discovering and balancing fundamental cycles in large

34
Discovering and Balancing Fundamental Cycles in Large Signed Graphs Ghadeer Alabandi*, Jelena Tešić, Lucas Rusnak, and Mar0n Burtscher

Upload: others

Post on 20-May-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Discovering and Balancing Fundamental Cycles in Large

Discovering and Balancing Fundamental Cycles in Large Signed Graphs

Ghadeer Alabandi*, Jelena Tešic, Lucas Rusnak, and Mar0n Burtscher

Page 2: Discovering and Balancing Fundamental Cycles in Large

Social Networks§ On-line social networks have become an

important mode of human interaction§ Receive news, participate in surveys, and express

opinions via on-line social networks

§ Social Network Analysis has largely focused on § Community discovery, topic trending, quantifying the

influence of a person, recommender systems§ If a decision must be made majority voting is used

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 2

Page 3: Discovering and Balancing Fundamental Cycles in Large

Reaching Consensus§ Majority voting problems

§ Ignores underlying network structure§ Prone to bias due to super-influencers, cheaters, etc.

§ Network-wide consensus§ Study consensus states across entire social network§ Well-researched approach in field of psychology

§ Signed graph balancing § Computes nearest consensus states of signed social

networks

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 3

Page 4: Discovering and Balancing Fundamental Cycles in Large

Signed Social Networks § Signed social network is a directed graph𝐺 = (𝑉, 𝐸, 𝛿) § 𝑉 is vertex set representing the nodes of the network§ 𝐸 ⊆ 𝑉 × 𝑉 is edge set representing the network links§ 𝛿∶ 𝐸 → {−1, +1 } is a function that assigns

§ +1 for a positive link(friend/trust)

§ −1 for a negative link(foe/distrust)

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 4

Page 5: Discovering and Balancing Fundamental Cycles in Large

wiki-Elec data analysis in graph-balancing space7115 ver(ces and 103689 edges (plus actual elec(on results)

Benefits of Graph Balancing

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 5

Page 6: Discovering and Balancing Fundamental Cycles in Large

Bo:lenecks § Best existing graph-balancing implementation

§ Too slow & memory hungry§ Computing 1000 nearest balanced states on relatively small

wiki-Elec graph takes 1.5 hours on 16-node HPC cluster with two 14-core 2.4 GHz Xeon processors per node

§ Requires tens of gigabytes of memory per node

§ No prior solution exists for balancing large real-world signed social networks

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 6

Page 7: Discovering and Balancing Fundamental Cycles in Large

Our Contribu=on§ graphB+ signed graph balancing algorithm

§ Based on new vertex and edge labeling technique§ Main benefits

§ Labeling only requires linear time and memory§ Cycle balancing time is independent of graph size

§ Parallel OpenMP and CUDA implementations§ 17 million cycles identified and balanced per second§ Can handle inputs with billions of edges on 1 CPU/GPU

§ Can use proven balance model from psychology on large real-world social networks for first time

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 7

Page 8: Discovering and Balancing Fundamental Cycles in Large

GraphB+

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 8

Page 9: Discovering and Balancing Fundamental Cycles in Large

GraphB+ Benefits§ High performance

§ GraphB+ incorporates a new algorithm for efficiently idenUfying, traversing, and balancing all fundamental cycles of a graph

§ Low memory footprint§ GraphB+ requires one word of storage per vertex to

record new node ID (label) as well as two words of storage per edge to record the beginning and end of the reachable vertex range

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 9

Page 10: Discovering and Balancing Fundamental Cycles in Large

Graph Balancing Example§ The red pluses and minuses

indicate the signs of the edges and are part of the input

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 10

Page 11: Discovering and Balancing Fundamental Cycles in Large

Graph Balancing Example§ Balancing is performed

based on a spanning tree of the graph

§ Assume vertex 𝑅 is selected to be the root of the spanning tree

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 11

Page 12: Discovering and Balancing Fundamental Cycles in Large

Graph Balancing Example§ Use resulQng BFS

spanning tree as starQng point for balancing

§ The tree edges are arrows poinQng from the parent to the child

§ The non-tree edges are the doSed green lines

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 12

Page 13: Discovering and Balancing Fundamental Cycles in Large

GraphB+ Vertex Relabeling (Step 1)§ GraphB+ relabels the verQces

§ Performs a pre-order traversal of the spanning tree

§ During this traversal, each reached vertex is assigned a new ID

§ The new ID is equal tothe number of previously visited verUces

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 13

Page 14: Discovering and Balancing Fundamental Cycles in Large

GraphB+ Edge Labeling (Step 2)

§ GraphB+ records a range on each tree edge § Range denotes

which ver1ces are reachable when traversing the edgein the parent-to-child direc1on § Beginning determined using a pre-order traversal § End determined using a post-order traversal

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 14

Page 15: Discovering and Balancing Fundamental Cycles in Large

GraphB+ Edge Labeling (Step 2 cont.) § Ranges can always be expressed by just two

values because of the vertex relabeling step

§ This feature ofgraphB+ is essenQalto keep the memory consumpQon low and to make the cycle traversals fast

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 15

Page 16: Discovering and Balancing Fundamental Cycles in Large

GraphB+ Cycle Balancing (Step 3) § GraphB+ idenQfies and traverses all cycles that

are created when inserQng one non-tree edge at a Qme

§ During cycle traversal§ Count the number

of traversed edges with a negaUve sign

§ Set sign of non-tree edge such that cycle has an even number of negaUve signs → cycle is balanced

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 16

Page 17: Discovering and Balancing Fundamental Cycles in Large

GraphB+ Balanced Graph§ ResulQng balanced graph § Includes two changed signs

§ Edge 𝐵 → 𝐹§ Edge 𝐷 → 𝐻

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 17

Page 18: Discovering and Balancing Fundamental Cycles in Large

Using the Balanced Graph§ Graph used to determine the Harary biparQQon

§ All negaUve edges are cut (grayed out)

§ ResulUng connected components (CCs) arecomputed

§ BiparUUon is formed by combining all CCs with an even number of negaUve edges between them

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 18

Page 19: Discovering and Balancing Fundamental Cycles in Large

Using the Balanced Graph (cont.)§ BiparQQon result

§ The brown verUces make up one biparUUon and the blue verUces the other

§ Everyone in blue parUUon agrees with each other and everyone in brown parUUon agrees with each other

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 19

Page 20: Discovering and Balancing Fundamental Cycles in Large

Using the Balanced Graph (cont.)§ Many such biparQQons are computed

§ Based on different spanning trees§ graphB+ is invoked many Umes§ Graph balancing is the most Ume intensive aspect

§ For each vertex, count how oWen it ends up in the majority (i.e., the larger biparQQon)§ This yields the status of each vertex, which is a very

important consensus-based network-wide metric§ Can be used to detect bias and outliers

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 20

Page 21: Discovering and Balancing Fundamental Cycles in Large

Parallelization

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 21

Page 22: Discovering and Balancing Fundamental Cycles in Large

Paralleliza=on § Vertex and edge labeling

§ Step 1: pre-order traversal of 𝑇§ Step 2: both pre- and a post-order traversal of 𝑇§ These traversals are difficult to parallelize

§ Same result can be obtained with a boSom-upfollowed by a top-down pass over 𝑇’s BFS levels § All verUces in same level can be processed in parallel

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 22

Page 23: Discovering and Balancing Fundamental Cycles in Large

Paralleliza=on (cont.) § Cycle processing

§ Processing the cycles is the core of graphB+ § To maximize the performance, the code only

processes the non-tree edges in one direction§ Based on the range information, it follows the

appropriate edge from vertex to vertex until the cycle is complete

§ Along the way, it counts the number of negative edges and, ultimately, sets the sign of the non-tree edge such that the total number is even

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 23

Page 24: Discovering and Balancing Fundamental Cycles in Large

Paralleliza=on (cont.) § Cycle processing

§ OpenMP§ The cycle processing is parallelized over the ver4ces§ No synchroniza4on as the shared data is only read§ Dynamic schedule for load balancing

§ CUDA§ GPUs require much higher degrees of parallelism

§ Our CUDA implementa4on is parallelized both acrossver4ces (warps) and edges (threads in warps)

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 24

Page 25: Discovering and Balancing Fundamental Cycles in Large

Results

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 25

Page 26: Discovering and Balancing Fundamental Cycles in Large

Methodology: 2 Devices§ Titan V GPU

§ 5120 cores§ 4.5 MB L2, 12 GB (652 GB/s)

§ 3.5 GHz AMD Ryzen Threadripper 2950X CPU § 16 cores, 32 threads§ 8 MB L3, 48 GB (87 GB/s)

§ Run all codes with 1000 trees§ Roots are the 1000 highest-degree verUces

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 26

Page 27: Discovering and Balancing Fundamental Cycles in Large

Methodology: 20 Signed Graphs

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 27

Page 28: Discovering and Balancing Fundamental Cycles in Large

Comparison to Original Python Code

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 28

graphB+ (CUDA) is 1074 times faster than original

algorithm (Python)

Page 29: Discovering and Balancing Fundamental Cycles in Large

Throughput on Larger Graphs

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 29

It takes GPU less than 15 minutes to compute 1000 nearest balanced states

CUDA code balances 17 million fundamental

cycles per second

Page 30: Discovering and Balancing Fundamental Cycles in Large

Speedup of OpenMP and CUDA

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 30

CUDA code is 2.6 to 53 times faster than serial

code and up to 6.2 times faster than OpenMP code

OpenMP code is 5.7 to 12.1 times faster on 16 cores than serial code

Page 31: Discovering and Balancing Fundamental Cycles in Large

Fundamental Cycle Properties

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 31

The average cycle length is between 5.0 and 10.6 (very small)

The average vertex degree encountered on

a cycle is 147.7

147.7 is surprisingly high for graphs with an average degree of just

3.3

Page 32: Discovering and Balancing Fundamental Cycles in Large

Spanning Tree Properties

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 32

Every single random BFS spanning tree of these social networks

is shallow

Largest depth over all graphs and trees is just 21 and average depth

is under 18

The average cycle length is linear in the average tree depth and the expected

average tree depth is 𝑂(𝑙𝑜𝑔(𝑛))

Page 33: Discovering and Balancing Fundamental Cycles in Large

Summary and Conclusion§ New efficient algorithm called graphB+ for

balancing the signs on the edges of signed graphs

§ Based on a new vertex and edge labeling technique for rapidly determining and balancing all fundamental cycles of a graph

§ Runs in expected 𝑂(𝑚×𝑙𝑜𝑔(𝑛)×𝑡) Qme and requires 𝑂(𝑛+𝑚) storage

§ Parallelized graphB+ using OpenMP and CUDA

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 33

Page 34: Discovering and Balancing Fundamental Cycles in Large

§ Acknowledgments§ NSF, DOE, NVIDIA

§ Contact information§ [email protected]§ [email protected]

§ Web page (link to paper and source code)§ cs.txstate.edu/~burtscher/research/graphBplus/

Thank you!

Discovering and Balancing Fundamental Cycles in Large Signed Graphs 34