discovering and balancing fundamental cycles in large
TRANSCRIPT
Discovering and Balancing Fundamental Cycles in Large Signed Graphs
Ghadeer Alabandi*, Jelena Tešic, Lucas Rusnak, and Mar0n Burtscher
Social Networks§ On-line social networks have become an
important mode of human interaction§ Receive news, participate in surveys, and express
opinions via on-line social networks
§ Social Network Analysis has largely focused on § Community discovery, topic trending, quantifying the
influence of a person, recommender systems§ If a decision must be made majority voting is used
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 2
Reaching Consensus§ Majority voting problems
§ Ignores underlying network structure§ Prone to bias due to super-influencers, cheaters, etc.
§ Network-wide consensus§ Study consensus states across entire social network§ Well-researched approach in field of psychology
§ Signed graph balancing § Computes nearest consensus states of signed social
networks
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 3
Signed Social Networks § Signed social network is a directed graph𝐺 = (𝑉, 𝐸, 𝛿) § 𝑉 is vertex set representing the nodes of the network§ 𝐸 ⊆ 𝑉 × 𝑉 is edge set representing the network links§ 𝛿∶ 𝐸 → {−1, +1 } is a function that assigns
§ +1 for a positive link(friend/trust)
§ −1 for a negative link(foe/distrust)
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 4
wiki-Elec data analysis in graph-balancing space7115 ver(ces and 103689 edges (plus actual elec(on results)
Benefits of Graph Balancing
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 5
Bo:lenecks § Best existing graph-balancing implementation
§ Too slow & memory hungry§ Computing 1000 nearest balanced states on relatively small
wiki-Elec graph takes 1.5 hours on 16-node HPC cluster with two 14-core 2.4 GHz Xeon processors per node
§ Requires tens of gigabytes of memory per node
§ No prior solution exists for balancing large real-world signed social networks
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 6
Our Contribu=on§ graphB+ signed graph balancing algorithm
§ Based on new vertex and edge labeling technique§ Main benefits
§ Labeling only requires linear time and memory§ Cycle balancing time is independent of graph size
§ Parallel OpenMP and CUDA implementations§ 17 million cycles identified and balanced per second§ Can handle inputs with billions of edges on 1 CPU/GPU
§ Can use proven balance model from psychology on large real-world social networks for first time
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 7
GraphB+
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 8
GraphB+ Benefits§ High performance
§ GraphB+ incorporates a new algorithm for efficiently idenUfying, traversing, and balancing all fundamental cycles of a graph
§ Low memory footprint§ GraphB+ requires one word of storage per vertex to
record new node ID (label) as well as two words of storage per edge to record the beginning and end of the reachable vertex range
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 9
Graph Balancing Example§ The red pluses and minuses
indicate the signs of the edges and are part of the input
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 10
Graph Balancing Example§ Balancing is performed
based on a spanning tree of the graph
§ Assume vertex 𝑅 is selected to be the root of the spanning tree
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 11
Graph Balancing Example§ Use resulQng BFS
spanning tree as starQng point for balancing
§ The tree edges are arrows poinQng from the parent to the child
§ The non-tree edges are the doSed green lines
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 12
GraphB+ Vertex Relabeling (Step 1)§ GraphB+ relabels the verQces
§ Performs a pre-order traversal of the spanning tree
§ During this traversal, each reached vertex is assigned a new ID
§ The new ID is equal tothe number of previously visited verUces
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 13
GraphB+ Edge Labeling (Step 2)
§ GraphB+ records a range on each tree edge § Range denotes
which ver1ces are reachable when traversing the edgein the parent-to-child direc1on § Beginning determined using a pre-order traversal § End determined using a post-order traversal
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 14
GraphB+ Edge Labeling (Step 2 cont.) § Ranges can always be expressed by just two
values because of the vertex relabeling step
§ This feature ofgraphB+ is essenQalto keep the memory consumpQon low and to make the cycle traversals fast
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 15
GraphB+ Cycle Balancing (Step 3) § GraphB+ idenQfies and traverses all cycles that
are created when inserQng one non-tree edge at a Qme
§ During cycle traversal§ Count the number
of traversed edges with a negaUve sign
§ Set sign of non-tree edge such that cycle has an even number of negaUve signs → cycle is balanced
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 16
GraphB+ Balanced Graph§ ResulQng balanced graph § Includes two changed signs
§ Edge 𝐵 → 𝐹§ Edge 𝐷 → 𝐻
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 17
Using the Balanced Graph§ Graph used to determine the Harary biparQQon
§ All negaUve edges are cut (grayed out)
§ ResulUng connected components (CCs) arecomputed
§ BiparUUon is formed by combining all CCs with an even number of negaUve edges between them
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 18
Using the Balanced Graph (cont.)§ BiparQQon result
§ The brown verUces make up one biparUUon and the blue verUces the other
§ Everyone in blue parUUon agrees with each other and everyone in brown parUUon agrees with each other
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 19
Using the Balanced Graph (cont.)§ Many such biparQQons are computed
§ Based on different spanning trees§ graphB+ is invoked many Umes§ Graph balancing is the most Ume intensive aspect
§ For each vertex, count how oWen it ends up in the majority (i.e., the larger biparQQon)§ This yields the status of each vertex, which is a very
important consensus-based network-wide metric§ Can be used to detect bias and outliers
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 20
Parallelization
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 21
Paralleliza=on § Vertex and edge labeling
§ Step 1: pre-order traversal of 𝑇§ Step 2: both pre- and a post-order traversal of 𝑇§ These traversals are difficult to parallelize
§ Same result can be obtained with a boSom-upfollowed by a top-down pass over 𝑇’s BFS levels § All verUces in same level can be processed in parallel
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 22
Paralleliza=on (cont.) § Cycle processing
§ Processing the cycles is the core of graphB+ § To maximize the performance, the code only
processes the non-tree edges in one direction§ Based on the range information, it follows the
appropriate edge from vertex to vertex until the cycle is complete
§ Along the way, it counts the number of negative edges and, ultimately, sets the sign of the non-tree edge such that the total number is even
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 23
Paralleliza=on (cont.) § Cycle processing
§ OpenMP§ The cycle processing is parallelized over the ver4ces§ No synchroniza4on as the shared data is only read§ Dynamic schedule for load balancing
§ CUDA§ GPUs require much higher degrees of parallelism
§ Our CUDA implementa4on is parallelized both acrossver4ces (warps) and edges (threads in warps)
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 24
Results
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 25
Methodology: 2 Devices§ Titan V GPU
§ 5120 cores§ 4.5 MB L2, 12 GB (652 GB/s)
§ 3.5 GHz AMD Ryzen Threadripper 2950X CPU § 16 cores, 32 threads§ 8 MB L3, 48 GB (87 GB/s)
§ Run all codes with 1000 trees§ Roots are the 1000 highest-degree verUces
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 26
Methodology: 20 Signed Graphs
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 27
Comparison to Original Python Code
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 28
graphB+ (CUDA) is 1074 times faster than original
algorithm (Python)
Throughput on Larger Graphs
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 29
It takes GPU less than 15 minutes to compute 1000 nearest balanced states
CUDA code balances 17 million fundamental
cycles per second
Speedup of OpenMP and CUDA
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 30
CUDA code is 2.6 to 53 times faster than serial
code and up to 6.2 times faster than OpenMP code
OpenMP code is 5.7 to 12.1 times faster on 16 cores than serial code
Fundamental Cycle Properties
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 31
The average cycle length is between 5.0 and 10.6 (very small)
The average vertex degree encountered on
a cycle is 147.7
147.7 is surprisingly high for graphs with an average degree of just
3.3
Spanning Tree Properties
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 32
Every single random BFS spanning tree of these social networks
is shallow
Largest depth over all graphs and trees is just 21 and average depth
is under 18
The average cycle length is linear in the average tree depth and the expected
average tree depth is 𝑂(𝑙𝑜𝑔(𝑛))
Summary and Conclusion§ New efficient algorithm called graphB+ for
balancing the signs on the edges of signed graphs
§ Based on a new vertex and edge labeling technique for rapidly determining and balancing all fundamental cycles of a graph
§ Runs in expected 𝑂(𝑚×𝑙𝑜𝑔(𝑛)×𝑡) Qme and requires 𝑂(𝑛+𝑚) storage
§ Parallelized graphB+ using OpenMP and CUDA
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 33
§ Acknowledgments§ NSF, DOE, NVIDIA
§ Contact information§ [email protected]§ [email protected]
§ Web page (link to paper and source code)§ cs.txstate.edu/~burtscher/research/graphBplus/
Thank you!
Discovering and Balancing Fundamental Cycles in Large Signed Graphs 34