leopard: lightweight partitioning and replication for dynamic graphs
TRANSCRIPT
![Page 1: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/1.jpg)
Leopard: Lightweight Partitioning and Replication for Dynamic Graphs
Jiewen Huang and Daniel AbadiYale University
![Page 2: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/2.jpg)
Facebook Social Graph
![Page 3: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/3.jpg)
Social Graphs
![Page 4: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/4.jpg)
Web Graphs
![Page 5: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/5.jpg)
Semantic Graphs
![Page 6: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/6.jpg)
Many systems use hash partitioning
● Results in many edges being “cut”
Given a graph G and an integer k, partition the vertices into k disjoint sets such that:
● as few cuts as possible
● as balanced as possible
Graph Partitioning
NP Hard
![Page 7: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/7.jpg)
Multilevel scheme Coarsening phase
State of the Art
![Page 8: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/8.jpg)
The only constant is change.
-------- Heraclitus
To Make the Problem more Complicated
Social graphs: new people and friendshipsSemantic Web graphs: new knowledgeWeb graphs: new websites and links
![Page 9: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/9.jpg)
Dynamic Graphs
A
Partition 1 Partition 2
Is partition 1 still the better partition for A?
![Page 10: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/10.jpg)
Repartitioning the entire graph upon every change is way too expensive
New Framework
Leopard:● Locally reassess partitioning as a result of
changes without a full re-partitioning● Integrates consideration of replication with
partitioning
![Page 11: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/11.jpg)
Outline
Background and Motivation
LEOPARD
Overview
Computation Skipping
Replication
Experiments
![Page 12: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/12.jpg)
Algorithm Overview
For each added/deleted edge <V1, V2>
Compute best partition for V1 using a heuristic
Re-assign V1 if needed
The same for V2
![Page 13: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/13.jpg)
Example: Adding an Edge
AB
Partition 1 Partition 2
![Page 14: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/14.jpg)
Compute the Partition for B
A
B
Partition 1 Partition 2# neighbours: 1# vertices: 5
# neighbours: 3# vertices: 3
Goals: (1) few cuts and (2) balanced
Heuristic: # neighbours * (1 - #vertices/capacity)
1 * (1 - 5/6) = 0.17 3 * (1 - 3/6) = 1.5
Higher score
This heuristic is simple for the sake of presentation. More advanced heuristics are discussed in the paper
![Page 15: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/15.jpg)
Compute the Partition for A
A
B
Partition 1 Partition 2# neighbours: 1# vertices: 4
# neighbours: 2# vertices: 4
Goals: (1) few cuts and (2) balanced
Heuristic: # neighbours * (1 - #vertices/capacity)
1 * (1 - 4/6) = 0.33 2 * (1 - 4/6) = 0.66
Higher score
![Page 16: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/16.jpg)
Example: Adding an Edge
B
Partition 1 Partition 2
A
(1) B stays put(2) A moves to partition 2
![Page 17: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/17.jpg)
Outline
Background and Motivation
Leopard
Overview
Computation Skipping
Replication
Experiments
![Page 18: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/18.jpg)
Computation cost
For each new edge, must: For both vertexes involved in the edge: Calculate the heuristic for each partition (May involve communication for remote vertex location lookup)
![Page 19: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/19.jpg)
Computation Skipping
Observation: As the number of neighbors of a vertex increases, the influence of a new neighbor decreases.
![Page 20: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/20.jpg)
Computation Skipping
Basic Idea: Accumulate changes for a vertex, if the changes exceed a certain threshold, recompute the partition for the vertex.
For example, threshold = # accumulated changes / # neighbors = 20%.
(1) Compute the partition when V has 10 neighbors. Then 2 new edges are added for V: 2 / 12 = 17% < 20%. Don’t recompute
(2) When 1 more new edge is added for V: 3 / 13 = 23% > 20%. Recompute the partition for V. Reset # accumulated changes to 0.
![Page 21: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/21.jpg)
Outline
Background and Motivation
Leopard
Overview
Computation Skipping
Replication
Experiments
![Page 22: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/22.jpg)
Goals of replication:
fault tolerance (k copies for each data point/block)
further cut reduction
Replication
![Page 23: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/23.jpg)
It takes two parameters:
● minimum: fault tolerance
● average: cut reduction
Minimum-Average Replication
![Page 24: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/24.jpg)
Example
# copies vertices
2 A,C,D,E,H,J,K,L
3 F,I
4 B,G
min = 2average = 2.5
first copy
replica
![Page 25: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/25.jpg)
Example
# copies vertices
2 A,C,D,E,H,J,K,L
3 F,I
4 B,G
min = 2average = 2.5
![Page 26: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/26.jpg)
How Many Copies?
A
Partition 1 Partition 4Partition 3Partition 2
0.1 0.40.30.2
minimum = 2average = 3
Scores of each partition
![Page 27: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/27.jpg)
How Many Copies?
A
Partition 1 Partition 4Partition 3Partition 2
0.1 0.40.30.2
minimum = 2average = 3
minimum requirementWhat about them?
![Page 28: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/28.jpg)
Always keep the last n computed scores.
Comparing against Past Scores
0.220.290.30.40.870.9 0.2 0.11 0.1
High Low
... ... ... ... ....
minimum = 2average = 3
cutoff: top avg-1/k-1 percent of scores
![Page 29: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/29.jpg)
Comparing against Past Scores
0.220.290.30.40.870.9 0.2 0.11 0.1
High Low
... ... ... ... ....
minimum = 2average = 3
30th 31th
# copies: 2
cutoff: 30th highest score
![Page 30: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/30.jpg)
Comparing against Past Scores
0.220.290.30.40.870.9 0.2 0.11 0.1
High Low
... ... ... ... ....
minimum = 2average = 3
30th 31th
# copies: 2
cutoff: 30th highest score
![Page 31: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/31.jpg)
Comparing against Past Scores
0.220.290.30.40.870.9 0.2 0.11 0.1
High Low
... ... ... ... ....
minimum = 2average = 3
30th 31th
# copies: 3
cutoff: 30th highest score
![Page 32: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/32.jpg)
Comparing against Past Scores
0.220.290.30.40.870.9 0.2 0.11 0.1
High Low
... ... ... ... ....
minimum = 2average = 3
30th
# copies: 4
cutoff: 30th highest score
![Page 33: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/33.jpg)
Outline
Background and Motivation
Leopard
Experiments
![Page 34: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/34.jpg)
Experiment Setup
● Comparison points○ Leopard with FENNEL heustitics
○ One-pass FENNEL (no vertex reassignment)
○ METIS (static graphs)
○ ParMETIS (repartitioning for dynamic graphs)
○ Hash Partitioning
● Graph Datasets○ Type: social graphs, collaboration graphs, Web graphs, email graphs, and synthetic graphs
○ Size: up to 66 million vertices and 1.8 billion edges
![Page 35: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/35.jpg)
Edge Cut
![Page 36: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/36.jpg)
Computation Skipping
![Page 37: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/37.jpg)
Effect of Replication on Edge Cut
![Page 38: Leopard: Lightweight Partitioning and Replication for Dynamic Graphs](https://reader035.vdocuments.us/reader035/viewer/2022070602/587c18b11a28abb5068b4bd9/html5/thumbnails/38.jpg)
Thanks!
Q & A