Download - CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin
![Page 1: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/1.jpg)
Large-‐Scale Machine Learning and Graphs
6. Before
8. After
7. After
Yucheng Low
![Page 2: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/2.jpg)
Phase 1: POSSIBILITY
Benz Patent Motorwagen (1886)
![Page 3: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/3.jpg)
Phase 2: SCALABILITY
Model T Ford (1908)
![Page 4: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/4.jpg)
Phase 3: USABILITY
![Page 5: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/5.jpg)
Possibility 6. Before
8. After
7. After
Graph+ Scalability 6. Before
8. After
7. After
Usability
6. Before
8. After
7. After
![Page 6: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/6.jpg)
How will we design and implement
parallel learning systems?
The Big QuesFon of Big Learning
![Page 7: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/7.jpg)
MapReduce for Data-‐Parallel ML
Excellent for large data-‐parallel tasks!
Data-Parallel Graph-Parallel
Cross ValidaFon
Feature ExtracFon
MapReduce
CompuFng Sufficient StaFsFcs
Graphical Models Gibbs Sampling
Belief PropagaFon VariaFonal Opt.
Semi-‐Supervised Learning
Label PropagaFon CoEM
Graph Analysis PageRank
Triangle CounFng
CollaboraLve Filtering
Tensor FactorizaFon
Is there more to Machine Learning
?
![Page 8: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/8.jpg)
Liberal ConservaFve
Post
Post
Post
Post
Post
Post
Post
Post
Es(mate Poli(cal Bias
Post
Post
Post
Post
Post
Post
Post
Post
Post
Post
Post
Post
Post
Post
? ?
?
?
? ?
?
? ? ?
?
?
? ?
? ?
?
?
?
?
?
?
?
?
?
?
?
? ?
?
Semi-‐Supervised & TransducFve Learning
![Page 9: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/9.jpg)
Flashback to 1998
First Google advantage: a Graph Algorithm & a System to Support it!
![Page 10: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/10.jpg)
The Power of Dependencies
where the value is!
![Page 11: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/11.jpg)
It’s all about the graphs…
![Page 12: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/12.jpg)
Social Media
! Graphs encode the relaLonships between:
! Big: 100 billions of verLces and edges and rich metadata ! Facebook (10/2012): 1B users, 144B friendships ! Twicer (2011): 15B follower edges
AdverLsing Science Web
People Facts
Products Interests
Ideas
![Page 13: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/13.jpg)
Examples of Graphs in
Machine Learning
![Page 14: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/14.jpg)
CollaboraFve Filtering: ExploiFng Dependencies
City of God
Wild Strawberries
The CelebraFon
La Dolce Vita
Women on the Verge of a Nervous Breakdown
Latent Factor Models Matrix CompleFon/FactorizaFon Models
![Page 15: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/15.jpg)
Topic Modeling
Cat
Apple
Growth
Hat
Plant
Latent Dirichlet AllocaFon, etc
![Page 16: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/16.jpg)
Example Topics Discovered from Wikipedia
![Page 17: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/17.jpg)
6. Before
8. After
7. After
Data
Machine Learning Pipeline
images
docs
movie raFngs
social acFvity
Extract Features
Graph Formation Structured
Machine Learning Algorithm
Value from Data
face labels
doc topics
movie recommend
senFment analysis
![Page 18: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/18.jpg)
ML Tasks Beyond Data-‐Parallelism
Data-Parallel Graph-Parallel
Cross ValidaFon
Feature ExtracFon
Map Reduce
CompuFng Sufficient StaFsFcs
Graphical Models Gibbs Sampling
Belief PropagaFon VariaFonal Opt.
Semi-‐Supervised Learning
Label PropagaFon CoEM
Graph Analysis PageRank
Triangle CounFng
CollaboraLve Filtering
Tensor FactorizaFon
![Page 19: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/19.jpg)
Example of a Graph-‐Parallel Algorithm
![Page 20: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/20.jpg)
PageRank
What’s the rank of this user?
Rank?
Depends on rank of who follows her
Depends on rank of who follows them…
Loops in graph è Must iterate!
![Page 21: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/21.jpg)
PageRank IteraFon
! α is the random reset probability ! wji is the prob. transiFoning (similarity) from j to i
R[i] = ↵+ (1� ↵)X
(j,i)2E
wjiR[j]R[i]
R[j] wji Iterate unFl convergence:
“My rank is weighted average of my friends’ ranks”
![Page 22: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/22.jpg)
ProperFes of Graph Parallel Algorithms
Dependency Graph
IteraFve ComputaFon
My Rank
Friends Rank
Local Updates
![Page 23: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/23.jpg)
The Need for a New AbstracFon
Data-Parallel Graph-Parallel
Cross ValidaFon
Feature ExtracFon
Map Reduce
CompuFng Sufficient StaFsFcs
Graphical Models Gibbs Sampling
Belief PropagaFon VariaFonal Opt.
Semi-‐Supervised Learning
Label PropagaFon CoEM
Data-‐Mining PageRank
Triangle CounFng
CollaboraLve Filtering
Tensor FactorizaFon
! Need: Asynchronous, Dynamic Parallel ComputaFons
![Page 24: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/24.jpg)
The GraphLab Goals
Efficient parallel
predicFons
Know how to solve ML problem
on 1 machine
![Page 25: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/25.jpg)
POSSIBILITY
![Page 26: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/26.jpg)
Data Graph Data associated with verFces and edges
Vertex Data: • User profile text • Current interests esFmates
Edge Data: • Similarity weights
Graph: • Social Network
![Page 27: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/27.jpg)
How do we program graph computaFon?
“Think like a Vertex.” -‐Malewicz et al. [SIGMOD’10]
![Page 28: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/28.jpg)
pagerank(i, scope){ // Get Neighborhood data (R[i], wij, R[j]) !scope;
// Update the vertex data // Reschedule Neighbors if needed if R[i] changes then reschedule_neighbors_of(i); }
R[i]←α + (1−α) wji ×R[ j]j∈N [i]∑ ;
Update FuncFons User-‐defined program: applied to vertex transforms data in scope of vertex
Dynamic computaLon
Update funcFon applied (asynchronously) in parallel unFl convergence
Many schedulers available to prioriFze computaFon
![Page 29: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/29.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph Based Data Representa(on
Update FuncFons User Computa(on
![Page 30: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/30.jpg)
Bayesian Tensor FactorizaFon
Gibbs Sampling Dynamic Block Gibbs Sampling
Matrix FactorizaFon
Lasso
SVM
Belief PropagaFon PageRank
CoEM
K-‐Means
SVD
LDA
…Many others… Linear Solvers
Splash Sampler AlternaFng Least
Squares
![Page 31: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/31.jpg)
Never Ending Learner Project (CoEM)
Hadoop 95 Cores 7.5 hrs
Distributed GraphLab
32 EC2 machines
80 secs
0.3% of Hadoop time
2 orders of mag faster "# 2 orders of mag cheaper
![Page 32: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/32.jpg)
! ML algorithms as vertex programs ! Asynchronous execuFon and consistency models
![Page 33: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/33.jpg)
GraphLab 1 provided exciFng scaling performance
But…
Thus far…
We couldn’t scale up to Altavista Webgraph 2002 1.4B verLces, 6.7B edges
![Page 34: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/34.jpg)
Natural Graphs
[Image from WikiCommons]
![Page 35: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/35.jpg)
Problem: ExisFng distributed graph
computaFon systems perform poorly on Natural Graphs
![Page 36: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/36.jpg)
Achilles Heel: Idealized Graph AssumpFon
Assumed… But, Natural Graphs…
Small degree " Easy to parFFon Many high degree verFces
(power-‐law degree distribuFon) "
Very hard to parFFon
![Page 37: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/37.jpg)
Power-‐Law Degree DistribuFon
100 102 104 106 108100
102
104
106
108
1010
degree
count
High-‐Degree VerFces:
1% verFces adjacent to 50% of edges
Num
ber o
f VerFces
AltaVista WebGraph 1.4B VerFces, 6.6B Edges
Degree
![Page 38: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/38.jpg)
High Degree VerFces are Common
Users
Movies
NeYlix
“Social” People Popular Movies
θ Z w Z w Z w Z w
θ Z w Z w Z w Z w
θ Z w Z w Z w Z w
θ Z w Z w Z w Z w
β α
Hyper Parameters
Docs
Words
LDA
Common Words
Obama
![Page 39: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/39.jpg)
Power-‐Law Degree DistribuFon “Star Like” MoFf
President Obama Followers
![Page 40: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/40.jpg)
Problem: High Degree VerLces è High CommunicaLon for Distributed Updates
Y
Machine 1 Machine 2
Natural graphs do not have low-‐cost balanced cuts [Leskovec et al. 08, Lang 04]
Popular parFFoning tools (MeFs, Chaco,…) perform poorly [Abou-‐Rjeili et al. 06]
Extremely slow and require substan(al memory
Data transmitted across network
O(# cut edges)
![Page 41: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/41.jpg)
Random ParFFoning ! Both GraphLab 1, Pregel, Twicer, Facebook,… rely on Random (hashed) parFFoning for Natural Graphs
Machine 1 Machine 2
3"2"
1"
D
A"
C"
B" 2"3"
C"
D
B"A"
1"
D
A"
C"C"
B"
(a) Edge-Cut
B"A" 1"
C" D3"
C" B"2"
C" D
B"A" 1"
3"
(b) Vertex-Cut
Figure 4: (a) An edge-cut and (b) vertex-cut of a graph intothree parts. Shaded vertices are ghosts and mirrors respectively.
5 Distributed Graph Placement
The PowerGraph abstraction relies on the distributed data-graph to store the computation state and encode the in-teraction between vertex programs. The placement ofthe data-graph structure and data plays a central role inminimizing communication and ensuring work balance.
A common approach to placing a graph on a cluster of pmachines is to construct a balanced p-way edge-cut (e.g.,Fig. 4a) in which vertices are evenly assigned to machinesand the number of edges spanning machines is minimized.Unfortunately, the tools [21, 31] for constructing balancededge-cuts perform poorly [1, 26, 23] or even fail on power-law graphs. When the graph is difficult to partition, bothGraphLab and Pregel resort to hashed (random) vertexplacement. While fast and easy to implement, hashedvertex placement cuts most of the edges:
Theorem 5.1. If vertices are randomly assigned to pmachines then the expected fraction of edges cut is:
E|Edges Cut|
|E|
�= 1� 1
p(5.1)
For example if just two machines are used, half of theof edges will be cut requiring order |E|/2 communication.
5.1 Balanced p-way Vertex-CutThe PowerGraph abstraction enables a single vertex pro-gram to span multiple machines. Hence, we can ensurework balance by evenly assigning edges to machines.Communication is minimized by limiting the number ofmachines a single vertex spans. A balanced p-way vertex-cut formalizes this objective by assigning each edge e2 Eto a machine A(e) 2 {1, . . . , p}. Each vertex then spansthe set of machines A(v)✓ {1, . . . , p} that contain its ad-jacent edges. We define the balanced vertex-cut objective:
minA
1|V | Â
v2V|A(v)| (5.2)
s.t. maxm
|{e 2 E | A(e) = m}|< l |E|p
(5.3)
where the imbalance factor l � 1 is a small constant. Weuse the term replicas of a vertex v to denote the |A(v)|copies of the vertex v: each machine in A(v) has a replicaof v. The objective term (Eq. 5.2) therefore minimizes the
average number of replicas in the graph and as a conse-quence the total storage and communication requirementsof the PowerGraph engine.
Vertex-cuts address many of the major issues associatedwith edge-cuts in power-law graphs. Percolation theory[3] suggests that power-law graphs have good vertex-cuts.Intuitively, by cutting a small fraction of the very highdegree vertices we can quickly shatter a graph. Further-more, because the balance constraint (Eq. 5.3) ensuresthat edges are uniformly distributed over machines, wenaturally achieve improved work balance even in the pres-ence of very high-degree vertices.
The simplest method to construct a vertex cut is torandomly assign edges to machines. Random (hashed)edge placement is fully data-parallel, achieves nearly per-fect balance on large graphs, and can be applied in thestreaming setting. In the following we relate the expectednormalized replication factor (Eq. 5.2) to the number ofmachines and the power-law constant a .
Theorem 5.2 (Randomized Vertex Cuts). Let D[v] denotethe degree of vertex v. A uniform random edge placementon p machines has an expected replication factor
E"
1|V | Â
v2V|A(v)|
#=
p|V | Â
v2V
1�✓
1� 1p
◆D[v]!. (5.4)
For a graph with power-law constant a we obtain:
E"
1|V | Â
v2V|A(v)|
#= p� pLia
✓p�1
p
◆/z (a) (5.5)
where Lia (x) is the transcendental polylog function andz (a) is the Riemann Zeta function (plotted in Fig. 5a).
Higher a values imply a lower replication factor, con-firming our earlier intuition. In contrast to a random 2-way edge-cut which requires order |E|/2 communicationa random 2-way vertex-cut on an a = 2 power-law graphrequires only order 0.3 |V | communication, a substantialsavings on natural graphs where E can be an order ofmagnitude larger than V (see Tab. 1a).
5.2 Greedy Vertex-CutsWe can improve upon the randomly constructed vertex-cut by de-randomizing the edge-placement process. Theresulting algorithm is a sequential greedy heuristic whichplaces the next edge on the machine that minimizes theconditional expected replication factor. To construct thede-randomization we consider the task of placing the i+1edge after having placed the previous i edges. Using theconditional expectation we define the objective:
argmink
E"
Âv2V
|A(v)|
����� Ai,A(ei+1) = k
#(5.6)
6
For p Machines: 10 Machines à 90% of edges cut 100 Machines à 99% of edges cut!
All data is communicated… Licle advantage over MapReduce
![Page 42: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/42.jpg)
In Summary
GraphLab 1 and Pregel are not well suited for natural graphs
! Poor performance on high-‐degree verFces ! Low Quality ParFFoning
![Page 43: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/43.jpg)
SCALABILITY
6. Before
8. After
7. After
![Page 44: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/44.jpg)
Gather InformaLon About Neighborhood
Apply Update to Vertex
Sca7er Signal to Neighbors & Modify Edge Data
Common Padern for Update Fncs.
GraphLab_PageRank(i) // Compute sum over neighbors total = 0 foreach( j in in_neighbors(i)): total = total + R[j] * wji // Update the PageRank R[i] = 0.1 + total // Trigger neighbors to run again if R[i] not converged then foreach( j in out_neighbors(i)) signal vertex-‐program on j
R[i]
R[j] wji
![Page 45: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/45.jpg)
GAS DecomposiFon Y
+ … + $
Y
Parallel “Sum”
Y
Gather (Reduce) Apply the accumulated value to center vertex
Apply Update adjacent edges
and verFces.
Scacer
⌃
Accumulate informaFon about neighborhood
Y
+
Y Σ Y’ Y’
![Page 46: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/46.jpg)
Many ML Algorithms fit into GAS Model
graph analyFcs, inference in graphical
models, matrix factorizaFon, collaboraFve filtering, clustering, LDA, …
![Page 47: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/47.jpg)
Minimizing CommunicaFon in GL2 PowerGraph: Vertex Cuts
Y CommunicaFon linear in # spanned machines
Y Y
A vertex-‐cut minimizes # machines per vertex
Percola(on theory suggests Power Law graphs can be split by removing only a small set of ver(ces [Albert et al. 2000]
è Small vertex cuts possible!
GL2 PowerGraph includes novel vertex cut algorithms %
Provides order of magnitude gains in performance
![Page 48: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/48.jpg)
From the AbstracFon to a System
6. Before
8. After
7. After
![Page 49: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/49.jpg)
34.8 Billion Triangles Triangle CounLng on Twicer Graph
64 Machines 15 Seconds
1636 Machines 423 Minutes
Hadoop [WWW’11]
S. Suri and S. Vassilvitskii, “CounFng triangles and the curse of the last reducer,” WWW’11
Why? Wrong AbstracLon $ Broadcast O(degree2) messages per Vertex
![Page 50: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/50.jpg)
Topic Modeling (LDA)
! English language Wikipedia ! 2.6M Documents, 8.3M Words, 500M Tokens
! ComputaFonally intensive algorithm
0 20 40 60 80 100 120 140 160
Smola et al.
GL2 PowerGraph
Million Tokens Per Second
100 Yahoo! Machines
64 cc2.8xlarge EC2 Nodes
Specifically engineered for this task
200 lines of code & 4 human hours
![Page 51: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/51.jpg)
How well does GraphLab scale?
Yahoo Altavista Web Graph (2002): One of the largest publicly available webgraphs
1.4B Webpages, 6.7 Billion Links
64 HPC Nodes
7 seconds per iter. 1B links processed per second
30 lines of user code
![Page 52: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/52.jpg)
GraphChi: Going small with GraphLab
Solve huge problems on small or embedded
devices?
Key: Exploit non-‐volaFle memory (starFng with SSDs and HDs)
6. Before
8. After
7. After
![Page 53: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/53.jpg)
GraphChi – disk-‐based GraphLab
Challenge: Random Accesses
Novel GraphChi soluLon: Parallel sliding windows method è minimizes number of random accesses
![Page 54: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/54.jpg)
Triangle CounFng on Twicer Graph 40M Users 1.2B Edges
Total: 34.8 Billion Triangles
Hadoop results from [Suri & Vassilvitskii '11]
59 Minutes
64 Machines, 1024 Cores 15 Seconds
GraphLab2
GraphChi
Hadoop
1636 Machines 423 Minutes
59 Minutes, 1 Mac Mini!
![Page 55: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/55.jpg)
! ML algorithms as vertex programs ! Asynchronous execuFon and consistency models
! Natural graphs change the nature of computaFon
! Vertex cuts and gather/apply/scacer model
6. Before
8. After
7. After
![Page 56: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/56.jpg)
Scalability
GL2 PowerGraph focused on
at the loss of Usability
![Page 57: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/57.jpg)
GraphLab 1
Explicitly described operaLons
PageRank(i, scope){ acc = 0 for (j in InNeighbors) { acc += pr[j] * edge[j].weight } pr[i] = 0.15 + 0.85 * acc }
Code is intuitive
![Page 58: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/58.jpg)
GL2 PowerGraph GraphLab 1
Explicitly described operaLons
PageRank(i, scope){ acc = 0 for (j in InNeighbors) { acc += pr[j] * edge[j].weight } pr[i] = 0.15 + 0.85 * acc }
Implicit operaLon
Implicit aggregaLon
gather(edge) { return edge.source.value * edge.weight }
merge(acc1, acc2) { return accum1 + accum2 }
apply(v, accum) { v.pr = 0.15 + 0.85 * acc }
Need to understand engine to understand code Code is intuitive
![Page 59: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/59.jpg)
What now?
Great flexibility, but hit scalability wall
6. Before
8. After
7. After
Scalability, but very rigid abstracFon
(many contorFons needed to implement
SVD++, Restricted Boltzmann Machines)
![Page 60: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/60.jpg)
USABILITY
6. Before
8. After
7. After
![Page 61: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/61.jpg)
Machine 1 Machine 2
GL3 WarpGraph Goals
Program Like GraphLab 1
Run Like GraphLab 2
![Page 62: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/62.jpg)
Fine-‐Grained PrimiFves
Y
PageRankUpdateFunction(Y) { Y.pagerank = 0.15 + 0.85 * MapReduceNeighbors( lambda nbr: nbr.pagerank*nbr.weight, lambda (a,b): a + b ) }
Expose Neighborhood OperaLons through Parallel Iterators
(aggregate sum over neighbors)
R[i] = 0.15 + 0.85X
(j,i)2E
w[j, i] ⇤R[j]
![Page 63: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/63.jpg)
Expressive, Extensible Neighborhood API
+ + … +
Y Y Y
Parallel Sum
Y
MapReduce over Neighbors
Y
Modify adjacent edges
Parallel Transform Adjacent Edges
Y
Schedule a selected subset of adjacent verFces
Broadcast
![Page 64: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/64.jpg)
Can express every GL2 PowerGraph program (more easily) in GL3 WarpGraph
MulFple gathers
Scacer before gather
CondiFonal execuFon
But GL3 is more expressive
UpdateFunction(v) { if (v.data == 1) accum = MapReduceNeighs(g,m) else ... }
![Page 65: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/65.jpg)
Graph Coloring Twicer Graph: 41M VerFces 1.4B Edges
WarpGraph outperforms PowerGraph with simpler code
32 Nodes x 16 Cores (EC2 HPC cc2.8x)
3.8x Faster GL3 WarpGraph 60 seconds
227 seconds GL2 PowerGraph
![Page 66: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/66.jpg)
! ML algorithms as vertex programs ! Asynchronous execuFon and consistency models
! Natural graphs change the nature of computaFon
! Vertex cuts and gather/apply/scacer model
! Usability is key ! Access neighborhood through parallelizable iterators and latency hiding
6. Before
8. After
7. After
6. Before
8. After
7. After
![Page 67: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/67.jpg)
Usability for Whom???
… WarpGraph PowerGraph
![Page 68: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/68.jpg)
Machine Learning PHASE 3 (part 2)
USABILITY
![Page 69: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/69.jpg)
ExciFng Time to Work in ML With Big Data, I’ll take over the world!!!
We met because of Big Data
Why won’t Big Data read my mind???
Unique opportuniFes to change the world!! ☺ But, every deployed system is an one-‐off soluFon,
and requires PhDs to make work… '
![Page 70: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/70.jpg)
ML key to any new service we want to build
But…
Even basics of scalable ML can be challenging
6 months from R/Matlab to producFon, at best
State-‐of-‐art ML algorithms trapped in research papers
Goal of GraphLab 3: Make huge-‐scale machine learning accessible to all!
![Page 71: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/71.jpg)
Step 0 : Learn ML With GraphLab Notebook
![Page 72: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/72.jpg)
Step 1 : pip install graphlab prototype on local machine
GraphLab + Python
![Page 73: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/73.jpg)
Step 2 : scale to full dataset in the cloud with minimal code changes
GraphLab + Python
![Page 74: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/74.jpg)
Step 3: deploy in production
GraphLab + Python
![Page 75: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/75.jpg)
Step 4: ???
GraphLab + Python
![Page 76: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/76.jpg)
Step 4: Profit
GraphLab + Python
![Page 77: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/77.jpg)
GraphLab Toolkits Highly scalable, state-‐of-‐the-‐art
machine learning methods… all accessible from python
Graph AnalyFcs
Graphical Models
Computer Vision Clustering Topic
Modeling CollaboraFve
Filtering
![Page 78: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/78.jpg)
Now with GraphLab: Learn/Prototype/Deploy
Even basics of scalable ML can be challenging
6 months from R/Matlab to producFon, at best
State-‐of-‐art ML algorithms trapped in research papers
Learn ML with GraphLab Notebook
pip install graphlab then deploy on
EC2/Cluster
Fully integrated via GraphLab Toolkits
![Page 79: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/79.jpg)
We’re selecFng strategic partners
Help define our strategy & prioriFes And, get the value of GraphLab in your company
![Page 80: CC-4007, Large-Scale Machine Learning on Graphs, by Yucheng Low, Joseph Gonzalez and Carlos Guestrin](https://reader033.vdocuments.us/reader033/viewer/2022042623/5453d781af79599f5c8b9557/html5/thumbnails/80.jpg)
6. Before
8. After
7. After
Define our future: [email protected] Needless to say: [email protected]
C++ GraphLab 2.2 available now: graphlab.com Beta Program: beta.graphlab.com Follow us on Twicer: @graphlabteam