CS224W: Analysis of NetworksJure Leskovec, Stanford University
http://cs224w.stanford.edu
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 3
Degree distribution: P(k)
Path length: h
Clustering coefficient: C
Connected components: s Definitions will be presented for undirected graphs, sometimes we will explicitly mention
extensions to directed graphs, and sometimes extensions will be obvious
¡ Degree distribution P(k): Probability that a randomly chosen node has degree k
Nk = # nodes with degree k¡ Normalized histogram:
P(k) = Nk / N ➔ plot
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 4
k
P(k)
1 2 3 4
0.10.20.30.40.50.6
For directed graphs we have separate in- and out-degree distributions.
¡ A path is a sequence of nodes in which each node is linked to the next one
¡ A path can intersect itself and pass through the same edge multiple times§ E.g.: ACBDCDEG
Pn = {i0,i1,i2,...,in}
Pn = {(i0 ,i1),(i1,i2 ),(i2 ,i3 ),...,(in-1,in )}
C
AB
D E
H
F
G
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 5
X
¡ Distance (shortest path, geodesic)between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes
§ *If the two nodes are not connected, the distance is usually defined as infinite (or zero)
¡ In directed graphs, paths need to follow the direction of the arrows
§ Consequence: Distance is not symmetric: hB,C ≠ hC,B
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 6
B
A
D
C
B
A
D
C
hB,D = 2hA,X = ∞
hB,C = 1, hC,B = 2
X
¡ Diameter: The maximum (shortest path) distance between any pair of nodes in a graph
¡ Average path length for a connected graph or a strongly connected directed graph
§ Many times we compute the average only over the connected pairs of nodes (that is, we ignore “infinite” length paths)
§ Note that ths measure also applied to (strongly) connected components of a graph
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 7
å¹
=ijiijhE
h,max2
1 • hij is the distance from node i to node j• Emax is the max number of edges (total
number of node pairs) = n(n-1)/2
¡ Clustering coefficient (for undirected graphs): § How connected are i’s neighbors to each other?§ Node i with degree ki
§ Ci Î [0,1]
§
¡ Average clustering coefficient:
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 8
where ei is the number of edges between the neighbors of node i
å=N
iiCN
C 1
Clustering coefficientis undefined (or defined to be 0) for nodes with degree 0 or 1
Note 𝑘"(𝑘" − 1) is max number of edges between the 𝑘" neighbors
¡ Clustering coefficient (for undirected graps): § How connected are i’s neighbors to each other?§ Node i with degree ki
§
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 9
where ei is the number of edges between the neighbors of node i
C
AB
D E
H
F
G
kB=2, eB=1, CB=2/2 = 1
kD=4, eD=2, CD=4/12 = 1/3
Avg. clustering: C=0.33
¡ Size of the largest connected component § Largest set where any two vertices can be joined
by a path¡ Largest component = Giant component
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 10
How to find connected components:• Start from random node and perform
Breadth First Search (BFS)• Label the nodes that BFS visits• If all nodes are visited, the network is connected• Otherwise find an unvisited node and repeat BFS
DC
A
B
H
F
G
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 11
Degree distribution: P(k)
Path length: h
Clustering coefficient: C
Connected components: s
MSN Messenger:¡ 1 month of activity§ 245 million users logged in§ 180 million users engaged in
conversations§ More than 30 billion
conversations§ More than 255 billion
exchanged messages
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 13
149/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu
Network: 180M people, 1.3B edges 159/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu
Contact ConversationJure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu9/29/19 16
Messaging as an undirected graph• Edge (u,v) if users u and v
exchanged at least 1 msg• N=180 million people• E=1.3 billion edges
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 17
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 18
Note: We plotted the same data as on the previous slide, just the axes are now logarithmic.
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 19
å=
=kkii
kk
i
CN
C:
1Ck: average Ci of nodes i of degree k:
Avg. clustering of the MSN:C = 0.1140
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 20
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 21
Number of links between pairs of
nodes in the largest connected
component
Avg. path length 6.690% of the nodes can be reached in < 8 hops
Steps #Nodes0 1
1 10
2 78
3 3,96
4 8,648
5 3,299,252
6 28,395,849
7 79,059,497
8 52,995,778
9 10,321,008
10 1,955,007
11 518,410
12 149,945
13 44,616
14 13,740
15 4,476
16 1,542
17 536
18 167
19 71
20 29
21 16
22 10
23 3
24 2
25 3
# no
des
as w
e do
BFS
out
of a
rand
om n
ode
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 22
Degree distribution:
Path length: 6.6
Clustering coefficient: 0.11
Connectivity: giant component
Heavily skewed;avg. degree = 14.4
Are these values “expected”? Are they “surprising”?
To answer this we need a model!
a. Undirected networkN=2,018 proteins as nodesE=2,930 binding interactions as links.
b. Degree distribution:Skewed. Average degree <k>=2.90c. Diameter:Avg. path length = 5.8d. Clustering:Avg. clustering = 0.12Connectivity: 185 componentsthe largest component has 1,647 nodes (81% of nodes)
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 23
¡ Erdös-Renyi Random Graphs [Erdös-Renyi, ‘60]¡ Two variants:§ Gnp: undirected graph on n nodes where each
edge (u,v) appears i.i.d. with probability p
§ Gnm : undirected graph with n nodes, and m edges picked uniformly at random
Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu9/29/19 25
What kind of networks do such models produce?
¡ n and p do not uniquely determine the graph!§ The graph is a result of a random process
¡ We can have many different realizations given the same n and p
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 26
n = 10 p= 1/6
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 27
Degree distribution: P(k)
Path length: h
Clustering coefficient: C
What are the values of these properties for Gnp?
¡ Fact: Degree distribution of Gnp is binomial.¡ Let P(k) denote the fraction of nodes with
degree k:
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 28
knk ppkn
kP ---÷÷ø
öççè
æ -= 1)1(
1)(
Select k nodes out of n-1
Probability of having k edges
Probability of missing the rest of the n-1-k edges
σ 2 = p(1− p)(n−1)
σk=1− pp
1(n−1)
"
#$
%
&'
1/2
≈1
(n−1)1/2)1( -= npk By the law of large numbers, as the network size increases, the distribution becomes increasingly
narrow—we are increasingly confident that the degree of a node is in the vicinity of k.
P(k)
Mean, variance of a binomial distribution
k
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 29
nk
nkp
kkkkpCii
ii »-
==--×
=1)1(
)1(
Clustering coefficient of a random graph is small.If we generate bigger and bigger graphs with fixed avg. degree 𝑘 (that is we set 𝑝 = 𝑘 ⋅ 1/𝑛), then C decreases with the graph size n.
)1(2-
=ii
ii kk
eC
ei = pki (ki −1)2 Number of distinct pairs of
neighbors of node i of degree kiEach pair is connected with prob. p
Where ei is the number of edges between i’s neighbors
¡ Remember:
¡ Edges in Gnp appear i.i.d. with prob. p
¡ So, expected E[ei] is:
¡ Then E[Ci]:
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 30
Degree distribution:
Clustering coefficient: C=p=k/n
Path length: next!
Connectivity:
knk ppkn
kP ---÷÷ø
öççè
æ -= 1)1(
1)(
¡ Graph G(V, E) has expansion α: if" S Í V: # of edges leaving S ³ α× min(|S|,|V\S|)
¡ Or equivalently:
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 31
|)\||,min(|#min SVS
SleavingedgesVSÍ
=a
SV \ S
¡ Fact: In a graph on n nodes with expansion α, for all pairs of nodes, there is a path of length O((log n)/α).
¡ Random graph Gnp:For log n > np > c, diam(Gnp) = O(log n / log (np))§ Random graphs have good expansion so it takes a
logarithmic number of steps for BFS to visit all nodes
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 32
S nodes α·S edges
S’ nodes α·S’ edges
s
Erdös-Renyi Random Graph can grow very large but nodes will be just a few hops apart
0 200000 400000 600000 800000 1000000
05
1015
20
num nodes
aver
age
shor
test
pat
h
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 33
Here 𝑛 ⋅ 𝑝 =constantThat is, avg deg 𝑘 is const
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 34
Degree distribution:
Path length: O(log n)
Clustering coefficient: C = p = k / n
Connected components: next!
knk ppkn
kP ---÷÷ø
öççè
æ -= 1)1(
1)(
¡ Graph structure of Gnp as p changes:
¡ Emergence of a giant component:avg. degree k=2E/n or p=k/(n-1)§ k=1-ε: all components are of size Ω(log n)§ k=1+ε: 1 component of size Ω(n), others have size Ω(log n)
§ Each node has at least one edge in expectation
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 35
0 1
p=1/(n-1)
Giant component appears
c/(n-1)Avg. deg const. Lots of isolated
nodes.
log(n)/(n-1)Fewer isolated
nodes.
2*log(n)/(n-1)No isolated nodes.
Emptygraph
Completegraph
Avg deg = 1
¡ Gnp, n=100,000, k=p(n-1) = 0.5 … 3
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 36
Fraction of nodes in the largest component
p*(n-1)=1
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 37
Degree distribution:
Avg. path length: 6.6 O(log n)
Avg. clustering coef.: 0.11 k / n
Largest Conn. Comp.: 99%C ≈ 8·10-8
h ≈ 8.2
MSN Gnp
GCC existswhen k>1.
k ≈ 14.
n=180M
ý
þ
ý
þ
¡ Are real networks like random graphs?§ Giant connected component: J§ Average path length: J§ Clustering Coefficient: L§ Degree Distribution: L
¡ Problems with the random networks model:§ Degree distribution differs from that of real networks§ Giant component in most real networks does NOT
emerge through a phase transition§ No local structure – clustering coefficient is too low
¡ Most important: Are real networks random?§ The answer is simply: NO!
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 38
¡ If Gnp is wrong, why did we spend time on it?§ It is the reference model for the rest of the class§ It will help us calculate many quantities, that can
then be compared to the real data§ It will help us understand to what degree a
particular property is the result of some random process
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 39
So, while Gnp is WRONG, it will turn out to be extremly USEFUL!
Can we have high clustering while also having short paths?
Vs.
High clustering coefficient, High diameter
Low clustering coefficientLow diameter
¡ MSN network has 7 orders of magnitude larger clustering than the corresponding Gnp!
¡ Other examples:
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 42
h ... Average shortest path lengthC ... Average clustering coefficient“actual” … real network“random” … random graph with same avg. degree
Actor Collaborations (IMDB): N = 225,226 nodes, avg. degree k = 61Electrical power grid: N = 4,941 nodes, k = 2.67Network of neurons: N = 282 nodes, k = 14
Network hactual hrandom Crandom
Film actors 3.65 2.99 0.00027Power Grid 18.70 12.40 0.005C. elegans 2.65 2.25 0.05
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 43
¡ Consequence of expansion:§ Short paths: O(log n)
§ This is the smallest diameter we can get if we keep the degree constant.
§ But clustering is low!¡ But networks have
“local” structure:§ Triadic closure:
Friend of a friend is my friend§ High clustering but
diameter is also high¡ How can we have both?
Low diameterLow clustering coefficient
High clustering coefficientHigh diameter
¡ Could a network with high clustering also be small world (have log 𝒏 diameter)?§ How can we at the same time have
high clustering and small diameter?
§ Clustering implies edge “locality”§ Randomness enables “shortcuts”
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 44
High clusteringHigh diameter
Low clusteringLow diameter
Small-World Model [Watts-Strogatz ‘98]Two components to the model:¡ (1) Start with a low-dimensional regular lattice§ (In our case we are using a ring as a lattice)§ Has high clustering coefficient
¡ (2) Rewire: Introduce randomness (“shortcuts”)§ Add/remove edges to create
shortcuts to join remote parts of the lattice
§ For each edge, with prob. p,move the other endpoint to a random node
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 45
[Watts-Strogatz, ‘98]
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 46
High clusteringLow diameter
Low clusteringLow diameter
43
2== C
kNh
NkCNh ==
loglog
a
Rewiring allows us to “interpolate” between a regular lattice and a random graph
[Watts-Strogatz, ‘98]
12
High clusteringHigh diameter
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 47
Clu
ster
ing
Coe
ffici
ent, 𝐶=
1 2∑𝐶𝑖
Prob. of rewiring, p
Parameter region of high clustering and low path length
Intuition: It takes a lot of randomness to ruin the clustering, but a very small amount to create shortcuts.
(sca
led)
Ave
rage
Pat
h Le
ngth
¡ Could a network with high clustering be at the same time a small world?§ Yes! You don’t need more than a few random links
¡ The Watts Strogatz Model:§ Provides insight on the interplay between clustering
and the small-world § Captures the structure of many realistic networks§ Accounts for the high clustering of real networks§ Does not lead to the correct degree distribution
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 48
Generating large realistic graphs
¡ How can we think of network structure recursively? Intuition: Self-similarity§ Object is similar to a part of itself: the whole has
the same shape as one or more of the parts¡ Mimic recursive graph/community growth:
¡ Kronecker product is a way of generating self-similar matrices
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 50
Initial graph Recursive expansion
¡ Kronecker graphs: § A recursive model of network structure
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 51
81 x 81 adjacency matrix
K1
[PKDD ‘05]
3 x 3 9 x 9
¡ Kronecker product of matrices 𝐴 and 𝐵 is given by
¡ Define a Kronecker product of two graphs as a Kronecker product of their adjacency matrices
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 52
N x M K x L
N*K x M*L
¡ Kronecker graph is obtained bygrowing sequence of graphs by iterating the Kronecker productover the initiator matrix 𝑲𝟏:
¡ Note: One can easily use multiple initiator matrices (𝐾1’, 𝐾1’’, 𝐾1’’’ ) (even of different sizes)
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 53
K1
[PKDD ‘05]
m
[m]m-1 m
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 54
[PKDD ‘05]
0.25 0.10 0.10 0.04
0.05 0.15 0.02 0.06
0.05 0.02 0.15 0.06
0.01 0.03 0.03 0.09
¡ Create 𝑁1´ 𝑁1 probability matrix 𝚯𝟏¡ Compute the kth Kronecker power 𝚯𝒌¡ For each entry 𝑝𝑢𝑣 of 𝚯𝒌 include an
edge (𝑢, 𝑣) in 𝐾𝑘 with probability 𝑝𝑢𝑣
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 55
0.5 0.20.1 0.3
Θ1
Instancematrix K2
Θ2= Θ1Ä Θ1
Flip biased coins
Kroneckermultiplication
Probability of edge puv
[PKDD ‘05]
¡ How do we generate an instance of a (Directed) stochastic Kronecker graph?
¡ Is there a faster way? YES!¡ Idea: Exploit the recursive structure of
Kronecker graphs§ “Drop” edges onto the graph one by one
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 56
0.25 0.10 0.10 0.040.05 0.15 0.02 0.060.05 0.02 0.15 0.060.01 0.03 0.03 0.09
1 1 0 00 1 0 11 0 1 10 1 0 1
Probability of edge puv
Need to flip 𝒏𝟐 coins!! Way too slow!!
Flip biased coins
¡ A faster way to generate Kronecker graphs
¡ How to “drop” an edge into a graph 𝑮 on 𝒏 = 𝟐𝒎 nodes
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 57
Adjacency matrix G
==Q
QÄQ
¡ A faster way to generate Kronecker graphs
¡ How to “drop” an edge into a graph 𝑮 on 𝒏 = 𝟐𝒎 nodes
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 58
Adjacency matrix G
a bc d
==Q
QÄQ
¡ A faster way to generate Kronecker graphs
¡ How to “drop” an edge into a graph 𝑮 on 𝒏 = 𝟐𝒎 nodes
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 59
==Q
QÄQ
Adjacency matrix G
a bc d
a bc d
¡ A faster way to generate Kronecker graphs
¡ How to “drop” an edge into a graph 𝑮 on 𝒏 = 𝟐𝒎 nodes:¡ We may get a few
edges colliding. We simply reinsert them.
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 60
Adjacency matrix G
a bc d
a bc da b
c d
==Q
QÄQ
Fast Kronecker generator algorithm:§ For generating directed graphs
¡ Insert 1 edge on graph 𝑮 on 𝒏 = 𝟐𝒎 nodes:§ Create normalized matrix 𝑳𝒖𝒗 = 𝚯𝒖𝒗/(∑𝒐𝒑𝚯𝒐𝒑)§ For 𝒊 = 𝟏 …𝒎
§ Start with 𝒙 = 𝟎, 𝒚 = 𝟎§ Pick a row/column (𝒖, 𝒗) with prob. 𝑳𝒖𝒗§ Descend into quadrant (𝒖, 𝒗) at level 𝒊 of 𝑮
§ This means: 𝒙 += 𝒖 ⋅ 𝟐𝒎O𝒊 , 𝒚 += 𝒗 ⋅ 𝟐𝒎O𝒊
§ Add an edge (𝒙, 𝒚) to 𝑮
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 61
a bc d
𝚯
¡ Real and Kronecker are very close:
9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 62
=Q10.99 0.54
0.49 0.13
[ICML ‘07]