adventures in random graphs: models, structures and …...dynamics of networks vs. dynamics on...
TRANSCRIPT
BCAM January 2011 1
Adventures in random graphs:Models, structures and algorithms
Armand M. Makowski
ECE & ISR/HyNet
University of Maryland at College Park
BCAM January 2011 2
Complex networks
• Many examples
– Biology (Genomics, protonomics)
– Transportation (Communication networks, Internet, roadsand railroads)
– Information systems (World Wide Web)
– Social networks (Facebook, LinkedIn, etc)
– Sociology (Friendship networks, sexual contacts)
– Bibliometrics (Co-authorship networks, references)
– Ecology (food webs)
– Energy (Electricity distribution, smart grids)
• Larger context of “Network Science”
BCAM January 2011 3
Objectives
• Identify generic structures and properties of “networks
• Mathematical models and their analysis
• Understand how network structure and processes on networksinteract
What is new?
Very large data sets now easily available!
• Dynamics of networks vs. dynamics on networks
BCAM January 2011 4
Bibliography (I): Random graphs
• N. Alon and J.H. Spencer, The Probabilistic Method (SecondEdition), Wiley-Science Series in Discrete Mathematics andOptimization, John Wiley & Sons, New York (NY) 2000.
• A. D. Barbour, L. Holst and S. Janson, Poisson
Approximation, Oxford Studies in Probability 2, OxfordUniversity Press, Oxford (UK), 1992.
• B. Bollobas, Random Graphs, Second Edition, CambridgeStudies in Advanced Mathematics, Cambridge UniversityPress, Cambridge (UK), 2001.
• R. Durrett, Random Graph Dynamics, Cambridge Series inStatistical and Probabilistic Mathematics, CambridgeUniversity Press, Cambridge (UK), 2007.
BCAM January 2011 5
• M. Draief and L. Massoulie, Epidemics and Rumours in
Complex Networks, Cambridge University Press, Cambridge(UK), 2009.
• D. Dubhashi and A. Panconesi, Concentration of Measure for
the Analysis of Randomized algorithms, Cambridge UniversityPress, New York (NY), 2009.
• M. Franceschetti and R. Meester, Random Networks for
Communication: From Statistical Physics to Information
Systems, Cambridge Series in Statistical and ProbabilisticMathematics, Cambridge (UK), 2007.
• S. Janson, T. Luczak and A. Rucinski, Random Graphs,Wiley-Interscience Series in Discrete Mathematics andOptimization, John Wiley & Sons, 2000.
• R. Meester and R. Roy, Continuum Percolation, CambridgeUniversity Press, Cambridge (UK), 1996.
BCAM January 2011 6
• M.D. Penrose, Random Geometric Graphs, Oxford Studies inProbability 5, Oxford University Press, New York (NY), 2003.
BCAM January 2011 7
Bibliography (II): Survey papers
• R. Albert and A.-L. Barabasi, “Statistical mechanics ofcomplex networks,” Review of Modern Physics 74 (2002), pp.47-97.
• M.E.J. Newman, “The structure and function of complexnetworks,” SIAM Review 45 (2003), pp. 167-256.
BCAM January 2011 8
Bibliography (III): Complex networks
• A. Barrat, M. Barthelemy and A. Vespignani, Dynamical
Processes on Complex Networks, Cambridge University Press,Cambridge (UK), 2008.
• R. Cohen and S. Havlin, Complex Networks - Structure,
Robustness and Function, Cambridge University Press,Cambridge (UK), 2010.
• D. Easley and J. Kleinberg, Networks, Crowds, and Markets:
Reasoning About a Highly Connected World, CambridgeUniversity Press, Cambridge (UK) (2010).
• M.O. Jackson, Social and Economic Networks, PrincetonUniversity Press, Princeton (NJ), 2008.
BCAM January 2011 9
• M.E.J. Newman, A.-L. Barabasi and D.J. Watts (Editors), The
Structure and Dynamics of Networks, Princeton UniversityPress, Princeton (NJ), 2006.
BCAM January 2011 10
LECTURE 1
BCAM January 2011 11
Basics of graph theory
BCAM January 2011 12
What are graphs?
With V a finite set, a graph G is an ordered pair (V,E) whereelements in V are called vertices/nodes and E is the set ofedges/links:
E ⊆ V × V
E(G) = E
Nodes i and j are said to be adjacent, written i ∼ j, if
e = (i, j) ∈ E, i, j ∈ V
BCAM January 2011 13
Multiple representations for G = (V, E)
Set-theoretic – Edge variables ξij , i, j ∈ V with
ξij =
1 if (i, j) ∈ E
0 if (i, j) /∈ E
Algebraic – Adjacency matrix A = (Aij) with
Aij =
1 if (i, j) ∈ E
0 if (i, j) /∈ E
BCAM January 2011 14
Some terminology
Simple graphs vs. multigraphs
Directed vs. undirected
(i, j) ∈ E if and only if (j, i) ∈ E
No self loops(i, i) /∈ E, i ∈ V
Here: Simple, undirected graphs with no self loops!
BCAM January 2011 15
Types of graphs
• The empty graph
• Complete graphs
• Trees/forests
• A subgraph H = (W,F ) of G = (V,E) is a graph with vertexset W such that
W ⊆ V and F = E ∩ (W ×W )
• Cliques (Complete subgraphs)
BCAM January 2011 16
Labeled vs. unlabeled graphs
A graph automorphism of G = (V,E) is any one-to-one mappingσ : V → V that preserves the graph structure, namely
(σ(i), σ(j)) ∈ E if and only if (i, j) ∈ E
Group Aut(G) of graph automorphisms of G
BCAM January 2011 17
Of interest
• Connectivity and k-connectivity (with k ≥ 1)
• Number and size of components
• Isolated nodes
• Degree of a node: degree distribution/average degree,maximal/minimal degree
• Distance between nodes (in terms of number of hops): Shortestpath, diameter, eccentricity, radius
• Small graph containment (e.g., triangles, trees, cliques, etc.)
• Clustering
• Centrality: Degree, closeness, in-betweenness
BCAM January 2011 18
For i, j in V ,
`ij =Shortest path length between
nodes i and j in the graph G = (V,E)
Convention: `ij =∞ if nodes i and j belong to differentcomponents and `ii = 0.
Average distance
`Avg =1
|V |(|V | − 1)
∑i∈V
∑j∈V
`ij
Diameterd(G) = max (`ij , i, j ∈ V )
BCAM January 2011 19
Eccentricity
Ec(i) = max (`ij , j ∈ V ) , i ∈ V
Radius
rad(G) = min (Ec(i), i ∈ V )
`Avg ≤ d(G)
andrad(G) ≤ d(G) ≤ 2 rad(G)
BCAM January 2011 20
Centrality
Q: How central is a node?
Closeness centrality
g(i) =1∑
j∈V `ij, i ∈ V
Betweenness centrality
b(i) =∑
k 6=i, k 6=j,
σkj(i)σkj
, i ∈ V
BCAM January 2011 21
Clustering
Clustering coefficient of node i
C(i) =
∑j 6=i, k 6=i, j 6=k ξijξikξkj∑j 6=i, k 6=i, j 6=k ξijξik
Average clustering coefficient
CAvg =1n
∑i∈V
C(i)
C = 3 · Number of fully connected triplesNumber of triples
BCAM January 2011 22
Random graphs
BCAM January 2011 23
Random graphs?
G(V ) ≡Collection of all (simple free of self-loops undirected)
graphs with vertex set V .
Definition – Given a probability triple (Ω,F ,P), a randomgraph is simply a graph-valued rv G : Ω→ G(V ).
Modeling – We need only specify the pmf
P [G = G] , G ∈ G(V ).
Many, many ways to do that!
BCAM January 2011 24
Equivalent representations for G
Set-theoretic – Link assignment rvs ξij , i, j ∈ V with
ξij =
1 if (i, j) ∈ E(G)
0 if (i, j) /∈ E(G)
Algebraic – Random adjacency matrix A = (Aij) with
Aij =
1 if (i, j) ∈ E(G)
0 if (i, j) /∈ E(G)
BCAM January 2011 25
Why random graphs?
Useful models in many applications to capture binary relationshipsbetween participating entities
Because
|G(V )| = 2|V |(|V |−1)
2
' 2|V |2
2 A very large number!
there is a need to identify/discover typicality!
Scaling laws – Zero-one laws as |V | becomes large, e.g.,
V ≡ Vn = 1, . . . , n (n→∞)
BCAM January 2011 26
Menagerie of random graphs
• Erdos-Renyi graphs G(n;m)
• Erdos-Renyi graphs G(n; p)
• Generalized Erdos-Renyi graphs
• Geometric random models/disk models
• Intrinsic fitness and threshold random models
• Random intersection graphs
• Growth models: Preferential attachment, copying
• Small worlds
• Exponential random graphs
• Etc
BCAM January 2011 27
Erdos-Renyi graphs G(n; m)
With
1 ≤ m ≤(n
2
)=n(n− 1)
2,
the pmf on G(Vn) is specified by
P [G(n;m) = G] =
u(n;m)−1 if |E(G)| = 2m
0 if |E(G)| 6= 2m
where
u(n;m) =(n(n−1)
2
m
)
BCAM January 2011 28
Uniform selection over the collection of all graphs on the vertexset 1, . . . , n with exactly m edges
BCAM January 2011 29
Erdos-Renyi graphs G(n; p)
With0 ≤ p ≤ 1,
the link assignment rvs χij(p), 1 ≤ i < j ≤ n are i.i.d.0, 1-valued rvs with
P [χij(p) = 1] = 1− P [χij(p) = 0] = p, 1 ≤ i < j ≤ n
For every G in G(V ),
P [G(n; p) = G] = p|EG|
2 · (1− p)n(n−1)
2 − |EG|2
BCAM January 2011 30
Related to, but easier to implement than G(n;m)
Similar behavior/results under the matching condition
|E(G(n;m))| = E [|E(G(n; p))|] ,
namely
m =n(n− 1)
2p
BCAM January 2011 31
Generalized Erdos-Renyi graphs
With0 ≤ pij ≤ 1, 1 ≤ i < j ≤ n
the link assignment rvs χij(p), 1 ≤ i < j ≤ n are mutuallyindependent 0, 1-valued rvs with
P [χij(pij) = 1] = 1− P [χij(pij) = 0] = pij , 1 ≤ i < j ≤ n
An important case: With positive weights w1, . . . , wn,
pij =wiwjW
with W = w1 + . . .+ wn
BCAM January 2011 32
Geometric random graphs (d ≥ 1)
With random locations in Rd at
X1, . . . ,Xn,
the link assignment rvs χij(ρ), 1 ≤ i < j ≤ n are given by
χij(ρ) = 1 [ ‖Xi −Xj‖ ≤ ρ ] , 1 ≤ i < j ≤ n
where ρ > 0.
BCAM January 2011 33
Usually, the rvs X1, . . . ,Xn are taken to be i.i.d. rvs uniformlydistributed over some compact subset Γ ⊆ Rd
Even then, not so obvious to write
P [G(n; ρ) = G] , G ∈ G(V ).
since the rvs χij(ρ), 1 ≤ i < j ≤ n are no more i.i.d. rvs
For d = 2, long history for modeling wireless networks (known asthe disk model) where ρ interpretated as transmission range
BCAM January 2011 34
Threshold random graphs
Given R+-valued i.i.d. rvs W1, . . . ,Wn with absolutely continuousprobability distribution function F ,
i ∼ j if and only if Wi +Wj > θ
for some θ > 0
Generalizations:
i ∼ j if and only if R(Wi,Wj) > θ
for some symmetric mapping R : R2+ → R+.
BCAM January 2011 35
Random intersection graphs
Given a finite set W ≡ 1, . . . ,W of features, with randomsubsets K1, . . . ,Kn of W,
i ∼ j if and only if Ki ∩Kj 6= ∅
Co-authorship networks, random key distribution schemes,classification/clustering
BCAM January 2011 36
Growth models
Gt, t = 0, 1, . . .
with rulesVt+1 ← Vt
andGt+1 ← (Gt, Vt+1)
• Preferential attachment
• Copying
Scale-free networks
BCAM January 2011 37
Small worlds
• Between randomness and order
• Shortcuts
• Short paths but high clustering
Milgram’s experiment and six degrees of separation
BCAM January 2011 38
Exponential random graphs
• Models favored by sociologists and statisticians
• Graph analog of exponential families often used in statisticalmodeling
• Related to Markov random fields
BCAM January 2011 39
With I parametersθ = (θ1, . . . , θI)
and a set of I observables (statistics)
ui : G(V )→ R+,
we postulate
P [G = G] =e
PIi=1 θiui(G)
Z(θ), G ∈ G(V )
with normalization constant
Z(θ) =∑
G∈G(V )
ePI
i=1 θiui(G)
BCAM January 2011 40
In sum
• Many different ways to specify the pmf on G(V )
– Local description vs. global representation
– Static vs. dynamic
– Application-dependent mechanisms