the universal laws of structural dynamics in large graphs
DESCRIPTION
The Universal Laws of Structural Dynamics in Large Graphs. Dmitri Krioukov UCSD/CAIDA David Meyer & David Rideout UCSD/Math M. Bogu ñá , M . Á. Serrano , F . Papadopoulos, M. Kitsak, kc claffy, A. Vahdat DARPA’s GRAPHS, Chicago, July 2012. High-level project description. Motivation: - PowerPoint PPT PresentationTRANSCRIPT
The Universal Lawsof Structural Dynamics
in Large Graphs
Dmitri KrioukovUCSD/CAIDA
David Meyer & David RideoutUCSD/Math
M. Boguñá, M. Á. Serrano,F. Papadopoulos, M. Kitsak, kc claffy, A. Vahdat
DARPA’s GRAPHS, Chicago, July 2012
High-level project description
• Motivation:– Predict network dynamics– Detect anomalies
• Goal:– Identify the universal laws of network dynamics
• Methods: random geometric graphs– Past work: static graphs– Future work: dynamic graphs
Past workRandom geometric graphs in hyperbolic spaces
• Strengths:– Common structural properties of real networks– Optimality of their common functions
• Limitations:– Static graphs– Model real networks qualitatively
Future workRandom geometric graphs on Lorentzian manifolds
• Dynamic graphs• Model real networks quantitatively
Outline
• Introduction– Why geometric graphs?– Why fundamental laws?– Real networks– Network models
• Past work– Random hyperbolic graphs (RHGs)
• Future work– Random Lorentzian graphs (RLGs)
Why geometric graphs?
• Graphs are not “geometry”• Yet real networks are navigable
– Efficient substrates for information propagation without global knowledge or central coordination
• How is this possible?
Geometric graphs
• “Coarse approximations” of smooth manifolds– Riemann’s idea (Nature, v.7)
• “Sense of direction” (geometry) makes graphs navigable
• “Hyperbolicity”– Maximizes navigability (optimal Internet routing)– Reflects hierarchical (tree-like) organization– Explains common structural properties
Why fundamental laws?
• Many different real networks have certain common structural properties
• Are there any common (fundamental) laws of network dynamics explaining the emergence of these common properties?– If “no”, then… too bad (each network is unique)– If “yes”, then one can utilize these laws to predict
network dynamics and detect anomalies
Real networks• Technological
– Internet– Transportation– Power grid
• Social– Collaboration– Trust– Friendship
• Biological– Gene regulation– Protein interaction– Metabolic– Brain
• What can be common to all these networks???
• Naïve answer:– Nothing– Well, something: they all are
messy, complex, “very random”– And that’s it
“Very random” graphs
• Classical random graphs (Erdős-Rényi)– Take N nodes– Connect each node pair with probability p
• Soft version of -regular graphs ( Np)• Maximum-entropy graphs of size N and
expected average degree
Heterogeneity
• Distribution P(k)of node degrees k– Real: P(k) ~ k– Random: P(k) ~ k e /
k!
“Less random” graphs• Random graphs with
expected node degree distributions– Take N nodes– Assign to each node a random variable
drawn from a desired distribution, e.g.,() ~
– Connect each node pair with probabilityp(,) ~ /N
• Soft version of the configuration model• Maximum-entropy graphs of size N and expected degree
distribution ()
Clustering
• Distribution P(k)of node degrees k– Real: P(k) ~ k– Random: P(k) ~ k
• Average probability that node neighbors are connected– Real: 0.5– Random: 7104
Random geometric graphs
• Take a compact region in a Euclidean space, e.g., a circle of radius R
• Sprinkle N nodes into it via the Poisson point process
• Connect each pair of nodes if the distance between them is d r R
Heterogeneity lost
• Distribution P(k)of node degrees k– Real: P(k) ~ k– Random: P(k) ~ k e /
k!
• Average probability that node neighbors are connected– Real: 0.5– Random: 0.5
Strong heterogeneity and clustering are common properties of large networks
Network Exponent of thedegree distribution
Average clustering
Internet 2.1 0.46
Air transportation 2.0 0.62
Actor collaboration 2.3 0.78
Protein interaction S. cerevisiae
2.4 0.09
Metabolic E. coli and S. cerevisiae
2.0 0.67
Gene regulation E. coli and S. cerevisiae
2.1 0.09
Any other common properties?• No• Well, some randomness, of course• Real large networks appear to be quite different
and unique in all other respects• Any simple random graph model that can
reproduce these two universal properties?
Random hyperbolic graphs
• Take a compact region in a hyperbolic space, e.g., a circle of radius R
• Sprinkle N nodes into it via the Poisson point process (R ln N)
• Connect each pair of nodes if the distance between them is x R
Succeeded finally
• Distribution P(k)of node degrees k– Real: P(k) ~ k– Random: P(k) ~ k
• Average probability that node neighbors are connected– Real: 0.5– Random: 0.5
Rrer ~)(
Node density)(
21
~)(rR
erk
Node degree
kkP ~)(
Degree distribution
3 1~)( kkc
Clusteringmaximized
Fermi-Dirac connection probability
• connection probability p(x) – Fermi-Dirac distribution • hyperbolic distance x – energy of links/fermions• disk radius R – chemical potential• two times inverse sqrt of curvature 2/ – Boltzmann constant• parameter T – temperature
1
1)(2
TRx
e
xp )(0 xRT
Curvature and temperature
• Curvature 0 controls power-law exponent [2,]– Graphs are not geometric (node density is not uniform)
unless = 3• Temperature T 0 controls clusteringc [0,cmax]– Phase transition at T 1
Limiting cases
Curvature \ Temperature T Finite T Infinite T
Finite Random hyperbolic graphs and real networks Classical random graphs
Infinite Random geometric graphs Configuration model
Random graphswith hidden variables
• Definition:– Take N nodes– Assign to each node a random variable
drawn from distribution ()• can be a vector (of attributes or coordinates)
– Connect each node pair with probability p(,)• In random geometric graphs:
– ’s are node coordinates– () is the uniform distribution– p(,) is a step function of distance d(,)
Exponential random graphs• Definition:
– Set of graphs G with probability measure , where , and H(G) is the graph Hamiltonian
• In soft configuration model:– , where
is the adjacency matrix and are the expected degrees
• In random hyperbolic graphs:– , where
, and
,
Back to reality
• Infer/learn node coordinates in real networks using maximum-likelihood methods (MCMC)
• Compare the empirical probability of connections in the real network with the inferred coordinates against the theoretical prediction
• If the model is correct, the two should match
Physical meaningof node coordinates
• Radial coordinates– Node degrees (popularity)
• Angular coordinates– Similarity
• Projections of a properly weighted combination of all the factors shaping the network structure
Summary of RHG strengths• Explanation how the structure of complex networks maximizes the
efficiency of their transport function– Optimal Internet routing as a practical application
• Two universal structural properties of real networks(degree heterogeneity and strong clustering) emerge as simple consequences of the two basic properties of hyperbolic spaces (exponential expansion and metric property)
• Connections to– Similarity distances– Hidden variable models– Exponential random graphs
• interpreting auxiliary fields as linear function of hyperbolic distances– Fermionic systems– Self-similarity– Conformal invariance?– AdS/CFT correspondence?
• RHGs subsume– Classical random graphs– Random geometric graphs– Soft configuration model
as limiting cases with degenerate geometry
Summary of RHG limitations• Exponential random graphs are intrinsically
static graphs (equilibrium ensembles)– Real networks are growing (far from equilibrium)
• or node density is not uniform(graphs are not coarse approximations of smooth manifolds)– in real networks
Proposal• These observations suggest that the geometry
of real networks is actually different• Question:
– What is it?• Proposal:
– Lorentzian geometry• Why:
– Lorentzian geometry explicitly models time– Indications that in some RLGs
• Challenge:– Dynamic exponential random graphs
Lorentzian manifolds
• Pseudo-Riemannian manifold is a manifold with a non-degenerate metric tensor– Distances can be positive, zero, or negative
• Lorentzian manifold is a manifold with signature – Coordinate corresponding to the minus sign is
called time– Negative distance are time-like– Positive distance are space-like
Causal structure
• For each point , the set of points at time-like distances from p can be split in two subsets:– ’s future– ’s past
• If , then is called the Alexandrov set of
Alexandrov sets
• Form a base of the manifold topology– Similar to open balls in Riemannian case
Random Lorentzian graphs
• Take a compact region of a Lorentzian manifold, e.g., a patch – Similar to a circle in the Riemannian case
• Sprinkle N nodes into it via the Poisson point process
• Connect each pair of nodes if the distance between them is x 0– Because Alexandrov sets are analogous to balls
Summary of RLGs
• Random geometric graphs by constructions– Are they also exponential random graphs?
• Growing T explicitly models graph growth• Not to sacrifice the strengths of RHGs,
find Lorentzian manifold M such that:– There exist a map between M and hyperbolic space
H satisfying certain duality properties– The groups of isometries of M and H are isomorphic
• Lorentz group
• M. Boguñá, F. Papadopoulos, and D. Krioukov,Sustaining the Internet with Hyperbolic Mapping,Nature Communications, v.1, 62, 2010
• D. Krioukov, F. Papadopoulos, A. Vahdat, and M. Boguñá,Hyperbolic Geometry of Complex Networks,Physical Review E, v.82, 036106, 2010,Physical Review E, v.80, 035101(R), 2009
• F. Papadopoulos, D. Krioukov, M. Boguñá, and A. Vahdat,Greedy Forwarding in Scale-Free NetworksEmbedded in Hyperbolic Metric Spaces,INFOCOM 2010,SIGMETRICS MAMA 2009
• M. Boguñá, D. Krioukov, and kc claffy,Navigability of Complex Networks,Physical Review Letters, v.102, 058701, 2009Nature Physics, v.5, p.74-80, 2009
• M. Á. Serrano, M. Boguñá, and D. Krioukov,Self-Similarity of Complex Networks,Physical Review Letters, v.106, 048701, 2011Physical Review Letters, v.100, 078701, 2008