scaling, renormalization and self- similarity in complex networks chaoming song (ccny) lazaros...
Post on 31-Mar-2015
219 Views
Preview:
TRANSCRIPT
Scaling, renormalization and self-similarity in complex networks
Chaoming Song (CCNY)Chaoming Song (CCNY)Lazaros Gallos (CCNY)Lazaros Gallos (CCNY)Shlomo Havlin (Bar-Ilan, Israel)Shlomo Havlin (Bar-Ilan, Israel)
Hernan A. MakseHernan A. Makse
Levich Institute and Physics Dept.Levich Institute and Physics Dept.City College of New YorkCity College of New York
Protein interaction networkProtein interaction network
Are “scale-free” networks really ‘free-of-scale’?“If you had asked me yesterday, I would have said surely not” - said Barabasi.
(Science News, February 2, 2005).
Small world contradicts self-similarity!
Small World effect shows that distance between nodes grows logarithmically with N (the network size):
OR
Self-similar = fractal topology is defined by a power-law relation:
AIM: How the network behaves under a scale transformation.Implications for: 1. Dynamics 2. Modularity
3. Universality
WWW nd.edu
300,000 web-pages
Internet connectivity, with selected backbone ISPs (Internet Service Provider) colored separately.
Faloutsos et al., SIGCOMM ’99
Internet
J. Han et al., Nature (2004)
Yeast Protein-Protein Interaction Map
Individual proteins
Physical interactions from the “filtered yeast interactome” database: 2493 high-confidence interactions observed by at least two methods (yeast two-hybrid).1379 proteins, <k> = 3.6
Colored according to protein function in the cell:Transcription, Translation, Transcription control, Protein-fate, Genome maintenance, Metabolism, Unknown, etc
Modular structure according to function!
from MIPS database, mips.gsf.de
Metabolic network of biochemical reactions in E.coli
Chemical substrates
Biochemical interactions: enzyme-catalyzed reactions that transform one metabolite into another.
Modular structureaccording to the biochemicalclass of the metabolic productsof the organism.
Colored according to product class:Lipids, essential elements, protein, peptides and amino acids, coenzymes and prosthetic groups, carbohydrates, nucleotides and nucleic acids.
J. Jeong, et al., Nature, 407 651 (2000)
Biological networks
Protein Homology Tree of life
Similarities between sequence ofAmino-acids (BLAST)Network of 5 million proteins1.2 TB of data growing at 50GBPer month.Adai et al. J Mol Biol (2004)
Complex network of speciesRepresenting their evolucionary history~90,000 species
Coast lines Rivers Mountains
Clouds Lightening Neurons
Introduction to fractalsIn Nature there exist many examples of random fractals
How long is the coastline of Norway?It depends on the length of your ruler.
Fractal Dimension dB-Box Covering Method
Fractals look the same on all scales = `scale-invariant’.
Box length
Total no. of boxes
Boxing in Biology
How to “zoom out” of a complex network?
Generate boxes where all nodes are within a distance
Calculate number of boxes, , of size needed to cover the network
We need the minimum number of boxes: NP-complete optimization problem!We need the minimum number of boxes: NP-complete optimization problem!
Boxing in Biology
Most efficient tiling of the network
4 boxes
5 boxes
1
0
0
1
2
8 node network: Easy to solve
300,000 node network: Mapping to graph colouring problem. NP-complete: Greedy algorithm to find minimum boxes
Burning algorithms
1. Compact box burning: CBBSong et al. JSTAT (2007)
2. Maximum mass burning: MEMB Burning from the hubs with the radius r
Minimazing the number of boxes is analogous to maximizing the mass of each box: implications for modularity
Two universality classes:
-dB
log(lB)
log(
NB)
1 2 3
TOPOLOGICAL NON-FRACTAL TOPOLOGICAL FRACTALS
EUCLIDIAN NON-FRACTALS EUCLIDIAN FRACTALS
Percolation cluster:“holes” at all scales
Compact cluster
Box covering in yeast: protein interaction network
Many complex networks are Fractal
Metabolic Protein interaction
Song, Havlin, Makse, Nature (2005)
Biological networks
Three domains of life: archaea, bacteria, eukaria
E. coli, H. sapiens, yeast
43 organisms - all scale
yeast
Metabolic networks are fractals
More topological fractals
WWW
nd.edu domain
1. Protein homology network2. Tree of life (taxonomy)3. Genetic networks (Meyer-Ortmanns, Khang)4. Neural networks (Yuste)
300,000 web-pages
Internet and social networks are not fractal
Other models fail too: Erdos-Renyi, hierarchical model, fitness model, JKK model, pseudo-fractals models, etc.
The Barabasi-Albert model of preferentialattachment does not generate fractal networks
All available models fail to predict self-similarity
INTERNETRouter and AS level
Two universality classes
Fractal networks:WWWBiological networks: protein interactions, metabolic, genetic (Meyer-Ortmanns, Khang), taxonomy, tree of life, protein homology network, neural activity network.
Non-Fractal networks: Internet (routers and AS level)Social networks (citations (Khang), IMDB)Models based on uncorrelated preferential attachment
Two ways to calculate fractal dimensions
Box covering method Cluster growing method
In homogeneous systems (all nodes with similar k) both definitions agree:
percolation
Box Covering= flat average Cluster Growing = biased
power law
Different methods yield different results due to heterogeneous topology
exponential
Box covering reveals the self similarity. Cluster growth reveals the small world. NO CONTRADICTION! SAME HUBS ARE USED MANY TIMES IN CG.
Renormalization in Complex Networks
NOW, REGARD EACH BOX AS A SINGLE NODEAND ASK WHAT IS THE DEGREEDISRIBUTION OF THE NETWORKOF BOXES AT DIFFERENT SCALES ?
Renormalization of WWW network with
Statistical properties are invariant under renormalization
WWW PIN
E.coli
Internet
Self-similarity:Invariant under renormalization
Internet is not fractal, dB--> infinity but it is renormalizable
FRACTALS NON-FRACTALS
DYNAMICS: Turning back the timeRepeatedly BOXING the network is the same as going back
in time: from a single node to present day.
renormalization
time evolution
Can we “predict” the past…. ? if not the future.
ancestral node
present daynetwork
THE RENORMALIZATION SCHEME
1
time evolution
Evolution of complex networks
opening boxes
How does Modularity arise?The boxes have a physical meaning =
self-similar nested communities
time evolution
ancestral node
present daynetwork
renormalization
1
How to identify communities in complex networks?
Classes of genes in the yeast proteome
Is evolution of PIN fractal?
Ancestral Prokaryote Cell
YeastOtherFungi
Ancestral yeast
Animals+ Plants
Ancestral Fungus
Archaea + Bacteria
Ancestral Eukaryote
presentday
~ 300 million years ago
1 billion years ago
1.5 billion years ago
Following the phylogenetic tree of life:
3.5 billion years ago
COG databasevon Mering, et al Nature (2002)
Suggests that present-day networks could have been created following a self-similar, fractal dynamics.
Same fractal dimension and scale-freeexponent over 3.5 billion years…
Renormalization following the phylogenetic treeRenormalization following the phylogenetic tree
P. Uetz, et al. Nature 403 (2000).
Emergence of Modularity in PINBoxes are related to the biologically relevant functional modules
in the yeast protein interactome
time evolution renormalization
present day network
translation transcription protein-fatecellular-fateorganization
ancestralcell
Emergence of modularity in metabolic networks
Appearance of functional modules in E. coli metabolic network.Most robust network than non-fractals.
Scaling theory of modularityHow the modules/communities are linked?
k: degree of the nodes
k’=2renormalization
s=1/4k=8
k’: degree of the communities
node degree
community degree factor<1
Gallos et al. PNAS (2007)
Theoretical approach to modular networks: Scaling theory to the rescue
WWW
The larger the modulethe smaller their connectivity
new exponent describing how modules link
Scaling relations
A theoretical prediction relating the different exponents
new scaling relation
boxes
distance
degree
new exponent
Scaling relationsThe communities also follow a self-similar pattern
WWW Metabolic
Scaling relationworks
fractalsfractals communities/modulescommunities/modules
scale-freescale-free
predictionprediction
What is the origin of topological fractality?
HINT: the key to understand fractals is in the degreecorrelations P(k1,k2) not in P(k)
Can you see the difference?
Internet map Yeast protein map
E.coli metabolic map
NON FRACTAL FRACTAL
Compact cluster
Quantifying correlations P(k1,k2):
Probability to find a node with k1 links connected with a node of k2 links
Internet map - non fractal Metabolic map - fractal
log(k1)log(k1)
log(
k 2)
log(
k 2)
P(k1,k2)
low prob.
low prob.
high prob.
high prob.
Hubs connected with hubs Hubs connected with non-hubs
Gallos et al. (2007)
Quantify anticorrelation between hubsat all length scales
hubs
hubs
Renormalize
Hubs connected directly
Hub-Hub Correlation function: fraction of hub-hub connections
Hub-hub correlations organized in a self-similar way
The larger de implies more anticorrelations
(fractal) (non-fractal)
Anticorrelations are essential for fractal structures
non-fractal
fractal
Exponent de determines the joint probability distribution
What is the origin of fractality?
• very compact networks• hubs connected with other hubs• strong hub-hub “attraction”• assortativity
Non-fractal networks
• less compact networks• hubs connected with non-hubs• strong hub-hub “repulsion”• dissasortativity
Fractal networks
Internet, socialAll available models: BA model, hierarchicalrandom scale free, JKK, etc
WWW, PIN, metabolic, genetic, neural networks, protein homology, taxonomy
How to model it? renormalization reverses time evolution
Mode IIMode I
tim
e
Both mass and degree increase exponentially with time
Scale-free:
offspring nodes attached to their parents
(m=2) in this case
reno
rmal
ize
Song, Havlin, Makse, Nature Physics, 2006
How does the length increase with time?
Mode II: FRACTALMode I: NONFRACTAL
SMALL WORLD
Combine two modes together
tim
e
e=0.5
Mode I with probability e Mode II with probability 1-e
reno
rmal
ize
m = 2
Model
A multiplicative growth processof the number of nodes and links
Probability ehubs always connected
strong hub attractionshould lead to non-fractal
Probability 1-ehubs never connectedstrong hub repulsionshould lead to fractal
Analogous to duplication/divergence
mechanism in proteins??
For the both models, each step the total number of nodes scale as n = 2m +1( N(t+1) = nN(t) ). Now we investigate the transformation of the lengths. They show quite different ways for this two models as following:
Then we lead to two different scaling law of N ~ L
Mode III: L(t+1) =3L(t)Mode II: L(t+1) = 2L(t)+1
Mode I: L(t+1) = L(t)+2
smaller
smaller
Different growth modes lead to differenttopologies
Suppose we have e probability to have mode I, 1-e probability to have mode II and mode III. Then we have:
or
Dynamical model
Model predicts all exponents in terms ofgrowth rates
Each step the total mass scales with a constant n, all the degrees scale with a constant s.
The length scales with a constant a, we obtain:
We predict the fractal exponents:
PredictionsModel reproduces local small world, scale-free and fractality
NON-FRACTAL• attraction between hubs• non-fractal• small world globally
FRACTAL• repulsion between hubs leads to fractal topology• small world locally inside well defined communities
yeast
The model reproduces the main features of real networks
Case 1: e = 0.8: FRACTALS Case 2: e = 1.0: NON-FRACTALS
Summary of scaling exponents and scaling relationsMass:
Links:
Hub-hub correlations:
Modularity ratioModularity exponent:
Number of hub-hub links
Number of links outside modules
Number of links inside modules
Modularity is also scale-invariant
Protein Homology
Similarities between sequence ofamino-acids (BLAST)Network of 5 million proteins1.2 TB of data growing at 50GBper month. Adai et al. J Mol Biol (2004)
Yeast protein interaction
Large modularity Ultramodularity
Time evolution in yeast network
Multiplicative and exponential growth in yeast PINLength-scales, number of conserved proteins and degree
Self-similar learning dynamics of the brainCalcium imaging of spontaneous action potentials in large neuronal populations of a slice of the medial prefrontal cortex of a brain slice of mouse.
QuickTime™ and aSorenson Video 3 decompressorare needed to see this picture.
Rafael Yuste and IkegayaJohn Cageminimalistavant-gardemusic
t = 15 sec t = 30 sec t = 45 sec t = 60 sec
t = 75 sec t = 90 sec t = 105 sec t = 120 sec
Time evolution of the network
The degree distribution P(k) is invariant under evolution. The plots go from 30 sec to 120 sec
The fractal dimension is also invariant under evolution from 30 sec to 120 sec
The degree exponents and fractal dimension are invariant under the time evolution
Scale-transformation of degree
We verify the formula:
k(t1) = S(t1|t2) k(t2)
Here we fix t2 = 120 sec, and take t1 from 30 sec to 105 sec. The linear dependency is verified for different times t1.
From the theory: N(t) = s(t)γ−
The inset shows that both N(t) and s(t) increase exponentially:
N(t) ~ exp(0.014t)
s(t) ~ exp(0.021t)
This gives rise to the following scaling relation:
Confirmation of the scaling formula for the degree exponent as a function of the fractal exponents
Tolerance of the network under random failureand intentional attack
We plot the largest cluster size as a function of the fraction p of nodes removed
A new principle of network dynamics 1930solid-state physicsbig world
1960Erdos-Renyi model small world
democracy=socialism
1999BA model “rich-get-richer”=
capitalism
2005fractal model of modularity
“rich-get-richer” at the expense of the “poor”=
globalizationLess vulnerable to intentional attacks:Designed by Evolutionary pressure.
Summary
• In contrast to common belief, many real world networks are self-similar.• FRACTALS: WWW, Protein interactions, metabolic networks, neural networks, homology networks, tree of life. • NON-FRACTALS: Internet, social, all models.• Communities/modules are self-similar, as well.• Scaling theory describes the dynamical evolution.• Boxes are related to the functional modules in metabolic and protein networks.• Origin of self similarity: anticorrelation between hubs• Fractal networks are less vulnerable than non-fractal networks
Graph theoretical representation of a metabolicGraph theoretical representation of a metabolicnetworknetwork
(a) A (a) A pathway (catalyzed by Mg2+-dependant enzymes).(b) All interacting metabolites are considered equally. (c) For many biological applications it is useful to ignore co-factors, such as the high energy-phosphate donor ATP, which results in a second type of mapping that connects only the main source metabolites to the main products.
More topological fractals
WWW
nd.edu domain
Hollywood film actors
212,000 actors
300,000 web-pages
Burning algorithms
Compact box burning: CBBSong et al. JSTAT (2007)
Maximum excluded mass burning: MEMBBurning from the hubs with the radius r
Minimazing the number of boxes is analogous to maximizing the mass of each box: Modularity
top related