ecs 289 / mae 298, lecture 3 april 8, 2014

37
ECS 289 / MAE 298, Lecture 3 April 8, 2014 “Preferential Attachment, Network Growth, Master Equations”

Upload: others

Post on 05-Jul-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECS 289 / MAE 298, Lecture 3 April 8, 2014

ECS 289 / MAE 298, Lecture 3April 8, 2014

“Preferential Attachment, Network Growth,Master Equations”

Page 2: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Announcements

• Videos available on Smartsite

• TA office hours– Sam Johnson, Weds 1-2pm– Vikram Vijayaraghavan, Mon 2-3pm– Alex Waagen – When is good?

• Class survey results – 30 responses

• HW1 — includes project pitch or mathematical analysis

• Project– Teams of 2-3 people– Negative results are OK– Ideally aim to have a result for a journal of conference

Page 3: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Project pitch

• One-paragraph (∼200 words) describing your idea. Submittedvia Smartsite and shared with the class.

• Skill sets to merge: System or Application / Method / Data

• We have a lot of ideas:

Page 4: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Back to basics of networksRecall: broad scale degree distribution

Illustration: Analysis of the Flight Weighted U.S. Air Transportation Network

! The flight weighted U.S. air

transportation network exhibits a

partial power law distribution

! Power law distribution for small and

medium size nodes up to approximately

250,000 flights per year

! Non power law distribution (above 250,000

flights per year)

! Hypothesis: Limits to scale and

capacitated nodes (i.e. capacity

constrained airports) are present in the non power law part of the distribution

10

U.S. Air Transportation Network(Airport level)

Analysis of the U.S. Air Transportation Network

*

Social contacts Airport trafficSzendroi and Csanyi Bounova 2009

We first identified protein classes significant-ly enriched or depleted in the high-confi-dence network (table S5). Enriched classesrelate primarily to DNA metabolism, tran-scription, and translation. Depleted classesare primarily plasma membrane proteins, in-cluding receptors, ion channels, and pepti-dases. Enrichment and depletion of specificclasses may be due to technical biases of thetwo-hybrid assay.

We then classified each interaction ac-cording to its corresponding pair of proteinclasses to identify class-pairs that are en-riched in the network. Rather than using acontingency table (13), we used a random-ization method to calculate statistical sig-nificance (6 ). Enriched class-pairs involv-ing structural domains (Pfam annotations)may represent binding modules and couldprovide the biological rules for buildingmultiprotein complexes. We identified 67pairs of Pfam domains enriched with a Pvalue of 0.05 or better after correcting formultiple testing (table S6). These includeknown domain pairs (F-box/Skp1, P ! 9 "10#20; LIM/LIM binding, P ! 5 " 10#8;actin/cofilin, P ! 2 " 10#7) as well asdomain pairs involving domains of un-known function (DUF227/DUF227, P !9 " 10#5; cullin/DUF298, P ! 0.0003). Anadditional 88 domain pairs are significantat P ! 0.05 before correcting for multipletesting and may represent additional bio-logically relevant binding patterns.Properties of the high-confidence pro-

tein-interaction network. Protein networksare of great interest as examples of small-world networks (14–16). Small-world net-works exhibit short-range order (two proteinsinteracting with a third protein have an en-hanced probability of interacting with eachother) but long-range disorder (two proteinsselected at random are likely to be connect-ed by a small number of links, as in arandom network).

Small-world properties arise in partfrom the existence of hub proteins, thosehaving many interaction partners. Hubs arecharacteristic of scale-free networks, andthe Drosophila network resembles a scale-free network in that the distribution of in-teractions per protein decays slowly, closeto a power law (Fig. 2D). To determine thesignature of biological organization insmall-world properties beyond what wouldbe expected of scale-free networks in gen-eral, we calculated properties for both theactual Drosophila network and an ensem-ble of randomly rewired networks with thesame distribution of interactions per proteinas in the original network. We consideredonly the giant connected component toavoid ill-defined mathematical quantities.

The distribution of the shortest path be-tween pairs of proteins peaks at 9 to 10

protein-protein links (Fig. 3A). A logistic-growth mathematical model for the probabil-ity that the shortest path between two distinctproteins has ! links is (N #1)#1 K$(!; N, J ),where K(!; N, J ) ! N/ [1% (N # 1) J#!] isthe number of proteins within ! links of acentral protein and the symbol $ indicatesdifferentiation with respect to !, K$(d; N,J ) ! N(N # 1)(ln J )J#!/[1 % (N #1)J#!]2.Although this model fits the ensemble ofrandom networks, the fit to the actual net-work is less adequate.

Small-world properties of biologicalnetworks may reflect biological organiza-tion, and hierarchical organization has beenused to describe the properties of metabolicnetworks (7). We tested the ability of asimple, two-level hierarchical model to de-scribe the properties of the Drosphila pro-tein-interaction network. The lower level oforganization in this model represents pro-tein complexes, and the high level repre-sents interconnections of these complexes.In this case, the probability Pr(!) that the

Fig. 2. Confidence scores for protein-protein interactions (A) Drosophila protein-protein interac-tions have been binned according to confidence score for the entire set of 20,405 interactions(black), the 129 positive training set examples (green), and the 196 negative training set examples(red). (B) The 7048 proteins identified as participating in protein-protein interactions have beenbinned according to the minimum, average, and maximum confidence score of their interactions.Proteins with high-confidence interactions total 4679 (66% of the proteins in the network, and34% of the protein-coding genes in the Drosophila genome). (C) The correlation between GOannotations for interacting protein pairs decays sharply as confidence falls from 1 to 0.5, thenexhibits a weaker decay. Correlations were obtained by first calculating the deepest level in the GOhierarchy at which a pair of interacting proteins shared an annotation (interactions involvingunannotated proteins were discarded). The average depth was calculated for interactions binnedaccording to confidence score, with bin centers at 0.05, 0.1, . . . , 0.95. Finally, the correlation forthe bin centered at x was defined as [Depth(x) # Depth(0)]/[Depth(0.95) # Depth(0)]. Thisprocedure effectively controls for the depth of each hierarchy and for the probability that a pair ofrandom proteins shares an annotation. (D) The number of interactions per protein is shown for allinteractions (black circles) and for the high-confidence interactions (green circles). Linear behaviorin this log-log plot would indicate a power-law distribution. Although regions of each distributionappear linear, neither distribution may be adequately fit by a single power-law. Both may be fit,however, by a combination of power-law and exponential decay, Prob(n) & n#'exp#(n, indicatedby the dashed lines, with r 2 for the fit greater than 0.98 in either case (all interactions: ' ! 1.20)0.08, ( ! 0.038 ) 0.006; high-confidence interactions: ' ! 1.26 ) 0.25, ( ! 0.27 ) 0.05). Notethat the power-law exponents are within 1* for the two interaction sets.

R E S E A R C H A R T I C L E

www.sciencemag.org SCIENCE VOL 302 5 DECEMBER 2003 1729

on O

cto

ber

5, 2009

ww

w.s

cie

ncem

ag.o

rgD

ow

nlo

aded fro

m

Protein interactionsGiot et al Science 2003

(node degree)

• Small data sets, power laws vs log normal, stretched-exponential, etc...• Exceptions: Power grids? Router-level Internet?

Page 5: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Approximating broad scale by a Power LawProperties of a power law PDF

(PDF = probability density function)

pk = Ak−γ

• To be a properly defined probability distribution need γ > 1.

• For 1 < γ ≤ 2, both the average 〈k〉 and standard deviation σ2

are infinite!

• For 2 < γ ≤ 3, average 〈k〉 is finite, but standard deviation σ2

is infinite!

• For γ > 3, both average and standard deviation finite.

Page 6: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Recall: The “classic” random graph, G(N, p)(A Classic Null Model)

• P. Erdos and A. Renyi, “On random graphs”, Publ. Math. Debrecen. 1959.• P. Erdos and A. Renyi, “On the evolution of random graphs”,

Publ. Math. Inst. Hungar. Acad. Sci. 1960.• E. N. Gilbert, “Random graphs”, Annals of Mathematical Statistics, 1959.

●●●

●●

● ●

●●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

● ●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

• Start with N isolated vertices.

• Add random edges one-at-a-time.N(N − 1)/2 total edges possible.

• After E edges, probability p of any

edge is p = 2E/N(N − 1)

What does the resulting graph look like?(Typical member of the ensemble)

Page 7: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Emergence of a “giant component”

• pc = 1/N .

• p < pc, Cmax ∼ log(N)

• p > pc, Cmax ∼ A ·N

(Ave node degree t = pN

so tc = 1.)

Branching process (Galton-Watson); “tree”-like at tc = 1.

Page 8: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Degree distribution of G(n, p)

• The absence or presence of an edge is independent for alledges.

– Probability for node i to connect to all other n nodes is pn.

– Probability for node i to be isolated is (1− p)n.

– Probability for a vertex to have degree k follows a binomialdistribution:

pk =(nk

)pk(1− p)n−k.

Page 9: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Binomial converges to Poisson as n→∞

• Recall that z = (n− 1)p = np (for large n).

limn→∞

pk = limn→∞

(n

k

)pk(1− p)n−k

= limn→∞

n!(n− k)!k!

(z/n)k(1− z/n)n−k

= zke−z/k!

For more details see for instance: http://en.wikipedia.org/wiki/Poisson distribution

Page 10: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Poisson Distribution

Page 11: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Diameter of G(n,p)

The diameter of a graph is the maximumdistance between any two connected vertices in the graph.

• Below the phase transition, only tiny components exist. In some sense, thediameter is infinite.

• Above the phase transition, all vertices in the giant component connectedto one another by some path.

• The mean number of neighbors a distance l away is zl. Todetermine the diameter we want zl ≈ n. Thus the typicaldistance through the network, l ≈ log n/ log z.

• This is a small-world network: diameter d ∼ O(logN).

Page 12: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Clustering coefficient

A measure of transitivity: If node A is known to be connected toB and to C, does this make it more likely that B and C are

connected?

(i.e., The friends of my friends are my friends)

• In E-R random graphs, all edges created independently, so noclustering coefficient!

Page 13: ECS 289 / MAE 298, Lecture 3 April 8, 2014

So, how well does G(n, p) model common real-worldnetworks?

1. Phase transtion: Yes! We see the emergence of a giantcomponent in social and in technological systems.

2. Poisson degree distribution: NO! Many real networks havemuch broader distributions.

3. Small-world diameter:YES! Social systems, subway systems,the Internet, the WWW, biological networks, etc.

4. Clustering coefficient: NO!

Page 14: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Well then, why are random graphs important?

• Much of our basic intuition comes from random graphs.

• Phase transition and the existence of the giant component.Even if not a giant component, many systems have a dominatecomponent much larger than all others.

From “The web is a bow tie” Nature 405, 113 (11 May 2000)

Page 15: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Generalized random graph – advanced HWaccommodate any degree sequence

. . . The configuration model (1970’s)Molloy and Reed (1995)

• Specify a degree distribution pk, such that pk is the fraction of vertices inthe network having degree k.

• We chose an explicit degree sequence by sampling in some unbiased wayfrom pk. And generate the set of n values for ki, the degree of vertex i.

• Think of attaching ki “spokes” or “stubs” to each vertex i.

• Choose pairs of “stubs” (from two distinct vertices) at random, and jointhem. Iterate until done.

• Technical details: self-loops, parallel edges, ... (neglect in n→∞ limit).

Page 16: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Back to power lawsPower laws in social systems

• Popularity of web pages and web search terms: Nk ∼ k−1

• Rank of city sizes (“Zipf’s Law”): Nk ∼ k−1

• Pareto. In 1906, Pareto made the now famous observationthat twenty percent of the population owned eighty percentof the property in Italy, later generalised by Joseph M. Juranand others into the so-called Pareto principle (also termed the80-20 rule) and generalised further to the concept of a Paretodistribution.

• Usually explained in social systems by “the rich get richer”(preferential attachment).

Page 17: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Known Mechanisms for Power Laws

• Phase transitions (singularities)

• Random multiplicative processes (fragmentation)

• Combination of exponentials (e.g. word frequencies)

• Preferential attachment / Proportional attachment(Polya 1923, Yule 1925, Zipf 1949, Simon 1955, Price 1976,Barabasi and Albert 1999)

Attractiveness (rate of growth) is proportional to size,dsdt ∝ s

Page 18: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Origins of preferential attachment

• 1923 — Polya, urn models.

• 1925 — Yule, explain genetic diversity.

• 1949 — Zipf, distribution of city sizes (1/f ).

• 1955 — Simon, distribution of wealth in economies. (“The richget richer”).

• [Interesting note, in sociology this is referred to as the Mattheweffect after the biblical edict, “For to every one that hath shallbe given ... ” (Matthew 25:29)]

Page 19: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Preferential attachment in networks

D. J. de S. Price: “Cumulative advantage”

• D. J. de S. Price, “Networks of scientific papers” Science, 1965.First observation of power laws in a network context.Studied paper co-citation network.

• D. J. de S. Price, “A general theory of bibliometric and othercumulative advantage processes” J. Amer. Soc. Info. Sci.,1976.

Cumulative advantage seemed like a natural explanation forpaper citations:

The rate at which a paper gains citations is proportional tothe number it already has. (Probability to learn of a paperproportional to number of references it currently has).

Page 20: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Preferential attachment in networks, continued

Cumulative advantage did not gain traction at the time. But wasrediscovered some decades later by Barabasi and Albert, in thenow famous (about 1000 citations in SCI) paper:

“Emergence of Scaling in Random Networks”, Science 286,1999.

They coined the term “preferential attachment” to describe thephenomena.

(de S. Price’s work resurfaced after BA became widely reknown.)

Page 21: ECS 289 / MAE 298, Lecture 3 April 8, 2014

The Barabasi and Albert model

• A discrete time process.

• Start with single isolated node.

• At each time step, a new node arrives.

• This node makes m connections to already existing nodes.(Why m edges?)

• We are interested in the limit of large graph size.

Page 22: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Visualizing a PA graph (m = 1) at n = 5000

Page 23: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Probabilistic treatment

• Start at t = 0 with one isolated node.At time t the total number of nodes n = t.

• Probability incoming node attaches to node j:

Pr(t+ 1→ j) = dj/∑j dj.

• Normalization constant easy (but time dependent):∑j dj = 2mt = 2mn

(Each node 1 through t, contributes m edges.)(Each edge augments the degree of two nodes.)

• Prob connect to particular node of degree k at time t:qk,t = k/2mt

(Using qk,t since reserving pk,t for degree distribution.)

Page 24: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Network evolutionProcess on the degree sequence

• As shown qk,t = k/2mt.

• Also, when a node of degree k gains an attachment, itbecomes a node of degree k + 1.

• When the new node arrives, it increases by one the number ofnodes of degree m.

Page 25: ECS 289 / MAE 298, Lecture 3 April 8, 2014

The “Master Equation” / “rate eqns” / “kinetic theory”

(Let nk,t ≡ expected number of nodes of degree k at time t,and nt ≡ total number of nodes at time t: Note nt = t)

For each arriving link:

• For k > m : nk,t+1 = nk,t +(k−1)2mt nk−1,t − k

2mt nk,t

• For k = m : nm,t+1 = nm,t + 1− m2mt nm,t

But each arriving node contributes m links:

• For k > m : nk,t+1 = nk,t +m(k−1)

2mt nk−1,t − mk2mt nk,t

• For k = m : nm,t+1 = nm,t + 1− m2

2mt nm,t

Page 26: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Translating from number of nodes nk,t to probabilities pk,t

pk,t = nk,t/n(t) = nk,t/t

→ nk,t = t pk,t

For each arriving link:

• For k > m : (t+ 1) pk,t+1 = t pk,t +(k−1)

2 pk−1,t − k2 pk,t

• For k = m : (t+ 1) pm,t+1 = t pm,t + 1− m2 pm,t

Page 27: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Steady-state distribution

We want to consider the final, steady-state: pk,t = pk.

• For k > m : (t+ 1) pk = t pk + (k−1)2 pk−1 − k

2 pk

• For k = m : (t+ 1) pm = t pm + 1− m2 pm

Rearranging and solving for pk:

• For k > m : pk = (k−1)(k+2) pk−1

• For k = m : pm = 2(m+2)

Page 28: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Recursing to pm

pk = (k−1)(k−2)···(m)(k+2)(k+1)···(m+3) · pm = m(m+1)(m+2)

(k+2)(k+1)k ·2

(m+1)

pk = 2m(m+1)(k+2)(k+1)k

For k � 1

pk ∼ k−3

Page 29: ECS 289 / MAE 298, Lecture 3 April 8, 2014

For more on master equations

• “Rate Equations Approach for Growing Networks”, P. L.Krapivsky, and S. Redner, invited contribution to theProceedings of the XVIII Sitges Conference on “StatisticalMechanics of Complex Networks”.

• Dynamical Processes on Complex Networks, Barratt, Barthelemy,Vespignani

Applications to cluster aggregation (e.g. Erdos-Renyi)

• “Kinetic theory of random graphs: From paths to cycles”, E.Ben-Naim and P. L. Krapivsky, Phys. Rev. E 71, 026129(2005).

• “Local cluster aggregation models of explosive percolation”, R.M. D’Souza and M. Mitzenmacher, Physical Review Letters,104, 195702, 2010.

Page 30: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Did we prove the behavior of the degree distribution?

Page 31: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Details glossed over

1. Proof of convergence to steady-state (i.e. prove pk,t→ pk)

2. Proof of concentration (Need to show fluctuations in eachrealization small, so that the average nk describes well mostrealizations of the process).

– For this model, we can use the second-moment method(show that the effect of one different choice at time t dies outexponentially in time).

Page 32: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Issues

• Whether there are really true power-laws in networks? (Usuallyrequires huge systems, and no constraints on resources).

• Only get γ = 3!

• Note only the old nodes are capable of attaining high degree.

Page 33: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Generalizations of Pref. Attach.

• Vary steps of P.A. with steps of random attachment.

Dorogovtsev SN, Mendes JFF, Samukhin AN (2000) Phys RevLett 85.

Achieves 2 < γ < 3.

• Consider non-linear P.A., where prob of attaching to node ofdegree k ∼ (dk)b.

“Organization of growing random networks”, P. L. Krapivskyand S. Redner, Phys. Rev. E 63, 066123 (2001).

– Sublinear (b < 1); deg dist decays faster than power law.

– Superlinear (b > 1): one node emerges as the center of a“star”-like topology.

Page 34: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Alternatives to PA that yield pk ∼ k−γ

• Copying models– WWW:• “The web as a graph: measurements, models, and methods”, J. M. Kleinberg, R. Kumar,

P. Raghavan, S. Rajagopalan, A. S. Tomkins, Proceedings of the 5th annual international

conference on Computing and combinatorics, 1999.

• “Stochastic models for the Web graph”, R. Kumar, P. Raghavan, S. Rajagopalan, D.

Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the web graph. In Proc. 41st

IEEE Symp. on Foundations of Computer Science, pages 5765, 2000.

– Biology (Duplication-Mutation-Complementation)• “Modeling of Protein Interaction Networks”, Alexei Vazquez, Alessandro Flammini, Amos

Maritan, Alessandro Vespignani, Complexus Vol. 1, No. 1, 2003

• Optimization models– Fabrikant-Koutsoupias-Koutsoupias (2002).

– D’Souza-Borgs-Chayes-Berger-Kleinberg (2007).

Page 35: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Other approaches

• “Winners don’t take all: Characterizing the competition for linkson the web”, D M. Pennock, G. W. Flake, S. Lawrence, E. J.Glover, C. Lee Giles, PNAS 99 (2002).

• First mover advantage

• Second mover advantage

Page 36: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Further reading:(All refs available on “references” tab of course web page)

PA model of network growth

• Barabasi and Albert, “Emergence of Scaling in RandomNetworks”, Science 286, 1999.

• B. Bollobas, O. Riordan, J. Spencer, and G. Tusnady, “Thedegree sequence of a scale-free random process”, RandomStructures and Algorithms 18(3), 279-290, 2001.

• Newman Review, pages 30-35.

• Durrett Book, Chapter 4.

Page 37: ECS 289 / MAE 298, Lecture 3 April 8, 2014

Further reading, cont.

Fitting power laws to data

• Newman Review, pages 12-13.

• M. Mitzenmacher, “A Brief History of Generative Modelsfor Power Law and Lognormal Distributions”, InternetMathematics 1 (2), 226-251, 2003.

• A. Clauset, C. R. Shalizi and M.E.J. Newman, “Power-lawdistributions in empirical data”, SIAM review, 2009.