dsouza supernova 2008

Layers of Networks(Towards a Science of Networks)

Raissa D’Souza UC DavisDept of Mechanical and Aeronautical Eng.

Complexity Sciences CenterSanta Fe Institute

22 January 2007 CSE Advance 2

Networks:

Transportation

Networks/

Power grid(distribution/

collection networks)

Biological networks- protein interaction

- genetic regulation

- drug design

Computer

networks

Social networks- Immunology

- Information

- Commerce

Networks: Physical, Biological, Social

• Geometric versus virtual (Internet versus WWW).

• Natural / spontaneously arising versus engineered / built.

• Each network optimizes something unique.

• Identifying similarities and fundamental differences canguide future design/understanding.

1. How do we build a coherent distributed energy system integrating solar,wind, hydropower, bio-diesel, hydrogen, etc.

2. Is old infrastructure introducing vulnerabilities in telecom?

• Definition of node can depend on level of representation.

Studying each network individually(Though we know they interact)

• Topology (Statistical properties of node and edges)

– degree and degree distribution (extremely varied)– diameter (“small-world”)– clustering coefficients– assortative mixing– betweenness, communities/partitioning, etc.

• Activity (Information flows)

– epidemiology (humans and computers)– Web search (ranking the web map)– consensus formation / tipping points / phase transitions

Interactions between structure and function.

Software call graphs and

OSS Developer networks

• Highly evolveable, modular, robust to mutation, exhibit punctuated eqm

• Open-source software as a “systems” / organization paradigm.

D’Souza, Filkov, Devanbu, Swaminathan, Hsu

NETWORK TOPOLOGY

Connectivity matrix, M :

Mij =

{1 if edge exists between i and j

0 otherwise.

1 1 1 1 01 1 0 1 01 0 1 0 01 1 0 1 10 0 0 1 1

= M

Node degree is number of links.

Broad Heterogeneity in node degree

e.g., The “Who-is-Who” network in Budapest(Balazs Szendroi and Gabor Csanyi)

Bayesian curve fitting→ p(k) = ck−γe−αk

Random Power Law Graphs:(e.g., “Preferential Attachment”, Barabasi and Albert, Science 1999)

Hubs and leaves

letters to nature

NATURE | VOL 406 | 27 JULY 2000 | www.nature.com 379

called scale-free networks, which include the World-Wide Web3–5,the Internet6, social networks7 and cells8. We find that suchnetworks display an unexpected degree of robustness, the abilityof their nodes to communicate being unaffected even by un-realistically high failure rates. However, error tolerance comes at ahigh price in that these networks are extremely vulnerable toattacks (that is, to the selection and removal of a few nodes thatplay a vital role in maintaining the network’s connectivity). Sucherror tolerance and attack vulnerability are generic properties ofcommunication networks.

The increasing availability of topological data on large networks,aided by the computerization of data acquisition, had led to greatadvances in our understanding of the generic aspects of networkstructure and development9–16. The existing empirical and theo-retical results indicate that complex networks can be divided intotwo major classes based on their connectivity distribution P(k),giving the probability that a node in the network is connected to kother nodes. The first class of networks is characterized by a P(k)that peaks at an average !k" and decays exponentially for large k. Themost investigated examples of such exponential networks are therandom graph model of Erdos and Renyi9,10 and the small-worldmodel of Watts and Strogatz11, both leading to a fairly homogeneousnetwork, in which each node has approximately the same numberof links, k ! !k". In contrast, results on the World-Wide Web(WWW)3–5, the Internet6 and other large networks17–19 indicatethat many systems belong to a class of inhomogeneous networks,called scale-free networks, for which P(k) decays as a power-law,that is P!k""k! g, free of a characteristic scale. Whereas the prob-ability that a node has a very large number of connections (k q !k")is practically prohibited in exponential networks, highly connectednodes are statistically significant in scale-free networks (Fig. 1).

We start by investigating the robustness of the two basic con-nectivity distribution models, the Erdos–Renyi (ER) model9,10 thatproduces a network with an exponential tail, and the scale-freemodel17 with a power-law tail. In the ER model we first define the Nnodes, and then connect each pair of nodes with probability p. Thisalgorithm generates a homogeneous network (Fig. 1), whose con-nectivity follows a Poisson distribution peaked at !k" and decayingexponentially for k q !k".

The inhomogeneous connectivity distribution of many real net-works is reproduced by the scale-free model17,18 that incorporatestwo ingredients common to real networks: growth and preferentialattachment. The model starts with m0 nodes. At every time step t anew node is introduced, which is connected to m of the already-existing nodes. The probability !i that the new node is connectedto node i depends on the connectivity ki of node i such that!i # ki=Sjkj. For large t the connectivity distribution is a power-law following P!k" # 2m2=k3.

The interconnectedness of a network is described by its diameterd, defined as the average length of the shortest paths between anytwo nodes in the network. The diameter characterizes the ability oftwo nodes to communicate with each other: the smaller d is, theshorter is the expected path between them. Networks with a verylarge number of nodes can have quite a small diameter; for example,the diameter of the WWW, with over 800 million nodes20, is around19 (ref. 3), whereas social networks with over six billion individuals

Exponential Scale-free

ba

Figure 1 Visual illustration of the difference between an exponential and a scale-freenetwork. a, The exponential network is homogeneous: most nodes have approximatelythe same number of links. b, The scale-free network is inhomogeneous: the majority ofthe nodes have one or two links but a few nodes have a large number of links,guaranteeing that the system is fully connected. Red, the five nodes with the highestnumber of links; green, their first neighbours. Although in the exponential network only27% of the nodes are reached by the five most connected nodes, in the scale-freenetwork more than 60% are reached, demonstrating the importance of the connectednodes in the scale-free network Both networks contain 130 nodes and 215 links(!k " # 3:3). The network visualization was done using the Pajek program for largenetwork analysis: !http://vlado.fmf.uni-lj.si/pub/networks/pajek/pajekman.htm".

0.00 0.01 0.0210

15

20

0.00 0.01 0.020

5

10

15

0.00 0.02 0.044

6

8

10

12a

b c

f

d

Internet WWW

Attack

Failure

Attack

Failure

SFE

AttackFailure

Figure 2 Changes in the diameter d of the network as a function of the fraction f of theremoved nodes. a, Comparison between the exponential (E) and scale-free (SF) networkmodels, each containing N # 10;000 nodes and 20,000 links (that is, !k " # 4). The bluesymbols correspond to the diameter of the exponential (triangles) and the scale-free(squares) networks when a fraction f of the nodes are removed randomly (error tolerance).Red symbols show the response of the exponential (diamonds) and the scale-free (circles)networks to attacks, when the most connected nodes are removed. We determined the fdependence of the diameter for different system sizes (N # 1;000; 5,000; 20,000) andfound that the obtained curves, apart from a logarithmic size correction, overlap withthose shown in a, indicating that the results are independent of the size of the system. Wenote that the diameter of the unperturbed (f # 0) scale-free network is smaller than thatof the exponential network, indicating that scale-free networks use the links available tothem more efficiently, generating a more interconnected web. b, The changes in thediameter of the Internet under random failures (squares) or attacks (circles). We used thetopological map of the Internet, containing 6,209 nodes and 12,200 links (!k " # 3:4),collected by the National Laboratory for Applied Network Research !http://moat.nlanr.net/Routing/rawdata/". c, Error (squares) and attack (circles) survivability of the World-WideWeb, measured on a sample containing 325,729 nodes and 1,498,353 links3, such that!k " # 4:59.

© 2000 Macmillan Magazines Ltd

Albert, Jeong and Barabasi, Nature, 406 (27) 2000.

N=130, E=215Red five highest degree nodes;

Green their neighbors.

“Robust” to random failure,fragile to targeted.

Is connectivity a good thing?

Engineered networks (e.g., the Internet) are not random!

Optimization in network growth

(D’Souza, Borgs, Chayes, Berger, Kleinberg, PNAS 2007)

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

oo

o

o

oo

o

o

o

o

oo

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

ooo

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

ooo

o

o

o

o

o

o

o

o

o

o

o

o

ooo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

(Competing objectives)

Network Activity: FLOWS on NETWORKS

(Spread of disease, routing data, materials transport/flow,gossip spread/marketing)

Random walk on the network has state transition matrix, P :

1/4 1/3 1/2 1/4 01/4 1/3 0 1/4 01/4 0 1/2 0 01/4 1/3 0 1/4 1/20 0 0 1/4 1/2

= P

The eigenvalues and eigenvectors convey much information.Markov Chains, Spectral Gap.

Feedback and network growthof Hierarchical organizations

• Functional = efficient information flow throughout organization.

• More functional→ grow faster(but each new attachment less optimal)

• Less functional→ grow slower but more balanced(each new attachmentmore considered)

(more balanced, efficient structures:

respond to changing circumstances)

Building a “science of networks”

• Last ten years, since 1999.

• Understanding activity and topology of individual networks.

• “Nodes”, “Robustness” (e.g., connectivity) context dependent.

“all our modern critical infrastructurerelies on networks”

Our modern infrastructure

Layered, interacting networks

• ? MATHEMATICS NEEDED: ?Multiple info streams; Layered interactions; PDEs (calculus)

dsouza supernova 2008

Business

exponential network

network activity

networks mathematics

layers of networks

network individuallythough

exponential squares

scalefree network

network growth dsouza