extracting information from complex networks from the metabolism to collaboration networks roger...

42
Extracting information Extracting information from complex networks from complex networks From the metabolism to collaboration From the metabolism to collaboration networks networks Roger Guimerà Department of Chemical and Biological Engineering Northwestern University Bloomington, April 11th, 2005

Upload: ariel-russell

Post on 17-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

High-throughput techniques in biology Metabolic network Protein interactions in fruit fly Giot et al., Science (2003)

TRANSCRIPT

Page 1: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Extracting information from Extracting information from complex networkscomplex networks

From the metabolism to collaboration networksFrom the metabolism to collaboration networks

Roger GuimeràDepartment of Chemical and Biological Engineering

Northwestern University

Bloomington, April 11th, 2005

Page 2: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

High-throughput techniques in biology

Protein interactions in fruit flyGiot et al., Science (2003)

Metabolic network

Page 3: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Large databases for critical infrastructures

World-wide airport network

Page 4: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Large databases for social networks

Collaborations in Econometrica Collaborations in the Astronomical Journal

Page 5: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

What do “statistical properties” tell us about the network?

Page 6: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

What are the important cities in the world-wide airport network?

Most connected cities

Most central

cities

Page 7: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Cartography of complex Cartography of complex (metabolic) networks(metabolic) networks

with L. A. N. Amaral

Page 8: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Cartography of complex (metabolic) networks

Modules One divides the system into “regions”

Roles One highlights important players

Page 9: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Real metabolic networks are extremely complex…

Page 10: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

…and “regions” are not so well defined

Metabolic network of E. coli

Page 11: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

One can define a quantitative measure of modularity

Low modularity

High modularity

Newman & Girvan, PRE (2003)

Page 12: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

One can define a quantitative measure of modularity

Modularity of a partition: M = (ds – Ds)

Newman & Girvan, PRE (2003); Guimera, Sales-Pardo, Amaral, PRE (2004)

ds: fraction of links within module s

Ds: expected fraction of links within module s, for a random partition

of the nodes

Page 13: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

We use simulated annealing to obtain the partition with largest modularity

Simulated Annealing

Page 14: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The new algorithm for module detection outperforms previous algorithms

Page 15: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Now we need to identify the role of each node

Page 16: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

We define the within-module degree and the participation coefficient

eddistributuevenly links1

module onein links all0P

Within-module relative degree k: number of links of a node to other nodes in the same module

Within-module degree:

Participation coefficient fis: fraction of links of node i in module s

Participation coefficient: Pi = 1 - fis

k

kkz

2

Page 17: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The within-module degree and the participation coefficient define the role of each node

Page 18: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

We define seven different roles

Hubs

Non-hubsPeripheral

Non

-hub

co

nnec

tors

Ultra-peripheral

Pro

vinc

ial

hubs

Con

nect

or

hu

bs

Kin

less

no

n-hu

bsK

inle

ss

hubs

Page 19: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The cartographic representation of the metabolic network of E. coli

Guimera & Amaral, Nature (2005)

Page 20: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The loss rate quantifies the importance of a role

Metabolite Role in Species A Role in Species B A Ultra-peripheral Peripheral B Connector hub Connector hub C Ultra-peripheral LOST D LOST Peripheral ...

Loss rate of role R: ploss(R) = p(lost | R)

Page 21: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Non-hub connectors are more conserved across species than provincial hubs

Comparison between 12 organisms: 4 archaea 4 bacteria 4 eukaryotes

Ultra-perip

heral

Peripheral

Non-hub co

nnecto

rs

Provinc

ial hubs

Connecto

r hubs

Page 22: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

1 – Ultra-peripheral 2 – Peripheral 3 – Non-hub connectors

5 – Provincial hubs 6 – Connector hubs

Different networks have different role structures

Page 23: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Collaboration networks: Collaboration networks: Team assembly, network Team assembly, network

structure, and performancestructure, and performance

with B. Uzzi, J. Spiro, and L. A. N. Amaral

Page 24: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Different collaboration networks have different properties

Collaborations in Econometrica Collaborations in the Astronomical Journal

Page 25: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

How do collaboration networks grow? How are teams assembled?

A model for collaboration network formation must specify what rules determine the participation of an individual in a team

Page 26: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Balancing expertise and diversity

Expertise Diversity

Performance

But:

Need to incorporate new people

But:

It is easier to work with similar people and with former collaborators

Page 27: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

1

Incumbents

25

4

Newcomers

3 p1-p

Page 28: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

1

Incumbents

25

434

Page 29: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

1

Incumbents

25

4

Newcomers

3

p

1-p4

Page 30: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

Newcomers4

6

Page 31: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

1

Incumbents

25

4

Newcomers

3

p1-p4

6

Page 32: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

1

Incumbents

25

434

6

Page 33: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

1

Any incumbent

25

434

6

5

3

Repeat collaboration

1-qq

Page 34: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

4

6

5

3

Repeat collaboration

3

Page 35: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Assembling a new team

1

2

3

5

4

4

6 3

1

2

3

5

4

6

Page 36: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The structure of the network depends on the fraction of incumbents...

Guimera, Uzzi, Spiro & Amaral, Science (forthcoming 2005)

Page 37: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

...and on the tendency to repeat past collaborations

The size of the “invisible college” increases with the fraction of incumbents, p, and decreases with the tendency to repeat collaborations, q.

Page 38: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Most fields have very similar values of p and q

Page 39: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The fraction of incumbents is positively correlated with the impact factor of journals

Page 40: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

The tendency to repeat collaborations is negatively correlated with the impact factor of journals

Page 41: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Conclusions

We need to go one step further in the analysis of complex networks, so that we can provide specific answers to specific problems.

Modules and roles give important information about the structure of a network and about the importance of each node.

Networks with different functions have different role structure.

In creative collaboration networks, the emergence of the invisible college and team performance are correlated to expertise and diversity (in a “network sense”), and there may be a universal optimum.

Page 42: Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering

Acknowledgements

Marta Sales-Pardo, André A. Moreira, and Daniel B. Stouffer.

Fulbright Commission and Spanish Ministry of Education, Culture, and Sports.

More information:http://amaral.northwestern.edu/roger/http://amaral.northwestern.edu/