extracting information from complex networks from the metabolism to collaboration networks roger...
DESCRIPTION
High-throughput techniques in biology Metabolic network Protein interactions in fruit fly Giot et al., Science (2003)TRANSCRIPT
Extracting information from Extracting information from complex networkscomplex networks
From the metabolism to collaboration networksFrom the metabolism to collaboration networks
Roger GuimeràDepartment of Chemical and Biological Engineering
Northwestern University
Bloomington, April 11th, 2005
High-throughput techniques in biology
Protein interactions in fruit flyGiot et al., Science (2003)
Metabolic network
Large databases for critical infrastructures
World-wide airport network
Large databases for social networks
Collaborations in Econometrica Collaborations in the Astronomical Journal
What do “statistical properties” tell us about the network?
What are the important cities in the world-wide airport network?
Most connected cities
Most central
cities
Cartography of complex Cartography of complex (metabolic) networks(metabolic) networks
with L. A. N. Amaral
Cartography of complex (metabolic) networks
Modules One divides the system into “regions”
Roles One highlights important players
Real metabolic networks are extremely complex…
…and “regions” are not so well defined
Metabolic network of E. coli
One can define a quantitative measure of modularity
Low modularity
High modularity
Newman & Girvan, PRE (2003)
One can define a quantitative measure of modularity
Modularity of a partition: M = (ds – Ds)
Newman & Girvan, PRE (2003); Guimera, Sales-Pardo, Amaral, PRE (2004)
ds: fraction of links within module s
Ds: expected fraction of links within module s, for a random partition
of the nodes
We use simulated annealing to obtain the partition with largest modularity
Simulated Annealing
The new algorithm for module detection outperforms previous algorithms
Now we need to identify the role of each node
We define the within-module degree and the participation coefficient
eddistributuevenly links1
module onein links all0P
Within-module relative degree k: number of links of a node to other nodes in the same module
Within-module degree:
Participation coefficient fis: fraction of links of node i in module s
Participation coefficient: Pi = 1 - fis
k
kkz
2
The within-module degree and the participation coefficient define the role of each node
We define seven different roles
Hubs
Non-hubsPeripheral
Non
-hub
co
nnec
tors
Ultra-peripheral
Pro
vinc
ial
hubs
Con
nect
or
hu
bs
Kin
less
no
n-hu
bsK
inle
ss
hubs
The cartographic representation of the metabolic network of E. coli
Guimera & Amaral, Nature (2005)
The loss rate quantifies the importance of a role
Metabolite Role in Species A Role in Species B A Ultra-peripheral Peripheral B Connector hub Connector hub C Ultra-peripheral LOST D LOST Peripheral ...
Loss rate of role R: ploss(R) = p(lost | R)
Non-hub connectors are more conserved across species than provincial hubs
Comparison between 12 organisms: 4 archaea 4 bacteria 4 eukaryotes
Ultra-perip
heral
Peripheral
Non-hub co
nnecto
rs
Provinc
ial hubs
Connecto
r hubs
1 – Ultra-peripheral 2 – Peripheral 3 – Non-hub connectors
5 – Provincial hubs 6 – Connector hubs
Different networks have different role structures
Collaboration networks: Collaboration networks: Team assembly, network Team assembly, network
structure, and performancestructure, and performance
with B. Uzzi, J. Spiro, and L. A. N. Amaral
Different collaboration networks have different properties
Collaborations in Econometrica Collaborations in the Astronomical Journal
How do collaboration networks grow? How are teams assembled?
A model for collaboration network formation must specify what rules determine the participation of an individual in a team
Balancing expertise and diversity
Expertise Diversity
Performance
But:
Need to incorporate new people
But:
It is easier to work with similar people and with former collaborators
Assembling a new team
1
2
3
5
4
1
Incumbents
25
4
Newcomers
3 p1-p
Assembling a new team
1
2
3
5
4
1
Incumbents
25
434
Assembling a new team
1
2
3
5
4
1
Incumbents
25
4
Newcomers
3
p
1-p4
Assembling a new team
1
2
3
5
4
Newcomers4
6
Assembling a new team
1
2
3
5
4
1
Incumbents
25
4
Newcomers
3
p1-p4
6
Assembling a new team
1
2
3
5
4
1
Incumbents
25
434
6
Assembling a new team
1
2
3
5
4
1
Any incumbent
25
434
6
5
3
Repeat collaboration
1-qq
Assembling a new team
1
2
3
5
4
4
6
5
3
Repeat collaboration
3
Assembling a new team
1
2
3
5
4
4
6 3
1
2
3
5
4
6
The structure of the network depends on the fraction of incumbents...
Guimera, Uzzi, Spiro & Amaral, Science (forthcoming 2005)
...and on the tendency to repeat past collaborations
The size of the “invisible college” increases with the fraction of incumbents, p, and decreases with the tendency to repeat collaborations, q.
Most fields have very similar values of p and q
The fraction of incumbents is positively correlated with the impact factor of journals
The tendency to repeat collaborations is negatively correlated with the impact factor of journals
Conclusions
We need to go one step further in the analysis of complex networks, so that we can provide specific answers to specific problems.
Modules and roles give important information about the structure of a network and about the importance of each node.
Networks with different functions have different role structure.
In creative collaboration networks, the emergence of the invisible college and team performance are correlated to expertise and diversity (in a “network sense”), and there may be a universal optimum.
Acknowledgements
Marta Sales-Pardo, André A. Moreira, and Daniel B. Stouffer.
Fulbright Commission and Spanish Ministry of Education, Culture, and Sports.
More information:http://amaral.northwestern.edu/roger/http://amaral.northwestern.edu/