title title

33
Robustness, clustering & evolutionary conservation Stefan Wuchty Stefan Wuchty Center of Network Research Center of Network Research Department of Physics Department of Physics University of Notre Dame University of Notre Dame

Upload: dalton-woods

Post on 31-Dec-2015

23 views

Category:

Documents


2 download

DESCRIPTION

title title. Robustness, clustering & evolutionary conservation. Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame. New York Times. Complex systems. Made of many non-identical elements connected by diverse interactions. NETWORK. Bio-Map. GENOME. - PowerPoint PPT Presentation

TRANSCRIPT

Robustness, clustering &

evolutionary conservation

Robustness, clustering &

evolutionary conservation

Stefan WuchtyStefan WuchtyCenter of Network ResearchCenter of Network Research

Department of PhysicsDepartment of PhysicsUniversity of Notre DameUniversity of Notre Dame

Complex systems

Made of many non-identical elements connected by diverse interactions.

NETWORK

protein-gene

interactions

protein-protein

interactions

PROTEOME

GENOME

Citrate Cycle

METABOLISM

Bio-chemical reactions

protein-protein

interactions

PROTEOME

Yeast protein networkNodes: proteins

Links: physical interactions (binding)

P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …

Topology of the protein network

)exp()(~)( 00

k

kkkkkP

H. Jeong, S.P. Mason, A.-L. Barabasi & Z.N. Oltvai, Nature, 2001

RobustnessComplex systems maintain their basic functions even under errors and failures

(cell mutations; Internet router breakdowns)

node failure

fc

0 1Fraction of removed nodes, f

1

S

Robustness of scale-free networks

1

S

0 1ffc

Attacks

3 : fc=1

(R. Cohen et. al., PRL, 2000)

Failures

Topological error tolerance

R. Albert et.al. Nature, 2000

Yeast protein network- lethality and topological position -

Highly connected proteins are more essential (lethal)...

H. Jeong et al., Nature, 2001

Modules in biological systems

Metabolic networks Protein networks

E. Ravasz et al., Science, 2002

Can we identify the modules?

),min(

),(),(

jiT kk

jiJjiO J(i,j): # of nodes both i and j link to; +1 if there is a direct (i,j) link

Metabolism: E. Ravasz et al., Science, 2002Protein interactions: Rives and Galitski, PNAS, 2003 Spirin and Mirny, PNAS, 2003

Open questionsDoes the application of standart clusteringalgorithms reflect real modules well?

Since e.g. one protein can be part of more than one protein complex overlapping clustering algorithms should give better results.

Motifs Small subnetworks that appear in real world networks

significantly more often than in random graphs.

(Milo et al., Science, 2002; Conant and Wagner, Nature Gen., 2003, Shen-Orr et al., Nature Gen., 2002, Milo et al, Science, 2004)

From the particular to the universal

A.-L- Barabasi & Z. Oltvai, Science, 2002

Topology and Evolution

Topology and Evolution

S. Wuchty, Z. Oltvai & A.-L. Barabasi, Nature Genetics, 2003

S. Wuchty, Genome Res., 2004

Topology and evolution

- General distribution of orthologs:E = N(o)/N(p)

- degree-dependent distribution of orthologsek = Nk(o)/Nk

Orthologous Excess Retention:ERk = ek/E

Clustering in protein interaction networks

Goldberg and Roth, PNAS, 2003

high clustering = high quality of interaction

|))(||,)(min(|

|)()(| |)(||)(|

|)(||)(|log

wNvN

wNvNi

vwwN

N

iwN

vNN

i

vNC

Does that also hold for evolutionary conservation?

Protein-protein interaction data are highly flawed:

90% false positives, 50% false negatives

Von Mering et al., Nature, 2002

How stable are these results?

Something else?

Eisen et al., PNAS, 1998

Open question

?

Wuchty et al., submitted, 2004

?

Plasmodium falciparum• Eukaryotic organism• Malaria parasite• Genome size 23 MB, 14 chromosomes• 5300 genes (estimated, Hall et al., Nature 2002,

Gardner et al., Nature, 2002) • No protein interaction data available• Co-expression data available (Bozdech et al.,

PloS, 2003, LeRoch et al., Science, 2003)• 868 orthologs with Yeast (InParanoid, Remm et

al. J. Mol. Biol., 2001)

Plasmodium falciparum

Plasmodium falciparum

Inferred protein interaction network

in P. falciparum• 667 nodes, 3,564 weighted interactions

• Clustering

i

iii aeQ )( 2 i

iii aeQ )( 2

- Iteratively pruning edges starting with the least weighted link- Quality of clusters is assessed by their modularity

i

iii aeQ )( 2

until a maximum is reached.

All edges shown with Cvw > 1.Colorcode red: Cvw > 4, yellow: Cvw > 3, green: Cvw > 2, blue: Cvw > 1

What does that mean?

Validation of results?

Co-expression patternsBozdech et al. PLoS, 2003

replication

exo/protesome

DN

A processing

translation

RN

A processing

ribososome

Wuchty, Barabasi, Ferdig and Adams, in preperation

What‘s next?

• Uncovering evolutionary cores of interactions in other organisms.

• Application of a Maximum Set Cover Algorithm to predict protein interactions (Huang, Kaanan, Wuchty, Izaguirre and Cheng, submitted) to unfold the interactome using the evolutionary cores and experimentally derived interactions.