"home depot" model of evolution of prokaryotic metabolic networks and their regulation...

Post on 18-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

"Home Depot" Model of Evolution of Prokaryotic Metabolic Networks and

Their Regulation

Sergei MaslovBrookhaven National

LaboratoryIn collaboration with Kim Sneppen and Sandeep Krishna,

Center for Models of Life, Copenhagen Uand

Tin Yau Pang, Stony Brook U

Stover et al., Nature (2000)

van Nimwegen, TIG (2003)

The rise of bureaucracy! Fraction of bureaucrats grows with organization

size Trend (if unchecked) could lead to a “bureaucratic

collapse”: 100% bureaucrats and no workers As human bureaucrats, transcription factors are

replaceable and disposable many anecdotal stories of one regulator replacing another in

closely related organisms Not very essential (at least in yeast) .

One is tempted to view regulators nearly as “parasites” or superficial add-ons that marginally improve the efficiency of an organism

But if you are a bureaucrat you see your role somewhat differently…

Encephalization QuotientEQ~M(brain)1//M(body)

From Carl Sagan's book: “Dragons of Eden: Speculations on the Evolution

of Human Intelligence”

Table from M.Y. Galperin, BMC Microbiology (2005)

Bacterial IQ

Bacterial IQ~(Nsignal

trasnsducers)1/2/Ngenes

Quadratic scaling applies to all types of regulation and signaling

Table from Molina, van Nimwegen, Biology Direct 2008

How to explain the quadratic law?

Let’s play with this scaling law

• NR=NG2/80,000 --> NR=NG 2NG/80,000

NG /NR=40,000/NG

• ~40 new genes per regulator for NG=1000• ~4 new genes (1 regulator + 3 non-regulatory genes)

for the largest bacterial genomes with NG~10,000

• Important observation: NG /NR decreases with genome size

Now to our model

Disclaimer: authors of this study (unfortunately) received no financial support from Home Depot, Inc. or Obi, GMBH

“Home Depot” argument• Inspired by personal experience as a new homeowner buying

tools• Tools are bought to accomplish functional tasks e.g. fix a leaking

faucet • Redundant tools are returned to “Home Depot”• As your toolbox grows you need to get fewer and fewer new tools

to accomplish a new task

• Tools are e.g. metabolic pathways acquired by Horizontal Gene Transfer

• Regulators control these pathways (we assume one regulator per task/pathway)

• Redundant genes are promptly deleted (in prokaryotes)• Genomes shrink by deleting entire pathways that are no longer

required • All non-regulatory “workhorse” genes of an organism - its toolbox• As it gets larger you need fewer new workhorse genes per new

regulated function – FASTER THAN LINEAR SCALING

Random overlap between functions no quadratic scaling!

Nuniv – the total number of tools in “Home Depot” NG – the number of tools in my toolbox Lpathway – the number of tools needed for each

new functional task If overlap is random then Lpathway NG / Nuniv

are redundant (already in the toolbox) dNG/dNR= Lpathway- Lpathway NG / Nuniv

Superlinear only due to logarithmic corrections: NG= Lpathway NR / Nuniv log NR

max/(NRmax

-NR) Networks are needed for non-random overlap

between functional pathways

Spherical cow modelof metabolic networks

Food WasteMilk

nutrient

Horizontal gene transfer:entire pathways could be added in one step

Pathways could be also removed

Central metabolism anabolic pathways biomass

nutrient

nutrient

• New pathways are added from the universal network formed by the union of all reactions in all organisms (bacterial answer to “Home depot”)

• The only parameter - the size of the universal network Nuniv

• The current size of the toolbox (# of genes ~ # of enzymes ~ # of metabolites): NG

• Probability to join the existing pathway: pjoin= NG /Nuniv

• Lpathway=1/pjoin=Nuniv/NG

• If one regulator per pathway: NG/NR=Lpathway=Nuniv/NG

• Quadratic law: NR=NG2 /2Nuniv

+

=

We tried several versions of the toolbox model

On a random network: analytically solved to give NR~Nmet

2

On a union of all KEGG reactions: numerically solved to give NR~Nmet

1.8

~1800 reactions and metabolites upstream of the central metabolism

Randomly select nutrients Follow linear pathways until they overlap with

existing network

102

103

104

100

101

102

103

N

TF

Ngenes

Green – all fully sequenced prokaryotes

Red – toolbox model on KEGG universal network with Nuniv=1800

From SM, S. Krishna, T.Y. Pang, K. Sneppen, PNAS (2009)

100

101

102

10-4

10-3

10-2

10-1

100

branch/regulon size

cum

ula

tive

dis

trib

uti

on

Green – linear branches in E. coli metabolic network

Red – toolbox model on KEGG

SM, S. Krishna, T.Y. Pang, K. Sneppen, PNAS (2009)

Length distribution of metabolic pathways/branches

-1=2

Model with shortest & branched instead of meandering & linear

pathways

101

102

103

100

101

102

103

Nmet

NT

F

Slope=1.7

SM, T.Y. Pang, in preparation (2010)

What does it mean for regulatory networks?

NR<Kout>=NG<Kin>=number of regulatory interactions

NR/NG= <Kin>/<Kout> increases with NG Either <Kout> decreases with NG:

pathways become shorter as in our model Or <Kin> grows with NG:

regulation gets more coordinated Most likely both trends at onceE. van Nimwegen, TIG (2003)

nutrient

nutrient

TF1

TF2

Regulating pathways: basic version

<Kout>: <Kin>=1=const

nutrient

nutrient

TF1

Regulating pathways: long regulons

TF2

<Kout>=const<Kin>:

nutrient

nutrient

TF1

TF2

Regulating pathways: TFTF + upstream

suppression

nutrient

nutrient

TF1

TF2

Regulating pathways: new TFs

TF1

Conclusions and future plans Toolbox “Home Depot” model explains:

Quadratic scaling of the number of regulators Broad distribution (hubs and stubs) of regulon sizes:

most functions need few tools, some need many Gene duplication models offer an alternative way

to explain hubs in biological networks but the ultimate explanation has to be functional

Our model relies on Horizontal Gene Transfer instead of gene duplication

To do list: Coordination of regulation of different pathways:

which of our proposed scenarios (if any) is realized? What Nature is trying to minimize when adding branched

pathways? The number of added reactions? The number of byproducts? Cross-talk with existing pathways?

Extensions to organizations, technology innovations, etc?

Thank you!

Target product

By-product

By-product

“Surface”

NM

100

101

102

100

101

102

# of metabolites in a pathway

Su

rfac

e o

f p

ath

way

surface

log-binned surface:exponent = 0.25 - 0.5

100

101

102

10-1

100

101

102

# of metabolites in a pathway

Su

rfac

e o

f p

ath

way

by-products

log-binned by-products:exponent = 1

Toolbox model E. coli metabolic network (spanning tree)

nutrient

nutrient

nutrient

TF1

TF2

TF1

Deleting pathways

100

101

102

10-4

10-3

10-2

10-1

100

branch length/regulon size

cum

ula

tive

dis

trib

uti

on

-1=1

-1=2

Green – regulons in E. coli

Red – toolbox model on full KEGG

Distribution of regulon sizes

Table from M.Y. Galperin, BMC Microbiology (2005)

Bacterial IQIQ~(Nsignal trasnsducers)1/2/Ngenes

100

101

102

10-3

10-2

10-1

100

regulon size

cum

ula

tive

dis

trib

uti

on

Fig. 2AFig 2BFig. 2CFig. 2D

KEGG pathways vs reactionsIn ~500 fully sequenced prokaryotes

# of reactions ~ NG

# o

f p

ath

ways

~ N

R

SM, S. Krishna, K. Sneppen (2008)

102

103

104

100

101

102

103

N

TF

Ngenes

MF-model, Nuniv

=1750

kegg-maps, 1800best fit to x: slope=2.15

A

100

101

102

100

101

102

dNM

surfaceby-product

Adaptive evolution of bacterial metabolic networks by horizontal gene transferCsaba Pal, Balazs Papp & Martin Lercher, Nat. Gnet. (2005)

Adaptive evolution of bacterial metabolic networks by horizontal gene transferCsaba Pal, Balazs Papp & Martin Lercher, Nat. Gnet. (2005)

nutrient

nutrient

nutrient

TF1

TF2

102

103

104

10-2

10-1

100

# of genes

# o

f e

nzy

mes

/# o

f g

enes

meanfraction=0.23

102

103

104

100

101

102

103

NG

- # of genes

# o

f A

BC

tra

nsp

ort

ers

(p

fam

:PF

000

05)

all prokaryotes

fit slope 1.33

Table from Erik van Nimwegen, TIG 2003

Complexity is manifested in Kin distribution

E. coli vs. S. cerevisiae vs. H. sapiens

0 5 10 15 2010

0

101

102

103

Kin

N(K

in)

100

101

102

10-2

10-1

100

101

102

Kout

N(K

out)

A B

Basic versionCoordinated activity of pathways

SM, S. Krishna, K. Sneppen (2008)

Jerison 1983

Jerison 1983 The evolution of the mammalian brain as an information-processing system. pp. 113-146 IN Eisenberg, J. F. & Kleiman, D. G. (Ed.), Advances in the Study of Mammalian Behavior (Spec. Publ. Amer. Soc. Mamm. 7). Pittsburgh: American Society of Mammalogists. Figure redrawn from Jerison 1973

Jerison 1983

Trivia facts

Zebrafish – the largest # of TFs (~2700) or 10% of ~27,000 genes. (humans ~1900 TFs or 8% of 24,000 genes)

In bacteria it is Burkholderia sp. 383 : ~800 TFs out of 8000 genes (also 10% of the total)

Linear fit to log(NR) with log(Ngenes) explains 87% of the variance (cc~0.93)

Linear fit to NR/Ngenes with Ngenes explains 50%-60% of the variance (cc~0.7-0.75).

Gut/sewer bacterium: E. coli K12: 4467 genes 271 TFs 6%

http://www.g-language.org/g3/

Aphid parasite: Buchnera aphidicola APS: 618 genes 6 TFs 1%

http://www.g-language.org/g3/

Soil bacterium: Rhodococcus sp. RHA1: 9221 genes 641 TFs 7%

http://www.g-language.org/g3/

Gut/free bacterium: E. coli K12: 4467 genes 271 TFs 6%

http://www.g-language.org/g3/

top related