Transcript
Page 1: Why bacteria run Linux  while eukaryotes  run Windows?

Why bacteria run Linux while eukaryotes run

Windows?Sergei Maslov

Brookhaven National LaboratoryNew York

Page 2: Why bacteria run Linux  while eukaryotes  run Windows?

2

Physical vs. Biological Laws Physical Laws are often discovered

by finding simple common explanation for very different phenomena

Newton’s Law: Apples fall to the ground Planets revolve around the Sun

Discovery of Biological Laws is slowed down by us having cookie-cutter explanation in terms of natural selection:

Page 3: Why bacteria run Linux  while eukaryotes  run Windows?

Drawing from Facebook group: Trust me, I'm a "Biologist"'

Page 4: Why bacteria run Linux  while eukaryotes  run Windows?

Genes encoded in bacterial genomes

Packages installed on Linux computers

~

Page 5: Why bacteria run Linux  while eukaryotes  run Windows?

Complex systems have many components Genes (Bacteria) Software packages (Linux OS)

Components do not work alone: they need to be assembled to work

In individual systems only a subset of components is installed Genome (Bacteria) – collection of

genes Computer (Linux OS) – collection of

software packages Components have vastly

different frequencies of installation

Page 6: Why bacteria run Linux  while eukaryotes  run Windows?

Justin Pollard, http://www.designboom.com

IKEA kits have many components

Page 7: Why bacteria run Linux  while eukaryotes  run Windows?

Justin Pollard, http://www.designboom.com

They need to be assembled to work

Page 8: Why bacteria run Linux  while eukaryotes  run Windows?

Different frequencies of use

vs

Common Rare

Page 9: Why bacteria run Linux  while eukaryotes  run Windows?

What determines the frequency of installation/use of a

gene/package? Popularity: AKA preferential

attachment Frequency ~ self-amplifying popularity Relevant for social systems: WWW links,

facebook friendships, scientific citations Functional role:

Frequency ~ breadth or importance of the functional role

Relevant for biological and technological systems where selection adjusts undeserved popularity

Page 10: Why bacteria run Linux  while eukaryotes  run Windows?

Empirical data on component frequencies

Bacterial genomes (eggnog.embl.de): 500 sequenced prokaryotic genomes 44,000 Orthologous Gene families

Linux packages (popcon.ubuntu.com): 200,000 Linux packages installed on 2,000,000 individual computers

Binary tables: component is either present or not in a given system

Page 11: Why bacteria run Linux  while eukaryotes  run Windows?

Frequency distributions

P(f)~ f-1.5 except the top √N “universal” components with f~1

CloudShell

Core

ORFans

TY Pang, S. Maslov, PNAS (2013)

Page 12: Why bacteria run Linux  while eukaryotes  run Windows?

How to quantify functional importance?

We want to check Frequency ~ Importance

Usefulness=Importance ~ Component is needed for proper functioning of other components

Dependency network A B means A depends on B for its function Formalized for Linux software packages For metabolic enzymes given by upstream-

downstream positions in pathways Frequency ~ dependency degree, Kdep

Kdep = the total number of components that directly or indirectly depend on the selected one

Page 13: Why bacteria run Linux  while eukaryotes  run Windows?

13TY Pang, S. Maslov, PNAS (2013)

Page 14: Why bacteria run Linux  while eukaryotes  run Windows?

Correlation coefficient ~0.4 for both Linux and genesCould be improved by using weighted dependency

degree

Frequency is positively correlated with functional importance

TY Pang, S. Maslov, PNAS (2013)

Page 15: Why bacteria run Linux  while eukaryotes  run Windows?

Warm-up: tree-like metabolic network

Kdep=5

Kdep=15

TCA cycle

TY Pang, S. Maslov, PNAS (2013)

Page 16: Why bacteria run Linux  while eukaryotes  run Windows?

Dependency degree distribution on a critical branching tree

P(K)~K-1.5 for a critical branching tree

Paradox: Kmax-0.5 ~ 1/N Kmax=N2>N

Answer: parent tree size imposes a cutoff:there will be √N “core” nodes with Kmax=N present in almost all systems (ribosomal genes

or core metabolic enzymes)

Need a new model: in a tree D=1, while in real systems D~2>1

Page 17: Why bacteria run Linux  while eukaryotes  run Windows?

Bottom-down model of dependency network evolution

Components added gradually over evolutionary time

New component directly depends on D previously existing components selected randomly

Versions: D is drawn from some distribution

same as above Recent components are preferentially

selectedcitations

There is a fixed probability to connect to anypreviously existing componentsfood webs

Page 18: Why bacteria run Linux  while eukaryotes  run Windows?

18

• p(t,T) –probability that component added at time T

directly or indirectly depends on one added at time t

Page 19: Why bacteria run Linux  while eukaryotes  run Windows?

19

Page 20: Why bacteria run Linux  while eukaryotes  run Windows?

20

Kdep and Kout degree distributions

Page 21: Why bacteria run Linux  while eukaryotes  run Windows?

Kdep decreases layer numberLinux Model with D=2

TY Pang, S. Maslov, PNAS (2013)

Page 22: Why bacteria run Linux  while eukaryotes  run Windows?

Zipf plot for Kdep distributionsMetabolic enzymes

vsModel

Linuxvs

Model

TY Pang, S. Maslov, PNAS (2013)

Page 23: Why bacteria run Linux  while eukaryotes  run Windows?

Frequency distributions

P(f)~ f-1.5 except the top √N “universal” components with f~1

Shell

Core

ORFans

Cloud

TY Pang, S. Maslov, PNAS (2013)

Page 24: Why bacteria run Linux  while eukaryotes  run Windows?

What experiments does P(f) help to interpret?

Page 25: Why bacteria run Linux  while eukaryotes  run Windows?

Pan-genome of E. coli strains

M Touchon et al. PLoS Genetics (2009)

Page 26: Why bacteria run Linux  while eukaryotes  run Windows?

Metagenomes

The Human Microbiome Project Consortium, Nature (2012)

Page 27: Why bacteria run Linux  while eukaryotes  run Windows?

27

Pan-genome scaling

Page 28: Why bacteria run Linux  while eukaryotes  run Windows?

Pan-genome of all bacteria

Slope=-0.4 predictions of the toolbox model (-0.5)

P. LapierreJP Gogarten TIG 2009

(# of genes in pan-genome) ~ (# of sequenced genomes)0.5

(# of new genes added to pan-genome) ~ (# of sequenced genomes)-0.5

Page 29: Why bacteria run Linux  while eukaryotes  run Windows?

Bacterial genome evolution happens in cooperation with

phages

+ =

Page 30: Why bacteria run Linux  while eukaryotes  run Windows?

Comparative genomics of E. coliimplicates phages for BitTorrent

Phage capacity: 20kbOther strains up to

40kb

K-12 to B comparison

1kb: gene length

Page 31: Why bacteria run Linux  while eukaryotes  run Windows?

Phage-Bacteria Infection NetworkData from Flores et al 2011

experiments by Moebus,Nattkemper,1981

WWW from AT&T website circa 1996 visualized by Mark Newman

Page 32: Why bacteria run Linux  while eukaryotes  run Windows?

Why eukaryotes run windows? Dependency network = reuse of

components Bacteria do not keep redundant genes

after HGT Linux developers rely on previous efforts Pros: smaller genomes, open source,

economies of scale Cons: less specialized, potentially unstable,

“dependency hell” Eukaryotes are like Windows or Mac OS

X Keep redundant components Proprietary software

Page 33: Why bacteria run Linux  while eukaryotes  run Windows?

Figure adapted from S. Maslov, TY Pang, K. Sneppen, S. Krishna, PNAS (2009)

# of genes

# of

pat

hway

s (or

thei

r reg

ulat

ors)

Page 34: Why bacteria run Linux  while eukaryotes  run Windows?

101 102 103 104 105100

101

102

103

104

105

# of installed packages

# of

sel

ecte

d pa

ckag

es

100 102 1041.6

1.7

1.8

Linux dataslope 1.7

Nselected packages ~ Ninstalled packages1.7

Software packages for Linux

Page 35: Why bacteria run Linux  while eukaryotes  run Windows?

35

Collaborators: Tin Yau Pang, Stony Brook University

Support: Office of Biological and Environmental Research

Page 36: Why bacteria run Linux  while eukaryotes  run Windows?

Thank you!


Top Related