molecule as computation

Post on 12-Jan-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Molecule as Computation. Ehud Shapiro Weizmann Institute of Science Joint work with Aviv Regev and Bill Silverman In collaboration with Corrado Priami, Naama Barkai and Luca Cardelli. The talk has three parts:. Briefly introduce molecular biology - PowerPoint PPT Presentation

TRANSCRIPT

Molecule as Computation

Ehud Shapiro

Weizmann Institute of Science

Joint work with Aviv Regev and Bill Silverman

In collaboration with Corrado Priami, Naama Barkai and Luca Cardelli

The talk has three parts:

1. Briefly introduce molecular biology

2. Computer-based consolidation of molecular biology

3. Our work on helping this happen

Part IBrief Introduction to

Molecular Biology

Pentium II E. Coli

Pentium II E. Coli

1 million macromolecules

1 million bytes of static genetic memory

1 million amino-acids per second

3 million transistors

1/4 million bytes of memory

80 million operations per second

Comparison courtesy of Eric Winfree

Pentium II E. Coli

Pentium II E. Coli

1 micron

Pentium II E. Coli

1 micron1 micron

Inside E. Coli

(1Mbyte)

Inside E. Coli

Ribosomes in operation

Ribosomes translate RNA to Proteins

RNA Polymerase transcribes DNA to RNA

Computationally: A stateless string transducer from the RNA alphabet of nucleic acids to the Protein alphabet of amino acids

(= protein)

Ribosomes in operation

Ribosome operation

Ribosome operation

Ribosome operation

Ribosome operation

Seqeunces and String Transducers

Ribosomes translate RNA to Proteins

RNA Polymerase transcribes DNA to RNA

Molecular Biology in One Slide Sequence: Sequence of DNA and Proteins

Molecule as Computation

Ehud Shapiro

Weizmann Institute of Science

Joint work with Aviv Regev and Bill Silverman

In collaboration with Corrado Priami, Naama Barkai and Luca Cardelli

The talk has three parts:

1. Briefly introduce molecular biology

2. Computer-based consolidation of molecular biology

3. Our work on helping this happen

Part IBrief Introduction to

Molecular Biology

Pentium II E. Coli

Pentium II E. Coli

1 million macromolecules

1 million bytes of static genetic memory

1 million amino-acids per second

3 million transistors

1/4 million bytes of memory

80 million operations per second

Comparison courtesy of Eric Winfree

What about “The Rest” of biology: the function, activityand interaction of molecular systems in cells?

?

Part III An Abstraction for Molecular

Systems

The “New Biology” The cell as an information processing

device

Cellular information processing and passing are carried out by networks of interacting molecules

Ultimate understanding of the cell requires an information processing model

Which?

“We have no real ‘algebra’ for describing regulatory circuits across different systems...”

- T. F. Smith (TIG 14:291-293, 1998)

“The data are accumulating and the computers are humming, what we are lacking are the words, the grammar and the syntax of a new language…”

- D. Bray (TIBS 22:325-326, 1997)

Our Proposal: Molecule as Computational Process

“Cellular Abstractions: Cells as Computation”,

to appear in Nature, September 26th, 2002

A system of interacting molecular entities is described and modelled by a system of interacting computational entities.

Composition of two processes is a process, therefore:

Molecular ensembles as processes

Molecular networks as processes

Cells as processes (virtual cell)

Multi-cellular organisms as processes

Collections of organisms as processes

Towards “Molecule as Process”

1. Use the -calculus process algebra as molecule description language

The -calculus (Milner, Walker and Parrow 1989)

A program specifies a network of interacting processes

Processes are defined by their potential communication activities

Communication occurs on complementary channels, identified by names

Message content: Channel name

-calculus key constructs

Parallel A | B

Choice A ; B

Communication X ! M or X ? Y

Recursion, with state change

P :- … P’…

Molecules as Processes

Molecule Process

Interaction capability Channel

Interaction Communication

Modification State change

Na + Cl < Na+ + Cl-

Na | Na | … | Na | Cl | Cl | … | Cl

Na::= e ! [] , Na_plus .

Na_plus::= e ? [] , Na .

Cl::= e ? [] , Cl_minus .

Cl_minus::= e ! [] , Cl .

Processes, guarded communication, alternation between two states.

The RTK-MAPK pathway

16 molecular species

24 domains; 15 sub-domains

Four cellular compartments

Binding, dimerization, phosphorylation, de-phosphorylation, conformational changes, translocation

~100 literature articles

250 lines of code

ERK1RAF

GRB2

RTK

RTK

SHC

SOS

RAS

GAP

PP2A

MKK1

GF GF

MP1

MKP1

IEG

IEP

IEP

J F

Molecular systems with -calculus

Can express, qualitatively, the behavior of many complex molecular systems

Cannot express quantitative aspects

Towards “Molecule as Process”

1. Use the -calculus process algebra as molecule description language

2. Provide a biochemistry-oriented stochastic extension (with Corrado Priami)

Stochastic -Calculus (Priami, 1995,

Regev, Priami, Shapiro, Silverman 2000)

Every channel x attached with a base rate r

A global (external) clock is maintained

The clock is advanced and a communication is selected according to a race condition

Rate calculation and race condition adapted for chemical reactions: Rate(A+B C) = BaseRate *[A]*[B]

[A] = number of A’s willing to communicate with B’s.

[B] = number of B’s willing to communicate with A’s.

BioSPI implementation: -calculus + Gillespie’s algorithm

Gillespie (1977): Accurate stochastic simulation of chemical reactions

The BioSPI system: Compiles (full) calculus

Runtime incorporates Gillespie’s algorithm

0 0.005 0.01 0.015 0.02 0.025 0.030

10

20

30

40

50

60

70

80

90

100

global(e1(100),e2(10)).

Na::= e1 ! [] , Na_plus .

Na_plus::= e2 ? [] , Na .

Cl::= e1 ? [] , Cl_minus .

Cl_minus::= e2 ! [] , Cl .

0 0.5 1 1.5 2 2.5 3 3.5 4

x 10-3

0

10

20

30

40

50

60

70

80

90

100Na + Cl < Na+ + Cl-

Programming Experience with

Stochastic Pi Calculus Taught semesterial M.Sc. Course (available

online) with lots of examples, exercises and final projects

Textbook examples from chemistry, organic chemistry, enzymatic reactions, metabolic pathways, signal-transduction pathways…

Circadian Clocks

J. Dunlap, Science (1998) 280 1548-9

The circadian clock machinery (Barkai and Leibler, Nature 2000)

PR

UTRR

R

R

R_GENE

R_RNAtranscription

translation

degradation

PA

UTRA

A

A

A_GENE

A_RNAtranscription

translation

degradation

Differential rates: Very fast, fast and slow

The machinery in -calculus: “A” molecules

A_GENE::= PROMOTED_A + BASAL_APROMOTED_A::= pA ? {e}.ACTIVATED_TRANSCRIPTION_A(e)BASAL_A::= bA ? [].( A_GENE | A_RNA)ACTIVATED_TRANSCRIPTION_A::=

1 . (ACTIVATED_TRANSCRIPTION_A | A_RNA) +e ? [] . A_GENE

RNA_A::= TRANSLATION_A + DEGRADATION_mATRANSLATION_A::= utrA ? [] . (A_RNA | A_PROTEIN)DEGRADATION_mA::= degmA ? [] . 0

A_PROTEIN::= (new e1,e2,e3) PROMOTION_A-R + BINDING_R + DEGRADATION_A

PROMOTION_A-R ::= pA!{e2}.e2![]. A_PROTEIN + pR!{e3}.e3![]. A_PRTOEIN

BINDING_R ::= rbs ! {e1} . BOUND_A_PRTOEIN BOUND_A_PROTEIN::= e1 ? [].A_PROTEIN + degpA ? [].e1 ![].0DEGRADATION_A::= degpA ? [].0

A_Gene

A_RNA

A_protein

The machinery in -calculus: “R” molecules

R_GENE::= PROMOTED_R + BASAL_RPROMOTED_R::= pR ? {e}.ACTIVATED_TRANSCRIPTION_R(e)BASAL_R::= bR ? [].( R_GENE | R_RNA)ACTIVATED_TRANSCRIPTION_R::=

2 . (ACTIVATED_TRANSCRIPTION_R | R_RNA) +e ? [] . R_GENE

RNA_R::= TRANSLATION_R + DEGRADATION_mRTRANSLATION_R::= utrR ? [] . (R_RNA | R_PROTEIN)DEGRADATION_mR::= degmR ? [] . 0

R_PROTEIN::= BINDING_A + DEGRADATION_RBINDING_R ::= rbs ? {e} . BOUND_R_PRTOEIN BOUND_R_PROTEIN::= e1 ? [] . A_PROTEIN + degpR ? [].e1 ![].0DEGRADATION_R::= degpR ? [].0

R_Gene

R_RNA

R_protein

BioSPI simulation

Robust to random perturbations

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

100

200

300

400

500

600

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

100

200

300

400

500

600

A R

The A hysteresis module

The entire population of A molecules (gene, RNA, and protein) behaves as one bi-stable module

A

R

ON

OFF

FastFast

0 100 200 300 400 500 6000

100

200

300

400

500

600A

R

Hysteresis moduleON_H-MODULE(CA)::=

{CA<=T1} . OFF_H-MODULE(CA) + {CA>T1} . (rbs ! {e1} . ON_DECREASE + e1 ! [] . ON_H_MODULE + pR ! {e2} . (e2 ! [] .0 | ON_H_MODULE) + 1 . ON_INCREASE)ON_INCREASE::= {CA++} . ON_H-MODULEON_DECREASE::= {CA--} . ON_H-MODULE

OFF_H-MODULE(CA)::=

{CA>T2} . ON_H-MODULE(CA) + {CA<=T2} . (rbs ! {e1} . OFF_DECREASE + e1 ! [] . OFF_H_MODULE + 2 . OFF_INCREASE )OFF_INCREASE::= {CA++} . OFF_H-MODULEOFF_DECREASE::= {CA--} . OFF_H-MODULE

ON

OFF

Modular cell biology

Build two representations in the -calculus Implementation (how?): molecular level

Specification (what?): functional module level

The circadian specification

R (gene, RNA, protein) processes are unchanged (modular;compositional)

PR

UTRR

R

R

R_GENE

R_RNAtranscription

translation

degradation

ONOFF

Counter_A

BioSPI simulation

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000

50

100

150

200

250

300

350

400

450

500

Module, R protein and R RNA

7500 8000 8500 9000 9500 100000

100

200

300

400

500

600

R (module vs. molecules)

Modular cell biology

Build two representations in the -calculus Implementation (how?): molecular level

Specification (what?): functional module level

Ascribing a function to a biomolecular system ~ equivalence between specification and implementation

Limitation of stochastic - calculus: Lack of location

information Membranes: Cells and cellular

compartments, “inside” and “outside”

Molecular proximity: The identity of complexes and single molecules

Limited solution: programming tricks

Towards “Molecule as Process”

1. Use the -calculus process algebra as molecule description language

2. Provide a biochemistry-oriented stochastic extension (with Corrado Priami)

3. Provide an Ambient Calculus extension (with Luca Cardelli)

Mobile compartments

Compartment

Compartment mobility

Process mobility

Cells Cell movement Trans-membranal molecules (receptors, channels, transporters);

Molecule entry and exit

Organelles and vesicles

Merging, budding, bursting

Multi-molecular complexes

Form and break Bind and unbind to molecular scaffolds

The ambient calculus (Cardelli and Gordon)

An ambient is a bounded place where computation happens

Ambient Processes

The ambient calculus (Cardelli and Gordon)

The ambient’s boundary restricts process interactions across it

Ambient Processes

The ambient calculus (Cardelli and Gordon)

Processes can move in and out of ambients

Ambient Processes

Ambient are mobile processes, too !

Compartments as ambients

Cells, vesicles, compartments ~ Ambients

Cell

NucleusP

QR

Rcell [ P | Q | R | nuc [R] ]

Synchronized ambient movement

enter/accept exit/expel merge+/merge-

vesicle[merge- c. P|Q] | lysozome [merge+ c . R|S]

lysozome [P|Q|R|S]

Lysozome

vesicle

Enter, exit, merge ~ Budding-in or -out, endo- or exo-cytosis

merge

enter

exit

merge

Molecules and complexes

Merge, enter, exit (with private channels) ~ Complex formation and breakage,

molecule re-localization

Complex

Mol1

P Q

Mol2

R S

P Q R S

Mol1 [P|merge+ c.Q]Mol2[merge- c. R|S] |

Complex [P|Q|R|S]

enter/accept exit/expel merge+/merge-

Vesicle merging

Vesicle

Cell

Cell

Single substrate reactions:Enzyme and substrate as ambients

Enzyme

S X P

enter

enter

exit

exit

Bi-substrate reactions: Inter-ambient communication

Enzyme

S1 X P1

enter

enter

exit

exit

S2 Y P2

enter

enter

exit

exit

s2s

Example: Multi-cellular system (hypothalamic body

weight control system)

IRS-1

IR

tub

1st ord

er

ARCVMNPVN

2nd

ord

er

PVN PFA LHA

Uterinefunction

Eff

ere

nt

signal

Fat cell mass

Leptin expression

Insulin expression

Insulin resistanceGlucose utilization in adipocytes

POMC*/CART*POMC CART

MSH expressioncleavage

NPY*/AgRP*NPY/AgRP expression

Orexin

PFA

MCHLHA

TRH* CRH* OXY

PVN

Thyroid axis

Hypothalamic Pituitary

Adrenal axis

Energy expenditureFood intake

Aff

ere

nt

signal

Weight gain / Weight loss

Contro

lled

syste

m 2

MSH

MC4

Gs

cAMP,PKA

Gi

NPY

NPYR

AgRP

IRS-1 tub

IR LR

JAK

STAT

LR

JAKSTAT

Inp

ut

Conclusions

The most advanced tools for computer process description seem to be also the best tools for the description of biomolecular systems

This intellectual economy validates the decades-long study of concurrency in computer science

An essential foundation for the forthcoming “Virtual Cell Project”

top related