to tackle the complexity of biological systems, investigate programming theory concepts
DESCRIPTION
Formal Validation and Inference of Biochemical Reaction Models from Temporal Logic Properties François Fages Constraint Programming project-team, INRIA Paris-Rocquencourt, France. To tackle the complexity of biological systems, investigate Programming Theory Concepts - PowerPoint PPT PresentationTRANSCRIPT
François Fages Grenoble, Oct. 2007
Formal Validation and Inference of Biochemical Reaction Models from
Temporal Logic Properties
François FagesConstraint Programming project-team,
INRIA Paris-Rocquencourt, France
To tackle the complexity of biological systems, investigate• Programming Theory Concepts• Formal Methods of Circuit and Program Verification• Automated Reasoning Tools
Prototype Implementation in the Biochemical Abstract Machine BIOCHAMmodeling environment available at http://contraintes.inria.fr/BIOCHAM
François Fages Grenoble, Oct. 2007
Issue of Abstraction in Systems Biology
Models are built in Systems Biology with two contradictory perspectives :
François Fages Grenoble, Oct. 2007
Issue of Abstraction in Systems Biology
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
François Fages Grenoble, Oct. 2007
Issue of Abstraction in Systems Biology
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
François Fages Grenoble, Oct. 2007
Issue of Abstraction in Systems Biology
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
These perspectives can be reconciled by organizing formalisms and models into hierarchies of abstractions.
To understand a system is not to know everything about it but to know
abstraction levels that are sufficient for answering questions about it
François Fages Grenoble, Oct. 2007
Semantics of Living Processes ?
Formally, “the” behavior of a system depends on our choice of observables.
? ?
Mitosis movie [Lodish et al. 03]
François Fages Grenoble, Oct. 2007
Boolean Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Presence/absence of molecules over time
Boolean transitions
0 1
François Fages Grenoble, Oct. 2007
Continuous Differential Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Concentrations of molecules over time
Rates of reactions
x ý
François Fages Grenoble, Oct. 2007
Stochastic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Numbers of molecules over time
Probabilities of reaction
n
François Fages Grenoble, Oct. 2007
Temporal Logic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Presence/absence of molecules over time
Temporal logic formulas
F xF x
F (x ^ F ( x ^ y))
FG (x v y)
…
François Fages Grenoble, Oct. 2007
Constraint Temporal Logic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Concentrations of molecules over time
Constraint LTL temporal formulas
F x>1F (x >0.2)
F (x >0.2 ^ F (x<0.1 ^ y>0.2))
FG (x>0.2 v y>0.2)
…
François Fages Grenoble, Oct. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems 1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2. Temporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL
specification
2. Inferring reaction rules from CTL specification
L. Calzone, N. Chabrier, F. Fages, S. Soliman. TCSB VI, LNBI 4220:68-94. 2006.
François Fages Grenoble, Oct. 2007
Syntax of proteins
Cyclin dependent kinase 1 Cdk1
(free, inactive)
Complex Cdk1-Cyclin B Cdk1–CycB
(low activity)
Phosphorylated form Cdk1~{thr161}-CycB
at site threonine 161
(high activity)
mitosis promotion factor
François Fages Grenoble, Oct. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
François Fages Grenoble, Oct. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
François Fages Grenoble, Oct. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
François Fages Grenoble, Oct. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
Transport: A::L1 => A::L2Cdk1~{p}-CycB::cytoplasm => Cdk1~{p}-CycB::nucleus
François Fages Grenoble, Oct. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R (import/export SBML format)
SBML : exchange format, no semantics
François Fages Grenoble, Oct. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R (import/export SBML format)
SBML : exchange format, no semantics
BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules 1. Concurrent Transition System (asynchronous, non-deterministic)
2. Differential Semantics: concentration 1. Ordinary Differential Equations or Hybrid system (deterministic)
• Stochastic Semantics: number of molecules • Continuous time Markov chain
François Fages Grenoble, Oct. 2007
Hierarchy of Semantics
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Theory of abstract Interpretation
Abstractions as Galois connections
[Cousot Cousot POPL’77]
[Fages Soliman CMSB’06,TCSc’07]
Syntactical
model
François Fages Grenoble, Oct. 2007
Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence.
Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
Necessary for over-approximating the possible behaviors under the stochastic/discrete semantics (abstraction N {zero, non-zero})
Only the last transition is considered in Pathway logic [Lincoln et al. 02] or Petri nets [Reddy93, Heiner 05, Chaouiya 05]
François Fages Grenoble, Oct. 2007
Hierarchy of Analyses
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Syntactical
model
Jacobian circuit analysis
Discrete circuit analysis
Boolean circuit analysisabstraction
abstraction
abstraction
Positive circuits are a
necessary condition
for multistationarity
[Thomas 81] [Soulé 03]
[Remy Ruet Thieffry 05]
François Fages Grenoble, Oct. 2007
Temporal Logic Properties
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Syntactical
model
LTL with constraints on R
CTL properties
CTL propertiesabstraction
abstraction
abstraction
François Fages Grenoble, Oct. 2007
Type Checking / Inference by Abstract Interpretation
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Syntactical
model
[Fages Soliman CMSB’06]
Influence graph of proteins
Protein functions
(kinase, phosphatase,…)
Compartments topology
Influence graph of proteins
(activation/inhibition)
François Fages Grenoble, Oct. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages Grenoble, Oct. 2007
Interaction Graph
François Fages Grenoble, Oct. 2007
Stochastic Simulation
François Fages Grenoble, Oct. 2007
Differential Simulation
François Fages Grenoble, Oct. 2007
Boolean Simulation
François Fages Grenoble, Oct. 2007
Automatic Generation of True CTL PropertiesEi(reachable(Cyclin)))Ei(reachable(!(Cyclin))))Ai(oscil(Cyclin)))Ei(reachable(Cdc2~{p1})))Ei(reachable(!(Cdc2~{p1}))))Ai(oscil(Cdc2~{p1})))Ai(AG(!(Cdc2~{p1})->checkpoint(Cdc2,Cdc2~{p1}))))Ei(reachable(Cdc2-Cyclin~{p1,p2})))Ei(reachable(!(Cdc2-Cyclin~{p1,p2}))))Ai(oscil(Cdc2-Cyclin~{p1,p2})))Ei(reachable(Cdc2-Cyclin~{p1})))Ei(reachable(!(Cdc2-Cyclin~{p1}))))Ai(oscil(Cdc2-Cyclin~{p1})))Ai(AG(!(Cdc2-Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1,p2},Cdc2-Cyclin~{p1})))Ei(reachable(Cdc2)))Ei(reachable(!(Cdc2))))Ai(oscil(Cdc2)))Ei(reachable(Cyclin~{p1})))Ei(reachable(!(Cyclin~{p1}))))Ai(oscil(Cyclin~{p1})))Ai(AG(!(Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1},Cyclin~{p1}))))
François Fages Grenoble, Oct. 2007
François Fages Grenoble, Oct. 2007
Mammalian Cell Cycle Control Map [Kohn 99]
François Fages Grenoble, Oct. 2007
Cell Cycle Model-Checking (with BDD NuSMV)
biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}). Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is truebiocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB). Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is truebiocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))). Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is falsebiocham: why.-- Loop starts here cycB-cdk1~{p1,p2,p3} is present cdk7 is present cycH is present cdk1 is present Myt1 is present cdc25C~{p1} is presentrule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}. cycB-cdk1~{p2,p3} is present cycB-cdk1~{p1,p2,p3} is absentrule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}. cycB-cdk1~{p2,p3} is absent cycB-cdk1~{p1,p2,p3} is present
François Fages Grenoble, Oct. 2007
Mammalian Cell Cycle Control Benchmark
165 genes and proteins. 500 variables, 2500 states. 800 rules.
BIOCHAM NuSMV model-checker time in sec. [Chabrier et al. TCS 04]
Initial state G2 Query: Time:
compiling 29 s
Reachability G1 EF CycE 2 s
Reachability G1 EF CycD 1.9 s
Reachability G1 EF PCNA-CycD 1.7 s
Checkpoint
for mitosis complex
EF ( Cdc25~{Nterm}
U Cdk1~{Thr161}-CycB)
2.2 s
Oscillation EG ( (CycA => EF CycA) ^
( CycA => EF CycA))
31.8 s
François Fages Grenoble, Oct. 2007
Temporal Logic with Constraints over the Reals
• Constraints over concentrations and derivatives as FOL formulae over the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
François Fages Grenoble, Oct. 2007
Temporal Logic with Constraints over the Reals
• Constraints over concentrations and derivatives as FOL formulae over the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
• Linear Time Logic LTL operators for time X, F, U, G• F([M]>0.2)
• FG([M]>0.2)
• F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0))))
• oscil(M,n) defined as at least n alternances of sign of the derivative
• Period(A,75)= t v F(T = t & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)
& F(T = t + 75 & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)))…
François Fages Grenoble, Oct. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages Grenoble, Oct. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3),150).
François Fages Grenoble, Oct. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3),150).
First values found :
parameter(k3,10).
parameter(k4,70).
François Fages Grenoble, Oct. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3) & F([Cdc2-Cyclin~{p1}]>0.15), 150).
First values found :
parameter(k3,10).
parameter(k4,120).
François Fages Grenoble, Oct. 2007
3.1 Inferring Parameters from LTL Specification
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
period(Cdc2-Cyclin~{p1},35), 150).
First values found:
parameter(k3,10).
parameter(k4,280).
François Fages Grenoble, Oct. 2007
Leloup and Goldbeter (1999)
MPF preMPF
Wee1
Wee1P
Cdc25
Cdc25PAPC
APC
....
....
........
Cell cycle
Linking the Cell and Circadian Cycles through Wee1
BMAL1/CLOCK
PER/CRY
Circadian cycle
Wee1 mRNA
L [L. Calzone, S. Soliman 2006]
François Fages Grenoble, Oct. 2007
PCN
Wee1m
Wee1MPF
BN
Cdc25
François Fages Grenoble, Oct. 2007
entrainmententrainment
Condition on Wee1/Cdc25 for Period Synchronization
Period synchronization constraint expressed in LTL with the period formula
François Fages Grenoble, Oct. 2007
Inferring Rules from Temporal Properties
Given
• a BIOCHAM model (background knowledge)
• a set of properties formalized in temporal logic
learn
• revisions of the reaction model, i.e. rules to delete and rules to add such that the revised model satisfies the properties
François Fages Grenoble, Oct. 2007
Model Revision from Temporal Properties
• Background knowledge T: BIOCHAM model • reaction rule language: complexation, phosphorylation, …
• Examples φ: biological properties formalized in temporal logic language• Reachability
• Checkpoints
• Stable states
• Oscillations
• Bias R: Reaction rule patterns or parameter ranges• Kind of rules to add or delete
Find a revision T’ of T such that T’ |= φ
François Fages Grenoble, Oct. 2007
Model Revision Algorithm
General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm.
Anticipate whether one has to add or remove a rule.
• Positive ECTL formula: if false, remains false after removing a rule• EF(φ) where φ is a boolean formula (pure state description)
• Oscil(φ)
• Negative ACTL formula: if false, remains false after adding a rule• AG(φ) where φ is a boolean formula,
• Checkpoint(a,b): ¬E(¬aUb)
• Remove a rule on the path given by the model checker (why command)
• Unclassified CTL formulae
François Fages Grenoble, Oct. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages Grenoble, Oct. 2007
Automatic Generation of True CTL PropertiesEi(reachable(Cyclin)))Ei(reachable(!(Cyclin))))Ai(oscil(Cyclin)))Ei(reachable(Cdc2~{p1})))Ei(reachable(!(Cdc2~{p1}))))Ai(oscil(Cdc2~{p1})))Ai(AG(!(Cdc2~{p1})->checkpoint(Cdc2,Cdc2~{p1}))))Ei(reachable(Cdc2-Cyclin~{p1,p2})))Ei(reachable(!(Cdc2-Cyclin~{p1,p2}))))Ai(oscil(Cdc2-Cyclin~{p1,p2})))Ei(reachable(Cdc2-Cyclin~{p1})))Ei(reachable(!(Cdc2-Cyclin~{p1}))))Ai(oscil(Cdc2-Cyclin~{p1})))Ai(AG(!(Cdc2-Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1,p2},Cdc2-Cyclin~{p1})))Ei(reachable(Cdc2)))Ei(reachable(!(Cdc2))))Ai(oscil(Cdc2)))Ei(reachable(Cyclin~{p1})))Ei(reachable(!(Cyclin~{p1}))))Ai(oscil(Cyclin~{p1})))Ai(AG(!(Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1},Cyclin~{p1}))))
François Fages Grenoble, Oct. 2007
Rule Deletion
biocham: delete_rules(Cdc2 => Cdc2~{p1}).
François Fages Grenoble, Oct. 2007
Rule Deletion
biocham: delete_rules(Cdc2 => Cdc2~{p1}).
biocham: check_all.
First formula not satisfied
Ei(EF(Cdc2-Cyclin~{p1}))
François Fages Grenoble, Oct. 2007
Model Revision from Temporal Properties
biocham: revise_model.
Rules to delete:
Rules to add:
Cdc2 => Cdc2~{p1}.
François Fages Grenoble, Oct. 2007
Model Revision from Temporal Properties
biocham: revise_model.
Rules to delete:
Rules to add:
Cdc2 => Cdc2~{p1}.
biocham: learn_one_addition.
(1) Cdc2 => Cdc2~{p1}.
(2) Cdc2 =[Cdc2]> Cdc2~{p1}.
(3) Cdc2 =[Cyclin]> Cdc2~{p1}.
François Fages Grenoble, Oct. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both qualitative and quantitative biological properties of systems
• SBML models are interpreted under three semantics in BIOCHAM :
Boolean semantics CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
• Parameter search from temporal properties proved useful and complementary to bifurcation theory tools (Xppaut)
• Rule inference from temporal properties will benefit from types (e.g. protein functions, influences …) to better prune the search.
François Fages Grenoble, Oct. 2007
Collaborations
EU STREP APrIL2 : Stephen Muggleton, IC, Luc de Raedt, U. Freiburg,…
• Learning in a probabilistic logic setting (finished)
INRIA ARC MOCA :
• modularity, compositionality and abstraction
EU NoE REWERSE : semantic web, François Bry, Münich, R. Backofen,
• Connecting Biocham to gene and protein ontologies (types)
EU STREP TEMPO : Cancer chronotherapies, INSERM Villejuif, F. Lévi; INRIA J. Clairambault, S. Soliman
• Coupled models of cell cycle, circadian cycle, cytotoxic drugs.
INRA Tours : E. Reiter, D. Heitzler, INRIA F. Clément
• Model of FSH signaling.