methods for construction and analysis of computational models in s ystems biology

Click here to load reader

Upload: ulani

Post on 23-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Methods for Construction and Analysis of Computational Models in S ystems Biology. Andrzej Mizera Institute of Fundamental Technological Research Polish Academy of Sciences [email protected]. Outline. Introduction: Systems Biology Part I: Model construction Parameter Estimation - PowerPoint PPT Presentation

TRANSCRIPT

Slide 1

Andrzej MizeraInstitute of Fundamental Technological ResearchPolish Academy of [email protected] 22, 2011University of Luxembourg 1Methods for Construction and Analysis of Computational Models in Systems BiologyOutlineNovember 22, 2011University of Luxembourg 2Introduction: Systems Biology

Part I: Model constructionParameter EstimationModel validationModel identifiability

Part II: Higher-level model analysis methodologiesQuantitative submodel comparisonTechniques for model modifications (extensions & simplifications) model refinement of ODE-based self-assembly models

SummaryIntroductionNovember 22, 2011University of Luxembourg 3Biological processes have long been seen as static systems comprising a vast number of loosely linked, highly detailed, molecular devices.

http://micro.magnet.fsu.edu/cells/animals/animalmodel.htmlSystems biology essentially advocates a departure from the reductionist viewpoint, while putting emphasis on the holistic approach towards the analysis of a biological system.

ATP synthase a marvellous rotary engine of the cellNature Reviews Molecular Cell Biology 2, 669-677 (September 2001)

Molecular Cell Biology. 4th edition. Lodish H, Berk A, Zipursky SL, et al. New York: W. H. Freeman; 2000.Action of muscle Ca2+ ATPaseAs examples of such molecular devices the ATP synthase rotary engine or Calcium pumps can be given.3IntroductionNovember 22, 2011University of Luxembourg 4Systems biology:emerging research field,aims to study biochemical and biological systems from a holistic perspective,with the goal to provide a comprehensive, system-level understanding of cellular behaviour.... the objective of systems biology is defined as the understanding of network behaviour, and in particular their dynamic aspects, which requires the utilization of mathematical modelling tightly linked to experiment.- WTEC Panel Report on International Research and Development in Systems Biology, 2005 -The large interest in topics associated with systems biology by researchers from very different fields of expertise (such as biology, physics, engineering, computer science) results in various views on what systems biology is. In consequence, there is no well-established consensus definition. We adopt for the purpose of the presented work the definition proposed in the World Technology Evaluation Center Panel Report:4IntroductionNovember 22, 2011University of Luxembourg 5Systems biology, unlike traditional biology, focuses on high-level concepts.

The very terminology of systems biology is foreign to traditional biology and it marks its drastic shift in research paradigm.networkcomponentrobustnessSystems biologyefficiencycontrolregulationhierarchical designsignallingsynchronisationparallelismcompetition

5IntroductionNovember 22, 2011University of Luxembourg 6The key concepts in systems biology have already been studied for a long time in computer science (albeit from different perspectives).

A key contribution that computer science brings to systems biology is the ability tomanipulate,analyse, andreasonabout such system-level concepts and structures.For example, mathematical modelling, formal system specifications, control design, and others are by now mainstream techniques in systems biology.Since my work is centred around mathematical modelling, I explain this term briefly on the next slides.6IntroductionNovember 22, 2011University of Luxembourg 7My research concerns a number of challenges of computational modelling in Systems Biology.

It is focused on the development and utilization of different methodologies having their origins in the fields of computer science and mathematics.

The presented methodologies are applied in the modelling of two biological processes chosen as modelling case studies:the heat shock response mechanism in eukaryotic cells (HSR)the in vitro self-assembly of intermediate filaments from tetrameric vimentin.

The developed methodologies are general in nature: can be applied for the modelling of other biological processes as well.November 22, 2011University of Luxembourg 8PART I: Model constructionThe heat shock responseNovember 22, 2011University of Luxembourg 9Cells response to elevated temperaturesIntense research on HSR in the last yearsHSR is a very well-conserved regulatory network across all eukaryotes (bacteria have a similar mechanism)Good candidate for deciphering the engineering principles of regulatory networks

Heat shock proteins are very potent chaperones (sometimes called the master proteins of the cell)Involved in a large number of regulatory processesAlso in anti-inflammatory processesFound in extra-cellular environment, which may suggest they are used for signalingMajor role in the resilience of cancer cells; attractive as targets for cancer treatment

Tempting for a biomodelling, SysBio project because it involves relatively few main actors (at least in a first, simplified presentation)9Heat shock response: main actorsNovember 22, 2011University of Luxembourg 10Heat shock proteins (HSP)Very potent chaperonesMain task: assist the refolding of misfolded proteinsSeveral types of them, we treat them all uniformly in our model with hsp70 as a base denominatorHeat shock elements (HSE)Several copies found upstream of the HSP-encoding gene, used for the transactivation of the HSP-encoding genesTreat uniformly all HSEs of all HSP-encoding genesHeat shock factors (HSF)Proteins acting as transcription factors for the HSP-encoding geneTrimerize, then bind to HSE to promote gene transcriptionGeneric proteinsConsider them in two states: correctly folded and misfoldedUnder elevated temperatures, proteins tend to misfold, exhibit their hydrophobic cores, form aggregates, lead eventually to cell death (see Alzheimer, vCJ, and other diseases)Various bonds between these metabolites10The molecular model for HSRNovember 22, 2011University of Luxembourg 11Heat shock geneHSEHeat shock geneHSEHSPHSPHSPHSPHSFRNA polHSP:HSFMFPMFPMFPMFPMFP37C42C11The new molecular modelNovember 22, 2011University of Luxembourg 12Transcription

HSF+HSF HSF2HSF+HSF2 HSF3HSF3+HSE HSF3:HSEHSF3:HSE HSF3:HSE+HSP

Backregulation

HSP+HSF HSP:HSFHSP+HSF2 HSP:HSF+HSFHSP+HSF3 HSP:HSF+2HSFHSP+HSF3:HSE HSP:HSF+2HSF+HSEResponse to stress

PROT MFPHSP+MFP HSP:MFPHSP:MFP HSP+PROT

Protein degradation

HSP 12The new molecular modelNovember 22, 2011University of Luxembourg 13Several criteria were followed when introducing this molecular model:

as few reactions and reactants as possible;include the temperature-induced protein misfolding;include HSF in all its three forms: monomers, dimers, and trimers;include the HSP-backregulation of the transactivation of the HSP-encoding gene;include the chaperon activity of HSP;include only well-documented, textbook-like reactions and reactants.The Law of Mass ActionNovember 22, 2011University of Luxembourg 14Introduced by Guldberg and Waage in 1864:

Waage, P.; Guldberg, C. M. Forhandlinger: Videnskabs-Selskabet i Christiana 1864, 35

The rate of any given reaction is proportional to the concentration of reactants.

For example:SA + SB P

In modern chemistry derived using statistical mechanics

The mathematical modelNovember 22, 2011University of Luxembourg 1515The mathematical modelNovember 22, 2011University of Luxembourg 16Modelling of the heat-induced misfolding

Adapted from Pepper et al (1997), based on studies of Lepock (1989, 1992) on differential calorimetry

(T)=(1-0.4/eT-37) 14.5 10-7 1.4T-37 s-1

Formula valid for temperatures between 37 C and 45 C, gives a generic protein misfolding rate per second.16The mathematical modelNovember 22, 2011University of Luxembourg 17Our model contains 3 mass-conservation relations:The total number of HSF molecules is constant,The total number of heat shock elements (HSE) is constant,and the total number of Protein molecules (PROT) is conserved.

17As far as protein degradation is concerned, we only consider it in the model for hsp. If we considered it also for hsf and prot, then we should also consider the compensating mechanism of protein synthesis, including its control. For the sake of simplicity and also based on experimental evidence that the total amount of hsf and of prot is somewhat constant, we ignore the details of synthesis and degradation for hsf and prot.

Parameter estimationNovember 22, 2011University of Luxembourg 18Data readily available for the goal: Kline, Morimoto (1997) heat shock of HeLa cells at 42 C for up to 4 hours, data on DNA binding (HSF3:HSE)

Requirements for the model:17 kinetic constants, 10 initial values to estimate,but:3 conservation relations available;the initial value of the variables of the model is a steady state for temperature set to 37 C, which gives 7 more independent algebraic equations (each of them quadratic).

Altogether: 17 independent values

Other conditions: total HSF somewhat low, refolding a fast reaction, HSPs long-lived proteins18*) Steady state condition at 37 C: this is a natural condition since the model is supposed to reflect the reaction to temperatures raised above 37 C.*) 3 conservation relations introduce 3 new constants to the model: K1, K2, and K3 for total amount of HSF, HSE, and PROT, respectively. We assume that their values are known.Parameter estimationNovember 22, 2011University of Luxembourg 19Parameter estimation performed with

(www.copasi.org)Hoops, S., Sahle, S., Gauges, R., Lee, C., Pahle, J., Simus, N., Singhal, M., Xu, L., Mendes, P., and Kummer, U. (2006). COPASI a COmplex PAthway SImulator. Bioinformatics 22, 3067-74.

User-friendlyStochastic and deterministic time course simulationSteady state analysisSensitivity analysisMetabolic control analysisMass conservation analysisOptimization of arbitrary objective functionsSBML-basedExcellent for parameter estimationFREE for non-commercial use

Strategy for parameter estimationNovember 22, 2011University of Luxembourg 20Ideal approach: Solve analytically the steady state equations at 37 C.Use the solution to decrease the number of parameters and initial values to the independent ones.Do parameter estimation on the remaining independent variables to fit the model based on the data at 42 C.

Problem: the steady state (37 C) equations cannot be solved because of they high overall degree.

Solution: use the Kline-Morimoto experimental data to fit parameters and ask also that the fluctuations at 37C are (close to) 0.Duplicate the model (the same parameter values!) and run both at the same time (37 & 42 C).Parameter estimation resultsNovember 22, 2011University of Luxembourg 21

Predictions and validationNovember 22, 2011University of Luxembourg 22Higher the temperature, higher the response

Prolonged transcription at 43 C confirmed Unlike previous models

Heat shock removed at the peak of the response confirms a more rapid attenuation phase

All data is in relative terms with respect to the highest value in the graph so that it can be easily compared22Predictions and validationNovember 22, 2011University of Luxembourg 23Experiment: two waves of heat shock, the second applied after the level of HSP has peakedObservation: the second heat shock response much milder than the firstThe reason is that the cell is better prepared to deal with the second heat shockTherapeutic consequences have been suggested: train the cell for heat shock by an initial milder heat shock

The model prediction is in line with the experimental observationDotted line: heat shock at 42 C for two hours, behavior followed up to 20 hoursContinuous line: heat shock at 42 C for two hours, followed by a second wave of heat shock after the level of HSP has peaked23Model identifiabilityNovember 22, 2011University of Luxembourg 24The problem of model identifiability: the question of the uniqueness of the set of parameters that fulfil the imposed conditions.

By repeating from scratch the whole parameter estimation procedure, we obtained several different sets of parameter values that result both in a good fit of the model to the experimental data, as well as in initial values that are steady-states of the model at 37 C.However, all these parameter sets failed the model validation tests with respect to the qualitative observations concerning the behaviour of cells under stress.

A new thorough method of searching for alternative numerical model fitsbased on systematic parameter scan in the space determined by the considered ranges of parameter values

Latin Hypercube Sampling - provides samples which are uniformly distributed over each parameter while the number of samples is independent of the number of parametersModel identifiabilityNovember 22, 2011University of Luxembourg 25Strategy: look for alternative models fits that are in agreement with the experimental data of Kline and Morimoto (1997), and satisfy the steady-state condition for the initial values.

Sampled N = 100.000 sets of parameter valuesFor each set, we estimated numerically the steady state of the model for a temperature value of 37 C. We then set the initial state of the model as the calculated steady state.Simulated the model for 14400s at a temperature of 42 C.Classified as non-responsive those parameter samples that led to low DNA binding level at the peak of the response, and excluded them from further analysis.Only 31.506 out of the 100.000 samples were responsive, already a result pointing to difficulties in finding satisfactory alternative numerical fits.For each model, we made a scatter plot for each variable and each parameter where we plotted the steady state values of each variable at 37 C, against the values of the parameter.

Model identifiabilityNovember 22, 2011University of Luxembourg 2637 C :most of the alternative fits predicted high levels of gene transcription in the absence of the heat shocknone of the sampled models reached such a low level of HSF as the reference modelthe reference model is one of the very few models in which most of the HSF molecules are sequestered by HSP, i.e. at 37C the response mechanism is turned offthe basic model reaches the lowest values for HSF dimers in agreement with the observation that HSF dimers are unnoticeable in biological experimentsModel identifiabilityNovember 22, 2011University of Luxembourg 27

Our reference model obtained the lowest score of around 12, while the 13 best fits of the sampled models were in the range between 300 and 1000. All the other models had much worse scores, of more than 1000.ConclusionsWhile it is likely that a model of this size is not uniquely identifiable, our parameter scan showed that finding parameter values satisfying our model constraints is far from being easy.42 C :November 22, 2011University of Luxembourg 28PART II: Higher-level model analysis methodologies Model decomposition and quantitative submodel comparisonNovember 22, 2011University of Luxembourg 29Various experimental investigations of a given biochemical system often lead to generation of a large variety of alternative molecular designs and models.

How to compare their functionality, efficiency, and robustness?

Comparing alternative models for a given biochemical system is, in general, a very difficult problem:the models may focus on different aspects of the same system,may consist of very different species and reactions,numerical setups of the associated computational models play a crucial role in the quantitative comparison.It involves a deep analysis of the underlying network of reactions, the biological assumptions as well as the numerical setup.

Unbiased methods to objectively compare the alternative designs are needed.Model decomposition and quantitative submodel comparisonNovember 22, 2011University of Luxembourg 30The problem becomes somewhat simpler when the alternative designs are actually submodels of a larger model:the underlying networks are similar, although not identical, andthe biological constrains are given by the larger model.

E.g., submodels are considered when striving to obtain a global picture of a large systems architecture, i.e. understanding the interactions between various components.

Similar problems have been encountered in engineering sciences, e.g., in control theory.One strategy is to adapt specific methods originating from these disciplines.Model decomposition and quantitative submodel comparisonNovember 22, 2011University of Luxembourg 31Three methods were developed:

Local submodels comparison

El. Czeizler, Eu. Czeizler, R.-J. Back, and I. Petre. Control strategies for the regulation of the eukaryotic heat shock response. In P. Degano and R. Gorrieri (Eds.), Computational Methods in Systems Biology, LNCS, vol. 5688, pp. 111-125, Heidelberg, 2009. Springer-Verlag.Discrete approach for comparing continuous submodels

El. Czeizler, A. Mizera, and I. Petre. A Boolean approach for disentangling the numerical contribution of modules to the system-level behavior of a biomodel. Submitted, 2011.A statistical method for quantitative submodel comparison

A. Mizera, El. Czeizler, and I. Petre. Methods for biochemical model decomposition and quantitative submodel comparison. Israel Journal of Chemistry, 51(1):151-164, 2011.Model decomposition and quantitative submodel comparisonNovember 22, 2011University of Luxembourg 32These methods are:

using the control-based decomposition of the reference model,

comparing knockdown mutants of the reference model, i.e., submodels missing one or several of the modules,

illustrated on the eukaryotic heat shock response case study.Model decompositionNovember 22, 2011University of Luxembourg 33

Model decompositionNovember 22, 2011University of Luxembourg 34

Local submodels comparisonNovember 22, 2011University of Luxembourg 35Aim: biologically unbiased analysis, i.e., elimination of accidental differences between models.

Question: initial conditions in the alternative models?Taking them from the reference model biased comparisonLocal submodel comparison: initial setup of each submodel constitutes a steady state in the absence of a trigger unbiased comparison

The method: Two initial biological constraints are imposed on the models:identical chemistry (reaction rates) and mass constants (conservation relations)initial values form a steady state under physiological conditionsLocal submodels comparisonNovember 22, 2011University of Luxembourg 36Comparison of the mutants

Based on the numerical observations one can summarize the contribution of the modules to some performance indicators of the network.

E.g., in the case of the HSR model these are:economical use of the cellular resources,speed of response to a heat shock,effectiveness of the response,scalability.Discrete approach for comparing continuous submodelsNovember 22, 2011University of Luxembourg 37Three conditions are imposed on the each knockdown mutant model:kinetic rate constants: the numerical prediction of the mutant model fits in with the experimental datainitial conditions: form a steady state under physiological conditionsmass constants: are chosen to be identical to those of the reference model

All three constraints come as natural consequences of the fact that we consider all knockdown mutants as viable alternatives.Discrete approach for comparing continuous submodelsNovember 22, 2011University of Luxembourg 38For each knockdown mutant a Boolean formula is associated that describes the mutants control architecture (negation and conjunction of variables associated to regulating mechanisms).Boolean formulas describing all mutants that show given behavioural properties are written.

Example: which knockdown mutants can be both effective and economic?

The numerical comparison of the mutants is then performed by analysing the Boolean formulas associated with various behavioural properties.

Statistical method for quantitative submodel comparisonNovember 22, 2011University of Luxembourg 39Statistical sampling of the reference model and mutant behaviour is performed:the parameter value space of the reference model is scanned (e.g. Latin Hypercube Sampling)each coordinate of the sampled parameter vectors is associated with one of the parameters in the reference model

For each sampled parameter vector:parameters of the reference model and the submodel are set in accordance with the considered vector,initial conditions of the reference model and the submodel are determined independently of each other by a systemic property (e.g. being in a steady state in a given setup).Statistical method for quantitative submodel comparisonNovember 22, 2011University of Luxembourg 40For each sampled parameter vector:one numerical instance for the mutant and one for the reference modelnumerical simulations are run for both of them in order to evaluate their functional effectivenessThe obtained results for each variant are summarized by use of some statistical measures (e.g. the moving median technique).

November 22, 2011University of Luxembourg 41Refinement of ODE-based self-assembly modelsRefinement of ODE-based self-assembly modelsNovember 22, 2011University of Luxembourg 42Challenge: construction of models capable of capturing the evolution of filaments of certain length up to some arbitrarily chosen value.R. Kirmse et al. A Quantitative Kinetic Model for the in Vitro Assembly of Intermediate Filaments from Tetrameric Vimentin. Journal of Biological Chemistry, 282(25):18563-18572, 2007.

monomercoiled-coil dimertetramerULFsimple modelextended modelRefinement of ODE-based self-assembly modelsNovember 22, 2011University of Luxembourg 43The proposed methodology of self-assembly model refinement is a particular instance of formal model refinement a topic extensively studied in Computer Science, especially in connection to formal software specifications.

Model refinement:starting from an abstract model of a system, a more detailed model is constructedthe refinement mechanism preserves already proven systemic quantitative properties of the original model (e.g. model fit, stochastic semantics)Refinement of ODE-based self-assembly modelsNovember 22, 2011University of Luxembourg 44Our method: an instance of data refinement, where one replaces a variable with a set of other variables in a way that introduces more details into the model, while keeping the model constraints unchanged.

A. Mizera, Eu. Czeizler, and I. Petre. Self-assembly models of variable resolution. To appear in Transactions on Computational Systems Biology, 2012.

a generic formal model for the process of self-assembly is presentedthe notion of model resolution is introduced a self-assembly mathematical model is of resolution n if it allows for capturing the dynamics of the number (or concentration) of components that are exactly of size i, where 0 i n.model refinement procedures preserving the fit to experimental data for a family of self-assembly ODE models are developedto the best of our knowledge, this is the first time formal refinement is considered in the context of ODE-based mathematical modelsRefinement of ODE-based self-assembly modelsNovember 22, 2011University of Luxembourg 45increasing the model resolution: an exact, constructive method based on analytical investigations of the ODEs is developeddecreasing the model resolution is more challenging: method based on symbolical computations but requires numerical investigations and simulations of the modelscase study: the in vitro self-assembly of intermediate filaments

November 22, 2011University of Luxembourg 46SummarySummaryNovember 22, 2011University of Luxembourg 47Issues related to model construction methodologies:parameter estimation, model validation, model identifiability, choice of deterministic vs stochastic modelling framework

Discuss and develop new techniques for the problem of model comparison.methodologies for model decompositionspecial case: comparison between submodels of a certain model

The problem of model modifications: various techniques useful for applying simplifications or extensions to an already fitted and validated mathematical model in such a way that the desired properties of the model are retained (fitting experimental data is computationally expensive!).computational heuristics for simplifying a biological modelmodel refinementSummaryNovember 22, 2011University of Luxembourg 48prof. Ion PetreComputational Biomodelling Laboratorybo Akademi University, Department of Information TechnologiesTurku Centre for Computer Science

prof. Barbara GambinInstitute of Fundamental Technological ResearchPolish Academy of Sciences