models and algorithmic tools for computational processes in cellular biology bhaskar dasgupta...
DESCRIPTION
Models and Algorithmic Tools for Computational Processes in Cellular Biology Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607-7053 [email protected]. What is “systems biology” in one sentence ? study to unravel and conceptualize - PowerPoint PPT PresentationTRANSCRIPT
Slide 1
Models and Algorithmic Tools for Computational Processes in Cellular Biology
Bhaskar DasGuptaDepartment of Computer ScienceUniversity of Illinois at ChicagoChicago, IL 60607-7053
1ISBRA 2012What is systems biology in one sentence ?
study to unravel and conceptualize dynamic processes, feedback control loops and signal processing mechanisms underlying lifeCellular Networks
A single cell by itself is complex enough
Various technologies have facilitated the monitoring of expression of genes and activities of proteins
Difficult to find the causal relations and overall structure of the network
http://www.nyas.org/ebriefreps/ebrief/000534/images/mendes2.gifISBRA 2012ISBRA 2012Cellular NetworksGenes and gene products interact on several levels, e.g.:
Genes regulate each others expression as part of gene regulatory networkstranscription factors can activate or inhibit the transcription of genes to give mRNAsthese transcription factors are themselves products of genes
Protein-protein interaction networksproteins can participate in diverse post-translational interactions that lead to modified protein functions or to formation of protein complexes that have new roles
Different levels of interactions are integrated e.g., presence of an external signal triggers a cascade of interactions that involves biochemical reactions, protein-protein interactions and transcriptional regulationISBRA 2012Cellular networks
cellular interaction maps only represent a network of possibilities, and not all edges are present and active in vivo in a given condition or in a given cellular location
only an integration of time-dependent interaction and activity information will be able to give the correct dynamical picture of a cellular network
ISBRA 2012Modeling problem
interaction data produced by the biologist in the form of a diagram (e.g., some type of labeled digraph)
wish to pose questions about the behavior (dynamics) of such a network
essential to provide a precise mathematical formulation of its dynamics, and specifically how the state of each node depends on the state of the nodes interacting with it
ISBRA 2012Models
discrete, continuous and hybrid modelstheir inter-relationships, powers and limitationscomputational complexity and algorithmic issuesbiological implications and validationsfascinating interplay between several areas such as:biologycontrol theorydiscrete mathematics and computer scienceISBRA 2012System dynamicsstate variables continuousdiscrete (e.g., small number of quantitative states)
time variablescontinuous (e.g., partial differential equation, delay equations)discrete (difference equations, quantized descriptions of continuous variables)
deterministic or probabilistic nature of the model
hybrid modelscombines continuous and discrete time-scales and/orcombines continuous and discrete time variablesISBRA 2012Continuous-state dynamics
Differential equation(continuous-time)
Difference equation (discrete-time)
Examples of other modelsISBRA 2012
Boolean
x1, x2, x3 {0.1}
Boolean feedforward
SignalTransductionISBRA 2012Reverse engineering of modelsGivenpartial knowledge about the process/networkaccess to suitable biological experiments
How to gain more knowledge about the model ?effective use of resources (time, cost)Reverse engineering Process of backward reasoning, requiring careful observation of inputs and outputs, to elucidate the structure of the system
http://www.computerworld.com/computerworld/records/images/story/46Reverse-engineering.gifISBRA 2012Ingredients for reverse engineeringMathematical model to be reverse engineerede.g., differential equation model
Biological experiments available, e.g., perturbation experimentsgene expression measurementsISBRA 2012Many reverse engineering approaches are possible
I will discuss two types of approaches:
hitting set based combinatorial approaches
modular response analysis (MRA) approach
ISBRA 2012Reverse Engineering of Networks Via Modular Response Analysis Method Ingredients for reverse engineering viamodular response analysis approachMathematical modelsdifferential equation model
Biological experiments availableperturbation experimentsISBRA 2012ISBRA 2012Differential Equation Model
state variables evolve by (unknown) ordinary differential equations
x = (x1(t),...,xn(t)) state variables over time t measurable (e.g., activity levels of proteins)
p = (p1,...,pm) parameters that can be manipulatedf(x*,p*)=0 p* wild-type (i.e., normal) condition of p x* corresponding steady-state conditionISBRA 2012settings for modular response analysis method
do not know f
but, prior information of the following type is available
parameter pj does not effect variables xi (i.e., fi /pj 0 or not)
Kholodenko, Kiyatkin, Bruggeman, Sontag, Westerhoff and Hoek, PNAS, 2002ISBRA 2012Experimental protocols(perturbation experiments)perturb one parameter, say pk
for perturbed p, measure steady state vector x = (p)let the system relax to steady statemeasure xi (western blots, microarrys etc.)
estimate n sensitivities:
where ej is the jth canonical basis vectorModeling GoalADCBTopology of connections only
Direction of the relationship
Information about stimulatory or inhibitory effects
Strength of relationship
++-+-2.19.31.24.85.3 Modeling goal can be at different levels
ISBRA 2012ISBRA 2012 Goal of MRA approach
Obtain information about the sign of fi/xj(x,p)
e.g., if fi/xj 0, then xj has a positive (catalytic) effect on the formation of xi
ISBRA 2012In a nutshellafter some combinatorics and linear algebra
one can quantify the additional prior knowledge necessary to reach the goalKholodenko, Kiyatkin, Bruggeman, Sontag, Westerhoff and Hoek, PNAS, 2002Bermen, DasGupta and Sontag, Discrete Applied Math, 2007Berman, DasGupta and Sontag, Annals of NYAS, 2007ISBRA 2012But, assuming (near)-sufficient prior information
how to determine a minimum or near-minimum number of perturbation experiments that will work?
This now becomes a algorithmic/complexity issue...ISBRA 2012After some effort, one can see that
designing minimal sets of experimentsleads tothe set multi-cover problemISBRA 2012In our biological application context,
our set-multicover algorithm provides a set of suggested experiments such that
# of experiments minimum possibleISBRA 2012Modular Response Analysis forDifferential Equations modelLinear AlgebraicformulationCombinatorialformulationCombinatorialAlgorithms(randomized)Selection ofappropriateperturbation experimentsOverall high-level pictureISBRA 2012Experimental validation of MRA MethodSee the paper:
S. D. M. Santos, P. J. Verveer, P. I. H. Bastiaens, Growth factor-induced MAPK network topology shapes Erk response determining PC-12 cell fateNature Cell Biology 9, 324 - 330 (2007)
MAPK pathway involving proteins Raf, Mek and Erk is activated through receptor tyrosine kinases TrkA and epidermal growth factor receptor (EGFR) by two different stimuli, NGF (neuronal-) or EGF (epidermal growth factor)
MRA method was applied to determine the MAPK network architecture in the context of NGF and EGF stimulations
Reverse Engineering of Networks Via Hitting-set based (combinatorial) Method ISBRA 2012steady state profiles of perturbations of the networkhitting set introduceredundancymulti-hitting setexpression data representing state transition measurement for wildtype and perturbation datatopology of interconnection network hitting set introduceredundancymulti-hitting setHitting set based combinatorial approachestopology of interconnection network Basic idea behind the hitting-set based approachesISBRA 2012which variables influence x5 ?
x5 changes so does x1, x3, x4
at least one of {x1,x3,x4} must influence x5build dependency information over all successive time steps{x1,x3,x4}{x1}{x1,x3}{x2,x3,x4}{x1}minimal dependency(hitting set problem){x1,x2}
Why construct minimal dependency ?
Occam's razorentia non sunt multiplicanda praeter necessitatem (entities must not be multiplied beyond necessity)
However, biological networks may be redundant:e.g.G. Tononi, O. Sporns, G. M. Edelman, PNAS, 1999R. Albert et al., Physical Review E, 2011
How can we introduce redundancy if necessary ?ISBRA 2012How can we introduce redundancy if necessary ?
First idea: add random extra dependencies (edges)not good, these edges may not be supported by given data
Better idea: modify hitting set to multi-hitting set
{x1,x3,x4} previously: select at least 1 now: select at least 2 (in general, some r)
ISBRA 2012Evaluation of performance of reverse engineering Methods
Reverse-engineering methods are ill-posed, i.e., their solution is not uniqueexistence of measurement error not all molecular species involved in a given analyzed phenomenon are included in the construction of a network i.e., existence of hidden variables
Two possible ways for evaluation:
Experimental testing of predictions: after a model has been inferred, newly found interactions or predictions can be tested experimentally
Benchmarking testing: measure how accurate the method of our interest is in recovering a known (gold standard) network ISBRA 2012Evaluation of performance of reverse engineering MethodsMetrics for accuracy for benchmark testing
Measurements:correct interactions inferred (true positives, TP)incorrect interactions inferred (false positives, FP)correct non-interactions inferred (true negatives, TN)incorrect non-interactions inferred (false negatives FN)
Metricsrecall or true positive rate
false positive rate accuracy
precision or positive predictive value
ISBRA 2012
Two published method based on hitting set approach
(A) Ideker, Thorsson, Karp, PSB (2000)First step (network inference): estimate a set of Boolean networks consistent with an observed set of steady-state gene expression profiles, each generated from a different perturbation to the genetic networkSecond step (optimization): use an entropy-based approach to select an additional perturbation experiment to perform a model selection from the set of predicted Boolean networks
(B) Jarrah, Laubenbacher, Stigler, Stillman, Adv. in Applied Mathematics (2007)Attempts to infer the most likely causal relationships among network elements from gene expression data
For other published results, see, for example: Krupa, Journal of Theoretical Biology (2002)ISBRA 2012Comparative analysis (via benchmark testing) of two approaches by(A) Ideker, Thorsson, Karp (B) Jarrah, Laubenbacher, Stigler, StillmanTwo gold standard networks:
Segment polarity network of Drosophila melanogaster (fruit fly):last step in the hierarchical cascade of gene families initiating the segmented body of the fruit flygenes of this network include engrailed (en) wingless (wg), hedgehog (hh) patched (ptc) cubitus interruptus (ci) and sloppy paired (slp) coding for the corresponding proteins1 para-segment of 4 cells60 nodes: variables are expression levels of segment polarity genes/proteinsBoolean model from (Albert and Othmer, Journal of Theoretical Biology, 2003)
ISBRA 2012
DasGupta, Vera-Licona, Sontag, 2011b. In Silico network: gene regulatory network with external perturbations13 species: 10 genes plus 3 different environmental perturbationsperturbations affect the transcription rate of the gene on which they act directly (through inhibition or activation) and their effect is propagated throughout the network by the interactions between the genesgenerated using the software package in (Mendes, Trends Biochem. Sci, 1997)
ISBRA 2012
DasGupta, Vera-Licona, Sontag, 2011generated time courses for both networks (a) and (b)
For method (A) we considered both greedy and linear programming based approximations to the hitting set problem as well as redundancy values R=1, 2
For method (B), input data must be discrete
used three discretization methods: graph-theoretic based approached D from (Dimitrova, Garcia-Puente, Jarrah, Laubenbacher, Stigler, Stillman, Vera-Licona, 2010)quantile Q discretization (method in which each variable state receives an equal number of data values)interval I discretization (select thresholds for the different discrete values).
ISBRA 2012DasGupta, Vera-Licona, Sontag, 2011Summary of ComparisonISBRA 2012
DasGupta, Vera-Licona, Sontag, 2011network (b): method (B) was better than method (A) in ROC space method (A) achieved a performance no better than random guessingnetwork (a): method (B) could not obtain any results after running over 12 hours method (A) was able to compute results in less than 1 minute method (A) improved slightly when small redundancy was introducedISBRA 2012implementation of method (B): http://polymath.vbi.vt.edu/polynome/
implementation of method (A) done by (DasGupta, Vera-Licona, Sontag, 2011) athttp://sts.bioengr.uic.edu/causal/
DasGupta, Vera-Licona, Sontag, 2011Direct Synthesization of Signal Transduction Networks
Only from known interactions and informationNo new experiments needed
ISBRA 2012Overall Goaldirect interactionA BA Bdouble-causal interactionA (B C)A (B C)additionalinformationMethod(algorithms, software)FASTnetworkminimal complexitybiologically relevantISBRA 2012Nature of experimental evidencebiochemical direct interaction, e.g., binding of two proteinsa transcription factor activating the transcription of a gene a chemical reaction with a single reactant and single product
pharmacological indirect causal effects most probably resulting from a chain of interactions and reactions, e.g., binding of a chemical to a receptor protein starts a cascade of protein-protein interactions and chemical reactions that ultimately results in the transcription of a gene
genetic evidence of differential responses to a stimuluscan be direct, but most often indirect (double-causal)
ISBRA 2012We describe a method for synthesizing double-causal (path-level) information into a consistent network
ISBRA 2012Direct interactions
A promotes B A B
A inhibits B A B
Illustration of double-causal interactionC promotes the process of A promoting BABBACBApseudoISBRA 2012Critical edge(known direct interaction, part of input)
ISBRA 2012Main computational step for network synthesis
Pseudo-vertex collapse (PVC) easy
Binary transitive reduction (BTR) hardneed heuristics
ISBRA 2012Pseudo-vertex collapse (PVC)
Intuitively, PVC is useful for reducing the pseudo-vertex set to the minimal set that maintains the graph consistent with all indirect experimental observations.uvin(u)=in(v)out(u)=out(v)uvpseudo-verticesnew psuedo-vertexISBRA 2012Illustration of Binary Transitive Reduction (BTR) remove?yes,alternate pathremove?no,critical edgeIntuitively, the BTR problem is useful for determining the sparsest graph consistent with a set of experimental observationsISBRA 2012Some biologists did look at very simplified or somewhat different version of BTR, e.g.:
A. Wagner, Estimating Coarse Gene Network Structure from Large-Scale Gene Perturbation Data, Genome Research, 12, pp. 309-315, 2002too special (reachability only), no efficient algorithms reported
T. Chen, V. Filkov and S. Skiena, Identifying Gene Regulatory Networks from Experimental Data, Third Annual International Conference on Computational Moledular Biology, pp. 94-103, 1999excess edge deletion problem, biologically too restrictive version
See the following excellent survey for more comprehensive information about biological network inference and modeling:
V. Filkov, Identifying Gene Regulatory Networks from Gene Expression Data, in Handbook of Computational Molecular Biology (edited by S. Aluru), Chapman & Hall/CRC Press, 2005 H. D. Jong, Modelling and Simulation of Genetic Regulatory Systems: A Literature Review, Journal of Computational Biology, Volume 9, Number 1, pp. 67-103, 2002ISBRA 2012High level description of the network synthesis process
Synthesize direct interactionsOptimizeSynthesize double-causal interactionsOptimizeInteraction withbiologistsBTRPVCBTRAlbert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012excitory (inhibitory) connection encoded by edge label 0 (1)
[encode single causal relationships] 1.1 Build networks for connections like AB and AB noting each critical edge.1.2 Apply BTR[encode double causal reltionships] 2.1 For each double causal relationship of the form A (B C) with x,y{0,1}, add new nodes and/or edges as follows:if B C Ecritical then add A (B C) if no subgraph of the form (for some node D with b = a+b = y (mod 2) )
then add the subgraph (where P is a new pseudo-node and b = a+b = y (mod 2) )
2.2 Apply PVC[final reduction] Apply BTRxyxxxyyABDCbaabAPBCAlbert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012All the steps in the network synthesis procedure except the steps that involve BTR can be done easily
Thus, it behooves to look at BTR more closelyISBRA 2012But, before that, biological validation of the network synthesis approach is desirable
Need a network that uses double-causal experimental evidenceISBRA 2012Plant signal transduction network
consistent guard cell signal transduction network for ABA-induced stomatal closuremanually curateddescribed in S. Li, S. M. Assmann and R. Albert, Predicting Essential Components of Signal Transduction Networks: A Dynamic Model of Guard Cell Abscisic Acid Signaling, PLoS Biology, 4(10), October 2006list of experimentally observed causal relationships collected by Li et al. and published as Table S1. This table containsaround 140 interactions and causal inferences, both of type A promotes B and C promotes process (A promotes B) We augment this list with critical edges drawn from biophysical/biochemical knowledge on enzymatic reactions and ion flows and with simplifying hypotheses made by Li et al. both described in Text of S1
ISBRA 2012 We also formalized an additional rule specific to the context of this network (and implicitly assumed by Li et al.) regarding enzyme-catalyzed reactionsISBRA 2012
Regulatory interactions between ABA signal transduction pathway componentsISBRA 2012
Regulatory interactions between ABA signal transduction pathway components (continued)NO GC not critical and not enzymaticERA1 (ABA CalM)ISBRA 2012Some nodes in the network
GCR1 putative G protein coupled receptorOST1 proteinNO Nitric OxideABH1 RNA cap-binding proteinRAC1 small GTPase proteinISBRA 2012
(left) Guard cell signal transduction network for ABA-induced stomatal closure manually curated by Li, Assmann and Albert [source: PloS Biology, 10 (4), 2006].
( right) our developed automated network synthesis procedure produced a reduced (fewer edges) network while preserving all observed pathways
Albert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012
Albert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012Summary of comparison of the two networks
Li et al. has 54 vertices and 92 edges our network has 57 vertices but 84 edgesBoth networks have identical strongly connected component of verticesAll the paths present in the Li et al.s reconstruction are present in our network as wellThe two networks have 71 common edgesIt took a few seconds to synthesize our network
Albert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012Summary of comparison of the two networks (continued)
Thus the two networks are highly similar but diverge on a few edges,
All these discrepancies are not due to algorithmic deficiencies but to human decisions.
Albert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012Software is available at:
http://www.cs.uic.edu/~dasgupta/network-synthesis/
runs on any machine with MS Windows (Win32) click, save the executable and runISBRA 2012Data sources for this type of network synthesisSignal transduction pathway repositories such as
TRANSPATH (http://www.gene-regulation.com/pub/databases.html#transpath)protein interaction databases such as the Search Tool for the Retrieval of Interacting Proteins (http://string.embl.de)
contain up to thousands of interactions, a large number of which are not supported by direct physical evidence.
NET-SYNTHESIS can be used to filter redundant information while keeping all direct interactions
ISBRA 2012Transitive reduction step used a heuristic
How good is the heuristic in general?ISBRA 2012Performance of our BTR algorithm on random signal transduction networks
But, what is a random biological network? ISBRA 2012Biological networks are scale-free: e.g.,
N. Guelzim, S. Bottani, P. Bourgine, and F. Kepes, Topological and causal structure of the yeast transcriptional regulatory network, Nature Genetics 31, 6063, 2002
Biological networks are NOT scale-free: e.g., :
R.Khanin and E.Wit, How Scale-Free Are Biological Networks ?, Journal of Computational Biology, 13 (3), 810 -818, 2006
So, we decided to look at the literature ourselves and decide on a reasonable model for random signal transduction networks
ISBRA 2012According to us, random signal transduction networks:distribution of in-degree of the network is exponential: Pr[in-degree=x]=L e-Lx, L maximum in-degree is 12distribution of out-degree is governed by a power-law: x 1 : Pr[out-degree=x]=cx-c; Pr[out-degree=0] c, 2 < c < 3 maximum out-degree is 200ratio of excitory to inhibitory edges between 2 and 4
random graphs with prescribed degree distributions are generated using the procedure described in: M. E. J. Newman, S. H. Strogatz and D. J. Watts. Random graphs with arbitrary degree distributions and their applications, Physical Review E, 64 (2), 026118-026134, 2001
ISBRA 2012What percentage of edges should be Critical (known direct interaction)?No known accurate estimates:curated network of Ma'ayan et al. (Science, 2005) expected to have close to 100% critical edges as they specifically focused on collecting direct interactions only Protein interaction networks are expected to be mostly critical Giot et al., Science, 2003Han et al., Nature, 2004Li et al., Science, 2004 Genetic interactions (e.g., synthetic lethal interactions) represent compensatory relationshipsonly a minority are direct interactions. Reverse engineering approaches: lead to networks whose interactions are close to 0% critical
ISBRA 2012We tried a few small and large values, such as 1%, 2% and 50%, for the percentage of edges that are critical to catch qualitatively all regions of dynamics of the network that are of interest
ISBRA 2012Tested on about 550 random networks
# of vertices in the range of about 100 to 1000
running time for individual networksseconds to at most a minute
ISBRA 2012Verify the robustness of performance of our BTR algorithm
perturb network such they do not change the optimal solution of the original graph
Almost always the solution quality does not change because of this
ISBRA 2012
On an average, we use about 5.5% more edges than the optimumPerformance of our implemented algorithm for BTR on random networksAlbert, DasGupta, Dondi, Kachalo, Sontag, Zelikovsky, Westbrooks, 2007ISBRA 2012Other applications NET-SYNTHESIS Synthesizing a Network for T Cell Survival and Death in LGL Leukemia
BackgoundLarge Granular Lymphocytes (LGL)medium to large size cells with eccentric nuclei and abundant cytoplasmcomprise 10%~15% of the total peripheral blood mononuclear cellstwo major lineagesCD3- natural-killer (NK) cell lineage: ~85% of LGL cellsCD3+ lineage: ~15% of LGL
Kachalo, Zhang, Sontag, Albert, DasGupta, 2008ISBRA 2012LGL leukemia
disordered clonal expansion of LGL and their invasions in the marrow, spleen and liverISBRA 2012Background (continued)Ras: small GTPase essential for controlling multiple essential signaling pathwaysits deregulation is frequently seen in human cancers
Activation of H-Ras require its farnesylation, which can be blocked by Farnesyltransferase inhibitiors (FTIs)
This envisions FTIs as future drug target for anti-cancer therapies, and several FTIs have entered early phase clinical trials
This observation, together with the finding that Ras is constitutively activated in leukemic LGL cells, leads to the hypothesis that Ras plays an important role in LGL leukemia, and may functions through influencing Fas/FasL pathway.ISBRA 2012we constructed the cell-survival/cell-death regulation-related signaling network, with special interest on the Ras effect on apoptosis response through Fas/FasL pathway
Goal: initiate understanding of the interactions between Ras pathway and Fas/FasL pathways, two of the major pathways that regulate cell survival/death decision.
Currently, there is no standard therapy for LGL leukemia. Understanding the mechanism of this disease is crucial for drug/therapy development
Proteins that modulate the Ras-apoptosis response can potentially serve as future reference for drug design and therapeutic-target-molecule search, and this may not be restricted to LGL leukemiaKachalo, Zhang, Sontag, Albert, DasGupta, 2008ISBRA 2012Synthesizing a Network for T Cell Survival and Death in Large Granular Lymphocyte Leukemia
Synthesized a cell-survival/cell-death regulation-related signaling network from the TRANSPATH 6.0 database, with additional information manually curated from literature search
359 vertices of this network represent proteins/protein families and mRNAs participating in pro-survival and Fas-induced apoptosis pathways
1295 edges represent regulatory relationships between nodes, including protein interactions, catalytic reactions, transcriptional regulation (no double-causal interactions were known)
Performing BTR with NET-SYNTHESIS reduced the total edge-number to 873Kachalo, Zhang, Sontag, Albert, DasGupta, 2008ISBRA 2012To focus on pathways that involve the 33 known T-LGL deregulated proteins, we designated vertices that correspond to proteins with no evidence of being changed during T-LGL as pseudo-vertices and deleted the label Y for those edges whose both endpoints were pseudo-vertices
Recursively performing Reduction (faster) BTR and Collapse degree-2 pseudonodes of NET-SYNTHESIS until no edge/node could be further removed simplified the network to 267 nodes and 751 edges.
Kachalo, Zhang, Sontag, Albert, DasGupta, 2008ISBRA 2012For further results, see
R. Zhang, M. V. Shah, J. Yang, S. B. Nyland, X. Liu, J. K. Yun, R. Albert, and T. P. Loughran, Network Model of Survival Signaling in LGL Leukemia PNAS, 2008
Binary transitive reductions revives two further interesting questions:
how redundant are biological networks ?what is redundancy and how to measure it ? percentage of edges removed by binary transitive reduction (Albert, DasGupta, Gitter, Grsoy, Hegde, Pal, Sivanathan, Sontag, 2011)
are redundancy and dynamical properties correlated ?ISBRA 2012ISBRA 2012Feedback loops and dynamics of biological networks
analyzing behaviors of feedback loops is a long-standing topic in the context of regulation, metabolism, and developments
e.g., see classical reference works such as
J. Monod and F. Jacob, General conclusions: telenomic mechanisms in cellular metabolism, growth, and differentiation, Cold Spring Harbor Symp. Quant. Biol., 26, 389401, 1961
ISBRA 2012Monotone dynamical system
ISBRA 2012Monotone dynamical system
ISBRA 2012Monotone systems are simpler behaved systems:
pathological behavior (chaos) is ruled out
even though they may have arbitrarily large dimensionality, monotone systems behave in many ways like one-dimensional systems
e. g. , in monotone systemsbounded trajectories generically converge to steady statesthere are no stable oscillatory behaviorsISBRA 2012Associated Signal Transduction Network
v1vjvkvivn
ISBRA 2012
+-++++++----sign-consistentsign-inconsistentparity: product of signssign-consistent: every undirected path between two nodes have same parity--( check undirected paths 1 4 and 1 2 3 4 )ISBRA 2012sign-consistent networks are monotone system
This allows us to define the degree of monotonicity Mof a differential equation systemin the following way:
minimum percentage of edges we need to deleteto make the associated signal transduction network sign-consistent(Albert, DasGupta, Gitter, Grsoy, Hegde, Pal, Sivanathan, Sontag, 2011)ISBRA 2012
ISBRA 2012Undirected Labeling Problem (ULP)needed to compute degree of monotonicity M
Given: undirected graph G=(V,E) edge labeling function h: E {0,1}
Valid solution: a vertex labeling function f: V {0,1}
Definition: an edge {u,v}E is consistent if h(u,v) = f(u) + f(v) (mod 2)
Goal: maximize number of consistent edges
Bad news: NP-hard and even MAX-SNP-hard.
DasGupta, Enciso, Sontag, Zhang, 2007ISBRA 2012Algorithm for ULPSolve the following vector program via Semidefinite programming methods: maximize subject to: for each vV, xv xv = 1
for each vV, xv|V|
Select an uniformly random vector r in the |V|-dimensional unit sphere
Label each vertex v as 0 if r xv 0 1 otherwiseIt can be easily implemented in MATLABDasGupta, Enciso, Sontag, Zhang, 2007
We have two measurable properties:
(topological) redundancy R percentage of edges removed by binary transitive reduction
(dynamical) monotonicity Mminimum percentage of edges we need to delete to make the associated signal transduction network consistent
M is negatively correlated to RISBRA 2012(Albert, DasGupta, Gitter, Grsoy, Hegde, Pal, Sivanathan, Sontag, 2011)Some other conclusions from (Albert, DasGupta, Gitter, Grsoy, Hegde, Pal, Sivanathan, Sontag, 2011)
the redundancy measure R is statistically significant
transcriptional networks are less redundant than signaling networks
redundancy of C. elegans metabolic network is largely due to currency metabolites
calculation of redundancy values and minimal networks provides a way to gain insight into predicted orientation of a protein-protein-interaction (PPI) networksISBRA 2012ISBRA 2012Future Research Questionsin the context of parallel and distributed computing
Synchronization: no global clocks are known to exist for cellular processes (ignoring circadian rhythms and some other global timing mechanisms in higher organisms)
Spatial effects: localization (nuclear, cytoplasmic, membrane-bound) in cellsakin to geographical location affecting communication speeds and coordination in distributed computingList of some relevant referencesR. Albert, B. DasGupta, et al. A New Computationally Efficient Measure of Topological Redundancy of Biological and Social Networks, Physical Review E, 84 (3), 036117, 2011.
B. DasGupta, P. Vera-Licona, E. Sontag. Reverse Engineering of Molecular Networks from a Common Combinatorial Approach, in Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications, John Wiley & Sons, Inc., 2011.
R. Albert, B. DasGupta, E. Sontag. Inference of signal transduction networks from double causal evidence, in Methods in Molecular Biology: Topics in Computational Biology, D. Fenyo (editor), Springer , 2010.
P. Berman, B. DasGupta, M. Karpinski. Approximating Transitive Reduction Problems for Directed Networks, 11th Algorithms and Data Structures Symposium, 2009.
R. Albert, B. DasGupta, R. Dondi, E. Sontag. Inferring (Biological) Signal Transduction Networks via Transitive Reductions of Directed Graphs, Algorithmica, 51 (2), 129-159, 2008.
S. Kachalo, R. Zhang, E. Sontag, R. Albert, B. DasGupta. NET-SYNTHESIS: A software for synthesis, inference and simplification of signal transduction networks, Bioinformatics, 24 (2), 293-295, 2008.
P. Berman, B. DasGupta, E. Sontag. Algorithmic Issues in Reverse Engineering of Protein and Gene Networks via the Modular Response Analysis Method, Annals of the New York Academy of Sciences, 2007.
R. Albert, B. DasGupta, et al. A Novel Method for Signal Transduction Network Inference from Indirect Experimental Evidence, Journal of Computational Biology, 14 (7), 927-949, 2007.
B. DasGupta, G. A. Enciso, E. Sontag, Y. Zhang. Algorithmic and Complexity Results for Decompositions of Biological Networks into Monotone Subsystems}, Biosystems, 90 (1), 161-178, 2007.
P. Berman, B. DasGupta, E. Sontag. Computational Complexities of Combinatorial Problems With Applications to Reverse Engineering of Biological Networks, in Advances in Computational Intelligence: Theory and Applications, F.-Y. Wang and D. Liu (editors), Series in Intelligent Control and Intelligent Automation, World Scientific publishers, 303-316, 2007.
P. Berman, B. DasGupta, E. Sontag. Randomized Approximation Algorithms for Set Multicover Problems with Applications to Reverse Engineering of Protein and Gene Networks, Discrete Applied Mathematics, 155 (6-7), 733-749, 2007.ISBRA 2012Acknowledgments
Thanks to research collaborators for these projects
R. Albert (Penn State) P. Berman (Penn State)R. Dondi (U. of Bergamo)G. Enciso (UC Irvine)A. Gitter (CMU) G. Grsoy (UIC)R. Hegde (UIC)S. Kachalo (UIC)M. Karpinski (Bonn) P. PalG. S. Sivanathan (UIC)E. Sontag (Rutgers)P. Vera-Licona (INRIA)K. Westbrooks (GSU)A. Zelikovsky (GSU)R. Zhang (Penn State)Y. Zhang (UIC)
Thanks to National Science Foundation (NSF) for funding:
DBI-1062328IIS-1064681IIS-0346973DBI-0543365IIS-0610244CCR-9800086CNS-0206795CCF-0208749
Thanks to generous support from DIMACS (Rutgers) during my Sabbatical leave through their special focus on computational and mathematical epidemiologyISBRA 2012ISBRA 2012Thank you for your attention!Questions?
989898Chart1804856595735372927221922161279512422010
Frequency% additional edges = ( ( |E'| / OPT ) - 1 ) * 100frequency of occurence
Sheet1
Sheet1804856595735372927221922161279512422010
Frequency% additional edges = ( ( |E'| / OPT ) - 1 ) * 100frequency of occurence
Sheet2
Sheet3