areejit samal regulation
DESCRIPTION
TRANSCRIPT
System level dynamics and robustness of the genetic network regulating E. coli metabolism
Areejit SamalDepartment of Physics and Astrophysics
University of DelhiDelhi 110007 India
June 15, 2009 Areejit Samal
Outline
• Background
• System: E. coli transcriptional regulatory network controlling metabolism (iMC1010v1)
• Simulation results
• Design features of the regulatory network
• Conclusions
Cell can be viewed as a ‘network of networks’
Metabolic Pathway
Promoter
5’ 3’
Coding region
Gene A
Promoter
5’ 3’
Coding region
Gene B
Promoter
5’ 3’
Coding region
Gene CmRNA
mRNA
mRNA
Protein A Protein B Protein C
A DCB
DNA
Protein
Metabolite
TranscriptionalRegulatoryNetwork
ProteinInteractionNetwork
MetabolicNetwork
Cell
Cell can be viewed as a ‘network of networks’
Metabolic Pathway
Promoter
5’ 3’
Coding region
Gene A
Promoter
5’ 3’
Coding region
Gene B
Promoter
5’ 3’
Coding region
Gene CmRNA
mRNA
mRNA
Protein A Protein B Protein C
A DCB
DNA
Protein
Metabolite
TranscriptionalRegulatoryNetwork
ProteinInteractionNetwork
MetabolicNetwork
Environment Cell
Boolean network approach to model Gene Regulatory Networks
• Boolean networks were introduced by Stuart Kauffman as a framework tostudy dynamics of Genetic networks.
• In this approach, gene expression is quantized to two levels:– on or active (represented by 1) and– off or inactive (represented by 0).
• Each gene at any point of time is in one of the two states (i.e. active orinactive).
• In this approach, time is taken as discrete.
• Also, the expression state of each gene at any time instant is determined bythe state of its input genes at the previous time instant via a logical rule orupdate function.
June 15, 2009 Areejit Samal
Simplified Diagram of the Transcriptional Regulatory Network controlling metabolism
Metabolic reaction
• An input may activate or repress the expression of the gene.For example:Gene B [t+1] = NOT Gene A [t]
• When there are more than one input to a gene, the expression state of the gene will be determined by the state of the inputs based on a logical rule.
• This logical rule may be expressed in terms of Boolean operators (AND, OR, NOT).
• For example:Gene C [t+1] = Gene A [t] AND NOT Gene B [t]
• The state of Gene C determines if the metabolic reaction can occur inside the cell.
June 15, 2009 Areejit Samal
Modelling Gene Regulatory Networks as Random Boolean Networks
In the absence of data on real genetic networks, Boolean networks have beenused primarily to study the dynamics of the genetic networks that were
– either members of ensemble of random networks or
– networks generated using the knowledge of the connectivity of genes andTF in an organism along with random Boolean rules at each node as inputfunction governing the output state of the gene
June 15, 2009 Areejit Samal
E. coli transcriptional regulatory network controlling metabolism (iMC1010v1)
In this work, we have studied the database iMC1010v1 containing thetranscriptional regulatory network (TRN) controlling E. coli metabolism hasbecome available. The network contained in the database was reconstructed fromprimary literature sources.
The database iMC1010v1 contains the following types of information:
– the connections between genes and transcription factors (TF)
– dependence of genes and TF activity based on presence or absence ofexternal metabolites or nutrients in the environment
– the Boolean rule describing the regulation of each gene as a function of thestate of the input nodes
Available at: Bernhard Palsson’s Group Webpage
(http://gcrg.ucsd.edu/)
June 15, 2009 Areejit SamalMetabolic reaction
Promoter
5’ 3’
Coding region
Gene A
Promoter
5’ 3’
Coding region
Gene B
Promoter
5’ 3’
Coding region
Gene CmRNA
mRNA
mRNA
Protein A Protein B Protein C
DC
DNA
Protein
Schematic of Transcriptional Regulatory Network controlling metabolism
TranscriptionalRegulatoryNetwork
MetabolicNetwork
June 15, 2009 Areejit Samal
Description of the E. coli TRN controlling metabolism (iMC1010v1)
• There are 583 genes in this network which can be further subdividedinto– 479 genes that code for metabolic enzymes– 104 genes that code for TF
• The state of these 583 genes is dependent upon– the state of 103 TF and– presence or absence of 96 external metabolites
• The database provides a Boolean rule for each of the 583 genescontained in the network.
June 15, 2009 Areejit Samal
The pink nodes represent genes coding for TF, brown nodes represent genes that code for metabolic enzymes and the green nodes represent external metabolites.
The complete network can be subdivided into a large connected component and few small disconnected components.
June 15, 2009 Areejit Samal
Example of an input function in form of a Boolean rule controlling the output state of a gene
b2720
o2(e)b3202b2731
A CB
OUTPUT
b2720[t+1] = IF ( b2731[t] AND b3202[t] AND NOT o2(e)[t])
A B C OUTPUT
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 0
1 0 0 0
1 0 1 0
1 1 0 1
1 1 1 0
Truth Table
June 15, 2009Areejit Samal
The Dynamical System
We have used the information in the database to construct the following discrete dynamical system:
i
i
Gm
tg
tgi
)(
)1(583...1
denotes the state of ith gene at time t+1 that is either 1 or 0.
is vector that collectively denotes the state of all genes at time t
is a vector of 96 elements (each 0 or 1) determining the state of the environment contains all the information regarding the internal wiring of the network as well as the regulatory logic
( 1) ( ( ), )i ig t G g t m
June 15, 2009 Areejit Samal
State of the genetic network
The state of the 583 genes at any given time instant gives the state of the network.
1
2
3
583
( )( )( )
.
.
.( )
g tg tg t
g t
where gi(t) = 0 or 1; i = 1 …. 583Since each gene at any given time instant can be in one of the two states (0 or 1), the size of the state space is 2583.
g(t)
June 15, 2009 Areejit Samal
State of the environment
The presence or absence of the 96 external metabolites decide the state of the environment.
where mi = 0 or 1; i = 1 …. 96If an external metabolite or nutrient is present in the external environment, then we set the mi corresponding to it equal to 1 or else 0. In general, the concentration of external metabolites change with time. In the present study, we have considered buffered minimal media (i.e., vector m constant in time).
1
2
96
.
.
mm
m
m
E. coli TRN controlling metabolism as a Boolean dynamical system
Stuart Kauffman (1969,1993) studied dynamical systems of the form:
( 1) ( ( ))i ig t G g t
E. coli TRN controlling metabolism as a Boolean dynamical system
Stuart Kauffman (1969,1993) studied dynamical systems of the form:
( 1) ( ( ))i ig t G g t
( 1) ( ( ), )i ig t G g t m
The present database allowed us to systematically account for the effect of presence or absence of nutrients in the environment on the dynamics of the regulatory network.
June 15, 2009 Areejit Samal
Attractors of the E. coli TRN
• In the Boolean approach, the configuration space of the system is finite. Thediscrete deterministic dynamics ensures that the system eventually returns to aconfiguration which it had at a previous time instant. The sequence of statesthat repeat themselves periodically is called an attractor of the system.
• Starting from any one of the 2583 vectors as the initial configuration of genesand a fixed environment, the system can flow to different attractors for differentinitial configuration of genes.
June 15, 2009 Areejit Samal
The Network exhibits stability against perturbations of gene configurations for a fixed environment
( 1) ( ( ), )i ig t G g t m
Fix m to some buffered minimal media e.g. Glucose aerobic condition
Start with different g(t) as initial configuration of genes, and determine the attractor for the system for each initial configuration of genes.
Question 1: How many attractors of the system do we obtain starting from different initialconfiguration of genes and for a fixed environment?Answer 1: We found that the attractors of the genetic network were typically fixed points ortwo cycles. For a given environment, the number of different attractors were up to 8 fixedpoints and 28 two cycles. However, the maximum hamming distance between any twoattractor states for a given environment was 21. Hence, the states of most genes (≥562)was same in all attractor states for a given environment.
We found that the network exhibits homeostasis or stability against perturbations of initialgene configurations for a fixed environment.
June 15, 2009 Areejit Samal
Cellular Homeostasis
The graph shows that starting from even a initial configuration of genes that is inverse of the attractor for the glucose aerobic minimal media the system reaches the attractor in four time steps. Thus, any perturbation of gene configurations will be washed out in few time steps and the system is robust to such perturbations.
Time
0 1 2 3 4
Ham
min
g di
stan
ce w
.r.t.
gluc
ose
aero
bic
cond
ition
attr
acto
r
0
100
200
300
400
500
600
Random initial conditionHamming inverse of the attractorAttractor for glutamate aerobic mediumAttractor for acetate aerobic medium
June 15, 2009 Areejit Samal
E. coli TRN exhibits flexibility of response under changing environmental conditions
Question 2: How different are the attractors from each other for various environmentalconditions?Answer 2: We obtained the attractors of the system starting with 15,427 environmentalconditions. The largest hamming distance obtained between two attractors corresponding todifferent environmental conditions was 145.The system shows flexibility of response to changing environmental conditions.
We found that the system is insensitive to fluctuations in gene configurations for a given fixedexternal environment while it can shift to a different attractor when it encounters a change inthe environment. These properties ensure a robust dynamics of the underlying network.
( 1) ( ( ), )i ig t G g t m
Vary m across a set of 15427 buffered minimal media
Determine the attractors of the genetic system for different environments m
June 15, 2009 Areejit Samal
Flexibility of response
The graph shows that the largest hamming distance between two attractors from a set of attractors for 15,427 environmental conditions was 145.
Hamming distance
0 20 40 60 80 100 120 140
Freq
uenc
y
0
500x103
1x106
2x106
2x106
3x106
3x106
136 138 140 142 144 146
June 15, 2009 Areejit Samal
Flexibility of response
Each gene takes a value 0 or 1 in the 15427 attractors for the different environmental conditions. The standard deviation of a gene’s value across 15427 attractors is a measure of the gene’s variability across environmental conditions.
Standard deviation
0 0 - 0.1 0.1 - 0.2 0.2 - 0.3 0.3 - 0.4 0.4 - 0.5
Num
ber o
f Gen
es
0
50
100
150
200
250
June 15, 2009 Areejit Samal
Functional significance of attractors of TRN controlling metabolism
1010...1
Attractor for a given environment
Gene 1 is active: The enzyme is present to carry out a reaction in the metabolic network
Gene 2 is inactive: The enzyme is absent and a reaction cannot happen in the network
Met
abol
ic e
nzym
esTF
The attractor of the genetic network for a given environment constrains the set of active enzymes that catalyze various reactions in the metabolic network
June 15, 2009 Areejit Samal
Flux Balance Analysis (FBA)
List of metabolic reactions with stoichiometric coefficients
Biomass composition
Medium of growth or environment
Flux Balance Analysis
(FBA)
Growth rate for the given medium
Fluxes of all reactions
Reference: Varma and Palsson, Biotechnology (1994)
INPUT OUTPUT
June 15, 2009 Areejit Samal
Incorporating regulatory constraints within FBA
Biomass composition
Medium of growth or environment
Flux Balance Analysis
(FBA)
Growth rate (pure)
Fluxes of all reactions
INPUT OUTPUT
List of metabolic reactions
June 15, 2009 Areejit Samal
Incorporating regulatory constraints within FBA
Biomass composition
Medium of growth or environment
Flux Balance Analysis
(FBA)
Growth rate (pure)
Fluxes of all reactions
INPUT OUTPUT
List of metabolic reactions
State of theenvironment
m
June 15, 2009 Areejit Samal
Incorporating regulatory constraints within FBA
Biomass composition
Medium of growth or environment
Flux Balance Analysis
(FBA)
Growth rate (pure)
Fluxes of all reactions
INPUT OUTPUT
List of metabolic reactions
1010...1
State of theenvironment
Attractor of the genetic network
m
June 15, 2009 Areejit Samal
Incorporating regulatory constraints within FBA
Biomass composition
Medium of growth or environment
Flux Balance Analysis
(FBA)
Growth rate (pure)
Fluxes of all reactions
INPUT OUTPUT
List of metabolic reactions
1010...1
State of theenvironment
Attractor of the genetic network
Subset
m
June 15, 2009 Areejit Samal
Incorporating regulatory constraints within FBA
Biomass composition
Medium of growth or environment
Flux Balance Analysis
(FBA)
Growth rate (pure)
Fluxes of all reactions
INPUT OUTPUT
List of metabolic reactions
1010...1
State of theenvironment
Attractor of the genetic network
Subset
Growth rate (constrained)
The ratio of constrained FBA growth rate to pure FBA growth rate is ≤ 1.m
June 15, 2009 Areejit Samal
Answer 3(a): Histogram of the ratio of constrained FBA growth rate in the attractor of each of 15427 minimal media to the pure FBA growth rate in that medium. This is peaked at the bin with the largest ratio ≥ 0.9.
Ratio of constrained FBA growth rate topure FBA growth rate
0 - 0.1 0.1 - 0.2 0.2 - 0.3 0.3 - 0.4 0.4 -0.5 0.5 - 0.6 0.6 - 0.7 0.7 - 0.8 0.8 - 0.9 0.9 -1.0
Num
ber o
f med
ia
0
1000
2000
3000
4000
5000
6000
7000
Adaptability
Question 3(a): What is the ratio of the constrained FBA growth rate to pure FBA growth rate for various environmental conditions? In other words, is the regulatory network reaching an attractor that can make optimal use of the underlying metabolic network?
June 15, 2009 Areejit Samal
Adaptability
1010...1
1100...1
1101...0
.
.
.
.
.
.
.
.
t=0 t=1 t=∞
FBABiomasscomposition
GR(t=0) GR(t=1) GR(t=∞)
m
June 15, 2009 Areejit Samal
Adaptability
1010...1
1100...1
1101...0
.
.
.
.
.
.
.
.
t=0 t=1 t=∞
FBABiomasscomposition
GR(t=0) GR(t=1) GR(t=∞)
Question 3 (b): How well is the attractor of any particular medium “adapted” to that medium? Does the movement to the attractor “improve” the cell’s “metabolic functioning” in the medium?
Time
0 1 2 3 4 5
Gro
wth
rate
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Glutamine aerobic mediumLactate aerobic mediumFucose aerobic mediumAcetate aerobic medium
Answer 3(b):Growth rate increases by a factor of 3.5, averaged over pairs of minimal mediaFrom one minimal medium to another the average time taken to reach the attractor is only 2.6steps
Thus the regulatory dynamics enables the cell to adapt to its environment to improve its metabolic efficiency very substantially, fairly quickly.
m
June 15, 2009 Areejit Samal
The graph shows the genetic network controlling E. colimetabolism.
June 15, 2009 Areejit Samal
Design Features of the network explain Homeostasis and Flexibility
External Metabolites
Transcription factors
Metabolic Genes
June 15, 2009 Areejit Samal
Design Features of the network explain Homeostasis and Flexibility
This is an acyclic graph with maximal depth 4. Fixing the environment leads to fixing of TF states and also the leaf nodes leading to homeostasis. But when we change the environment, then the attractor state changes endowing system with the property of flexible response.
External Metabolites
Transcription factors
Metabolic Genes
June 15, 2009 Areejit Samal
Design Features of the network explain Homeostasis and Flexibility
The very few feedbacks from metabolism on to transcription factors are through the concentration of internal metabolites.
External Metabolites
Transcription factors
Metabolic Genes
Internal Metabolites
June 15, 2009 Areejit Samal
Modularity, Flexibility and Evolvability
This is a highly disconnected structure.
The disconnected components are dynamically independent and hence can be regarded as modules.
Such a structure can facilitate duringevolution to new environmental niches.
June 15, 2009 Areejit Samal
Almost all input functions in the E. coli TRN are canalyzing functions
• When a gene has K inputs, then in general there can be 2 to the power of 2K input Boolean functions that can exist. – As K increases the number of possible Boolean functions also
increases.• A Canalyzing Boolean function has at least one input such that for at
least one input value for that input the output value is fixed. • Stuart Kauffman proposed that Canalyzing Boolean functions are likely
to be over-represented in the real networks.• We found that all except four Boolean functions in the E. coli TRN were
canalyzing.
June 15, 2009 Areejit Samal
Design Features of the network
• The genetic network regulating E. coli metabolism is– Largely acyclic– Hierarchical– Root control with environmental variables– Disconnected and modular structure at the level of transcription factors– Preponderance of canalyzing Boolean functions
• There are some small cycles that exist due of presence of control byfluxes or internal metabolites but these cycles are very localized.
• Note that cycles are expected in developmental systems such ascell cycle which is a temporal phenomena.
• In metabolism, lack of cycles at the genetic level can be anadvantage as this is a slow process.
• Most cycles in metabolism exist at the level of enzymes and internalmetabolites such a process is faster.
June 15, 2009 Areejit Samal
Dynamics of the E. coli TRN controlling metabolism is highly ordered in contrast to that of Random Boolean Networks
Kauffman found that Random Boolean Networks (RBN) with K=2 are at the edge of chaos using Derrida Plot. Derrida plot is the discrete analog of the Lyapunovcoefficient. Derrida plot for RBNs with K>2 are found to be above the diagonal and their dynamics is quite chaotic.
Reference: S.A. Kauffman (1993)
June 15, 2009 Areejit Samal
Derrida Plot
100101
110111
000111
100111
t=0 t=1
H(0) = 2 H(1) = 1
H(0)
H(1
)
Derrida plot is a discrete analogue of the Lyapunov coefficient for continuous systems.
Ordered regime
Chaotic regime
June 15, 2009 Areejit Samal
Dynamics of the E. coli TRN controlling metabolism is highly ordered in contrast to that of Random Boolean Networks
Kauffman found that Random Boolean Networks (RBN) with K=2 are at the edge of chaos using Derrida Plot. Derrida plot is the discrete analog of the Lyapunovcoefficient. Derrida plot for RBNs with K>2 are found to be above the diagonal and their dynamics is quite chaotic.
Reference: S.A. Kauffman (1993)
H(0)
0 100 200 300 400 500
H(1
)
0
100
200
300
400
500
Reference: A. Samal and S. Jain (2008)
K can be as large as 8
The E. coli TRN controlling metabolism has input functions with K=8 also. However, the dynamics of the E. coli TRN is highly ordered .
June 15, 2009 Areejit Samal
System is far from edge of chaos
• The simple architecture of the genetic network controlling E. colimetabolism endows the system with the property of – Homeostasis– Flexibility of response
• Note that the dynamics is highly ordered and the system is far from the edge of chaos. It has been argued that the advantage of a system staying close to the edge of chaos lies in its ability to evolvable and be flexible.
• We have shown that the real system has an architecture with root control by environmental variables which is highly flexible, evolvable and far from the edge of chaos.
• Such an architecture of the regulatory network can also be useful for organisms with different cell types.
June 15, 2009 Areejit Samal
Acknowledgement
Collaboration
Sanjay JainUniversity of Delhi, India
Reference