thermodynamic models of combinatorial gene regulation by …oi1/papers/narula(2010)ietsysbio.pdf ·...

29
Published in IET Systems Biology Received on 8th January 2010 Revised on 13th July 2010 doi: 10.1049/iet-syb.2010.0010 Special issue on the Third q-bio Conference on Cellular Information Processing ISSN 1751-8849 Thermodynamic models of combinatorial gene regulation by distant enhancers J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main Street, Houston, TX 77030, USA E-mail: [email protected] Abstract: The dynamical properties of distal and proximal gene regulatory elements are crucial to their functionality in gene regulatory networks. However, the multiplicity of regulatory interactions at control elements makes their theoretical and experimental characterisation difficult. Here a thermodynamic framework to describe gene regulation by distant enhancers via a chromatin mechanism is developed. In this mechanism transcription factors (TFs) modulate gene expression via shifts in the equilibrium between chromatin states. The designs of AND, OR, XOR and NAND two-input transcriptional gates for the chromatin mechanism are proposed and compared to similar gates based on the direct physical interactions of TFs with the transcriptional machinery. An algorithm is developed to estimate the thermodynamic parameters of chromatin mechanism gates from gene expression reporter data and applied to characterise the response function for the Gata2-3 enhancer in hematopoietic stem cells. In addition waiting-time distributions for transcriptionally active states were analysed to expose the biophysical differences between the contact and chromatin mechanisms. These differences can be experimentally observed in single-cell experiments and therefore can serve as a signature of the gene regulation mechanism. Taken together these results indicate the diverse functionality and unique features of the chromatin mechanism of combinatorial gene regulation. 1 Introduction Differential regulation of gene expression is the key to cellular diversity in complex organisms. Its dynamical properties are controlled by underlying gene regulatory networks (GRNs) consisting of transcription factor (TF) genes and their cis- regulatory elements that, together with basic transcriptional machinery, control the expression levels of each gene [1]. The complexity of genetic regulation in higher organisms is related to the complexity of the underlying networks rather than the number of genes [2]. In particular, this complexity often manifests itself in combinatorial regulation of gene expression with multiple inputs converging on regulatory control elements. Binding sites for transcriptional regulators are found either in the immediate vicinity of a transcription initiation site or in the enhancer sequences situated several kilobases upstream or downstream [3]. The molecular mechanisms of gene regulation via distant enhancers are not very well understood. The proposed mechanisms of distal regulation of gene expression can be broadly characterised into two classes contact mechanisms and non-contact mechanisms. Contact mechanisms involve DNA looping or packing that brings the enhancer-bound proteins close to the promoter (Fig. 1a) [4]. Non-contact mechanisms do not rely on direct physical contact of the enhancer-bound proteins and transcriptional machinery [5, 6]. Proposed mechanisms of non-contact enhancer action include superhelical tension in negatively supercoiled DNA, nuclear localisation and nucleosome remodelling [3]. The nucleosome-remodelling hypothesis is particularly attractive as it explains why chromatin integration is often essential to observe any enhancer action [7, 8]. It also explains why enhancers in some single-cell measurements affect the probability of transcription rather than the rate of transcription [7, 9] and how in many cases enhancers regulate transcription in a manner that is independent of their orientation and distance relative to the transcription initiation site [10]. Dynamical modelling of GRNs is often essential to understand their functionality as it provides information IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408 393 doi: 10.1049/iet-syb.2010.0010 & The Institution of Engineering and Technology 2010 www.ietdl.org

Upload: others

Post on 20-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEdo

www.ietdl.org

Published in IET Systems BiologyReceived on 8th January 2010Revised on 13th July 2010doi: 10.1049/iet-syb.2010.0010

Special issue on the Third q-bio Conference on CellularInformation Processing

ISSN 1751-8849

Thermodynamic models of combinatorialgene regulation by distant enhancersJ. Narula O.A. IgoshinDepartment of Bioengineering, Rice University, 6500 Main Street, Houston, TX 77030, USAE-mail: [email protected]

Abstract: The dynamical properties of distal and proximal gene regulatory elements are crucial to theirfunctionality in gene regulatory networks. However, the multiplicity of regulatory interactions at controlelements makes their theoretical and experimental characterisation difficult. Here a thermodynamicframework to describe gene regulation by distant enhancers via a chromatin mechanism is developed. In thismechanism transcription factors (TFs) modulate gene expression via shifts in the equilibrium betweenchromatin states. The designs of AND, OR, XOR and NAND two-input transcriptional gates for the chromatinmechanism are proposed and compared to similar gates based on the direct physical interactions of TFs withthe transcriptional machinery. An algorithm is developed to estimate the thermodynamic parameters ofchromatin mechanism gates from gene expression reporter data and applied to characterise the responsefunction for the Gata2-3 enhancer in hematopoietic stem cells. In addition waiting-time distributions fortranscriptionally active states were analysed to expose the biophysical differences between the contact andchromatin mechanisms. These differences can be experimentally observed in single-cell experiments andtherefore can serve as a signature of the gene regulation mechanism. Taken together these results indicatethe diverse functionality and unique features of the chromatin mechanism of combinatorial gene regulation.

T

1 IntroductionDifferential regulation of gene expression is the key to cellulardiversity in complex organisms. Its dynamical properties arecontrolled by underlying gene regulatory networks (GRNs)consisting of transcription factor (TF) genes and their cis-regulatory elements that, together with basic transcriptionalmachinery, control the expression levels of each gene [1].The complexity of genetic regulation in higher organisms isrelated to the complexity of the underlying networks ratherthan the number of genes [2]. In particular, this complexityoften manifests itself in combinatorial regulation of geneexpression with multiple inputs converging on regulatorycontrol elements. Binding sites for transcriptional regulatorsare found either in the immediate vicinity of a transcriptioninitiation site or in the enhancer sequences situated severalkilobases upstream or downstream [3].

The molecular mechanisms of gene regulation via distantenhancers are not very well understood. The proposedmechanisms of distal regulation of gene expression can be

Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

broadly characterised into two classes – contactmechanisms and non-contact mechanisms. Contactmechanisms involve DNA looping or packing that bringsthe enhancer-bound proteins close to the promoter(Fig. 1a) [4]. Non-contact mechanisms do not rely ondirect physical contact of the enhancer-bound proteins andtranscriptional machinery [5, 6]. Proposed mechanisms ofnon-contact enhancer action include superhelical tension innegatively supercoiled DNA, nuclear localisation andnucleosome remodelling [3]. The nucleosome-remodellinghypothesis is particularly attractive as it explains whychromatin integration is often essential to observe anyenhancer action [7, 8]. It also explains why enhancers insome single-cell measurements affect the probability oftranscription rather than the rate of transcription [7, 9] andhow in many cases enhancers regulate transcription in amanner that is independent of their orientation anddistance relative to the transcription initiation site [10].

Dynamical modelling of GRNs is often essential tounderstand their functionality as it provides information

393

& The Institution of Engineering and Technology 2010

Page 2: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

39

&

www.ietdl.org

Figure 1 Two mechanisms of combinatorial gene regulation by distant enhancers

a Contact mechanism: DNA looping brings the distant enhancer-bound TFs A and B close to the promoter-bound transcriptionalmachinery to allow protein – protein interactions. TFs A and B can activate transcription by stabilising promoter-bound transcriptionalmachinery or repress transcription by destabilising or sterically hindering the binding of the transcriptional machinery to the promoterb Chromatin mechanism: RNA polymerase binding sites are inaccessible in the closed chromatin state where DNA is tightly wrapped intonucleosomes. The binding sites become accessible when DNA unwraps from nucleosomes and forms the open chromatin state. TFs A andB (activators) bind to enhancer binding sites in the open chromatin state to shift the equilibrium towards the open state and increase theprobability of gene transcription. TF C (repressor) binds to the enhancer in the closed chromatin state and shifts the equilibrium awayfrom the open state to decrease the probability of transcription

about network steady states and its responsiveness tophysiologically important inputs and perturbations.Construction of such models requires functional expressionsthat relate concentrations of TFs to the rate of transcriptionof regulated genes. Common approaches to constructingsuch input functions include the use of Boolean functions(such as logical gates), Hill functions and thermodynamicmodels. Boolean models and ordinary differential equation(ODE) models that use Hill functions provide usefulqualitative information about the behaviour of regulatorynetworks. However, these approaches are based onphenomenological information about the networks ratherthan a specific biophysical mechanism of gene regulation[11–13]. In contrast, thermodynamic treatment oftranscriptional regulation provides a rigorous method totranslate hypotheses about the mechanism of transcriptionalregulation into quantitative models [14–16]. This approachhas been extensively used to model bacterial gene regulation

4The Institution of Engineering and Technology 2010

but has not been widely adopted for combinatorialregulation in higher organisms.

We recently developed a thermodynamic model of distantenhancer activation via chromatin disruption and applied it tothe dynamic modelling of the core network module inhaematopoiesis [7]. The model assumes that the structureof chromatin in the gene neighbourhood is in either anunstable open state that allows binding of thetranscriptional machinery and gene transcription or arelatively stable closed state that does not allowtranscription (see Fig. 1b). In the closed chromatin state, thebinding regions for the transcriptional machinery arewrapped in nucleosomes and are inaccessible, and no geneexpression is possible from this state. The closed chromatinstate can spontaneously unwrap to an open state, where thebinding sites become accessible and allow the transcriptionalmachinery to bind to the promoter and initiate transcription.

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 3: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEdo

www.ietdl.org

This model of chromatin structure dynamics is based uponexperimental results that show that (i) the structure ofchromatin, in particular nucleosomes, can impedetranscriptional initiation [17, 18], (ii) chromatin exists in adynamic equilibrium of open and closed states [19] and (iii)TFs can disrupt chromatin structure by displacingnucleosomes to control gene transcription [20, 21]. Thecentral idea of our model is that by binding at the enhancerand modulating chromatin structure, TFs can control therate of gene expression without any physical interactionswith the transcriptional machinery.

In this paper, we further develop a general thermodynamicframework to construct input functions of combinatorial generegulation. We generalise this mechanism to include thepossibility of negative regulation via stabilisation of theclosed chromatin conformation. With that generalisation, weshow that this chromatin mechanism is capable ofgenerating the same logical input functions as the directcontact mechanism of transcriptional regulation [14, 15].We further compare the sensitivities of the resulting inputfunctions with respect to changes of the parameters andindicate important distinctions between contact andchromatin mechanisms. In addition, we describe anapproach that uses gene expression reporter measurements toestimate thermodynamic parameters and thereby characterisethe complete response function for any enhancer design andapply it to characterise the regulation of Gata2, an essentialhaematopoietic stem cell (HSC) gene, by a distant enhancer.Finally, we compare the dynamic properties of the twomechanisms with respect to waiting-time distributions oftranscriptionally active and inactive states.

Our results indicate that the chromatin mechanism of generegulation can perform the same logic gate type inputfunctions for transcriptional regulation as the contactmechanism. However, the differences in the biophysicalmechanism (direct contact against chromatin) lead todifferences in the design of regulatory elements, insensitivities to mutations and in dynamical properties.

2 Results2.1 Thermodynamic formalism to modelcombinatorial gene regulation viachromatin mechanism

Quantitative characterisation of gene regulation requires amathematical expression relating the rate of genetranscription to the concentrations of TFs that regulate itsexpression. Because the initiation of transcription is usuallythe rate-limiting step in gene expression [22], atthermodynamic equilibrium the rate of gene expression I isgiven by the product of the binding probability pB oftranscriptional machinery to the promoter and the rate ofRNA polymerase isomerisation I0

I = I0pB (1)

T Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

We assume that in both contact and chromatin mechanismsof enhancer action, TFs at the enhancer modulate thetranscriptional rate via probability pB. Binding of TFs andthe transcriptional machinery at DNA binding sites in theregulatory region generates multiple protein-bound DNAconfigurations or microstates. At thermodynamicequilibrium, the probability p(a) of each microstate a isgiven by a Boltzmann distribution

p(a) = e−Ga

Z(2)

Here Z = Sae−Ga is a partition function that represents thesum of the Boltzmann weights of all possible configurationsand Ga is the dimensionless free energy of each configurationin the units of kT. In order to compute pB, we then simplysum up probabilities of all configurations where thetranscriptional machinery is bound to the promoter (wedenote this set by aT)

pB =∑a[aT

p(a) (3)

Using (2) and (3) we obtain the following expression for thecumulative probability for transcriptional machinery bound tothe promoter

pB = ZON

ZON + ZOFF

(4)

where we split the partition function Z into two parts ZON

and ZOFF, corresponding to transcriptional machinerybound and not bound states, respectively

ZON =∑a[aT

e−Ga ; ZOFF =∑a�aT

e−Ga (5)

The free energies depend on the binding affinities,cooperative interaction energies and concentrations of allbound proteins in that configuration. Therefore TFs canactivate gene transcription by increasing ZON or repress thetranscription rate by increasing ZOFF. Concentrations ofTFs and RNA polymerase/transcriptional machinery enter(4) and (5) via entropic contributions to free energies ofeach bound configuration

Ga = G0a −

∑i

log([Ci]) (6)

Here [Ci] stands for the concentration of the ith protein andsummation is over all the bound protein monomers in theconfiguration and G0

a is the standard free energy of theconfiguration at unit TF concentration.

Thus far the formalism is very general and can be applied toboth mechanisms of enhancer action depicted in Fig. 1. Thechromatin mechanism (Fig. 1b) includes microstatescorresponding to both open and closed chromatin

395

& The Institution of Engineering and Technology 2010

Page 4: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

39

&

www.ietdl.org

configurations. Transcription activators that attach to DNAbinding sites in the open chromatin state shift the equilibriumtowards the open state and increase the probability of genetranscription (pB). Similarly repressors bind to and stabilise theDNA in the closed chromatin state, thereby decreasing pB.This can occur for instance if the binding site consists of thesequence motifs that are only brought in close physical contactby DNA packaging (see Fig. 1b) [23, 24]. As a result, underthe chromatin mechanism TFs modulate the rate oftranscription even without direct physical interaction with thetranscriptional machinery.

We first consider open and closed chromatin states in theabsence of TF or transcriptional machinery binding anddefine their respective free energies as G0 and G1.Thereafter we set G0 ¼ 0 and measure all the other freeenergies from this reference state. We define an equilibriumconstant of ‘spontaneous’ DNA opening (equilibriumconstant for transitions between open and closed chromatinin the absence of TF bound) as

e−G1 = K (7)

In most cases, the probability of spontaneous opening islow resulting in a large equilibrium constant, K ≫ 1 [19].The binding of the transcriptional machinery occurs only inan open state and therefore only these states contribute toZON. On the other hand, ZOFF includes contributionsfrom open chromatin enhancer configurations (a [ aOpen)and closed chromatin enhancer configurations (a [ aClosed)and is represented as follows

ZOFF = ZClosed + ZOpen (8)

where

ZOpen = 1 +∑

a[aOpen

e−Ga and

ZClosed = K +∑

a[aClosed

e−Ga

(9)

We assume that enhancer-bound TFs do not interact directlywith the promoter-bound transcriptional machinery.Therefore for each open chromatin microstate of theenhancer, the binding free energy for transcriptionalmachinery is the same value denoted GT. As a result, thepartition function ZON is factorised as

ZON = e−GT ZOpen (10)

These general equations can be used to model any generegulatory element that functions via the chromatinmechanism. We will use these equations to model thedistant regulatory elements shown in Fig. 2b that implementinput functions corresponding to various logic gates.

6The Institution of Engineering and Technology 2010

2.2 Implementation of cis-regulatorylogic gate functions with chromatinmechanism

Buchler et al. [15] showed that AND, OR, NAND andXOR logic type cis-regulatory functions can beimplemented with the contact model. In this section, weshow that enhancer-bound TFs can produce similar logicgate input functions without TF–transcriptional machineryinteractions based on the thermodynamic formalism givenby (4), (5) and (8)–(10). The parameter values for eachgate were numerically determined to minimise the mean-square difference from the corresponding gate function ofthe contact mechanism ([15] and Fig. S1).

Our implementation of the AND gate type input functionis schematically shown in Fig. 2a. Both TFs are activators andwe assume that they only bind to DNA in the openchromatin state and thereby increase pB. The probability oftranscription for the enhancer element is calculated byusing the following expressions for ZON and ZOFF in (4)

ZON = e−GT (1 + [A]e−GA + [B]e−GB + [A][B]e−GA−GB−GAB )

ZOFF = K + 1 + [A]e−GA + [B]e−GB + [A][B]e−GA−GB−GAB

(11)

Clearly, the probability of transcription for an AND gate ismaximum at saturating TF concentrations, that is [A],[B] � 1 [cf. (4) and (11)]. We calculate the transcriptionrate normalised to this maximum level of expression and theresults are shown in Fig. 3a. Note that because promoter-bound transcriptional machinery and enhancer-bound TFsdo not interact, the response function value at saturatinglevels of A is always the same as the value at saturating levelsof B. This is not generally true for the contact model as theresponse function value at saturating concentrations dependson the free energy of the TF–transcriptional machineryinteraction. However, for an appropriate choice of freeenergies this saturation effect will not be observed and theresulting input functions are very similar.

The implementation of the OR logic input function isshown in Fig. 2b. The design of the OR logic gate issimilar to the AND gate in that both TFs increase the rateof gene expression by binding to the enhancer. Theexpressions for ZON and ZOFF for the OR logic enhancerare the same as the ones specified in (11) for the ANDgate. However, crucial differences between the AND logicand OR logic gate designs are in the parameterscharacterising the strength of TF-enhancer binding. In thecase of AND gate, TFs bind to the enhancer weakly and theTF–TF interactions stabilise the TF–DNA complex. Onthe other hand, for OR gate each TF binds very strongly toits binding site in the enhancer. We substitute theexpressions from (11) into (4) to calculate the transcriptionrate for the OR gate normalised relative to the maximumrate at [A], [B] � 1. Fig. 3b shows the transcriptional

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 5: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IETdo

www.ietdl.org

Figure 2 Designs of distant enhancers that exhibit a logic gate response

a AND gate response: the binding sites for the TFs in the distant enhancer are weak and cooperative interaction between TFs is strongb OR gate response: the TF binding sites in the enhancer are strong and there is no cooperativityc NAND gate response: the TFs bind weakly to the enhancer sites in the closed chromatin configurationd XOR gate response: only one TF can bind to the enhancer in the open chromatin state because the binding sites overlap. Binding of theTFs to the closed chromatin state is weak but highly cooperative. Dashed lines indicate direct physical interaction. Weaker binding sites arehatchedIn the chromatin mechanism, activators increase the probability of transcription by binding to the enhancer in the open chromatin stateand repressors decrease gene expression by binding to enhancers in the closed chromatin state

input function of the OR gate. The OR gate type logic of theinput function shown here and the contact model inputfunction are very similar (cf. Fig. 3b and Fig. S1b).

TFs A and B act as repressors of transcription in theNAND gate input function. Therefore we assume thatboth TFs bind to the enhancer in the closed chromatinstate (Fig. 2b) and decrease the probability of transcription.This NAND gate is implemented with binding sites in theenhancer (silencer), unlike the implementation for thecontact model NAND gate where the TF binding sitesmust overlap with the binding site of the transcriptionalmachinery [15]. pB depends on the following ZON and ZOFF

ZON = e−GT

ZOFF = K + 1 + [A]e−GA + [B]e−GB + [A][B]e−GA−GB

(12)

The resulting analytical expression of the chromatinmechanism NAND is identical to that of the contactmechanism because of the lack of TF–transcriptionalmachinery interaction energies in either mechanism.Therefore we can analytically find parameter values for thechromatin mechanism such that its normalisedtranscription rate shown in Fig. 3c is identical to the

Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

normalised transcription rate of the contact mechanism [see(12) and (13) in Supplementary information and Fig. S1c].

The XOR gate input function is obtained with thechromatin mechanism as shown in Fig. 2d. In this design,TFs A and B both have two binding sites in the enhancer.One pair of binding sites is only accessible to the TFs inthe open chromatin state (binding affinities G1

A, G1B) and

the two sites overlap such that only one TF can be boundat a time. The other pair of binding sites is weak and onlyaccessible to TFs for binding in the closed chromatin state(binding affinities G2

A, G2B). The two TFs bind to the

closed chromatin sites cooperatively (high free energy ofinteraction G2

AB). ZON and ZOFF in this case are given bythe following equations

ZON = e−GT (1 + [A]e−G1A + [B]e−G1

B )

ZOFF = K + [A]e−G2A + [B]e−G2

B + [A][B]e−G2A−G2

B−G2AB

+ 1 + [A]e−G1A + [B]e−G1

B

(13)

We use the above equations with numerically estimatedparameters (see the Methods section) to calculate thenormalised transcription rate relative to the maximum

397

& The Institution of Engineering and Technology 2010

Page 6: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

39

&

www.ietdl.org

transcription rate for the case [A] � 1, [B] ¼ 0 (orequivalently [A] ¼ 0, [B] � 1). Despite our attempt tomatch the response functions of the contact mechanismXOR gates (single promoter model), the shape of theresulting response functions is slightly different (cf. Fig. 3dand Fig. S1d ). However, we argue that the chromatinmechanism’s design mimics an XOR gate better than thecontact mechanism’s design. In fact, the response functionof the XOR gate of the chromatin mechanism is similar tothe response of the contact mechanism XOR gate thatinvolves two promoters (cf. [15]).

2.3 Sensitivities of logic input functionsto free energy values

The logic gate response functions discussed above depend onthe values of free energies Gi of TF binding and interactions.Even though the chromatin mechanism is capable ofmatching the response function gates, it still may possessdifferent sensitivities to parameter variation. To quantify

Figure 3 Gene expression response functions for the logicgates of the chromatin mechanism

a For the AND gate response, the normalised rate of transcriptionis calculated relative to the transcription rate at high TFconcentrations [A] ¼ [B] ¼ 103(GA ¼ GB ¼ 5.26, GAB ¼ 20.70,K ¼ 39.4, e−GT = 1.16)b For the OR gate type response, the normalised rate oftranscription is calculated using (11) relative to the transcriptionrate at [A] ¼ [B] ¼ 103(GA ¼ GB ¼ 3.38, GAB ¼ 1.4, K ¼ 24.14,e−GT = 2.15)c Normalised rate of gene transcription for the NAND responsefunction relative to the transcription rate at [A] ¼ [B] ¼ 0(GA ¼ GB ¼ 1.61, GAB ¼ 0, K ¼ 19, e−GT = 2000)d Normalised rate of transcription from (13) relative to the rate oftranscription at [A] ¼ 103, [B] ¼ 0 shows XOR logic(G1

A ¼ G1B ¼ 24.52, G2

A ¼ G2B ¼ 20.10, G2

AB ¼ 26.22, K ¼ 7.51,e−GT = 1.15)Parameters were determined numerically so that the responsefunction for each logic gate is as close as possible to thecorresponding response function for the respective logic gate ofthe contact mechanism

8The Institution of Engineering and Technology 2010

these differences, we calculate the logarithmic sensitivity asfollows [25]

SGi= ∂ log(pB)

∂Gi

(14)

Here pB is the probability of transcription as given by (4) and(11), and the index i ¼ A, B, AB indicates a specific freeenergy. These free energies are easily affected by mutationsin the DNA–TF binding sites. Therefore the sensitivitiesare important indicators of the evolutionary robustness andadaptability of the transcriptional response.

The AND gate response is most sensitive to the freeenergies GA, GB and GAB at high concentrations of TFs Aand B, respectively. The sensitivity of the AND gateresponse to these free energies is similar for the chromatinmechanism and the direct contact mechanism (seeFigs. S2a–d ).

The sensitivities of the OR gate response to free energiesGA, GB and GAB were calculated using (11). The chromatinmechanism OR gate response is sensitive to GA and GB in alarger range of TF concentrations than the direct contactmechanism (see Figs. 4a and b). However, the direct contactmechanism shows more sensitivity to GAB near saturatingconcentrations of TFs A and B (see Figs. 4c and d).

The sensitivities of the NAND gate response to variationsin the TF binding and interaction energies are identical forthe two mechanisms. This is expected because the modelsfor the two systems are exactly the same as shown above[see (12) and (13) in the Supplementary information andFigs. S2e–h].

The sensitivities of the XOR gates to various free energiesdiffer significantly between the two mechanisms (seeFigs. S2i– l ). However, these differences in sensitivity aremainly because of the dissimilarity of the responsefunctions themselves.

In summary, we found that the sensitivities to free energiesfor AND and NAND logic gate responses of the chromatinmechanism do not differ significantly from the correspondingsensitivities of the contact mechanism. However, thechromatin mechanism OR gate response is more sensitiveto free energies of TF-enhancer binding. This suggests thatthe OR gate response of the chromatin mechanism is moresensitive to mutations in the TF binding sites.

2.4 Parameter estimation for chromatinmodel from experiments

Statistical thermodynamic models can be used to predictthe transcriptional response combinatorial cis-regulatoryenhancers have over a range of TF concentrations andquantitatively characterise different designs of generegulation as shown above. But these models usually have a

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 7: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEdo

www.ietdl.org

large number of independent parameters – the free energiesof all the configurations. Direct measurement of theseparameters can be very cumbersome and without theparameter values it is difficult to relate results from thesemodels to experimental information about gene expression.This problem greatly limits the utility of thermodynamicmodels. In this section, we outline an approach thatreduces the dimensions of the unknown parameter spacefor the chromatin mechanism using experimentalmeasurements of gene expression from enhancer-reporterconstructs. As a result, a handful of reporter measurementsallow us to quantitatively reconstruct the full transcriptionalresponse function.

To illustrate our approach for parameter estimation, wedevelop a thermodynamic model of the regulation ofGata2, a gene that regulates the specification anddifferentiation of HSCs [26–30]. Enforced over-expressionand knockout experiments have shown that that the controlof Gata2 expression has major implications for HSCfunction [29–32]. Gata2 gene expression is an idealexample for the illustration of our parameter estimationapproach because its regulation is dependent on thepresence of multiple TFs as well as the chromatinorganisation of distant upstream regulatory regions [29, 31,32]. Moreover, experimental gene expression measurements

Figure 4 Sensitivity of OR gate response to variations offree energies values

a and b Sensitivity of the transcription probability to the freeenergy of TF binding, GA for the contact mechanism andchromatin mechanism, respectively. The chromatin mechanismhas a larger region of high sensitivity than the response of thecontact mechanismc and d Sensitivity of the transcriptional probability to theinteraction energy between two TFs GAB for the contactmechanism and chromatin mechanism, respectively. For bothmechanisms, the response is sensitive to GAB only at high TFconcentrations. In this region, the response for the contactmechanism is more sensitive

T Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

for the Gata2 enhancer–reporter constructs have recentlybecome available [29].

As shown in Fig. 5, Gata2 binds to an enhancer 3 kbupstream (Gata2–3) along with another TF Fli1 to up-regulate its own transcription [29]. Both TFs enhanceGata2 gene expression [29]; therefore we assume that theybind to the Gata2–3 enhancer only in the open chromatinstate. The effect of the Gata2–3 enhancer on geneexpression was recently measured experimentally andreported in [29]. The authors cloned the Gata2–3enhancer upstream of a SV40 promoter controlling a LacZreporter gene [7, 29]. Thereafter, this construct wasintegrated into the genome of haematopoietic progenitorcells that show high concentrations of Gata2 and Fli1. The

Figure 5 Application of the parameter estimation methodto the Gata2–3 enhancer

a Schematic representation of Gata2–3 and mutant enhancer–reporter constructs. The wild-type (wt) enhancer contains bothGata2 and Fli1 binding sites, Enhancer 1 (E1) contains only aGata2 binding site and Enhancer 2 (E2) contains only Fli1binding sites. The numbers show the fold expressionenhancement relative to the expression from Enhancer 3 (E3),which does not have any TF binding sites (data taken from [7,29]). These measurements are used in (20)–(22) to calculate theparameters of the Gata2 response functionb Gata2 enhancer response function shows AND type logic for achromatin equilibrium constant of K ¼ 300. We have normalisedthe Gata2 and Fli1 concentrations with the respective wild-typeconcentrations and the white lines demarcate thisphysiologically relevant range of TF concentrations. Note thatthe transcription rates are normalised relative to the minimumtranscription rate at [Gata2] ¼ [Fli1] ¼ 0. The fold change underover-expression of Gata2 and Fli1 is �Kc Sensitivity of the Gata2 response to the value of the chromatinequilibrium constant K. The Gata2 response function is notsensitive to the value of K within the range of wild-typeconcentrations of TFs (demarcated by white lines). However, theresponse is sensitive to the value of the chromatin equilibriumconstant when TFs are overexpressed

399

& The Institution of Engineering and Technology 2010

Page 8: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

40

&

www.ietdl.org

cells were then disrupted and analysed for b-galactosidaseactivity. Assuming that the reporter protein is stable, thelevel of b-galactosidase activity in cells with the enhancer–reporter construct is directly proportional to the rate ofreporter transcription in these cells.

The measured rate of transcription of the reporter I G isproportional to the probability pB that the promoter isbound by the transcriptional machinery [see (4)]. SinceRNA polymerase binds typical core promoters very weakly[14, 33–35] we find from (8)–(10) that

ZON ≪ ZOFF (15)

Accordingly, we keep only ZOFF in the denominator of (4)for pB to obtain

pB = e−GT (1 + e−GG + e−GF + e−GFG )

Ks + 1 + e−GG + e−GF + e−GFG(16)

Here GG, GF and GFG represent the free energies of theGata2-bound, Fli1 dimer-bound and Gata2–Fli1 dimer-bound enhancer configurations, respectively. GFG includesthe binding affinities GG, GF as well as the free energy ofthe Gata2–Fli1 protein–protein interaction. These freeenergies follow the definition in (6) and include entropiccontributions from wild-type concentrations of Gata2 andFli1 and the concentrations of Gata2 and Fli1 arenormalised with these wild-type concentrations. KG

represents the equilibrium constant for transitions betweenopen and closed chromatin. The general idea behind theapproach is that if the binding site of a TF is mutated ordeleted, then the binding of that TF to the mutatedenhancer becomes energetically unfavourable and thecorresponding terms are excluded from both the numeratorand denominator in the expression for pB. This allows usto compute one of the remaining free energies from theratio of transcription rates of reporters with wild-type andmutated enhancers.

Fig. 5a shows the fold expression enhancement for thereporter construct in the presence of the wild-type (wt)Gata2–3 enhancer and three reduced versions of thisenhancer: Enhancer 1 (E1) – Fli1 binding sites deleted,Enhancer 2 (E2) – Gata2 binding site deleted, Enhancer 3(E3) – all binding sites deleted. All the experimental datahave been abstracted from [7, 29]. The fold change in geneexpression for different enhancer–reporter constructs wascalculated by normalising the level of b-galactosidaseactivity of cells with enhancer–reporter constructs with theb-galactosidase activity levels of cells with enhancerless–reporter constructs. Therefore fold enhancements of geneexpression reflect the ratio of pB in the presence and

0The Institution of Engineering and Technology 2010

absence of enhancers. Note in (16) that the factors e−GT

will cancel as ratios of transcription rates are computed.

We use the equations above to relate the free energies ofdifferent configurations to the fold enhancement of geneexpression. The enhancer E1 can only bind Gata2.Accordingly, the ratio of transcription rates I G

E1/IGE3

depends only on the free energy GG of TF Gata2 and theequilibrium constant KG

I GE1

I GE3

= pE1B

pE3B

= (e−GG + 1)/(KG + e−GG + 1)

1/(KG + 1)(17)

Similarly, only Fli1 can bind to the enhancer E2 and the rate ofgene transcription from this enhancer relative to the expressionrate from E3 only depends upon GF and KG (see (19)).

I GE2

I GE3

= pE2B

pE3B

= (e−GF + 1)/(KG + e−GF + 1)

1/(KG + 1)(18)

The ratio of transcription rates from the wild-type Gata2–3enhancer and the enhancer E3 is easily constructed using(16) and this ratio depends on the free energies GG, GF

and GFG (see (19)).

Equations (17) and (18) are solved analytically for GG and GF

in terms of the equilibrium constant KG.

GG = − log(1 − I G

E1/I GE3)(KG + 1)

(KG + 1 − I GE1/I G

E3)

( )(20)

GF = − log(1 − I G

E2/I GE3)(KG + 1)

(KG + 1 − I GE1/I G

E3)

( )(21)

These solutions are used in (19) to solve for GFG as a functionof only KG.

GFG = − log(1 − I G

wt/I GE3)(KG + 1)

(KG + 1 − I Gwt/I G

E3)

(

− (1 − I GE1/I G

E3)(KG + 1)

(KG + 1 − I GE1/I G

E3)− (1 − I G

E2/I GE3)(KG + 1)

(KG + 1 − I GE2/I G

E3)

)

(22)

Thus, using experimental data from [7, 29] for foldenhancement of gene expression in (20)–(22), we reduce thedimensions of the parameter space to one. If we know KG

we can uniquely determine the free energies GG, GF andGFG. The unknown parameter KG can only beexperimentally determined through overexpression of one ofthe TFs but these data are currently unavailable. We assumean appropriate value for KG to calculate the free energies and

I Gwt

I GE3

= pwtB

pE3B

= (e−GG + e−GF + e−GFG + 1)/(KG + e−GG + e−GF + e−GFG + 1)

1/(KG + 1)(19)

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 9: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEdo

www.ietdl.org

the response function of the Gata2–3 enhancer. The responsefunction is shown in Fig. 5b and indicates that the Gata2–3enhancer functions as an asymmetric AND gate.Cooperative binding of Fli1 and Gata2 predicted from ourestimations ensures that a high expression level is achievedonly in the vicinity of maximal concentrations of both TFs.

As the exact value of KG is unknown, we explore thesensitivity of the Gata2 response to the value of thechromatin equilibrium constant. The logarithmic sensitivitywas calculated with (14) and is shown in Fig. 5c. We foundthat the AND logic property of the response function is notsensitive to the value of the equilibrium constant KG

(see Fig. 5c and Fig. S3). We can choose any value fromthe range KG . I G

wt/IGE3 ¼ 120 (here we choose KG ¼ 300).

Note that the maximum fold change in gene expression isapproximately KG, thus showing that the chromatinequilibrium constant can be measured by overexpression ofTFs. Note that the choice of this equilibrium constant canaffect the dynamic properties of the transcriptional response.In the construction of dynamical ODE type models, thevalue of the chromatin equilibrium constant may also beconstrained by qualitative phenotypic requirements (cf. [7]).

2.5 Comparison of stochastic kineticsof gene regulation by direct contactand chromatin mechanisms

So far we have focused on the steady-state transcriptionalresponse for combinatorial gene regulation via thechromatin mechanism and found that the chromatinmechanism can mimic the transcriptional response of thecontact mechanism when the effect of the TFs issymmetric. In such cases, it might be difficult todistinguish the two mechanisms based on steady-statemeasurements of gene expression levels. However, the twomechanisms can still be distinguished based upon thedifferences in their dynamics. Recent advances in singlemolecule experimental techniques offer a wealth of dataabout the dynamics of transcriptional regulation in singlecells [36–38]. In this section, we will use a simple exampleto show how single molecule experimental data about thedynamics of gene regulation can be used to infer themechanism of gene regulation. For simplicity, we use a ‘toymodel’ with a single transcription activator to demonstratetwo differences in the microscopic kinetics of these twomodels that can be experimentally observed.

Consider a gene that is regulated by a single TF A thatbinds to a distant enhancer. TF A up-regulates geneexpression via direct physical contact with thetranscriptional machinery or by shifting the equilibrium oflocal chromatin structure to an open conformation in whichthe promoter is accessible to the transcriptional machinery.The probability of gene transcription in thermodynamicequilibrium for both mechanisms can be calculated usingthe framework discussed in Section 2.1. Note thatthroughout this section, the superscripts con and chr

T Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

denote the direct contact mechanism and the chromatinmechanism, respectively. For the contact mechanism, theprobability of transcription pcon

B is calculated using (4)

pconB = e−Gcon

T (1 + v[A]/K conA )

1 + [A]/K conA + e−Gcon

T (1 + v[A]/K conA )

(23)

Here K conA is the dissociation equilibrium constant of TF-

enhancer binding and v represents the strength of TF–transcriptional machinery interaction. The probability oftranscription pchr

B for the chromatin mechanism is alsocalculated using (4)

pchrB = e−Gchr

T (1 + [A]/K chrA )

K + 1 + [A]/K chrA + e−Gchr

T (1 + [A]/K chrA )

(24)

Similar to the contact model, K chrA represents the enhancer

binding energy of TF A and GchrT represents the free energy of

transcription machinery binding. Note that there is nointeraction energy between the TF and transcriptionalmachinery in the chromatin mechanism. It can easily be shownthat the probabilities of transcription pcon

B and pchrB are equal for

all TF concentrations if the parameters are chosen according to

K = v− 1

K chrA = K con

A /v

GchrT = Gcon

T − log(v)

(25)

When these three conditions are satisfied, the steady-state rateof gene expression is the same for both mechanisms. However,there are still differences in the kinetics of the binding anddissociation of the transcriptional machinery in these twomechanisms.

Figs. 6a and b show the kinetic schemes for the directcontact and chromatin mechanisms, respectively. The modelfor the direct contact mechanism involves four configurationsof the regulatory region – empty (O), TF bound (OA),transcriptional machinery bound (OR) and both TF andtranscriptional machinery bound (OAR). The model for thechromatin mechanism involves five configurations: closedchromatin (C), open chromatin-empty (O), TF bound (OA),transcriptional machinery bound (OR) and TF andtranscriptional machinery bound (OAR). We assume thatonly the rate constants of TF and transcriptional machinerydissociation from DNA are affected by their respectiveaffinities for the binding sites. Using this assumption and(25), the rate constants of TF/transcriptional machinerybinding and dissociation reactions are set to the valuesshown in Figs. 6a and b. Additionally, in the chromatinmodel ko and kc, the rate constants of spontaneous closed toopen chromatin (C to O) and open to closed chromatin (Oto C) transitions, respectively, are related to the chromatinequilibrium constant defined in (25) as: K ¼ kc/ko.

401

& The Institution of Engineering and Technology 2010

Page 10: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

40

&

www.ietdl.org

We use the methods discussed in [39] to calculate theprobability density functions (PDFs) of waiting times inthe transcriptional machinery bound (ON) states andunbound states (OFF) for the two mechanisms (see the

Figure 6 Waiting time distributions for transcriptionalmachinery bound and unbound states

a and b Kinetic schemes of the direct contact and chromatinmechanisms for transcriptional regulation by a single activator.O, OA , OR and OAR denote empty enhancer, activator bound,transcriptional machinery bound and activator + transcriptionalmachinery bound configurations of the of the enhancer-genelocus in both mechanisms. In addition, these four configurationsof the chromatin mechanism represent the open chromatinconfigurations whereas the closed chromatin configuration isdenoted by C. The transcription machinery-bound configurationsOR and OAR together represent the ON state for bothmechanisms. All other configurations are part of the OFF state.The rate constants of the transitions between different statesare shown above the respective arrows (see the text for details)c PDF for waiting times in the ON state for the contactmechanism (dashed line) shows two different timescaleswhereas the PDF for the chromatin mechanism (solid line)shows only one timescale. The time axis is normalised by thetimescale of dissociation of the transcriptional machinery in thechromatin mechanism (kR

d/v)21

d PDF for the waiting time in the OFF state for the chromatinmechanism (solid line) shows three timescales whereas the PDFfor the contact mechanism (dashed line) shows only one. Thetime axis is normalised by the timescale of binding of thetranscriptional machinery (kR)21

e Mean waiting times in the ON state ktconON l, ktchr

ONl. ktchrONl (solid

line) is not a function of TF concentration whereas ktconON l

(dashed line) increases with an increase in TF concentrationf Mean waiting times in the OFF state ktcon

OFFl, ktchrOFFl. Waiting time

ktchrOFFl decreases with an increase in TF concentration whereas

ktconOFFl is independent of TF concentration

2The Institution of Engineering and Technology 2010

Methods section). PON(t) represents the PDF that the firstexit from the ON state lies in the interval (t0 + t,t0 + t+ Dt), given that the system entered the ON state attime t0. Similarly POFF(t) represents the PDF of the firstexit times from the OFF state. The fraction of time spentin the ON state is directly related to the rate oftranscription. The ON and OFF times are related totranscriptionally active and transcriptionally inactive statesand therefore may be obtained from the time-series data ofsingle-cell gene expression. In addition, several groups havealready shown that the time spent in transcriptionalmachinery bound and unbound states can be trackedin vivo by adding fluorescent protein binding hairpin loopsto the mRNA tail end [40–42] or by localisationenhancement that can detect protein binding anddissociation from a specific location [37, 38]. The waitingtime distributions PON(t) and POFF(t) can be determinedfrom these types of experiments and qualitative features ofthese distributions can be used to determine whether thegene regulation mechanism involves direct interactionsbetween TFs and transcriptional machinery.

Fig. 6c shows the PDF for time spent in the ON state forthe half-saturated TF concentration [A] ¼ K con

A . Thewaiting time PDF Pchr

ON(t) for the chromatin modeldepends on only one rate constant of dissociation oftranscriptional machinery kR

d /v

PchrON(t) = kR

d

ve−kR

d t/v (26)

This happens because the rate of transcriptional machinerydissociation is the same for the OR and OAR states withoutdirect interactions with the activator. In contrast, for thecontact mechanism the rate of exit from the ON statedepends on whether the system is in sub-state OR or OAR

because the rate of exit from the two states is different.Accordingly, the PDF Pcon

ON(t) is a sum of two exponentialterms

PconON(t) = w1r1e−r1t + (1 − w1)r2e−r2t (27)

r1 and r2 represent two different timescales for the waitingtime in the ON state and w1 [ [0, 1] is a weighing factorthat represents the probability of observing the r1 timescale.The weighting factors and Pcon

ON(t) can be modulated withthe TF concentration. In the absence of TF A ([A] � 0)the probability of being in the substate OAR is zero. As aresult, w1 � 1 and r1 � kR

d resulting in an exponentialdecay of Pcon

ON(t) with the characteristic rate kRd . At

saturating levels of A ([A] � 1), the substate OAR

dominates, w1 � 0 and r2 � kRd /v, and this results in an

exponential decay of PconON(t) with the characteristic rate kR

d /v.The contact model assumes a strong interaction oftranscriptional machinery and the activator resulting inseparation of these timescales (v . 1). For intermediateconcentrations of TF A two timescales are visible (seeFig. 6c). The range of concentrations for which this

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 11: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEdo

www.ietdl.org

timescale separation can be observed depends on the strengthof the TF–transcriptional machinery interaction v andkR

d /kAd .

The mean waiting times in the ON state are as follows

ktconONl = w1

r1

+ 1 − w1

r2

ktchrONl = v

kRd

(28)

Notably ktchrONl is independent of the TF concentration

whereas ktconONl increases with TF concentration till

ktconONl ¼ ktchr

ONl at saturating concentrations of A (seeFig. 6e). Moreover, because there are no TF–transcriptional machinery interactions in the chromatinmodel, ktchr

ONl has only one timescale and is independent ofTF concentration. This result will hold even when multipleTFs bind to the enhancer to regulate gene expression (notshown).

The situation is different for the distribution oftranscriptionally inactive states (Fig. 6d). In this case, thewaiting time distribution in the OFF state for the contactmodel is exponential

PconOFF(t) = kRe−kRt (29)

This is a consequence of the assumption that the binding rateof the polymerase does not change with the presence of anactivator (only dissociation rate does). Therefore the decayof PDF is determined by the rate constant of the bindingtranscriptional machinery kR. In contrast, up to threedistinct timescales can be present in the waiting timedistribution in the OFF state for the chromatin mechanism(Fig. 6d, solid line). The three different timescales of thechromatin mechanism are reflected in the PDF Pchr

OFF(t),which consists of three exponentials

PchrOFF(t) = c1r1e−r1t + c2r2e−r2t + (1 − c1 − c2)r3e−r3t (30)

Here r1, r2, and r3 represent three different timescales for thewaiting time in the OFF state and c1, c2 [ (0, 1) are weighingfactors that represent the probabilities of the r1 and r2

timescales, respectively. PchrOFF(t) is modulated by changing

the TF concentration. In the absence of TF A there areonly two timescales. The fast timescale corresponds to thedirect exit from the open state (O) and the slow timescaleinvolves switching between the open and closed (C )chromatin states before exiting from the open state. As theTF concentration is increased, three timescales becomevisible owing to the presence of an OA state. At saturatingconcentrations of A the equilibrium of the open and closedchromatin states shifts almost completely towards the openstate. As a result, the chromatin mechanism resembles thecontact mechanism and there is only a single timescale

T Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

(1/kR) in the waiting time distribution.

PconOFF(t) = kRe−kRt (31)

The mean waiting times in the OFF state are as follows

ktchrOFFl = c1

r1

+ c2

r2

+ (1 − c1 − c2)

r3

ktconOFFl = 1

kR

(32)

We find that ktconOFFl is independent of TF concentrations

whereas ktchrOFFl decreases with TF concentration till

ktconOFFl ¼ ktchr

OFFl at saturating levels of TF A (see Fig. 6f ).Note that while changing the TF concentration affects onlyktchr

OFFl in the chromatin mechanism and only ktconONl in the

contact mechanism, the fractional time spent in the ONstate is the same for both mechanisms

ktchrONl/(ktchr

ONl + ktchrOFFl) ¼ ktcon

ONl/(ktconONl + ktcon

OFFl). Thisfractional time in the ON state is proportional to theprobability of transcription pB. Because we assumed that theprobability of transcription is the same for both mechanisms,this result shows that our analysis is self-consistent.

Our results show that qualitative differences in the ON andOFF state waiting time distributions can be used to identifythe biophysical mechanism of gene regulation. Although arelatively simple model was chosen to illustrate the effect,many of the results can be generalised for combinatorialregulation by multiple TFs. A more detailed investigation ofthe weight-time distributions will be reported elsewhere.

3 DiscussionOur results show that the chromatin mechanism and the directcontact mechanism are capable of creating functionally similarlogic transcriptional gates. Notably, the NAND gate is auniversal gate that can be used to create any logic gate.Moreover, transcriptional logic gates are continuous functionsthat can be adapted to more complicated combinatorialoperations by tuning the TF binding affinities throughbinding site mutations as shown by the sensitivity analysis.This adaptability suggests that virtually any response functioncan be constructed with a combination of different logic gatesand appropriate manipulation of TF binding affinities.

Although chromatin and direct contact mechanisms canshow functionally equivalent transcriptional responses, thedesigns of regulatory elements for any transcriptional inputfunction are very different between the two mechanisms.These differences in design may have important implications.

The chromatin mechanism is more flexible in the design ofenhancers. Specific interactions between TFs and thetranscriptional machinery are unnecessary to produce thesame response as the contact mechanism. The onlyrequirement is that the chromatin structure at the enhancer

403

& The Institution of Engineering and Technology 2010

Page 12: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

40&

www.ietdl.org

and the gene unpack together. This allows a lot of flexibility inenhancer location because chromatin domains as large as severalkilobases in length can open as a whole [43]. It is important tonote that according to the chromatin mechanism, enhancerscan act in a non-specific manner to activate the transcriptionof genes in the neighbourhood of the target genes. In fact,this type of non-specific transcriptional activation isconsistent with a number of reports regarding the effects ofdistant enhancers and locus control regions in eukaryotes[44–46]. Each TF binds to the enhancer and disturbs theequilibrium between open and closed chromatin states andchanges the probability of binding of other TFs in a mannersimilar to the Monod–Wyman–Changeux model forallosteric enzymes [47]. As a result, physical TF–TFinteractions are unnecessary for cooperativity between TFs.For example, the NAND gate response functions of the twomechanisms are identical [cf. equations (22) and (23) inSupplementary information] but the response equation forthe contact mechanism has an explicit TF–TF interactionterm whereas the equation for the chromatin mechanism doesnot. The effective cooperativity that emerges from theequilibrium between open and closed chromatin in this case isequivalent to an effective free energy of interaction (seeSupplementary information). The emergence of cooperativitywithout direct physical interaction between TFs means thatany two DNA binding proteins can be used as TFs under thechromatin mechanism.

In contrast, the direct contact mechanism restricts thelocation of binding sites for transcriptional regulation. First,transcriptional repression requires binding sites in thepromoter vicinity. For example, in the NAND gate of thecontact model [15], binding sites for repressors A and Bmust be in the promoter region so that they can occludethe RNA polymerase binding site. Second, for all contactgates the free energies of DNA looping and TF–transcriptional machinery interaction affect the possibleenhancer location. Third, each response function requiresspecific domains on TFs that are responsible for theappropriate TF–transcriptional machinery interactions.

Both the contact mechanism and chromatin mechanism canbe utilised for combinatorial gene regulation in higherorganisms and it might be necessary to investigate theparticulars of the mechanism for each gene. Our modelssuggest several experimental designs to distinguish betweenthe alternatives. Although the two mechanisms arefunctionally equivalent within the operating range of TFconcentrations, we could distinguish the two through forcedover-expression of any one of the TFs. Saturatingconcentrations of any TF will show the same level of geneexpression for the chromatin mechanism. On the other hand,expression rates at saturating concentrations of different TFsmight be different for the direct contact mechanism. Anothermethod involves shifting the position of the enhancer relativeto the promoter. Regulation by contact mechanism issensitive to such translocations because the free energy of theenhancer-bound TFs and promoter-bound transcriptional

4The Institution of Engineering and Technology 2010

machinery depends on the distance between them.Regulation by the chromatin mechanism will likely beunaffected by translocation of the enhancer because localaccessibility of DNA at the enhancer can be propagated overlong distances (several kBs) to establish an open chromatinstate [43]. Interestingly, the sensitivities of the AND andNAND logic gate designs to free energies of TF-enhancerbinding are not very different for the two mechanisms.However, the chromatin mechanism OR gate is moresensitive to TF-enhancer binding free energies. Thisincreased sensitivity of the OR gate response suggests thatthis design is more sensitive to binding site mutations thanthe equivalent design of the contact mechanism. Thethermodynamic approach that we have developed allows us tocharacterise gene expression input functions based on ahandful of transcriptional reporter measurements. From thisperspective, gene regulation via the chromatin mechanism iseasier to quantify because it does not involve binding energiesbetween TFs and the transcriptional machinery and thereforeinvolves fewer parameters. In this case, the method that wehave proposed can use experimental results directly inparameter estimation.

Although the chromatin mechanism and the direct contactmechanism can produce functionally equivalent time-averaged transcriptional responses, there are intrinsicdifferences between the two mechanisms that neverthelesslead to differences in the stochastic kinetics of geneexpression. We have shown that the chromatin mechanismcan be distinguished from the contact mechanism based onsingle-molecule gene expression data. The chromatinmechanism can easily be identified from such data from thesingle characteristic timescale in the PDF of time spent inthe transcriptionally active state. On the other hand,multiple timescales are present in the PDF of time spent intranscriptionally inactive state. We have also found that themean waiting time in the ON state is independent of TFconcentration for the chromatin mechanism. Thesedistinguishing dynamical properties highlight theirreducible differences between the two mechanisms.Although these results were obtained using a somewhatoversimplified model of transcriptional activation, we expectour observations to hold even for more complex kineticschemes. This will be a subject of a separate investigation.

We also note that transcriptional regulators using thechromatin mechanism have potentially promisingapplications in synthetic biology and genetic engineering. Atpresent, synthetic biology circuits use simple promoterarchitectures with a single regulator to control genetranscription. This clearly limits the transcriptional responseof the gene and functional properties of the circuits. Thislimitation exists because combinatorial gene regulation thatfollows the contact mechanism requires specialised TFs withappropriately interacting domains. At the same time, cis-regulatory modules of living systems, especially eukaryotes,are typically dauntingly complex. The increase in complexityof gene regulation is associated with the evolutionary

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 13: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEd

www.ietdl.org

emergence of complex multicellular organisms [2, 48].Moreover, the increase in proteins that control chromatinstructure and nucleosome remodelling correlates well withthe increase in complexity of cis-control elements inmetazoans [2]. This adoption of the chromatin mechanismof gene regulation in higher organisms reflects theadvantages of the flexibility in design of complexcombinatorial regulation. Synthetic designs of combinatorialregulation based on the chromatin mechanism can harnessthis flexibility to avoid the limitations of the contactmechanism regulation. The designs of logic gates withcombinatorial regulation via the chromatin mechanism thatwe have discussed in this paper are only an indication ofhow this mechanism can help simplify the design ofsynthetic circuits for any transcriptional response function.

4 Methods4.1 Calculation of waiting times intranscriptional machinery bound (ON)and unbound (OFF) states

The methods discussed in [39] were used to calculate thePDF of the time spent in the transcriptional machinerybound (ON) and unbound (OFF) states for the contactand chromatin mechanism kinetic schemes. The dynamicsfor either mechanism in the ON state can be described bythe following system of ODEs

d

dtG = (H − U )G (33)

here H is a rate matrix such that each element Hij representsthe rate constant of j � i transition and Hii ¼ 2Sj=iHij.Uij ¼ Kij∀i = j, and the j � i transition represents thedissociation of the transcriptional machinery from theregulatory region. All remaining elements of U are set tozero. Similarly, Vij ¼ Kij ∀i = j and the j � i transitionrepresents the binding of transcriptional machinery to theregulatory region with the initial conditions, Gj(t) is theprobability that the system has reached state j at time twithout the dissociation of transcriptional machinery giventhat the system was initially in the ON state.

We used the equations above to calculate the PDF of timespent in ON state PON(t)

PON(t) = (UG(t))†pin

where pin = Vpss

1†Vpss

(34)

T Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408oi: 10.1049/iet-syb.2010.0010

Here pss( j), the vector of steady probability of each state, iscomputed using (2) and the partition function for eachmechanism (see Section 2.5). The probability pin( j) of enteringthe ON substate j is a weighing factor for the calculation of theON state PDF († represents transpose and 1 represents a unitvector). Similarly the OFF state PDF can also be calculated as

POFF(t) = (VG(t))†pin

where pin = Upss

1†Upss

(35)

We used (34) to calculate the PDF for ON state waitingtimes for both the chromatin and contact mechanisms

PchrON(t) = kR

d

ve−kR

d t/v (36)

(see (37))

where w1 [ (0, 1) is a weighing factor. These equations showthat the ON state PDF for the chromatin mechanism hasonly one timescale whereas the PDF for the contactmechanism has two timescales: 1/r1 and 1/r2.

Similarly, we used (35) to calculate the PDF for OFF statewaiting times for both the contact and chromatin mechanisms

PconON(t) = kRe−kRt (38)

PchrOFF(t) = w1r1e−r1t + w2r2e−r2t + (1 − w1 − w2)r3e−r3t

(39)

where w1, w2 [ (0, 1) are weighing factors. Equations (38)and (39) show that the waiting time distribution of the OFFstate in the contact model has only one timescale, (kR)21

whereas the PDF of the chromatin mechanism has threetimescales (see Supplementary information for details).

The moments of these waiting time distributions can beeasily calculated from the PDFs.

4.2 Construction of logic gates for thechromatin mechanism

We define the response function for the logic gates as

f i([A], [B]) = pB([A], [B])

max(pB)(40)

where f i is the normalised rate of gene expression in thepresence of TFs A and B (i ¼ con for the contact

PconON(t) = w1r1e−r1t + (1 − w1)r2e−r2t

r1,2 = 1

2

kAd

v+ kA[A] + kR

d + kRd

v+

��������������������������������������������������������������kA

d

v+ kA[A] + kR

d + kAd

v

( )2

− 4kR

d kAd

v+ kR

d kA[A]

v+ (kR

d )2

v

( )√√√√⎛⎝

⎞⎠ (37)

405

& The Institution of Engineering and Technology 2010

Page 14: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

40

&

www.ietdl.org

mechanism; i ¼ chr for the chromatin mechanism) relative tothe maximum rate of expression. Parameters and equationsfor the contact mechanism logic gates were taken from[15]. The expressions for pchr

B for each of the logic gateswere derived using (4) and the appropriate expressions forZON and ZOFF from Section 2.2.

f con and f chr were used to construct an objective function Sthat represents the mean square difference between theresponse functions of the two mechanisms.

S =∫∫

( f con([A], [B])− f chr([A], [B]))2 d log[A] d log [B]

(41)

The parameters for the AND, OR and XOR gates of thechromatin mechanism were estimated numerically byminimising S. The fminimax library routine of MATLABwas used to solve the non-linear optimisation.

Parameters of the NAND gate of the chromatinmechanism were derived analytically from the parametersof the NAND gate of the contact mechanism. SeeSupplementary information for the details.

5 AcknowledgmentsThe authors would like to thank Dr. Bertie Gottgens and Dr.Aileen Smith for useful discussions regarding the regulationof Gata2. We also wish to thank J. Christian J. Ray andAbhinav Tiwari for their comments on the theoreticalaspects of this study. J.N. and O.A.I. are supported by RiceUniversity startup funds and NSF award MCB-0845919.

6 References

[1] DAVIDSON E.H., RAST J.P., OLIVIERI P., ET AL.: ‘A genomicregulatory network for development’, Science, 2002, 295,(5560), pp. 1669–1678

[2] LEVINE M., TJIAN R.: ‘Transcription regulation and animaldiversity’, Nature, 2003, 424, (6945), pp. 147–151

[3] BLACKWOOD E.M., KADONAGA J.T.: ‘Going the distance:a current view of enhancer action’, Science, 1998, 281,(5373), pp. 60–63

[4] PTASHNE M., GANN A.: ‘Transcriptional activation byrecruitment’, Nature, 1997, 386, (6625), pp. 569–577

[5] POLACH K.J., WIDOM J.: ‘Mechanism of protein access tospecific DNA sequences in chromatin: a dynamicequilibrium model for gene regulation’, J. Mol. Biol.,1995, 254, (2), pp. 130–149

[6] RAVEH-SADKA T., LEVO M., SEGAL E.: ‘Incorporatingnucleosomes into thermodynamic models of

6The Institution of Engineering and Technology 2010

transcription regulation’, Genome Res., 2009, 19, (8),pp. 1480–1496

[7] NARULA J., SMITH A.M., GOTTGENS B., IGOSHIN O.A.: ‘Modelingreveals bistability and low-pass filtering in the networkmodule determining stem cell fate’, PLoS Comput. Biol.,2009, 6, (5), pp. e1000771

[8] PIMANDA J.E., DONALDSON I.J., DE BRUJIN M.F., ET AL.:‘The SCL transcriptional network and BMPsignaling pathway interact to regulate RUNX1activity’, Proc. Natl. Acad. Sci. USA, 2007, 104, (3),pp. 840–845

[9] WALTERS M.C., FIERING S., EIDEMILLER J., ET AL.:‘Enhancers increase the probability but not the level of geneexpression’, Proc. Natl. Acad. Sci. USA, 1995, 92, (15),pp. 7125–7129

[10] KHOURY G., GRUSS P.: ‘Enhancer elements’, Cell, 1983, 33,(2), pp. 313–314

[11] CHAVES M., ALBERT R., SONTAG E.D.: ‘Robustnessand fragility of Boolean models for geneticregulatory networks’, J. Theoret. Biol., 2005, 235, (3),pp. 431–449

[12] KAUFFMAN S., PETERSON C., SAMUELSSON B., TROEIN C.: ‘RandomBoolean network models and the yeast transcriptionalnetwork’, Proc. Natl. Acad. Sci. USA, 2003, 100, (25),pp. 14796–14799

[13] LASLO P., SPOONER C.J., WARMFLASH A., ET AL.: ‘Multilineagetranscriptional priming and determination ofalternate hematopoietic cell fates’, Cell, 2006, 126, (4),pp. 755–766

[14] BINTU L., BUCHLER N.E., GARCIA H.G., ET AL.: ‘Transcriptionalregulation by the numbers: models’, Curr. Opin. Genet.Dev., 2005, 15, (2), pp. 116–124

[15] BUCHLER N.E., GERLAND U., HWA T.: ‘On schemes ofcombinatorial transcription logic’, Proc. Natl. Acad. Sci.USA, 2003, 100, (9), pp. 5136–5141

[16] SHEA M.A., ACKERS G.K.: ‘The OR controlsystem of bacteriophage lambda. A physical – chemicalmodel for gene regulation’, J. Mol. Biol., 1985, 181, (2),pp. 211–230

[17] HAN M., GRUNSTEIN M.: ‘Nucleosome loss activates yeastdownstream promoters in vivo’, Cell, 1988, 55, (6),pp. 1137–1145

[18] LORCH Y., LAPOINTE J.W., KORNBERG R.D.: ‘Nucleosomes inhibitthe initiation of transcription but allow chain elongationwith the displacement of histones’, Cell, 1987, 49, (2),pp. 203–210

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 15: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

IEdo

www.ietdl.org

[19] LI G., LEVITUS M., BUSTAMANTE C., WIDOM J.: ‘Rapidspontaneous accessibility of nucleosomal DNA’, Nat.Struct. Mol. Biol., 2005, 12, (1), pp. 46–53

[20] FELSENFELD G.: ‘Chromatin as an essential part of thetranscriptional mechanism’, Nature, 1992, 355, (6357),pp. 219–224

[21] OWEN-HUGHES T., WORKMAN J.L.: ‘Remodeling the chromatinstructure of a nucleosome array by transcription factor-targeted trans-displacement of histones’, Embo J., 1996,15, (17), pp. 4702–4712

[22] PTASHNE M., JEFFREY A., JOHNSON A.D., ET AL.: ‘How thelambda repressor and cro work’, Cell, 1980, 19, (1),pp. 1–11

[23] REEVES R.: ‘Molecular biology of HMGA proteins:hubs of nuclear function’, Gene, 2001, 277, (1 – 2),pp. 63–81

[24] STRAUSS F., VARSHAVSKY A.: ‘A protein binds to a satelliteDNA repeat at three specific sites that would be broughtinto mutual proximity by DNA folding in the nucleosome’,Cell, 1984, 37, (3), pp. 889–901

[25] SAVAGEAU M.A.: ‘Biochemical systems analysis: a study offunction and design in molecular biology’ (Addison-WesleyPub. Co., Reading, MA, 1976)

[26] CURTIS D.J., HALL M.A., VAN STEKELENBURG L.J., ET AL.: ‘SCL isrequired for normal function of short-term repopulatinghematopoietic stem cells’, Blood, 2004, 103, (9),pp. 3342–3348

[27] KOBAYASHI-OSAKI M., OHNEDA O., SUZUKI N., ET AL.: ‘GATAmotifs regulate early hematopoietic lineage-specificexpression of the Gata2 gene’, Mol. Cell Biol., 2005, 25,(16), pp. 7005–7020

[28] LUGUS J.J., CHUNG Y.S., MILLS J.C., ET AL.: ‘GATA2 functionsat multiple steps in hemangioblast developmentand differentiation’, Development, 2007, 134, (2),pp. 393–405

[29] PIMANDA J.E., OTTERSBACH K., KNEZEVIC K., ET AL.: ‘Gata2,Fli1, and Scl form a recursively wired gene-regulatory circuit during early hematopoieticdevelopment’, Proc. Natl. Acad. Sci. USA, 2007, 104, (45),pp. 17692–17697

[30] RODRIGUES N.P., JANZEN V., FORKERT R., ET AL.:‘Haploinsufficiency of GATA-2 perturbs adulthematopoietic stem-cell homeostasis’, Blood, 2005, 106,(2), pp. 477–484

[31] BRESNICK E.H., MARTOWICZ M.L., PAL S., JOHNSON K.D.:‘Developmental control via GATA factor interplay

T Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408i: 10.1049/iet-syb.2010.0010

at chromatin domains’, J. Cell Physiol., 2005, 205, (1),pp. 1–9

[32] GRASS J.A., BOYER M.E., PAL S., ET AL.: ‘GATA-1-dependenttranscriptional repression of GATA-2 via disruption ofpositive autoregulation and domain-wide chromatinremodeling’, Proc. Natl. Acad. Sci. USA, 2003, 100, (15),pp. 8811 – 8816

[33] KUHLMAN T., ZHANG Z., SAIER JR. M.H., HWA T.: ‘Combinatorialtranscriptional control of the lactose operon ofEscherichia coli’, Proc. Natl. Acad. Sci. USA, 2007, 104,(14), pp. 6043–6048

[34] FAKHOURI W.D., AY A., SAYAL R., ET AL.: ‘Deciphering atranscriptional regulatory code: modeling short-rangerepression in the Drosophila embryo’, Mol. Syst. Biol.,2010, 6, p. 341

[35] ZINZEN R.P., SENGER K., LEVINE M., PAPATSENKO D.:‘Computational models for neurogenic gene expression inthe Drosophila embryo’, Curr. Biol., 2006, 16, (13),pp. 1358–1365

[36] CAI L., FRIEDMAN N., XIE X.S.: ‘Stochastic protein expressionin individual cells at the single molecule level’, Nature,2006, 440, (7082), pp. 358–362

[37] ELF J., LI G.W., XIE X.S.: ‘Probing transcription factordynamics at the single-molecule level in a living cell’,Science, 2007, 316, (5828), pp. 1191–1194

[38] YU J., XIAO J., REN X., LAO K., XIE X.S.: ‘Probing geneexpression in live cells, one protein molecule at a time’,Science, 2006, 311, (5767), pp. 1600–1603

[39] GOPICH I.V., SZABO A.: ‘Theory of the statisticsof kinetic transitions with application to single-molecule enzyme catalysis’, J. Chem. Phys., 2006, 124,(15), p. 154712

[40] CHUBB J.R., TRCEK T., SHENOY S.M., SINGER R.H.: ‘Transcriptionalpulsing of a developmental gene’, Curr. Biol., 2006, 16, (10),pp. 1018–1025

[41] GOLDING I., PAULSSON J., ZAWILSKI S.M., COX E.C.: ‘Real-timekinetics of gene activity in individual bacteria’, Cell, 2005,123, (6), pp. 1025–1036

[42] PROSHKIN S., RAHMOUNI A.R., MIRONOV A., NUDLER E.:‘Cooperation between translating ribosomes and RNApolymerase in transcription elongation’, Science, 2010,328, (5977), pp. 504–508

[43] JENUWEIN T., FORRESTER W.C., FERNANDEZ-HERRERO L.A., ET AL.:‘Extension of chromatin accessibility by nuclearmatrix attachment regions’, Nature, 1997, 385, (6613),pp. 269–272

407

& The Institution of Engineering and Technology 2010

Page 16: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

www.ietdl.org

408

&

[44] FRASER P., GROSVELD F.: ‘Locus control regions, chromatinactivation and transcription’, Curr. Opin. Cell. Biol., 1998,10, (3), pp. 361–365

[45] GROSVELD F.: ‘Activation by locus control regions?’,Curr. Opin. Genet. Dev., 1999, 9, (2), pp. 152–157

[46] LI Q., PETERSON K.R., FANG X., STAMATOYANNOPOULOS G.:‘Locus control regions’, Blood, 2002, 100, (9), pp. 3077–3086

The Institution of Engineering and Technology 2010

[47] MONOD J., WYMAN J., CHANGEUX J.P.: ‘On the nature ofallosteric transitions: a plausible model’, J. Mol. Biol.,1965, 12, pp. 88–118

[48] CHEN K., RAJEWSKY N.: ‘The evolution of gene regulationby transcription factors and microRNAs’, Nat. Rev. Genet.,2007, 8, (2), pp. 93–103

IET Syst. Biol., 2010, Vol. 4, Iss. 6, pp. 393–408doi: 10.1049/iet-syb.2010.0010

Page 17: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

Detailed Methods

1. Calculation of waiting times in transcriptional machinery bound (ON) and unbound

(OFF) states

The methods discussed in [1] were used to calculate the probability distribution function (PDF)

of the time spent in the transcriptional machinery bound (ON) and unbound (OFF) states for the

contact and chromatin mechanism kinetic schemes. The dynamics for either mechanism can be

described by a system of ordinary differential equations. Let H be a rate matrix such that each

element ijH represents the rate constant of j i→ transition and ii ijj i≠

= −∑H H .

To obtain the PDF of the time spent in the ON state we first require the probability that the

system does not release the bound transcriptional machinery for time t given that the

transcriptional machinery binds to the regulatory region at t 0= . Given that the system starts in

the ON sub-state j , then ( )i tG , the probability that the system is in the ON sub-state i at time

t without releasing the transcriptional machinery is given by the rate equation :

( - )ddt

= H UG G (1)

with the initial conditions (0) 1, (0) 0kj k j= = ∀ ≠G G . Here U is the matrix of rate constants

such that ij ij i j= ∀ ≠U K and the j i→ transition represents the dissociation of the

transcriptional machinery from the regulatory region. All remaining elements of U are set to

zero. Similarly we can define a matrix V such that ij ij i j= ∀ ≠V K and the j i→ transition

represents the binding of transcriptional machinery to the regulatory region. The solution of

Page 18: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

equation (1) is ( ) exp(( (0)) )t t= −G K U G . The PDF of spending time τ in the ON state depends

on three factors: 1) the probability of entering the ON substate j at time 0t , 2) the probability of

going from substate j i→ in time τ and 3) the probability of exiting the ON state from ON

substate i at time 0t τ+ . This PDF can be calculated as:

) ( )(

,

( )ONP

where

τ τ=

=

†in

ss

i †n ss

UG

p1

p

VVp

p (2)

Here ss ( )jp , the vector of steady probability of each state is computed using equation (2) and the

partition function for each mechanism (see section 2.5). The probability ( )jinp of entering the

ON substate j is a weighing factor for the calculation of the ON state PDF († represents

transpose and 1 represents a unit vector). Similarly the OFF state PDF can also be calculated:

(

,

) ( ( ))OFFP

where

τ τ=

=

†in

ss

in ss†p Up1

p

U

VG

p (3)

For the contact mechanism we use the following numbering of sub-states: O (1), OA (2),

OR (3) and OAR (4). For Figure 6(a), the matrices H , U and V for the contact mechanism are:

( 0( 0

=0 (

0

[ ])[ ] ) /

[ ]) //( )[ ] /

R A A Rd d

A R A Rd d

R R Ad d

A

R AR Ad d

k k kk k

k k k

k Ak A k

k Ak Ak k k

ωω

ω ω

++

+

⎡ ⎤−⎢ ⎥−⎢ ⎥⎢ ⎥−⎢ ⎥

−⎢ ⎥⎣ ⎦+

H (4)

Page 19: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

0 0 00 0 0

=0 0 0 0

0

/

0 00

Rd

Rd

kk ω

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

U (5)

0 0 0 00 0 0 0

=0 0 0

0 0 0

R

R

kk

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

V (6)

To find con ( )ONP τ , the PDF of waiting time in the ON state for the contact model equations, first

the equations (1), (2) and (4)-(6) were used with the Laplace transform to solve for con )(ONP s% :

( )

( ) ( ) ( )

[ ][ ] [ ]( )

[ ]

+ + + + + + +

+ + + + +[ +]

A R A R AA A R Rd d d d d d d

d d dcon

ON AA R A R

R R RA

d dd d d

RA

k k k k k k k A kk A k A k s k k ss

k kk k A k s k A k sP

s

ω ω ω ω ω

ω ω

⎛ ⎞⎛ ⎞ ⎛ ⎞⎜ ⎟⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠⎝ ⎠=⎛ ⎞⎛ ⎞⎜ ⎟⎜ ⎟

⎝ ⎠⎝ ⎠

%

(7)

The inverse Laplace transform of con )(ONP s% shows that the PDF con ( )ONP τ is a sum of two exponential

terms:

( )

1 2con1 1 1 2

22

1,2

( ) (1 )

[ ][ ] [ 42

]1

r t r tON

RA A A R A AdA R A Rd d d d d d d

d

R R

d

w e w r e

kk k k k k k k k Ak A k k A k

P r

r

τ

ω ω ω ω ω ω ω

− −= + −

⎛ ⎞⎛ ⎞ ⎜ ⎟+ + + ± +⎛ ⎞⎜ ⎟= + + − + +⎜ ⎟ ⎜ ⎟⎝ ⎠⎜⎜⎝ ⎠ ⎠⎝

⎟⎟

(8)

where 1w (0,1)∈ is a weighing factor. Equation (8) shows that the PDF has two timescales:

11 / r and 21 / r . The point of separation of these two timescales sτ is the point at which the two

exponential terms are equal.

Page 20: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

g )lo (1 1

1s

2

1 2

w1 w

=r

rr

- rτ

⎛ ⎞⎜ ⎟−⎝ ⎠ (9)

Using equations (1), (3) and (4)-(6) we can calculate con ( )OFFP s% , the Laplace transform of the

OFF state waiting time distribution:

con (+

)R

OFF R

ksk

Ps

=%

(10)

The inverse Laplace transform of con ( )OFFP s% shows that the PDF con ( )OFFP τ has a single exponential

term:

con ( )RR k t

OFF k eP τ −= (11)

Thus the waiting time distribution of the OFF state in the contact model has only one

timescale: ( )Rk−1

.

We use the same method to calculate the waiting time distributions for the chromatin

mechanism. We use the following numbering of sub-states: C(1), O (2), OA (3), OR (4) and

OAR (5). The matrices H , U and V for the chromatin mechanism are:

0 0 0( / / 0[ ])

[ ] ) /= 0 ( / 00 0 ( /0 0 ( )

[ ]) /[ ] / /

o cR A A R

o c d dA R

A

R

R Ad d

R R Ad d

R Ad d

A

k kk k k k k

k kk

k Ak A k

k Ak A

k kk k k

ω ωω ω

ω ωω ω

⎡ ⎤⎢ ⎥+⎢ ⎥⎢ ⎥+⎢ ⎥+⎢ ⎥⎢ ⎥+⎣ ⎦

−− +

−−

H (12)

Page 21: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

0 0 0 0 00 0 0 0

= 0 0 0 00 0 0 0 00 0 0 0

/

0

/Rd

Rd

kk

ωω

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

U (13)

0 0 0 0 00 0 0 0 0

= 0 0 0 0 00 0 0 00 0 0 0

R

R

kk

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

V (14)

Using equations (1), (2) and (12)-(14) we solve for the Laplace transform of the ON state waiting

time distribution chr )(ONP s% :

chr

+//( )

Rd

ON Rd

P ksk s

ωω

=%

(15)

The inverse Laplace transform of chr )(ONP s% shows that the PDF chr ( )ONP τ has a single exponential

term:

/chr ( )Rd

Rk td

ONk eP ωτω

−= (16)

Again using equations (1), (3) and (12)-(14) we can calculate chr ( )OFFP s% the Laplace transform of

the OFF state waiting time distribution:

Page 22: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

( ) ( ) [ ]( )

[ ] [ ]( ) ( ) ( )( )

[ ]( ) ( )( )

( ) [ ]( ) ( ) ( )( )

chr

2

( )( )( )

2( )

( )

R

R

R R

R R

R R R

OFF

A AAd d

o o

A Ao o c

AdA

o cAd

Ao o c

P sN sD s

k kk s k s k A k sN s k

k A k A k s k k s s k k s

k k k s s k k skD s k Ak s k A k s k k s s k k s

ω ω

ωω

=

⎛ ⎞⎛ ⎞⎜ ⎟+ + + + + +⎜ ⎟⎜ ⎟= ⎝ ⎠⎜ ⎟⎜ ⎟+ + + + + +⎝ ⎠

⎛ ⎞+ + + + +⎜ ⎟

= + ⎜ ⎟⎜ ⎟+ + + + + +

⎛ ⎞⎜ ⎟⎝ + ⎠

⎠⎝

%

(17)

The inverse Laplace transform of chr ( )OFFP s% shows that the PDF chr ( )OFFP τ is a sum of three

exponential terms:

321chr1 1 312 2 2( ) (1 ) r tr t r t

OFFP r rw e w e w r ewτ −− −= + + −− (18)

where ,1 2w w (0,1)∈ are weighing factors that represent the probability of timescales 1r and 2r

respectively. The moments of these waiting time distributions can be easily calculated from the

PDFs.

For the calculations shown in the main text the rate of binding of TF A was assumed to

be near the diffusion limit -1 -10.001nM sAk = [2] and a typical value was chosen for the TF

dissociation constant / 1nMA AA d kK k= = [3, 4]. The binding rate constant for the transcriptional

machinery was also assumed to be diffusion limited, -10.001sRk = (note that this is a first order

rate constant unlike Ak ) [2, 5, 6]. Binding of the transcriptional machinery to core promoters is

typically weak [4, 7], so we assumed that -10. s1Rdk = . The chromatin equilibrium constant K and

the strength of TF-transcriptional machinery interactions ω are both known to be in the range 10-

Page 23: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

1000 [8, 9] so we assumed 20K +1ω = = . The concentration of TF A was chosen to ensure that

the response is not saturated [ ] 1nMA = .

2. Construction of logic gates for the chromatin mechanism

Parameters for the contact mechanism logic gates were taken from Ref. [10]. The parameters for

each logic gate of the chromatin mechanism were chosen to ensure that the response of this

design was as close as possible to the response of the corresponding contact mechanism logic

gate.

The response function for the logic gates is given by

([([ ],[ ])max

],[ ])( )

i B

B

A Bpf A Bp

= (19)

where if is the normalized rate of gene expression in the presence of TFs A and B ( coni = for

the contact mechanism; chri = for the chromatin mechanism) relative to the maximum rate of

expression.

For the AND, OR and XOR gates the parameters were estimated numerically by minimizing the

square of the difference between the response functions of each logic gate of the two

mechanisms. The normalized rate of gene expression for the AND logic of the contact

mechanism ( conANDf ) was calculated by using equation (4) for the probability of transcription con

Bp .

In a similar fashion, the normalized rate of gene expression for the AND logic of the chromatin

mechanism ( chrANDf ) was calculated by substituting ONZ and OFFZ from equation (11) in equation

(4) to find chrBp .

Page 24: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

conANDf and chr

ANDf were used to construct the objective function S .

c chron ([ ],[ ]) ([ ],[ ])) log[ ] lo( g[ ]2AND ANDS A B A B d A d Bf f= −∫∫                           (20) 

To approximate this integral, the range of concentrations ( , 31 10 ) was discretized into 40 log-

uniform intervals for each TF. The square of the difference between the normalized transcription

rates of the two mechanisms was calculated at each combination of TF concentrations

( , ]] [[ i jA B ; , , ...i j 1 2 40= ) and summed to construct the following discrete version of the

objective function:

chr

,

con ([ ] ,[ ] ) ([ ] ,[ ] ))( 2AND i Aj jND i

i j

S A B A Bf f−=∑ (21)

The parameters , ,, TGA BK e G G− and ABG for the chromatin mechanism were chosen to minimize

S . The fminimax library routine of MATLAB was used to solve the nonlinear optimization. The

parameters for the OR gate and XOR gate were estimated using similar objective functions.

Normalized transcription rates for these gates can be easily calculated with the equations listed in

sections 2.3.

Parameters of the NAND gate of the chromatin mechanism were derived analytically from the

parameters of the NAND gate of the contact mechanism. The probability of transcription is

highest when [ ] [ ]A B 0= = in the case of NAND logic. Numerical methods were not necessary

for estimating parameters of the NAND gate because there are no TF-transcriptional machinery

interactions in the contact model of the NAND gate (see Ref [10]). The probabilities conBp and

chrBp for the NAND gate of the contact mechanism and the NAND gate of chromatin mechanism

are given by:

Page 25: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

con

con con con con con concon

[ ] [ ] [ ][ ]A B A B AT B

TG

B G G G G G G

ee 1 A e B

pe A B e

− − − − − −+ ++ += (22)

chr

chr chr chr chr chrchr

[ ] [ ] [ ][ ]A

T

T B A B

G

B G G G G G

ee 1 A e B

pK e A B e

− − − − −+ ++ +=

+ (23)

Using the substitutions

con con

con con

con

chr

chr

chr con

,

,

AB

T AB

A AB

B AB

G

T

A

B

G G

G G

G

G

G

G

K 1e

G and−

=

=

=

= +con

(24)

the response function for the contact mechanism in equation (22) can be rearranged to give the

response function for the chromatin mechanism shown in equation (23). Clearly the two

expressions are analytically identical and the parameters of the chromatin mechanism can be

derived from the substitutions used above. Note that in the response function for the chromatin

mechanism the TFs A and B do not interact. This implies that the cooperativity between the two

TFs emerges from the equilibrium between open and closed chromatin states. The strength of

this emergent cooperativity matches the free energy of the TF-TF interaction in equation (22).

con log( )ABG K 1= − + (25)

 

 

 

 

 

 

Page 26: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

Supplementary Figures

 

 

 

Figure S1. Gene expression response functions for contact model logic gates. Parameters and equations have been adapted from Ref. [10]. (a) Gene expression rates relative to the maximum level of expression ([ ] [ ] 3A B 10= = ) for the AND logic with typical cellular concentrations of TFs A and B ( .A BG 5G 1 2= = , .AB 99G 2= − , . , .AP BP ABP2 99G -6 68G G= = − = , .TGe 0 029− = ). (b) Gene expression rates for the OR logic ( single promoter model) normalized relative to the maximum expression level at [ ] [ ] 3A B 10= = ( .A BG 3G 2 1= = − , ABG 0= ,

. , .AP BP ABP2 99G -3 69G G= = − = , .TGe 0 05− = ). (c) For the NAND logic transcription rates were normalized relative to the maximum expression level at [ ] [ ]A B 0= = ( .A BG 3G 2 1= = − , .ABG 2 99= − , TGe 100− = ). (d) For the XOR logic the maximum expression level at [ ] ,[ ]3A 10 B 0= = was used to normalize the transcription rates ( .1 1

A B 6G G 1 2= −= , 1ABG 0= , . , .1 1 1

AP BP ABP2 9G G 9 3 69G -= = − = , .2 2A B 1G G 0 1= −= , .AB

2 99G 2= − , .TGe 0 1− = ).

Page 27: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

Figure S2. Comparison of the sensitivities to free energy values of the AND, NAND and XOR logic gate responses. (a)-(d) The AND gate response is most sensitive to the free energies AG and

ABG at high concentrations of TFs A and B , respectively. The sensitivity of the AND gate response to these free energies is similar for the chromatin mechanism and the direct contact mechanism. (The sensitivity to BG is symmetric to the sensitivity to AG ). (e)-(h) Similar to the AND gate response, the NAND gate response is most sensitive to AG and ABG at high TF concentrations. The sensitivities of the contact and chromatin mechanisms are identical. (i) In the contact model the XOR gate response is sensitive to 1

AG only at intermediate concentrations of the TFs. (j) The XOR gate response for the chromatin mechanism is less sensitive than the contact mechanism at intermediate concentrations of TFs and more sensitive at high concentrations of TFs. (k),(l) The contact mechanism’s response is more sensitive to 1

ABG at high TF concentrations than the chromatin mechanism’s response. Rather than parameters sensitivities these differences are largely a result of the difference in XOR response of the two

Page 28: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

mechanisms (see section 2.3). Parameters and equations for the contact model were abstracted from [10].

 

 

Figure S3. Sensitivity of the parameter estimation method to the choice of chromatin equilibrium constant K . (a) and (c) show the Gata2 response function for K 250= and K 500= respectively, as determined using the expression data given in Figure 5(a) (the colorbars represent fold-change increase in gene expression relative to the expression rate at [ ] [ ]Gata2 Fli1 0= = ). Note that the response function within the range of wild-type TF

concentrations ( [ ] [ ]0 Gata2 , Fli1 1< ≤ ; demarcated by white lines) does not change over this two-fold change in K . However the response functions are drastically different when the TFs are over-expressed. In fact the maximum fold-change (at saturating TF concentrations) in each case are approximately equal to K . Therefore the value of K can be determined by over-expressing the enhancer binding TFs. (b) and (d) show the sensitivity of the Gata2 response to the value of K for K 250= and K 500= respectively. In the range of wild-type concentrations ( [ ] [ ]0 Gata2 , Fli1 1< ≤ ; area bounded by white lines) the response functions are not sensitive to the value of K , but outside this region the response function depends strongly on the chosen value of K . Therefore the response function inside the region of wild-type concentrations can be reliably predicted even without the knowledge of K .

Page 29: Thermodynamic models of combinatorial gene regulation by …oi1/papers/Narula(2010)IETSysBio.pdf · J. Narula O.A. Igoshin Department of Bioengineering, Rice University, 6500 Main

References

1. Gopich, I.V. and A. Szabo: 'Theory of the statistics of kinetic transitions with application to single-molecule enzyme catalysis', J Chem Phys, 2006, 124, (15), pp. 154712.

2. von Hippel, P.H. and O.G. Berg: 'Facilitated target location in biological systems', J Biol Chem, 1989, 264, (2), pp. 675-8.

3. Bintu, L., et al.: 'Transcriptional regulation by the numbers: applications', Curr Opin Genet Dev, 2005, 15, (2), pp. 125-35.

4. Bintu, L., et al.: 'Transcriptional regulation by the numbers: models', Curr Opin Genet Dev, 2005, 15, (2), pp. 116-24.

5. Darzacq, X., et al.: 'In vivo dynamics of RNA polymerase II transcription', Nat Struct Mol Biol, 2007, 14, (9), pp. 796-806.

6. Narayan, S., et al.: 'RNA polymerase II transcription. Rate of promoter clearance is enhanced by a purified activating transcription factor/cAMP response element-binding protein', J Biol Chem, 1994, 269, (17), pp. 12755-63.

7. Kuhlman, T., et al.: 'Combinatorial transcriptional control of the lactose operon of Escherichia coli', Proc Natl Acad Sci U S A, 2007, 104, (14), pp. 6043-8.

8. Buchler, N.E., U. Gerland, and T. Hwa: 'Nonlinear protein degradation and the function of genetic circuits', Proc Natl Acad Sci U S A, 2005, 102, (27), pp. 9559-64.

9. Li, G., et al.: 'Rapid spontaneous accessibility of nucleosomal DNA', Nat Struct Mol Biol, 2005, 12, (1), pp. 46-53.

10. Buchler, N.E., U. Gerland, and T. Hwa: 'On schemes of combinatorial transcription logic', Proc Natl Acad Sci U S A, 2003, 100, (9), pp. 5136-41.