lecture 12' - structural transitions in nucleic acids ii

8/14/2019 Lecture 12' - Structural Transitions in Nucleic Acids II

1/39

Lecture 12 Structural Transitions in

Nucleic Acids II


2/39

Outline

Introduction

The DNA Helix-Coil Transition Very quick review of basic DNA structure:

Focus: base-pair stacking.

DNA melting and melting curves: Thermal denaturation: breaking the stacks.

Experimental monitoring of base-pair stacking %...

Modeling DNA Melting: Idea: Generalize our Aligned Zipper model (Lecture 12).

To treat concentration-dependence, shifted structures, loops

Focus is still: Equilibrium Treatment1. Weighting conformations (both stacks and loops);

2. Thermodynamic parameter sets;

3. Models of duplex formation;

Comparisons with experiment.


3/39

Our Focus: the Helix-Coil

Transition in DNA

In particular, we focus on two

related processes:

dsDNA melting

B helix to two coils.

dsDNA annealing

two coils to a B helix.

Note: single dsDNA species.

Understanding these: aids in modeling more complicated

transitions.

e.g., many competing species.

Ultimate focus: complex annealing.


4/39

Single-Stranded DNA (the Coil)

An unbranched, polynucleotide chain:monomers units = nucleotides.

each contains three components:

a negatively charged phosphate (PO4-);

a 2-deoxyribose sugar;

one of 4 heterocyclic bases (A,T,G,C);

pairs linked by phosphodiester bonds.

Nitrogenous bases are of two types:

Purines (2 rings): Adenine (A), Guanine (G).

Pyrimidines (1 ring): Thymine (T), Cytosine (C).

ssDNA has a 5 to 3 polarity. 5 and 3 ends are chemically distinct.

by convention, DNA sequence is written 5 to 3.


5/39

The B Helix

Natural dsDNAs in solution adopt a double-stranded, helical structure.

strand orientations are antiparallel.

under physiological conditions: B helix.

Helix characterized by Watson-Crickbase pairing:

A pairs with T (2 H+-bonds).

G pairs with C (3 H+-bonds).

At right, we show the dsDNA formed

by annealing of:

5-CTAGTCGTGGTTC-3

5-GAACCACGACTAG-3


6/39


7/39

Monitoring the Helix-Coil

Transition

Degree of stacking is experimentally observable:

Let B = mean fraction of stacked base pairs.

Ultraviolet absorbance at 260 nm (A260)

is inversely proportional to B;

Also called the hypochromicity.

DNA melting accompanied by 40% increase in A260.

Plot ofA260 vs. T yields B vs. T

The DNA melting curve.


8/39

DNA Melting Curves

B decreases monotonically from 1 to 0 (for fully matched strands). sigmoidal shape indicates DNA melting is cooperative.

One sigmoid = cooperative melting of entire DNA;

The DNA melting temperature (Tm):

For fully matching strands: temp. at which B = ;

Width (T) is non-zero (e.g., for 10-mers, T 10oC).

Melting curves of longer DNAs show more structure: several independently melting regions (ATs less stable).


9/39

Why focus on DNA Melting?

Fitting of model curves with experiment: facilitates investigation of DNA thermodynamic parameters.

describe the fundamental properties of DNA interaction.

Helps to understand more general DNA mixtures.

Modeling provides a demonstration of general techniques:

Equilibrium chemistry;

Statistical Thermodynamic weighting;

Although parameter values vary by polymer, transition, general principles apply to modeling other biopolymers:

protein folding and structural transitions. RNA folding, etc.

We will use an Equilibrium approach based on Statistical Thermodynamics.


10/39

The Aligned-Zipper Model

In L. 11, we adopted a simplified model

Three non-zipper assumptions: Homoduplex-melting:1 kind of base-pair. Strands perfectly-aligned: no shifting. Strand-separation neglected: no [strand] effects.

This allowed an aligned Zipper model:

Annealing: forward transition (coil to helix). Melting: reverse transition (helix to coil). from a fully-helical state, H = hhhhhh. to a fully-melted state, C =cccccc.

Model Application proceeded by:

1. Defining model parameters: The nucleation parameter, . The propagation parameter, s.

Applying our Zipper-expression for :

Result: DNA Melting Curve.

However, our model a bit too simple!


11/39

Need for a Better Model Most dsDNAs of interest are not homo-polymers:

Generally contain all 4 bases (A, T, G, C).

At least 10 propagation parameters, si required.

Strand-separation is also significant:

Results in a dependence on strand-concentration.

Particularly for oligonucleotides.

Annealing, Melting also much more complicated:

Shifted alignments, looped structures can be significant.


12/39

Melting Curve Prediction:

We adopt an equilibrium model.

Our simple equilibrium of interest:

Quantities of Interest:

Fraction of stacked base pairs (bps): B = extint;

fraction of associated strands: ext= 2CAB/Ctot;

mean fraction of stacked bps per dsDNA conformation:int ;

Lets begin with an estimation ofext

First, we need some simple equations:

Mass Action: KD = CACB /CAB = 1/Kassoc.

ssDNA Strand Conservation:

CAo

= CA + CAB and CBo= CB + CAB


13/39

Melting Curve Prediction (cont):

Continuing our estimation ofext Analysis is system-dependent:

Mass Action: KD = CACB /CAB = 1/Kassoc.

ssDNA Strand Conservation:

CA

o

= CA + CAB and CBo

= CB + CAB

Idea: Combine to yield a quadratic eq. ( solve forext)

Usual is an Equivalent co-polymer treatment: assume A = B;

Good for long polymers; not so good for oligonucleotides.

Result: ext = [(Ctot/KD+1) - (1+2Ctot/KD)1/2

] / (Ctot/KD)


14/39

Statistical Thermodynamics

Computing B still requires estimates ofKD and int. ToolStatistical Thermodynamics.

assumption: system always (nearly) at equilibrium.

note a limitationno rate information.

Consider an Ensemble of Systems:

large number of instances of our systemO(1023).

each prepared identically.

members distributed over all accessible conformations:

single-stranded states (unstacked ssDNAs, hairpins).

distinct double-stranded states.

Stat-thermo addresses: equilibrium probabilities of state occupancy.

changes in system variables which accompany equilibriumtransitions.


15/39

Ising Model of Stacking

Assumption: stack-formation is all-or-none. each base has either single-stranded or stacked character.

big simplification

Each dsDNA conformation is then specified by: alignment between the interaction strands.

stacking pattern.

no worrying about partial stacks.

Conformation specified by location of helical and ss regions.


16/39

The Gibbs Factor

So, how do we estimate relative occupancies?

As before, each conformation, i gets a statistical weight, i.

related to its standard free energy of formation, Gio:

i = exp[-Gio

/RT]; R = molar gas constant. the Gibbs Factor.

Relative probability of observing states i and j: estimated by the ratio of weights:

P(i)/P(j) = i/j = exp[-(G)o

/RT]

at equilibrium, more stable states much more likely.

What about the absolute probabilities? we need to normalize by dividing by the Partition Function


17/39

The Partition Function, Z

Z = the sum of the statistical weights of all states:

Z = ii = i exp[-Gio/RT]

(we called this Q, earlier)

equal to the product of external and internal Zs:Z = ZextZint.

As before, the absolute probability of observing any state, i:

P(i) = i/Z.

All thermodynamic quantities derivable from Z. macroscopic observables correspond to ensemble averages. ensemble average of observable X:

= i Xii / Z;

Xi = X value characteristic of state i.


18/39

Estimating KD

KD = equilibrium constant of dissociation. estimated by the partition functions of products and reactants.

reactants = all double-strands (dsDNAs).

products = all fully melted single strands (ssDNAs).

For dsDNA melting: KD = Z(ss)2 /Z(ds)=1/(Zc).

Zc = ratio of internal partition functions = Zint(ds).

= ratio of external partition functions = Zext(ds)/Zext2(ss).

= the strand association parameter.

Note: If we like, we can also model hairpin melting:

KD = Z(ss)/Z(hp) = 1/Zc.

Also note that KD = 1/Kassoc.


19/39

Estimating int

Recall that B

= int

ext

.

KD allows us to model ext.

however, int must also be estimated

The mean fraction of stacked bps/duplex.

Let fi denote the fraction of stacked base pairs in

conformation i. then, int is just the ensemble average of fi

denoted, .

int can be estimated from the partition function:

int = = i fii/Zc

Here, Zc = Zint is the conformal partition function;

Only the weight of associated conformations included.

.


20/39

Estimating Statistical Weights

Now we know how to compute int and ext.We also have: Kassoc = Zc = ii

How do we estimate the statistical weight, i of each

conformation?

by decomposition.We model a given dsDNA conformation X

a linear chain, consisting of simpler structures:base-pair doublets, hairpin loops, internal loops, bulges

each weighted independentlyweights form a set of thermodynamic parameters.

product of weights = overall weight.


21/39

Example

Decomposition of a conformation into subunits

Overall Statistical Weightproduct of a set of smaller weights

which are determined by the identities the subunits.


22/39

Statistical Weight (cont.)

Overall weight of conformation X denoted X.

x estimated by a product of smaller weights.

Distinct weight for each type of structure si - each stacked base pair doublet of type i.

1/2

- each junction between stacked/unstacked pairs.

f(m) - each internal loop of m broken base pairs.

sbulge - each bulge loop. F(n) - each terminal (hairpin) loop of n bases.

send- each dangling end.

we must also assign a weight for dsDNA chain association.

We now address each, in turn.


23/39

Statistical Weight of a Stacked

Base-pairDoublet, s

Nearest-neighbor model: Enthalpy (H

o) and Entropy (S

o) of doublet stack formation.

depend only on the identity of the base pair doublet.

10 types ofWatson-Crickbase pairs = 20 distinct parameters.

Statistical weight of a stacked base-pair:

snn(i) = exp[-Gio/RT].

Gio

= HioTSi

o= Gibbs free energy of stacking.

Many Nearest-neighbor parameter sets: 10 Watson-Crick pairs (SantaLucia, et al., 1998).

Various singlet-mismatches (Allawi, et al., 1997, etc.).

Example:


24/39

Sequence-Dependence of s

Stacking Gos depend on GC content And will vary with specific doublet identity:

i.e., adjacent pairs of base-pairs.

We will expect the size of our propagation

parameter:

s = exp[-Gcho/RT],

to increase with GC content;

Duplexes with higher GC-content:

should form more easily

and be more resistant to melting.


25/39

Modeling End Unraveling:

The Cooperativity Parameter ()

Unraveling at a duplex end:

generally modeled by 1/2

.

accounts for the cooperativity of DNA melting.

formation of an isolated base much more difficult.

Consensus value: = 4.5 x 10-5

(0.1 M [Na+]).

Some care is required:

always included in chain association parameter, .

1/2

sometimes included in terminal loop weight, F(m).

1 factor of1/2

sometimes normalized into the zero free

energy state (Benight, et al., 1988).

is also salt-dependent (S. Kozyavkin, 1987).


26/39

Statistical Weight of an Internal Loop

Internal loop of m unbonded base pairs: Statistical weight = fn(m):

End unraveling: accounted for by 2 factors of

1/2.

Normalized probability of loop closure: Jacobson-Stockmayer: f(n) = 1/n

1.5; unrestricted loop, n links.

Purely entropic in origin (no T-dependence).

Empirical form (R. Wartell, 1977) f(m) = 1/[(1-1.38

-0.1m)(m+1)

1.7], m > 3.

Accounts for volume exclusion and chain stiffness.

Note: Due to the large penalty

Looped conformations usually discarded for oligos.


27/39

Statistical Weight of a Bulge Loop

Bulge Loop only one strand has unpaired bases.

Example:

Perturbation to intact helix, Go.

statistical weight, sbulge = exp[-Go/RT]

1-base bulges well-studied (Zhu and Wartell, 1999): statistical penalty roughly

1/2; sequence-dependent.

Larger bulges less well-studied:parameters for RNA bulges (Freier, et al., 1986).

statistical penalty > , increases with size.


28/39

Statistical Weight of a Terminal Loop

A terminal loop of n unpaired bases: hairpin loop.

Statistical weight = 1/2

Fend(n).

Example:

Strand unraveling: modeled by

1/2.

Normalized probability of loop closure:

Fend(n) = M(n)/(n+1)1.5

(Benight, et al., 1988).

M(n) accounts for steric hindrance, chain stiffness.


29/39

Statistical Weight of a Dangling End

Dangling ends (overhanging, unpaired bases):

stacking of first dangling base against duplex core.

often as stabilizing as an extra stack.

Energetics sequence dependent.

Nearest-neighbor model, Go (Bommarito, et al., 2000).

Energies depend, to 1st order only on:

Identity of dangling base + duplex core bases;

Statistical weight, sdangle = exp[-Go/RT].

Values for all dangles reported.


30/39

Bimolecular Helix Initiation

Strand Association Parameter: .

Accounts for both nucleation and end unraveling. includes a factor of (one

1/2for each duplex end).

Length, temperature dependent (W. Hillen, 1981).

= KN

a+b[1-(int)]

; K = 5000, a = -2.8, b = -3.2 ([Na+

] = 0.1 M).

Nearest-neighbor model (SantaLucia, et al., 1998): Simpler, approximate treatment of (deviations < 20%).

length-independent initiation free energy, Go

nuc.

= exp[-G

o

nuc/RT].


31/39

Impact of Strand Anchorage

For duplex conformations formed on microchips

well known to be much less stable.

Impact may be treated as a multiplicative correction:

A. Fotin, et al., 1998.

length, nature of linker no substantial effect.

Ho

= 24 +/- 4 kcal/mol

So = 70 +/- 12 cal/(mol K)

Then, Go

= Ho

T So

sanchorage = exp[-Go/RT].


32/39

Example 1 Simple DNA Duplex

Kd for formation from isolated strands:

one factor of for helix initiation.

an appropriate factor of s for each stacked base pair.

recall internal partition function for each isolated strand = 1.

Kd = 1/sCC/GG2sCA/GTsAA/TT

2= 1/Ka.

Approx. form for a conformation of this type: let s = mean weight of doublet stacking.

Kd = 1/sl-1

.


33/39

Example 2 Simple DNA Hairpin

Kd for formation from unfolded single-strand:

one factor of 1/2

F(7), for the terminal loop.

one factor of1/2

, for unraveling at the free end.

an appropriate factor of s for each stacked base-pair.

recall internal partition function for the isolated strand = 1.

Ka = sCC/GG2sCA/GTsAA/TT

2F(7); then Kd = 1/Ka.

Approx. form for a conformation of this type: let s = mean weight of doublet stacking.

Ka = sl-1

M(n)/(n+1)1.5

.


34/39


35/39


36/39

Simplest Application: Short Oligos

For short oligos, a 2-state model often used;

Only 2 conformations: un-melted (2 ssDNAs) + fully-aligned dsDNA;

Applied model can bestatistical(melting curves), orvant Hoff.

Generally, focus is Tm value

Vant Hoff assumption: @ Tm, = (ext) = .

Example: Length 9 bp mis-matched oligo:

Result: All-or-none model gives good agreement for short oligos:

Both G

o

and Tm predictions are acceptable (SantaLucia, et al.); However: for oligos longer than 15 bps, a shifted Zipper model required...

Li it ti f th 2 t t M d l


37/39

Limitations of the 2-state Model

Is a 2-state model good for long oligonucleotides?

Study: 100 dsDNAs of length 23 bps (A. Suyama, et al). Experimental vs. Calculated Tm values:

Result: correlation pretty good, but...calculated values too low!

Unpredicted stabilization probably due tomelting intermediates.

Adequate for predicting gross behavior/trends; Inadequate for accurate or detailed prediction.

L DNA


38/39

Longer DNAs Statistical Zipper Model (1 duplex/conformation):

very successfulfor predicting DNA melting, up to 150 bps;

e.g.: Differential melting curves for 4 lac DNA fragments:

Watson-Crick SZM predictions shown;

Note addl structure: 2 melting regions (two peaks) in (d)

For polymers > 150, a general, aligned model usually required.

a: 80 bps,

b: 101 bps,

c: 188 bps,

d: 219 bps

experimental.

---- theoretical.


39/39

Conclusion

In this Lecture, we have:

Discussed the Helix-Coil transition of biopolymers.

in the context of DNA melting and renaturation.

Described physical methods necessary for modeling:

Equilibrium Chemistry and Statistical Thermodynamics.

Note: also apply to protein and polysaccharide modeling.

Investigated the generalization of the model:

DNA strand design:

Stat-thermo modeling of error/efficiency.

Quantitative design for minimized error.

Next come Real-world applications:

Low-error Tag-Antitag system design.

lecture 12' - structural transitions in nucleic acids ii

Documents