chemical, physical, and theoretical kinetics of an …chemical, physical, and theoretical kinetics...

8
Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelka a,b,1 , Eric R. Henry a,1 , Troy Cellmer a , James Hofrichter a , and William A. Eaton a,2 a Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520; and b Department of Chemistry, University of Wyoming, Laramie, WY 82071 This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on April 25, 2006. Contributed by William A. Eaton, September 3, 2008 (sent for review June 25, 2008) An extensive set of equilibrium and kinetic data is presented and analyzed for an ultrafast folding protein—the villin subdomain. The equilibrium data consist of the excess heat capacity, trypto- phan fluorescence quantum yield, and natural circular-dichroism spectrum as a function of temperature, and the kinetic data consist of time courses of the quantum yield from nanosecond-laser temperature-jump experiments. The data are well fit with three kinds of models—a three-state chemical-kinetics model, a physical- kinetics model, and an Ising-like theoretical model that considers 10 5 possible conformations (microstates). In both the physical- kinetics and theoretical models, folding is described as diffusion on a one-dimensional free-energy surface. In the physical-kinetics model the reaction coordinate is unspecified, whereas in the theoretical model, order parameters, either the fraction of native contacts or the number of native residues, are used as reaction coordinates. The validity of these two reaction coordinates is demonstrated from calculation of the splitting probability from the rate matrix of the master equation for all 10 5 microstates. The analysis of the data on site-directed mutants using the chemical- kinetics model provides information on the structure of the transition-state ensemble; the physical-kinetics model allows an estimate of the height of the free-energy barrier separating the folded and unfolded states; and the theoretical model provides a detailed picture of the free-energy surface and a residue-by- residue description of the evolution of the folded structure, yet contains many fewer adjustable parameters than either the chemical- or physical-kinetics models. fluorescence funneled energy landscape Ising-like model laser temperature jump polypeptide A major challenge to advancing our understanding of how proteins fold is the development of an analytical theoretical model capable of calculating the quantities directly measured in both equilibrium and kinetic experiments. We have approached this problem experimentally by studying a small ultrafast folding protein, the 35-residue subdomain from the villin headpiece (1–7) (Fig. 1). It is the smallest naturally occurring protein that autonomously folds into a globular structure (8 –10), so it should have one of the simplest protein-folding mechanisms, which may therefore be amenable to understanding in depth by a theoretical model. Moreover, because folding of this protein occurs in a few microseconds, close to the proposed theoretical speed limit (4, 11), it can be investigated in detail by molecular-dynamics simulations. Our theoretical approach is to calculate the exper- imentally measured quantities with an Ising-like statistical me- chanical model (12, 13), originally developed to explain our results on the -hairpin from the protein GB1 (14, 15), and similar to models of Baker, Finkelstein, and coworkers (16–18). The key simplifying feature of these models is that they explicitly consider only interactions between residues that are in contact in the native structure [the perfectly funneled energy landscape of Wolynes and Onuchic (19–21)]. These models have been remarkably successful in predicting both the number of observ- able states and the folding rates for individual proteins and have also had some success in predicting the relative effect on folding rates and equilibrium constants produced by site-directed mu- tations [i.e., -values (22)] (12, 16–18, 23). However, up to now they have not been used to calculate the physical properties that are actually measured experimentally. In this work, we present and analyze an extensive set of equilibrium and kinetic data on the villin subdomain. The equilbrium data consist of the excess heat capacity (6), trypto- Author contributions: J.K., E.R.H., T.C., J.H., and W.A.E. designed research, performed research, analyzed data, and wrote the paper. The authors declare no conflict of interest. 1 J.K. and E.R.H. contributed equally to this work. 2 To whom correspondence should be addressed at: Building 5, Room 104, National Insti- tutes of Health, Bethesda, MD 20892-0520. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0808600105/DCSupplemental. © 2008 by The National Academy of Sciences of the USA A B Fig. 1. Structure of villin subdomain solved by x-ray diffraction (PDB 1WY4) (2). Ribbon diagram of backbone showing the side chains of W23 and H27 (A) and structure with all nonhydrogen atoms (B). Residues F6,K7,A8,G11,M12,T13 are shown in black because their contacts contribute most to the stability of the most populated microstates of the transition state ensemble at 310 K (see Figs. S8 and S9). www.pnas.orgcgidoi10.1073pnas.0808600105 PNAS December 2, 2008 vol. 105 no. 48 18655–18662 BIOPHYSICS CHEMISTRY INAUGURAL ARTICLE Downloaded by guest on September 23, 2020

Upload: others

Post on 24-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

Chemical, physical, and theoretical kineticsof an ultrafast folding proteinJan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera, James Hofrichtera, and William A. Eatona,2

aLaboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520;and bDepartment of Chemistry, University of Wyoming, Laramie, WY 82071

This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on April 25, 2006.

Contributed by William A. Eaton, September 3, 2008 (sent for review June 25, 2008)

An extensive set of equilibrium and kinetic data is presented andanalyzed for an ultrafast folding protein—the villin subdomain.The equilibrium data consist of the excess heat capacity, trypto-phan fluorescence quantum yield, and natural circular-dichroismspectrum as a function of temperature, and the kinetic data consistof time courses of the quantum yield from nanosecond-lasertemperature-jump experiments. The data are well fit with threekinds of models—a three-state chemical-kinetics model, a physical-kinetics model, and an Ising-like theoretical model that considers105 possible conformations (microstates). In both the physical-kinetics and theoretical models, folding is described as diffusion ona one-dimensional free-energy surface. In the physical-kineticsmodel the reaction coordinate is unspecified, whereas in thetheoretical model, order parameters, either the fraction of nativecontacts or the number of native residues, are used as reactioncoordinates. The validity of these two reaction coordinates isdemonstrated from calculation of the splitting probability from therate matrix of the master equation for all 105 microstates. Theanalysis of the data on site-directed mutants using the chemical-kinetics model provides information on the structure of thetransition-state ensemble; the physical-kinetics model allows anestimate of the height of the free-energy barrier separating thefolded and unfolded states; and the theoretical model provides adetailed picture of the free-energy surface and a residue-by-residue description of the evolution of the folded structure,yet contains many fewer adjustable parameters than either thechemical- or physical-kinetics models.

fluorescence � funneled energy landscape � Ising-like model �laser temperature jump � polypeptide

A major challenge to advancing our understanding of howproteins fold is the development of an analytical theoretical

model capable of calculating the quantities directly measured inboth equilibrium and kinetic experiments. We have approachedthis problem experimentally by studying a small ultrafast foldingprotein, the 35-residue subdomain from the villin headpiece(1–7) (Fig. 1). It is the smallest naturally occurring protein thatautonomously folds into a globular structure (8–10), so it shouldhave one of the simplest protein-folding mechanisms, which maytherefore be amenable to understanding in depth by a theoreticalmodel. Moreover, because folding of this protein occurs in a fewmicroseconds, close to the proposed theoretical speed limit (4,11), it can be investigated in detail by molecular-dynamicssimulations. Our theoretical approach is to calculate the exper-imentally measured quantities with an Ising-like statistical me-chanical model (12, 13), originally developed to explain ourresults on the �-hairpin from the protein GB1 (14, 15), andsimilar to models of Baker, Finkelstein, and coworkers (16–18).The key simplifying feature of these models is that they explicitlyconsider only interactions between residues that are in contactin the native structure [the perfectly funneled energy landscapeof Wolynes and Onuchic (19–21)]. These models have beenremarkably successful in predicting both the number of observ-

able states and the folding rates for individual proteins and havealso had some success in predicting the relative effect on foldingrates and equilibrium constants produced by site-directed mu-tations [i.e., �-values (22)] (12, 16–18, 23). However, up to nowthey have not been used to calculate the physical properties thatare actually measured experimentally.

In this work, we present and analyze an extensive set ofequilibrium and kinetic data on the villin subdomain. Theequilbrium data consist of the excess heat capacity (6), trypto-

Author contributions: J.K., E.R.H., T.C., J.H., and W.A.E. designed research, performedresearch, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

1J.K. and E.R.H. contributed equally to this work.

2To whom correspondence should be addressed at: Building 5, Room 104, National Insti-tutes of Health, Bethesda, MD 20892-0520. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/cgi/content/full/0808600105/DCSupplemental.

© 2008 by The National Academy of Sciences of the USA

A

B

Fig. 1. Structure of villin subdomain solved by x-ray diffraction (PDB 1WY4) (2).Ribbon diagram of backbone showing the side chains of W23 and H27 (A) andstructure with all nonhydrogen atoms (B). Residues F6,K7,A8,G11,M12,T13 areshown in black because their contacts contribute most to the stability of the mostpopulated microstates of the transition state ensemble at 310 K (see Figs. S8and S9).

www.pnas.org�cgi�doi�10.1073�pnas.0808600105 PNAS � December 2, 2008 � vol. 105 � no. 48 � 18655–18662

BIO

PHYS

ICS

CHEM

ISTR

YIN

AU

GU

RAL

ART

ICLE

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 2: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

phan fluorescence quantum yield (QY), and natural circular-dichroism (CD) spectrum (1, 2, 4) as a function of temperature.The kinetic data consist of time courses of the QY over a widetemperature range from nanosecond-laser-induced tempera-ture-jump experiments. These measured quantities are calcu-lated using a coarse-grained version of the Ising-like model ofMunoz, Henry, and Eaton (12, 13). The kinetics are described bydiffusion on a one-dimensional free-energy surface (24), usingeither the number of ordered residues (25) or the fraction ofnative contacts (26, 27) as reaction coordinates (28), and aposition-dependent diffusion coefficient determined from mea-surements of the relaxation rate as a function of viscosity (7). Wealso calculate �-values for a number of mutants from the changein relaxation rate resulting from the perturbation of the free-energy surface produced by the mutation. To test the validity of our useof order parameters as reaction coordinates, we calculated the splittingprobability (also called the pfold) (29–32) from the rate matrix of themaster equation for all 105 microstates of the model.

In addition, we have analyzed the data in terms of a conven-tional chemical-kinetics model and a model in which the kineticsare described by diffusion on an empirical free-energy surface,similar to what has previously been done for other proteins byGruebele (33), Munoz (34), and their coworkers. For lack of abetter term, we call this a physical-kinetics model. The chemical-and physical-kinetics models are not only helpful in interpretingexperimental results, but they also expose features that supportthe validity of the theoretical results. We believe that ouranalysis, using three very different types of models to interpretthe data, represents the most comprehensive approach so far tounderstanding the results of equilibrium and kinetic experimentson protein folding.

ResultsExperimental Data to Be Calculated. The most important equilib-rium data to be calculated are the heat capacity as a function oftemperature measured by differential scanning calorimetry (6).The reason for its special importance is that the parametersrequired to fit the data are purely thermodynamic quantities,unlike both equilibrium fluorescence and CD data that requireadditional parameters to describe the temperature dependenceof the QY and CD for each state of the chemical-kinetics model,each position along the reaction coordinate for the physical-kinetics model, or each microstate of the theoretical model.Because the theoretical-kinetics model does not consider thecontributions from hydration or internal degrees of freedomarising from bond and angle vibrations, and the chemical-kinetics model considers only differences in heat capacity rela-tive to the native state, the relevant experimental quantity is theheat capacity in excess of that for the fully folded, native state(Fig. 2).

Fig. 3 shows the equilibrium thermal unfolding curves mea-sured by fluorescence and CD. To reduce the number ofadjustable parameters in fitting with all three models, we alsomeasured the fluorescence of a fragment to simulate the QY forthe fully unfolded protein (upper dotted curve in Fig. 3A). Theresults of the kinetics experiments are shown in Fig. 4. Theobserved progress curves [QY versus time, supporting informa-tion (SI) Fig. S1] are well characterized at each temperature bytwo exponential processes with relaxation times of �100 ns and1–5 �s. Several lines of evidence, including an independentestimate of the folding rate from an analysis of end-to-endcontact measurements (3) and infrared studies by Dyer andcoworkers (35, 36), indicate that the slower relaxation corre-sponds to global unfolding/refolding. Fig. 5 shows �-values for 10mutants calculated from simultaneous fitting of fluorescenceand CD equilibrium unfolding curves as in reference (1) andrelaxation rates assuming a two-state model (equilibrium andrate constants are summarized in Table S1).

Chemical-Kinetics Model. The simplest chemical-kinetics model isa two-state model, in which there are only two populations ofmolecules at equilibrium and at all times in kinetic experiments.However, the observation of two phases in the kinetic progresscurves requires consideration of a third state. The conventionalthree-state folding model is one in which there is an intermediatestate (I) on the pathway from the unfolded (U) to folded (F)state, i.e.

U -|0105 � 106 s�1

I -|0107 s�1

F

This model provides an excellent fit to both equilibrium andkinetic data (Figs. 2–4). The standard equations used and the

Fig. 2. Excess heat capacity as a function of temperature. The filled circles arethe experimental data. The curves are the fits using the three-state chemical-kinetics (dashed, red), physical-kinetics (continuous, cyan), and theoretical(continuous, blue) models.

ellipticity

ellipticityx

( degdegcmcm

dmol

dmol)

10103

2-

2-1

A

B

Fig. 3. Tryptophan fluorescence quantum yield and circular dichroism as afunction of temperature. The filled circles are the experimental data. Thecurves are the fits using the three-state chemical-kinetics (dashed, red), phys-ical-kinetics (continuous, cyan), and theoretical (continuous, blue) models. (A)Tryptophan fluorescence: the upper dotted curve corresponds to the mea-sured quantum yield of the fragment (AcWKQQH); the lower dotted curve isthe quantum yield of the theoretical-kinetics model for microstates that havea W23-H27 contact. (B) Circular dichroism. The curves are the fits using thethree-state chemical-kinetics (dashed, red), physical-kinetics (continuous,cyan), and theoretical (continuous, blue) models.

18656 � www.pnas.org�cgi�doi�10.1073�pnas.0808600105 Kubelka et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 3: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

values of the adjustable parameters of the model are given in SIAppendix and Table S2. The important result of the three-statemodel, as indicated in the scheme above, is that the intermediate(I) interconverts with the folded state (F) much faster than itdoes with the unfolded state (U) and therefore lies on the foldedside of the major free-energy barrier (the populations of thethree states as a function of temperature are shown in Fig. S2).

Physical-Kinetics Model. Our physical-kinetics model consists of afree energy (G) versus reaction coordinate function, and thevalues of the observables (QY and CD) at each value of thereaction coordinate (q). We assumed that there are only twodeep minima on this free-energy surface, corresponding to the

unfolded and folded states, with the position of the minimaallowed to move with temperature as a way of generating anadditional relaxation. Recognizing that, with the possible excep-tion of the calorimetric data, the information content in the datais insufficient to determine curvatures, we used effectively thesame curvature for both wells and barrier top and parameterizedthe surface with coordinates (qi,Gi) for the unfolded state, thebarrier top, and the folded state. Fig. 6 shows the free-energysurfaces as a function of temperature that optimally fit theexperimental data (the coordinate dependencies of the QY andCD are shown in Fig. S3). The calorimetric data (Fig. 2) were fitfrom the temperature dependence of an equilibrium constant(Fig. S4), defined by the ratio of the populations on either sideof the dividing line, taken as the value of q at the free-energybarrier top (see SI Appendix for details).

The physical-kinetics model also provides an excellent fit to allof the equilibrium and kinetic data (Figs. 2–4). The importantresults of analyzing the data by using this model are that thefree-energy barrier to folding is very small (�2 kcal/mol) andthat the �100-ns process is explained as relaxation in the foldedwell to more unfolded conformations as the temperature in-creases, as indicated by the movement of the folded-well mini-mum to smaller values of the reaction coordinate.

Theoretical-Kinetics Model. The model has been described in detailelsewhere (6, 12, 13) (see SI Appendix and Table S3 for details).The principal assumptions of the model are that each residue ofthe polypeptide chain can exist in one of two possible states—native (n) or nonnative (c), as in an Ising model (37)—and thatno more than two continuous stretches of native residues areallowed in each molecule (e.g., …cnnncccnnnccc…). This latterassumption, the so-called double-sequence approximation,greatly reduces the number of possible configurations from 235

(3 � 1011) to 6 � 104. The free energy and thermodynamic weightof a stretch of native residues of length j beginning at position iare, respectively,

Gji � nji� � jT�sconf, wji � exp��Gji/RT� [1]

where nji is the number of contacts in the stretch, � is the energyper contact, and �sconf is the conformational-entropy cost offixing a residue in its native conformation. The model allowscontacts between residues in a native segment and betweenresidues in two different native segments, so there is an addi-tional destabilizing term in the partition function for connectingthe two segments by a disordered loop, but it contains noadjustable parameters. The same energy (�) was assigned to eachinterresidue contact and is an adjustable parameter of the model.In the same spirit, the same entropy decrease (�sconf) for thenonnative to native transition was assigned to every residue.Allowing �sconf and � to be temperature-dependent to accountfor hydrophobic interactions gave no improvement to the fits.

A

B

Fig. 4. Relaxation rates (A) and kinetic amplitudes (B) as a function oftemperature. The open and filled circles are the relaxation rates obtainedfrom biexponential fits to the measured kinetic progress curves shown in Fig.S1. The curves are the fits using the three-state chemical-kinetics (red), phys-ical-kinetics (cyan), and theoretical models with P (green) and Q (blue) asreaction coordinates.

Fig. 5. �-Values. The red points are the experimental data at 310 K (see TableS1); the continuous blue and green curves are the �-values calculated for theQ and P reaction coordinates, respectively, assuming a two-state model and a50 cal/mol perturbation of the stability by altering the contact energy for themutated residue. The dotted blue and green curves, for Q and P respectively,correspond to the (Boltzmann-weighted) average fraction of native contactsfor the microstates at reaction coordinates with free energies within 1 kBT ofthe barrier top.

Fig. 6. Free-energy surface (continuous) and populations (dashed) of phys-ical-kinetics model.

Kubelka et al. PNAS � December 2, 2008 � vol. 105 � no. 48 � 18657

BIO

PHYS

ICS

CHEM

ISTR

YIN

AU

GU

RAL

ART

ICLE

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 4: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

A contact between residues exists if the distance between�-carbons of the polypeptide backbone in the folded structure is�0.8 nm. This definition results in a rather sparse interresiduecontact map (Fig. 7A), with only 13 nonlocal contacts (which wedefine as contacts between residues separated by five or moreresidues in the sequence) and 9 contacts between helices. Usingatom–atom contacts as a criterion for interresidue contacts, Chiuet al. (2) found the corresponding numbers to be 18 and 13.Importantly, our more coarse-grained contact map does containcontacts by all three of the core phenylalanines (F6, F10, andF17) and two of the three interhelical hydrogen bonds.

The excess heat capacity was calculated directly from the parti-tion function of the model (see Eqs. S1–S5 in SI Appendix), whereascalculation of the unfolding curves measured by CD and fluores-cence required a model for the CD and the QY for each microstateof the model (see SI Appendix for details). For short helices, thereis a large dependence of the CD on the length of the helix. We usedthe model of Thompson et al. (38), which also considers the

contribution to the CD from the lone tryptophan (see Eqs. S7 andS8 in SI Appendix). For the QY of each microstate we assumed thatthe only source of quenching relative to the fully unfolded statearises from the contact of the tryptophan with the protonatedhistidine one turn away on the helix (Fig. 1) when all of theintervening residues are in their native conformation. This assump-tion is based on the well known quenching of tryptophan fluores-cence by protonated histidine (38) and our observation of a �4-folddecrease in the amplitude for the slower unfolding/refolding relax-ation at pH 7, where the histidine is mainly unprotonated (data notshown). For microstates in which this contact is not made, weassumed that the QY is the same as that measured for a shortpeptide fragment containing the lone tryptophan of the sequence(upper dotted curve in Fig. 3A).

The kinetics were obtained by solving the system of differen-tial equations describing reversible hopping between adjacentdiscrete values of the reaction coordinate on a one-dimensionalfree-energy surface, where the reaction coordinate is taken as

Fig. 7. Results of theoretical-kinetics model. (A) Contact map (PDB 1YRF). (B) Free energy and relative population versus Q reaction coordinate. (C) Free energyand relative population versus P reaction coordinate. (D) Probability that a residue is in its native conformation at each temperature. The thick red bars indicatehelical residues. (E) Probability that a residue is in its native conformation at each value of Q relative to all microstates at that value of Q. (F) Probability that aresidue is in its native conformation at each value of P relative to all microstates at that value of P. (G–I). Relative probability that contact is formed at Q � 0.09(barrier top) (G), 0.33 (H), 0.71 (I). See Fig. S5 for corresponding plot for P reaction coordinate.

18658 � www.pnas.org�cgi�doi�10.1073�pnas.0808600105 Kubelka et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 5: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

either the number of ordered residues (P) (25) or the fraction ofnative contacts (Q) (26) (see Eqs. S9–S12 in SI Appendix) (Figs.7 B and C). The QY at each position along the reactioncoordinate was calculated from the Boltzmann-weighted con-tribution of the subset of the 92,696 microstates of the partitionfunction having that value of the reaction coordinate. Fig. S1shows representative fits to the progress curves, with all of theresults summarized in Fig. 4, which compares the calculated andexperimental values of the relaxation rates and amplitudes ateach temperature.

Fig. 5 contains the �-values predicted by the model using twodifferent types of calculations (see SI Appendix for details). Inone, �-values were calculated in the small-perturbation limit byadjusting the contact energy of the mutated residue to producea new free-energy surface with a 50-cal/mol decrease in thestability of the folded state, assuming a two-state system with thedividing surface at the barrier top (Q � 0.09 or P � 5). Thefolding rate was then obtained from the new relaxation rate andequilibrium constant to yield the �-value, as was done with theexperimental data. In the second method, the �-value wasobtained from the (Boltzmann-weighted) fraction of nativecontacts for each residue for all microstates at and close to thebarrier top, assumed to be the transition state. This methodcorresponds to the conventional interpretation of experimental�-values and is also similar to what is done to calculate �-valuesfrom simulation results; the correct calculation, namely the deter-mination of the change in folding rate and equilibrium constantproduced by the mutation, is much more difficult to calculate fromsimulations and has not yet been done for any protein.

The important output of the theoretical model (Fig. 7) ismechanistic information, that is, the prediction of how thestructure evolves from the unfolded to folded state along thereaction coordinates, P and Q. Fig. 7 E and F show the relativeprobability at each value of the Q or P reaction coordinate,respectively, that a residue is in its native conformation, whereasFig. 7 G–I shows the evolution of the interresidue contacts alongthe Q reaction coordinate (the corresponding contact maps forP are shown in Fig. S5 and are very similar).

Splitting Probabilities. The splitting probability (or pfold) is theprobability that a given microstate of the model will reach thefolded state before reaching the unfolded state (29–32). Theseprobabilities were calculated for the individual microstates fromthe rate matrix for the master equation of the theoretical model(see SI Appendix) (39). Fig. 8 shows the splitting-probabilitydistribution for those microstates of the model with free energiesof 2 kBT or less above the lowest-free energy microstate for thespecified value of the coordinate. Fig. S6 shows the distributionfor all microstates at these values of the reaction coordinates.

DiscussionEach of the three models used to simultaneously fit the equi-librium and kinetic data provides information on the mechanismof folding of the villin subdomain. The three-state chemical-kinetics model provides an excellent fit to all of the data (Figs.2–4), which is not surprising because there are 18 adjustableparameters in the model (Table S2). The important result fromthis model is that the intermediate interconverts with the fullyfolded state (�107 s�1) much faster than with the unfolded state(�105 to 106 s�1), from which we conclude that it is located onthe folded side of the major free-energy barrier. The three-statemodel also attributes the increase in CD before the mainunfolding transition (Fig. 3B) to an increase in the partiallyunfolded intermediate-state population of lower helix content(Fig. S2). The important feature of a chemical-kinetics model isthat there is a straightforward prescription for obtaining infor-mation on the ensemble of structures of the transition state fromthe relative effects of site-directed mutants on the folding rate

and equilibrium constant—the �-value (22), defined as �lnkf/�lnKeq. Because of the separation in time scales between the fastphase and the global unfolding/refolding phase, the fast phasecould be ignored, permitting the calculation of �-values from asimple two-state analysis to obtain the folding rate and equilib-rium constant for the wild-type and the mutants. None of themutations are ideal for �-value analysis, because they do notrepresent small structural changes, such as simple deletion of amethyl group, as in replacing isoleucine with valine or threoninewith serine. Nevertheless, the �-values are unusually low at 310K (Fig. 5), suggesting a transition state with little structureformation and therefore one that appears very early along thereaction coordinate. There is, moreover, a significant increase in�-values for four of the nine residues for which �-values couldbe calculated at 340 K (Table S1), suggesting a shift in the transitionstate toward the folded state at the higher temperature.

The physical-kinetics model, which consists of an empiricalone-dimensional free-energy surface with two deep minima(Fig. 6) and the coordinate dependence of the fluorescence QYand CD (Fig. S3) to give a total of 16 adjustable parameters(Table S4), also provides an excellent fit to the equilibrium andkinetic data (Figs. 2–4). According to this model, the position ofthe folded-well minimum moves toward smaller values of thereaction coordinate with increasing temperature. This motionrepresents partial unfolding and explains the decrease in helixcontent before the global thermal unfolding (Fig. 3B). It alsoexplains the �100-ns phase in the kinetics (Fig. 4) as reconfigu-ration in the shifted folded well at the elevated temperature. Thelack of temperature dependence for the rate of the �100-nsphase is consistent with our a priori assumption that there is noadditional barrier on the free-energy surface.

The major result of the physical-kinetics model is the predic-tion of the height of the free-energy barrier separating foldedand unfolded states. According to Kramers’ theory, the rate

A

B

Fig. 8. Distribution of spitting probabilities (pfold) calculated from the ratematrix for microstates �2 kBT above the most stable one at each value of thereaction coordinate at 310 K. (A) pfold distribution at Q � 0.03, 0.09 (the barriertop), and 0.27. (B) pfold distribution at P � 3, 5 (the barrier top), and 10. See Fig.S6 for pfold distribution for all microstates at the same values of the reactioncoordinates.

Kubelka et al. PNAS � December 2, 2008 � vol. 105 � no. 48 � 18659

BIO

PHYS

ICS

CHEM

ISTR

YIN

AU

GU

RAL

ART

ICLE

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 6: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

of barrier crossing depends on the curvature of the wells and thebarrier top and exponentially on the barrier height (24), so thebarrier height dominates. An important result of the physical-kinetics model, then, is that the free-energy barrier to folding(Fig. 6) is small (�2 kcal/mol). A �2-kcal/mol barrier is alsosimply calculated from Kramers’ equation for a two-state system,assuming that the diffusion coefficients and curvatures in thefolded and unfolded well and at the barrier top of the free-energysurface are the same, i.e., �Gf

‡ � RTf ln(f/2) � 1.6 kcal/mol,where f (4.6 �s) is the folding time, and (70 ns) is thereconfiguration time in the folded well (4). The 1- to 2-kcal/molbarriers at the folding temperature are also obtained by fittingthe calorimetric data (Fig. 2) with the variable barrier model ofMunoz and Sanchez-Ruiz (6) or the temperature dependenceof the relaxation rates (Fig. 4A) with a mean-field model ofNaganathan et al. (34).

The Ising-like theoretical model provides remarkably good fitsto all of the experimental equilibrium and kinetic data (Figs.2–4) and yields the most information about the folding mech-anism, yet contains far fewer adjustable parameters than eitherthe chemical-kinetics or physical-kinetics models. Although thefit to the excess heat capacity-vs.-temperature curve (Fig. 2) isnot as good as either of the other two models, the theoreticalmodel requires only two adjustable parameters to (nearly) fit theheat-capacity data—a contact energy and a conformational-entropy loss (Eq. 1) that are the same for every residue,compared with eight adjustable parameters of the chemical-kinetics model and nine of the physical-kinetics model. Thefitted value of �3.70 cal mol�1 K�1 for the conformational-entropy loss, moreover, is comparable to the average valueexpected from thermodynamic analysis of many proteins (6).

The kinetics are described by the theoretical model as hoppingalong the discretized one-dimensional free-energy surface given bythe partition function, by using either the fraction of native contacts(Q) or number of native residues (P) as reaction coordinates(25–27). By invoking a linear free-energy relation between thehopping rate and the ratio of equilibrium populations at adjacentvalues of the reaction coordinate, the additional adjustable param-eter required to describe the kinetics is the proportionality constant(�) and an activation energy for its temperature dependence (Eqs.S10–S12 in SI Appendix). From a recent study of the viscositydependence of the measured relaxation rates using a viscogen thathas no effect on the equilibrium properties, we obtained thereaction-coordinate dependence of � by fitting the data with thetheoretical model (see Fig. S7) (7). The resulting simultaneous fitsto all of the equilibrium and kinetic data (Figs. 2–4 and Fig. S1),apart from the CD (see SI Appendix), are not quite as good as theother models, but again there is a large difference in the number ofadjustable parameters (7 for the theoretical model, 15 for thephysical-kinetics model, and 18 for the chemical-kinetics model).The theoretical model fails to come close to fitting the �100-nsphase, attributed by the physical-kinetics model to reconfigurationin the folded well. This relaxation apparently reflects the finestructure of the free-energy surface, which is not captured by thissimple theoretical model. The failure of the model may also resultfrom the implicit assumption of the same prefactor independent ofthe type of motion (40) or from an oversimplification in ourtreatment of the (W23) quantum yield, which assumes that contactwith the protonated histidine (H27) one turn away in the helix (Fig.1) is the only additional source of fluorescence quenching in thefolded state.

A question that immediately arises in describing the kinetics asdiffusion on the one-dimensional free-energy surface is: are Q andP good reaction coordinates? A critical test is whether the splittingprobability is close to one-half for the microstates at the free-energybarrier tops of these profiles. We therefore calculated the splittingprobability (also called the pfold) from the rate matrix of the masterequation of the model (see SI Appendix). Fig. 8 shows the distri-

bution of splitting probabilities for those microstates of the modelwithin 2 kBT of the lowest free-energy microstate at the barrier topand at positions 1.6 kBT to the left and right of the barrier top. Asubstantial fraction of microstates exhibits a pfold between 0.4 and0.6 at the barrier top for both Q (28% at Q � 0.09) and P (44% atP � 5) reaction coordinates, with a sharp change in the pfolddistribution at higher and lower values of Q and P. We also askedthe question: what fraction of the low lying microstates with a pfoldbetween 0.4 and 0.6 is within the barrier region, i.e., at reactioncoordinates within 1 kBT of the barrier top? If we identify the lowlying microstates as those within 2 kBT of the lowest free-energymicrostate at each value of the reaction coordinate, we find that thebarrier region contains 84% of the microstates with a pfold between0.4 and 0.6 for Q and 100% for P. If the low lying microstates areidentified as those within 3 kBT of the lowest free-energy microstateat each value of the reaction coordinate, the barrier region contains48% for Q and 90% for P. All of these results indicate that both Qand P are indeed good reaction coordinates for describing thekinetics.

Another important test of the theoretical model is its abilityto predict �-values, which cannot be calculated from either thechemical- or physical-kinetics model. Fig. 5 shows the �-valuesfor the 10 mutations studied. Of these, only two (L20V andL28V) might represent a sufficiently small structural perturba-tion to satisfy the assumption of the �-value analysis that themutation does not alter the unfolded state and changes only theinteraction of the mutated residues with its contacting neighborsin the transition and folded states without perturbing any otherresidue–residue interactions. At 310 K the observed �-values areunusually low compared with all other proteins, which is qual-itatively explained by the theoretical model as resulting from themajor barrier being very early along the reaction coordinate forboth Q and P (Fig. 7 B and C), where there is a very lowprobability of native contact formation for most residues (Fig.7G and Figs. S5A, S8, and S9). The model also does a remarkablygood job of predicting the low �-values quantitatively using twodifferent methods (Fig. 5 and Table S1).

Although the system is not far from two-state, as judged by thetemperature dependence of the probability that a residue is in itsnative state (Fig. 7D), we do not have the simple case of a highbarrier that remains at the same position along the reactioncoordinate at all temperatures. At temperatures above the foldingtemperature of 340 K, the surface becomes more complex, indi-cating formation of an intermediate, and a new major barrierappears closer to the folded state (Q � 0.5 and P � 25). (A simplecalculation explains almost all of the shift: Increasing the temper-ature by �T results in an increase in free energy at P � 25 relativeto P � 5 from the change in the contribution from the conforma-tional entropy of �P � � T � �sconf, which, for a 40°C temperatureincrease, is 3 kcal/mol). This change in the surface contributes to thedenaturant independence (Fig. S10) of the observed relaxation ratebecause of the relative insensitivity of the folding and unfoldingbarriers to the contact energy at low and high denaturant concen-tration, respectively (Fig. 7 B and C) (5). The change in thefree-energy surface also results in the decrease in the viscositydependence of the relaxation rate as the temperature is increasedbecause of an increased contribution of internal friction in the morecompact structures at higher values of the reaction coordinate (7)(Fig. S7). The model therefore also predicts that there should be anincrease in �-values with increasing temperature, and in the fourmutants where measurements could be made at 340 K, there isindeed a significant increase (Table S1).

Because of the complexity of the surface, both experimental andtheoretical �-values are only approximate. The correct and morerigorous approach is to compare the new thermal unfolding curvesand relaxation rates that result from the mutation. That is, vary thecontact energy (and possibly also the conformational entropychange) of the mutated residue to optimally fit the new thermal

18660 � www.pnas.org�cgi�doi�10.1073�pnas.0808600105 Kubelka et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 7: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

unfolding curve and compare the calculated relaxation rate on thenew free-energy surface with the observed relaxation rate. To testthis prediction of the model will require more extensive data on thetemperature dependence of the kinetics, particularly results frommore conservative mutations than are currently available.

The utility of the theoretical model, of course, is that if accurateit provides a detailed description of the mechanism. Unlike �-valueanalysis, which, with rare exceptions (21), provides structural in-formation at a single, albeit important point along the reactioncoordinate—the transition state—the theoretical model describesthe complete evolution of the unfolded to folded structure. Thisinformation is shown in Fig. 7 E and F as the probability of a residuebeing in the native confirmation at each position on the reactioncoordinate and the evolution of the contact map (Fig. 7 G–I andFig. S5 A–C). The model predicts that the N-terminal helix (D3 toF10) forms first, along with an interhelical contact with the first fewresidues of the second helix (R14 to F17), followed by the middlehelix, and finally the C-terminal helix (L21 to E31). The longest-range contacts between the N- and C-terminal helices (V9-K32,F10-L28) do not begin to form until very late along the reactioncoordinate. Diagrams of the individual microstates of the transi-tion-state ensemble at 310 K (Figs. S8 and S9) show that for bothQ and P reaction coordinates, the interaction between the helicalresidues F6-K7-A8 with the G11-M12-T13 loop contributes most tothe stability (Fig. 1) (reflected in the peaks of the theoreticallycalculated �-values in Fig. 5), so that forming the transition stateensemble does not correspond to simple helix nucleation.

There have been a large number of simulations of various typeswith the aim of describing the dynamical properties and foldingmechanism of the villin subdomain (see SI Appendix for a list ofsimulation references). So an obvious question might be: How doesthis evolution of the structure compare with the predictions ofsimulations that provide much more structural detail than can beobtained from our theoretical model? We believe that it is prema-ture to make detailed comparisons for at least two reasons. First,none of the directly measured experimental quantities have beencalculated in any of the simulations—most importantly, the tem-perature dependence of the heat capacity that tests their thermo-dynamic accuracy. Second, although the agreement between thepredictions of our theoretical model and experimental results isimpressive, as pointed out above, more extensive mutagenesis willbe required to more rigorously test the predicted order of structureformation. Nevertheless, there are two intriguing results that shouldbe mentioned. Shakhnovich and coworkers used Monte Carlosimulations of an atomistic model to calculate �-values from thefraction of native contacts in the ensemble of microstates with 0.4 �pfold � 0.6 (41). Although their calculated �-values are uniformlyhigher at 300 K than we observe at 310 K (Fig. 5), the pattern of�-values indicates that the N-terminal and middle helices formbefore the C-terminal helix. Pande and coworkers have carried outall-atom molecular-dynamics calculations in explicit solvent (42,43). They find that the native conformation is in fast exchange withensembles of conformations containing the middle and C-terminalhelices and conformations containing only the middle helix, pro-viding a possible explanation of the �100-ns relaxation (see alsoref. 44).

Concluding RemarksOur analysis of equilibrium and kinetic data using three differentapproaches highlights the importance of a theoretical model.Despite having many fewer adjustable parameters than eitherthe chemical-kinetics or physical-kinetics models, the Ising-likemodel produces an almost equally good fit to a wide range ofexperimental data. An obvious question is: Why does such asimple model, with a perfectly funneled energy landscape and nodistinction between the different amino acid residues, work sowell? A possible answer to the first part is that coarse grainingworks because of enthalpy–entropy compensation. This issuecould be explored further by using contact maps generated withdifferent distance or atomistic criteria or with residue-specificpotentials (45). The answer to the second part of the question ismore biological than physical, and is based on the idea thatnatural selection minimizes nonnative interactions to preventtraps that will slow folding or lead to aggregation (19).

What can be done to further test and refine the model? Onecomputational test would be to carry out Langevin simulationsof a coarse-grained representation of the protein. These calcu-lations could determine how many contiguous sequences wouldbe required in the model to adequately describe the foldingprocess. An experimental test would be to carry out mutationalstudies that can be interpreted more rigorously, in which thechange in the amino acid is a simple methyl-group deletion (e.g.,threonine to serine or leucine to norvaline). Finally, a morecomplete description of the mechanism can, of course, beobtained from the kinetic equations that describe the intercon-version of all microstates of the theoretical model (Eq. S13).Solution of these equations will yield the distribution of pathwaysthat the protein takes from the folded to the unfolded state, ascan be obtained by molecular simulations, and might be testedby single-molecule FRET experiments (46).

Materials and MethodsThe 35-residue villin subdomain (LSDED FKAVF GMTRS AFANL PLWKQ QHLKKEKGLF—helical residues in bold type) was obtained from California PeptideResearch. Solutions were buffered with 20 mM sodium acetate at pH 4.9.Details of the differential scanning calorimetry experiments have been de-scribed by Godoy-Ruiz et al. (6). The absolute heat capacity for the fullyformed native structure was taken from the study of Freire, using CP

F � (b c(T � 273.15))Mr cal K�1 mol�1, with b � 0.329 cal K�1 g�1, c � 1.9 � 10�3 calK�2 g�1, and the molecular weight of the protein M� � 4084 g mol�1.Godoy-Ruiz et al. argued that the larger surface-to-volume ratio for this35-residue protein, with a greater fraction of surface residues having confor-mational freedom of the side chains compared with the much larger proteinsused in the Freire study (47), justified the use of the Freire parameters thatproduce the highest heat capacity within the experimental uncertainty.

CD was measured with a Jasco J-720 spectropolarimeter at a proteinconcentration of 0.2 mM. Fluorescence thermal unfolding curves were mea-sured with a SPEX Fluorolog spectrofluorometer at 10 �M concentration.Folding kinetics were measured at 0.5 mM by using a laser-temperature-jumpapparatus described in ref. 38. Each kinetic progress curve was obtained fromthe average of 512 laser shots. The size of the temperature jump (7–10 K) wascalibrated by using the temperature dependence of N-acetyltryptophana-mide fluorescence.

ACKNOWLEDGMENTS. We thank Attila Szabo, Peter Wolynes, Victor Munoz,and Eugene Shakhnovich for many helpful discussions. This work was sup-ported by the Intramural Research Program of the National Institute ofDiabetes and Digestive and Kidney Diseases, National Institutes of Health.

1. Kubelka J, Eaton WA, Hofrichter J (2003) Experimental tests of villin subdomain foldingsimulations. J Mol Biol 329:625–630.

2. Chiu TK, et al. (2005) High-resolution x-ray crystal structures of the villin headpiecesubdomain, an ultrafast folding protein. Proc Natl Acad Sci USA 102:7517–7522.

3. Buscaglia M, Kubelka J, Eaton WA, Hofrichter J (2005) Determination of ultrafastprotein folding rates from loop formation dynamics. J Mol Biol 347:657–664.

4. Kubelka J, Chiu TK, Davies DR, Eaton WA, Hofrichter J (2006) Sub-microsecond proteinfolding. J Mol Biol 359:546–553.

5. Cellmer T, Henry ER, Kubelka J, Hofrichter J, Eaton WA (2007) Relaxation rate for anultrafast folding protein is independent of chemical denaturant concentration. J AmChem Soc 129:14564–14565.

6. Godoy-Ruiz R, et al. (2008) Estimating free energy barrier heights for an ultra-fast folding protein from calorimetric and kinetic data. J Phys Chem B 112:5938 –5949.

7. Cellmer T, Henry ER, Hofrichter J, Eaton WA (2008) Measuring internal friction inultrafast folding kinetics. Proc Natl Acad Sci USA, in press.

8. McKnight CJ, Doering DS, Matsudaira PT, Kim PS (1996) A thermostable 35-residuesubdomain within villin headpiece. J Mol Biol 260:126–134.

9. McKnight CJ, Matsudaira PT, Kim PS (1997) NMR structure of the 35-residue villinheadpiece subdomain. Nat Struct Biol 4:180–184.

10. Frank BS, Vardar D, Buckley DA, McKnight CJ (2002) The role of aromatic residues in thehydrophobic core of the villin headpiece subdomain. Protein Sci 11:680–687.

Kubelka et al. PNAS � December 2, 2008 � vol. 105 � no. 48 � 18661

BIO

PHYS

ICS

CHEM

ISTR

YIN

AU

GU

RAL

ART

ICLE

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020

Page 8: Chemical, physical, and theoretical kinetics of an …Chemical, physical, and theoretical kinetics of an ultrafast folding protein Jan Kubelkaa,b,1, Eric R. Henrya,1, Troy Cellmera,

11. Kubelka J, Hofrichter J, Eaton WA (2004) The protein folding ‘speed limit’. Curr OpinStruct Biol 14:76–88.

12. Muñoz V, Eaton WA (1999) A simple model for calculating the kinetics of proteinfolding from three-dimensional structures. Proc Natl Acad Sci USA 96:11311–11316.

13. Henry ER, Eaton WA (2004) Combinatorial modeling of protein folding kinetics: freeenergy profiles and rates. Chem Phys 307:163–185.

14. Muñoz V, Thompson PA, Hofrichter J, Eaton WA (1997) Folding dynamics and mech-anism of beta-hairpin formation. Nature 390:196–199.

15. Muñoz V, Henry ER, Hofrichter J, Eaton WA (1998) A statistical mechanical model for�-hairpin kinetics. Proc Natl Acad Sci USA 95:5872–5879.

16. Alm E, Baker D (1999) Prediction of protein-folding mechanisms from free-energylandscapes derived from native structures. Proc Natl Acad Sci USA 96:11305–11310.

17. Galzitskaya OV, Finkelstein AV (1999) A theoretical search for folding/unfolding nucleiin three-dimensional protein structures. Proc Natl Acad Sci USA 96:11299–11304.

18. Alm E, Morozov AV, Kortemme T, Baker D (2002) Simple physical models connecttheory and experiment in protein folding kinetics. J Mol Biol 322:463–476.

19. Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG (1995) Funnels, pathways, and theenergy landscape of protein-folding—A synthesis. Proteins Struct Funct Genet 21:167–195.

20. Onuchic JN, Luthey-Schulten A, Wolynes PG (1997) Theory of protein folding: Theenergy landscape perspective. Ann Rev Phys Chem 48:545–600.

21. Oliveberg M, Wolynes PG (2005) The experimental survey of protein-folding energylandscapes. Quart Rev Biophys 38:245–288.

22. Fersht A (1999) Structure and Mechanism in Protein Science (Freeman, New York).23. Garbuzynskiy SO, Finkelstein AV, Galzitskaya OV (2004) Outlining folding nuclei in

globular proteins. J Mol Biol 336:509–525.24. Kramers HA (1940) Brownian motion in a field of force and the diffusion model of

chemical reactions. Physica VII:284–304.25. Bryngelson JD, Wolynes PG (1989) Intermediates and barrier crossing in a random

energy-model (with applications to protein folding). J Phys Chem 93:6902–6915.26. Sali A, Shakhnovich E, Karplus M (1994) How does a protein fold. Nature 369:248–251.27. Socci ND, Onuchic JN, Wolynes PG (1996) Diffusive dynamics of the reaction coordinate

for protein folding funnels. J Chem Phys 104:5860–5868.28. Onuchic JN, Wolynes PG, Luthey-Schulten Z, Socci ND (1995) Toward an outline of

the topography of a realistic protein-folding funnel. Proc Natl Acad Sci USA92:3626 –3630.

29. Du R, Pande VS, Grosberg AY, Tanaka T, Shakhnovich EI (1998) On the transitioncoordinate for protein folding. J Chem Phys 108:334–350.

30. Best RB, Hummer G (2005) Reaction coordinates and rates from transition paths. ProcNatl Acad Sci USA 102:6732–6737.

31. Shakhnovich E (2006) Protein folding thermodynamics and dynamics: Where physics,chemistry, and biology meet. Chem Rev 106:1559–1588.

32. Berezhkovskii A, Szabo A (2006) A perturbation theory of phi-value analysis of two-state protein folding: Relation between p(fold) and phi values. J Chem Phys125:104902.

33. Ma HR, Gruebele M (2005) Kinetics are probe-dependent during downhill folding of anengineered lambda(6–85) protein. Proc Natl Acad Sci USA 102:2283–2287.

34. Naganathan AN, Doshi U, Muñoz V (2007) Protein folding kinetics: Barrier effects inchemical and thermal denaturation experiments. J Am Chem Soc 129:5673–5682.

35. Brewer SH, et al. (2005) Effect of modulating unfolded state structure on the foldingkinetics of the villin headpiece subdomain. Proc Natl Acad Sci USA 102:16662–16667.

36. Brewer SH, Song BB, Raleigh DP, Dyer RB (2007) Residue specific resolution of proteinfolding dynamics using isotope-edited infrared temperature jump spectroscopy. Bio-chemistry 46:3279–3285.

37. Zwanzig R, Szabo A, Bagchi B (1992) Levinthals paradox. Proc Natl Acad Sci USA89:20–22.

38. Thompson PA, et al. (2000) The helix-coil kinetics of a heteropeptide. J Phys Chem B104:378–389.

39. Berezhkovskii A, Szabo A (2004) Ensemble of transition states for two-state proteinfolding from the eigenvectors of rate matrices. J Chem Phys 121:9186–9187.

40. Portman JJ, Takada S, Wolynes PG (2001) Microscopic theory of protein folding rates.II. Local reaction coordinates and chain dynamics. J Chem Phys 114:5082–5096.

41. Yang JS, Wallin S, Shakhnovich EI (2008) Universality and diversity of folding mechanicsfor three-helix bundle proteins. Proc Natl Acad Sci USA 105:895–900.

42. Jayachandran G, Vishal V, Pande VS (2006) Using massively parallel simulation andMarkovian models to study protein folding: Examining the dynamics of the villinheadpiece. J Chem Phys 124:164902.

43. Jayachandran G, Vishal V, Garcia AE, Pande VS (2007) Local structure formation insimulations of two small proteins. J Struct Biol 157:491–499.

44. Lei HX, Duan Y (2007) Two-stage folding of HP-35 from ab initio simulations. J Mol Biol370:196–206.

45. Miyazawa S, Jernigan RL (1985) Estimation of effective interresidue contact energiesfrom protein crystal structures—Quasi-chemical approximation. Macromoleules18:534–552.

46. Schuler B, Eaton WA (2008) Protein folding studied by single-molecule FRET. Curr OpinStruct Biol 18:16–26.

47. Freire E (1995) in Protein Stability and Folding. Theory and Practice, ed Shirley BA(Humana, Totowa, NJ), pp 191–218.

18662 � www.pnas.org�cgi�doi�10.1073�pnas.0808600105 Kubelka et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

23, 2

020