Dissertation Defense
Thu Zar W. Lwin
Departments of Chemistry and Molecular Biology and Biochemistry
University of California at Irvine
February 22, 2005
Application of Replica Exchange Method in Protein Folding Simulation
20 natural amino acids - polar and non polar Hydrophobic core
Ordered Secondary structures - helix and sheet - 33% in helix, 33% in sheet, - 33% in loops and turns
How are the above structures formed?
Basics of Protein Structure
O
O’
O’-
Structure of Beta Hairpin
G E W T Y D D A T K T F T V T E
- B1 Domain of Protein G in Streptococcal bacteria
- Binds to mammalian IgG
- NMR and X-ray (, ,
- No disulfide bridges => Folding study
- Fragmentation study - Stability in aqueous water - Initiation site for folding
- Computational studies
NC
16 residues
• How does a protein spontaneously fold into its native structure?
• How do we predict a protein’s native structure from its sequence?
Questions about Protein Folding
• How can we design a protein with a specified function?
Replica Exchange Method
old new
Hansmann, UHE (1997) Chem. Phys. Lett., 281:140-150
Efficient sampling of conformational space
Can quickly reach to states available at specified temperature
18 replicas: 270K……………690K
Amber Force Field Model
• Amber• CHARMM• Cedar• Gromos• OPLS
An all atom energy model
Outline
• Influences of solvent models- Explicit solvent vs. implicit
solvent- PB vs. GB
• Influences of force field on secondary structure propensity
• Sampling algorithm - Test on model energy function - Ab initio folding
Motivation to Analyze Solvation Models
Most populated structures Zhou,R. (2003) Proteins, 53:148-161
- These 2 degrees of freedom describe folding landscape
- Rg(core) consists of residues that form hydrophobic core F, W, T, V => describes compactness of hydrophobic core
- No. of H-bonds represent the secondary structure.
(Explicit) (Implicit)
Explicit vs. Implicit Solvent Model
Every atom of solvent molecule is represented
No explicit representation a continuum medium
Explicit Implicit
Implicit Solvent: How do we do it?
Solvation free energy
Components of the Solvation Energy
Polar Solvation Models
Solvent Accessible (SA) Model
1PB/PBSA
2GB/GBSA1Lu, Q. and Luo, R. (2003) J. Chem. Phys. 119:11035-11047 2Onufriev, A. et al. (2004) Proteins 55:383-394
Amber99ci
Poisson-Boltzmann Model
p
+
+-
-+ +
-
-
Attempts to solve the Poisson-Boltzmann equation numerically
s
Dielectric constant
Electrostatic potential
Charge density
Charge of salt ion in solution
Generalized Born Model
=> It is an approximation to the PB equation.
Electrostatic screening effect of salt Effective Born radius
Solvent dielectric constant
Backbone RMSD
(A)
At 282KAre the conformations similar to crystal structure?
Misfolded Salt Bridges
D46
E42
D47E56
K50SA appears to generate more mis-folded salt bridges.
Calculated NOE Distances
34: 5.52 6.21 A35: 4.9 6.3 A
Blanco, FJ. et al., (1994) Struct. Biol. 1:584-590
NOE pair # Type of proton coupling
1 – 10 Intra-residue: HN ….. HCA
11 – 35
11 – 22
23 – 29
30 – 31
32
33, 34
35
Inter-residue: HN, HCA, HCB
(i)HCA ….. (i+1)HN
(i)HN ….. (i+1)HN
HCA ….. HCA ( Y F, W V )
HCA ….. HE2 ( K Y)
HCA, HCB ….. HD2, HE1 ( Y F )
HH2 ….. HCB ( W F )PB models agree NMR data.
In NMR, the distance information for macromolecules can be obtained from Nuclear Overhauser Effect (NOE), transfer of spin polarization between nuclei. Rate of increase in NOE peak intensity Ç
Free Energy Landscape
The problem is specific to GB model.
Native Contacts and Melting Temperatures of -Hairpin
Solvents %Native Contacts (273K)
Melting temperature (K)
PB 81.6 370
PBSA 78.3 400
GB 66.5 320
GBSA 63.0 350
Experiment 80.0 ~300Muñoz, V., et al. (1997) Nature, 390:196-199
Summary
• Performance of polar solvation based on PB is reasonably good.
• The nonpolar interaction needs to be better defined.
Outline
• Influences of solvent models- Explicit solvent vs. implicit
solvent- PB vs. GB
• Influences of secondary structure propensity- Force fields
• Sampling algorithm - Test on model function - Ab initio folding
Why is Force Field Analysis Necessary?
• A Helical peptide can be erroneously folded into a beta-hairpin with Amber96.
Ace – A5 ( AAARA )3 A -- NME
García, AE and Sanbonmatsu, KY (2002) Proc. Natl. Acad. Sci. USA, 99:2782-2787
Folded short (Fs) peptide
AMBER96 vs. QM
RMSD 1.794kcal/mol
kcal/mol
Cond. Phase 300K AMBER96 *Cond. Phase 300K QM/MM
Amber96 QM/MM
Beta 0.86 0.61
Pass 0.02 0.16
Alpha R 0.11 0.26
Alpha L 0.00 0.07
Lu, Q. and Luo, R. ( In preparation)
*H. Hu, et al., (2003) Proteins, 50: 451-463
Amber94 Favors Helical Structures
García, AE and Sanbonmatsu, KY (2001) Proteins, 42:345-354
• Hairpin peptide can be erroneously folded into helix with Amber94.
AMBER94 vs. QM
*Cond. Phase 300K QM/MMCond. Phase 300K AMBER94
RMSD 1.985 kcal/mol
Lu, Q. and Luo, R. ( In preparation)kcal/mol
Amber94 QM/MM
Beta 0.17 0.61
Pass 0.01 0.16
Alpha R 0.82 0.26
Alpha L 0.00 0.07
*H. Hu, et al., (2003) Proteins, 50: 451-463
Spline Fitting vs. QM
Cond. Phase 300K Spline *Cond. Phase 300K QM/MM
RMSD=0.0056kcal/mol kcal/mol
Amber Spline
QM/MM
Beta 0.66 0.61
Pass 0.01 0.16
Alpha R 0.21 0.26
Alpha L 0.10 0.07
Lu, Q. and Luo, R. ( In preparation)
*H. Hu, et al., (2003) Proteins, 50: 451-463
AMBER Force Fields1. Amber03 Duan, Y. et al. (2003) J. Comput. Chem.
24:1999-2012.
2. Amber99ci Lu, Q. and Luo, R. (in preparation).
3. Amber99m2 Wang, J. and Luo, R. (in preparation).
4. Amber99m1 Simmerling, C. et al. (2002) J. A. Chem. Soc. 124:11258-11259.
5. Amber99off García, AE and Sanbonmatsu, KY (2002) Proc. Natl. Acad. Sci. USA 99:2782-2787.
6. Amber94 Cornell, WD, et al. (1995) J. Am. Chem. Soc.
117:5179-5197.
Comparing PME and PB using -Hairpin Peptide
Region PME PBBeta 0.46 (0.10) 0.53 (0.10)Pass 0.02 (0.004) 0.01 (0.008)Helix-R 0.38 (0.09) 0.26 (0.07)Helix-L 0.09 (0.02) 0.19 (0.02)State 4 0.01 (0.007) 0.01 (0.005)
Region PME PBBeta 0.46 (0.10) 0.53 (0.10)Pass 0.02 (0.004) 0.01 (0.008)Helix-R 0.38 (0.09) 0.26 (0.07)Helix-L 0.09 (0.02) 0.19 (0.02)State 4 0.01 (0.007) 0.01 (0.005)
Distribution of / angles from10 residues in sheet (450K)
6 force fields => 14 s
Is PB good enough?
Comparisons
Structure - Crystal structure => Native contact
=> Backbone RMSD - Experimental NOE - Secondary structure propensity
Thermodynamics - Population => Fluorescence => NMR
- Transition temperature
Mechanism - Free energy landscape - Order of hydrogen bonding
Native Contact FractionC – C distance of non-neighboring residue pairs => 6.5 A cut off distance => 21 pairs in crystal structure => Fractional number of pairs found in a conformation
Backbone RMSDThe smaller the RMSD value, the better.
Distribution of Salt Bridges
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
ff99ci ff03 ff99m2 ff99m1 ff99off ff94
Force fields
Popu
latio
n
E42-->K50 D46-->K50 D47-->K50 E56-->K50
E42
E56
K50
D47
D46
NOEff99ci in agreement with all NOE structural data.
Secondary Structure Propensity
Gray: HelixOlive green: Beta-sheetIs there a balance between secondary structures?
Comparisons
Structure - Crystal structure => Native contact
=> Backbone RMSD - Experimental NOE
- Secondary structure propensity
Thermodynamics - Native population => Fluorescence => NMR
- Transition temperature
Mechanism - Free energy landscape - Order of hydrogen bonding
Research group
Type ofExp.
Temperature (Kelvin)
Hairpin population
%
Blanco (1994)
NMR (direct) 278 42
Fesinmeyer (2004)
NMR (mutation) 278 42 43
Fesinmeyer (2004)
NMR (mutation) 298 30
Comparison of Hairpin Populations to NMR
Force field Avg. % Populationat 282 K
ff03 28 %
ff99ci 31 %
ff99m2 4.6 %
ff99m1 8.0 %
ff99off 1.5 %
ff94 0.5 %
Experimental data in aqueous water
Simulations in PB solvent with dielectric 80.0
Comparison of Native Contact Populations to Fluorescence Data
Population% (270 K)
ff03 74.4ff99ci 81.5ff99m2 59.5ff99m1 50.8ff99off 44.5ff94 43.4FluorescenceStudy (273 K) 80.0
Muñoz, V., et al. (1997) Nature, 390:196-199
Transition Temperature
(K)
Muñoz, V., et al. (1997) Nature, 390:196-199
50% of sheet population exits at transition temperature.
TF (K)
ff03 385ff99ci 368ff99m2 368ff99m1 368FluorescenceStudy ~300
Comparisons
Structure - Crystal structure => Native contact
=> Backbone RMSD - Experimental NOE - Secondary structure propensity
Thermodynamics - Population => Fluorescence => NMR
- Transition temperature
Mechanism - Free energy landscape - Order of hydrogen bonding
Free Energy Landscape
Temperature 282K
Hydrogen Bonding Probability
3 > 5 > 4 > 2 4 > 5 > 3 > 2 7 > 4 > 5 > 3
5 > 4 > 3 > 6 7 > 5 > 4 > 6 7 > 6
3 > 5 > 4 > 2 4 > 5 > 3 > 2
Summary
Out of 6 force fields, only the most recent 2 force fields (ff03 and ff99ci) treat the backbone torsion right
Structure => can produce native like conformaitons. => mis-folded salt bridges can form in imperfect force fields.
Thermodynamics => balance between helical and sheet structures
Mechanism => L shaped landscape => Existence of intermediates and their locations depend on force fields.
Outline
• Influences of solvent models- Explicit solvent vs. implicit
solvent- PB vs. GB
• Influences of secondary structure propensity- Force fields
• Sampling algorithm - Test on model function - Ab initio folding
Dill, K. A. and Chan, H. S., (1997) Nature Struct. Biol., 1:10-19.
Residue Model All atom model
Why New Method Necessary?
Dual REM
Model Energy:
Global OptimizationThermodynamic Simulation
Testing Dual REM
-Hairpin Peptide:
Ab Initio Folding
16 residues
G E W T Y D D A T K T F T V T E
300 4390
Instantaneous Energy to Global minimum
Energy Model
Efficiency
3d_2d 14
5d_4d 1
5d_3d 43
5d_2d 49
Distribution of Energy
Comparing Imperfectness across Different Resolution Combinations
_
Simulations
PB REM
18 temps (18 nodes)1 ps MD b/f rep. xch5 replica xch b/f resol. xch
*Lattice REM
9 temps one node100 MC b/f rep. xch 10 replica xch. b/f resol. xch
Simulated Annealing (2.5, 10 anneal step, 200 MC/step)Const. temp lattice run (200 MC)
ReconstructionGas/Min: 500 stepsHeating: 100 steps
Interface (400 trial exchanges)
270 K 0.90295 K 0.98
Single PB REMs:
Extended structure (5ns)Crystal structure (2ns)
Dual REM (2 ns) Interface
*MMTSB
Ab Initio Folding (RMSD)
0 2 4 6 8 10 120.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Pro
babi
lity
RMSD
Dual REM Single REM Crystal REM
Dual REM can fold into native structure (1 to 2 Angstrom)Analysis on last 0.5 ns of simulation
Dual REM 0.2 ns
Single REM > 5.0 ns
Summary• Dual REM is faster than single REM in both testing
scenarios.
• Limitations of this method are:
=> The imperfectness between the two resolutions must be small.
=> We have to use fairly efficient low resolution model.
=> The cost of computation for interface must be low. In our folding simulation, cost of computation for interface is very insignificant.
Future Directions
• Better treatment to the non polar solvation.
• Similar testing on helical peptide in force field analysis.Improvement on ff99ci with condensed phase QM calculations.
• Testing of Ab Initio folding on a protein that contains both kind of secondary structures, such as domain B1 of protein G.
Acknowledgements
Mengjuei HsiehMorris ChenDr. Qiang LuDr. Chuck TanDr. Yu-Hong TanDr. Lijiang Yang
Department of ChemistryChemical and Material Physics Program
UC Regents Dissertation Fellowship
Committee Members:
Professor Ray Luo Professor Douglas J. TobiasProfessor David A. Brant