tools and methods for multiscale biomolecular simulations

36
Tools and methods for multiscale biomolecular simulations We represent a partnership of 7 researchers located at the 3 Universities and the National Institute of Environmental and Health Sciences (NIEHS), all located within North Carolina’s Research Triangle NC State Members: Jerry Bernholc, Lubos Mitas, Christopher Roland and Celeste Sagui (PI) – Physics UNC Member : Lee Pedersen – Chemistry and NIEHS Duke Member: John Board – Computer Science NIEHS Collaborator: – Tom Darden – Structural Biology Lab Celeste Sagui Department of Physics, NC State University, Raleigh NC

Upload: enye

Post on 01-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Tools and methods for multiscale biomolecular simulations. Celeste Sagui Department of Physics, NC State University, Raleigh NC. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tools and methods for multiscale biomolecular simulations

Tools and methods for multiscale biomolecular simulations

We represent a partnership of 7 researchers located at the 3 Universities and the National Institute of Environmental and Health Sciences (NIEHS), all located within North Carolina’s Research Triangle

NC State Members: Jerry Bernholc, Lubos Mitas, Christopher Roland and Celeste Sagui (PI) – Physics

UNC Member: Lee Pedersen – Chemistry and NIEHS

Duke Member: John Board – Computer Science

NIEHS Collaborator: – Tom Darden – Structural Biology Lab

Celeste Sagui

Department of Physics, NC State University, Raleigh NC

Page 2: Tools and methods for multiscale biomolecular simulations

ITR Scientific Aims

• explore science as to enable a set of scalable computational tools for large-scale, multiscale biomolecular simulations

• multiscale methods are to range from Quantum Monte Carlo (QMC) to continuum methods

• codes will be based on real-space grids, with multigrid acceleration and convergence

• electrostatics will be treated in a highly efficient and accurate manner

• codes will be used to solve paradigmatic biomolecular problems

• codes will ultimately be distributed under the Open Source GPL license

Page 3: Tools and methods for multiscale biomolecular simulations

Some Highlights of Current Progress

1. Accurate and efficient electrostatics for large-scale biomolecular simulations (Sagui, Darden, Roland)

2. Coupling of QMC and MD (Mitas)

3. Thomas Fermi and DFT (Bernholc)

4. PMEMD and AMBER 8.0 (Pedersen, Duke)

5. Applications

Page 4: Tools and methods for multiscale biomolecular simulations
Page 5: Tools and methods for multiscale biomolecular simulations

Classical Molecular Dynamics Developments: Accurate Electrostatics

Page 6: Tools and methods for multiscale biomolecular simulations

Accurate and Efficient Electrostatics for Large-Scale Biomolecular Simulations

Accurate electrostatics is absolutely essential for meaningful biomolecular simulations (i.e., they stabilize the delicate 3-d structures, bind complexes together, represent the computational bottleneck in current simulations, etc)

Key Challenges:

(a) More accurate description of electrostatics with higher-order multipoles is needed (Our solution: Wannier functions)

(b) Computationally efficient ways of simulating such systems are needed (Our solution: advanced PME/ Multigrid approaches)

Page 7: Tools and methods for multiscale biomolecular simulations

Limitations of Current Modeling of Electrostatic Fields

Higher-order multipoles have not been implemented due to their overwhelming costs:

1. Multipoles up to hexadecapoles have 35 degrees of freedom, so that interaction matrix between them has 1225 components i.e., cost of fixed cutoff implementation is 3 orders of magnitude more than just charges alone !!

2. Ewald implementation grows like O(N2)

3. Use of cutoffs alleviate the problem: WRONG !!

much of the cost originates in the direct part

truncation leads to artifactual behavior unless cutoffs of the order of 25 Å are used

Page 8: Tools and methods for multiscale biomolecular simulations

Sagui, Pedersen, and Darden, J. Chem. Phys. 120, 73 (2004)

Our Solution

1. Implement a McMurchie-Davidson formalism for the direct part of the Ewald summation

2. Switch most of the calculation to reciprocal space

3. Implement a Particle-Mesh Ewald (PME)-based approach for single-processor machines

4. Implement a multigrid-based approach for parallel machines

Page 9: Tools and methods for multiscale biomolecular simulations

(Ang-1)

Rc

(Ang)Spline Order

hx

(Ang)

Direct (sec)

Reciprocal (sec)

Overall (sec)

charges 0.50 5.63 5 0.775 0.21 0.18 0.42

dipoles 0.50 5.60 5 0.775 0.33 0.21 0.58

quadrupoles 0.55 5.10 6 0.668 0.57 0.32 0.96

octupoles 0.70 4.25 8 0.620 0.93 0.60 1.70

hexadecapoles

0.85 3.60 8 0.459 1.54 1.12 3.05

Relative RMS force error: 5x10-4 ; for error 5x10-5, hexadecapole cost is 4.4 secs; 5x10-6 – cost is 5 secs

With Rc=8 Å cutoff, costs is 6 times more than with PME and has a RMS error of about 0.05

PME-based results for 4096 water molecules

Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler

Page 10: Tools and methods for multiscale biomolecular simulations

(Ang-1)

RRcc

(Ang)RG

(Ang)hx

(Ang)

Direct (sec)

Reciprocal (sec)

charges 0.60 5.20 3.50 0.620 0.20 2.30

dipoles 0.61 5.20 3.63 0.620 0.29 2.66

quadrupoles 0.70 4.80 3.45 0.516 0.52 4.64

octupoles 0.75 4.25 3.10 0.443 0.94 8.42

hexadecapoles 0.79 4.25 3.05 0.388 2.21 15.71

Relative RMS force error: 5x10-4

Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler

Multigrid-based results for 4096 water molecules

Page 11: Tools and methods for multiscale biomolecular simulations

J. Baucom et al, submitted to J. Chem. Phys. 2004

MD simulation of DNA decamer d(CCAACGTTGG)2 in a crystal environment

Page 12: Tools and methods for multiscale biomolecular simulations

charges only charges and induced dipoles

RMS deviation of crystal simulations at constant pressure with respect to crystal structure

Page 13: Tools and methods for multiscale biomolecular simulations

Calculating Multipole Moments via Wannier Functions (WFs)

• to partition the charge density and calculate the multipole moments, we use a WANNIER FUNCTION (WF) approach

• this has several advantages:

1. WFs provide for a chemical and physically intuitive way of partitioning the charge (ref. Marzari and Vanderbilt, PRB 56, 12847 (1997))

2. WFs are distributed in space, which allows for a more faithful representation of the electrostatic potential

3. no ad hoc assignment of the charge

4. numerically quite stable procedureRef: C. Sagui, P. Pomorski, T. Darden and C. Roland, J. Chem. Phys. 120, 4530 (2004)

Page 14: Tools and methods for multiscale biomolecular simulations

Maximally localized Wannier functions for water

• water molecule has 4 WFs

• 2 associated with OH bond (light blue)

• 2 associated with O lone pairs (dark blue)

Page 15: Tools and methods for multiscale biomolecular simulations

ab initio m mdq mdqo mdqoh

Electrostatic potential for single water molecule as generated by WFs

Page 16: Tools and methods for multiscale biomolecular simulations

Wannier functions for carbon dioxide

• CO2 has 8 WFs

• 6 associated with the CO bonds (light blue)

• 2 associated with O (dark blue)

Page 17: Tools and methods for multiscale biomolecular simulations

ab initio m mdq mdqo mdqoh

Electrostatic potential for carbon dioxide molecule as generated by WFs

Page 18: Tools and methods for multiscale biomolecular simulations

Quantum Monte Carlo Developments

Page 19: Tools and methods for multiscale biomolecular simulations

New continuous quantum Monte Carlo/molecular dynamics method

• we propose a new method for coupling ab initio molecular dynamics ionic with stochastic DMC electronic steps to provide accurate DMC energies “on-the-fly”

• exploits the slowness of MD evolution which enables to update the QMC sampling process very efficiently

• accurate for both thermal averages and description of energies along the pathways

• we have carried out the first QMC/MD simulations using both forces and energies from QMC

Ref: J. Grossmann and L. Mitas, preprint 2004

Page 20: Tools and methods for multiscale biomolecular simulations

Coupling of QMC and MD: Basic Idea

Instead of discrete sampling of each point with a new QMC run: calculate QMC energies “on-the-fly” during the dynamic simulation !

Continuously update the DMC walkers so that they correctly represent the evolving wave function (CDMC method)

Evolution of both configuration spaces is coupled: as the ionic dynamical trajectories evolve, so does the population of DMC electrons

average distance made by an ion in one MD time step

10-4 … 10-3 a.u.

average distance by an electron in a typical DMC time step

10-2 … 10-1 a.u.

Page 21: Tools and methods for multiscale biomolecular simulations

Stable CDMC simulation

ab initio MD Step

Compute orbital overlaps with current DMC walkers

Orbital swapping or rotation?

do VMC then DMC

yes

no

Check for node crossings

compute weights

Take DMC step(s) and calculate energy

R, (R)

Successful CDMC Algorithm

•Stable DMC population

•How accurate is it?

•Benchmark against discrete DMC

Page 22: Tools and methods for multiscale biomolecular simulations

•As simulation progresses, 1-step CDMC energies begin to differ significantly from discrete DMC

•Using 3 steps corrects time “lag”

•33 times more efficient than discrete sampling

CDMC: Number of DMC step needed per MD step

•Use large discrete sampled runs (1000 steps each) for comparison

E(discrete DMC) = -6.228(2)E(1 step continuous) = -6.220(2)E(2 steps continuous) = -6.220(2)E(3 steps continuous) = -6.226(2)E(10 steps continuous)= -6.230(2)E(20 steps continuous)= -6.228(2)

Thermal Averages (over 1 ps)

• Thermal averages are converged for N≥3

• Same convergence (3 CDMC steps) observed for Si2H6 and Si5H12

Page 23: Tools and methods for multiscale biomolecular simulations

CDMC: Si2H6

As for SiH4, asymmetrically stretch molecule and let go

Average temperature ~ 1500 K

QuickTime™ and a YUV420 codec decompressor are needed to see this picture.

Page 24: Tools and methods for multiscale biomolecular simulations

CDMC: Si2H6 Results

ForSi2H6, 3 steps appears to lead to stability as for SiH4

# steps looks like a function of dynamics rather than size

Can pinpoint specific types of strain that lead to wf lag

Page 25: Tools and methods for multiscale biomolecular simulations

Test of quantum Monte Carlo/molecular dynamics method on water dissociation

• DMC forces in very good agreement with DFT forces

SiH4 at 1500 K H2O Dissociation

• DMC-MD and DFT-MD trajectories are in excellent agreement

“QMC only” molecular dynamics, with no external input from DFT

Page 26: Tools and methods for multiscale biomolecular simulations

Density Functional Theory Developments

Page 27: Tools and methods for multiscale biomolecular simulations

Development of hydrid QM calculations: interfacing DFT with Thomas-Fermi calculations

Idea: in many biological systems, only part of the system is chemically active

Use ab initio methods for this part (real-space, multigrid-based code)

Use more approximate methods for the rest of the system (in this case Thomas Fermi approach, with frozen density for molecules, and gradient corrections)

Ref: M. Hodak, W. Lu, and Bernholc, in preparation

Page 28: Tools and methods for multiscale biomolecular simulations

Hybrid calculation tests

● Interaction of two water molecules in hybrid calculation

- Hydrogen bonding test

- Molecule 1: Ab initio

- Molecule 2: Thomas-Fermi

Gives estimated speed-up of 500 times !!

Page 29: Tools and methods for multiscale biomolecular simulations

Parallelization and Coding Advances

Page 30: Tools and methods for multiscale biomolecular simulations

A redesign of the AMBER SANDER program, along with a rewrite in FORTRAN 90 were undertaken with the goal of substantially improving the practicality of multi-nanosecond PME simulations (i.e., in 100,000 to 300,000 atom range)

Resulting software has been released to the AMBER community in 3 phases – the new software is named PMEMD for PARTICLE MESH EWALD MOLECULAR DYNAMICS

Improvements in performance and parallel scalability of AMBER MD Software

Release Dates: July 2003 – PMEND 3.00

October 2003 – PMEMD 3.10

March 2004 – PMEMD 8.0 (part of AMBER 8.0)

PMEMD is now the primary high performance MD modeling tool in AMBER !!

Page 31: Tools and methods for multiscale biomolecular simulations

Results for the Factor IX constant pressure system from Dr. LalithPerera, a solvated system with a total of 90906 atoms. The time step was1.5 fs, the direct force cutoff was 8.0 angstrom, all simulations used PME.The runs were done on the IBM 1.3 GHz p690 Regatta at the Edinburgh ParallelComputing Centre.

#procs PMEMD 8 PMEMD 3.1 PMEMD 3.0 Sander 8 Sander 7 Sander 6 psec/day psec/day psec/day psec/day psec/day psec/day 8 nd 346 353 nd 233 182 16 672 607 594 nd 279 258 32 1125 1035 929 nd 306 297 64 1975 1770 1127 369 318 339 96 2743 2304 nd nd nd nd 112 2945 2631 nd nd nd nd 128 2516* 2864 nd 339 nd nd

* Performance falloff observed here. Max performance obtained at higher processor count was 3600 psec/day, but required using only 4 of the 8 cpu's on each 8 cpu multi-chip module of the SP4.

Representative Benchmark

Page 32: Tools and methods for multiscale biomolecular simulations

Software References

Maximum throughput obtainable for SP4 is up an order of magnitude for PMEMD 8, than any other version of SANDER

Software Publication References:

R.E. Duke and L.G. Pedersen, PMEMD 3.0 (2003)

R.E. Duke and L.G. Pedersen, PMEMD 3.1 (2003)

D.A. Case et al, AMBER 8.0 (2004)

Page 33: Tools and methods for multiscale biomolecular simulations

Some Applications …

1. QM/MM studies of enzymatic sulfate transfer in the heparan sulfate precursor

2. QM/MM studies of enzymatic sulfate transfer in estrogen sulfation

3. PMEMD study of the mammalian P450 enzyme and the ternary blood coagulation complex tissue factors

4. Protein folding study on ionic domains of the coagulation protein protothrombin

5. Solvation and deprotonation of formic acid

6. Crystallographic studies of DNA

7. Binding of vancomycin and teicoplanin antibiotics to bacterial cell wall termini

8. Structure and function of serine proteases

9. QM/MM study of role of Mg ions in the mechanism of DNA polymerase

Page 34: Tools and methods for multiscale biomolecular simulations

K47

PAPS

H107

E2

Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl Transfer Reaction Catalyzed by Human Estrogen

SulfotransferaseP. Lin and L. Pedersen

Page 35: Tools and methods for multiscale biomolecular simulations

Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl Transfer Reaction Catalyzed by Human Estrogen Sulfotransferase

P. Lin and L. Pedersen

• estrogen is one of the most important hormones found in the human body

• it is extremely important that the body regulate estrogen, being able to both turn it on and off

• the deactivation of estrogen takes place by means of transfering a sulfate group to the hormone

• the details of this important reaction were investigated by means of a mixed quantum and classical molecular dynamics simulation, as shown in the movie • movie shows how the sulfate

group gets placed on the estrogen

Page 36: Tools and methods for multiscale biomolecular simulations

Summary

Scientific aims are to produce a set of scalable and portable computational tools for multiscale biomolecular calculations

Considerable progress in number of aspects:

1. Development of accurate and efficient methods for treatment of long-range electrostatic forces

2. Development of QMC and MD methods3. Development of DFT and TF interface4. PMEMD and AMBER 8.05. Applications