molecular modelling dr michelle kuttel department of computer science university of cape town
TRANSCRIPT
Molecular Modelling
Dr Michelle Kuttel
Department of Computer Science
University of Cape Town
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Aim
Brief introduction to/overview of molecular modelling why theory how
Hand-on experience of molecular modelling package (NAMD) and visualization software (VMD)
Resources
SlidesMolecular Modelling. Principles and
Applications. A. R. Leach, Addison Wesley Longman Limited,1996 (in library)
Essentials of Computational Chemistry - Theories and Models, 2nd Edition, Christopher J. Cramer
NAMD website - http://www.ks.uiuc.edu/
Computational Chemistry
Chemical/physicalproblem
?
MolecularSimulations
AnalyticalToolsComparison
withexperiment
Validation
Insight
simulation data
Predictions
Why is computational Chemistry Increasingly Popular?Chemical waste disposal and
computational technology which keeps getting cheaper and cheaper and
which more and more expensive?
Simulations: Modelling Strategies
Chemical/physicalproblem
?Molecular
Simulations
Force Field Methods
Ab initio QMMethods
Quantum Mechanics
postulates and theorems of quantum mechanics form the rigorous foundation for the prediction of observable chemical properties from first principles. microscopic systems are described by wave
functions that completely characterise all the physical properties of the system
operators applied to the wave function allow one to predict the probability of the system having a value or range of values.
Quantum mechanics vs Force Field methods QM deals with electrons in system
Accurate Can deal with reactions (bond breaking etc.) Often used to parameterize force fields Large number of particles means infeasibly time-consuming
for molecules as large as proteins Static models only (no time)
FF methods Molecular mechanics Cannot answer questions that depend on electron distribution
in a molecule But fast and surprisingly useful
Computable properties I
Structure determination of “best” structure very common
application of Comp Chem. lowest possible energycare when comparing theory with experiment -
thermal averaging for measured structure
Computable properties II
Potential Energy Surfaces fully characterize the potential energy surface
(PES) for a given chemical formula3N-6 coordinate dimensions, where N is the number
of atoms >= 3 points of interest are local minima (optimal
structures), saddle points (lowest energy barriers on the paths connecting minima - transition state)
Typically, take slices through PES, involving 1 or 2 coordinates, as hard to visualize otherwise
Computable properties III
Chemical Properties Single molecule properties
e.g. spectral quantities - NMR shifts and coupling constants etc.
Thermodynamic quantities. Enthalpy, free energy. theory extensively used to estimate equilibrium constants,
which are derived from free energy differences between minima on a PES and connected transition state structures.
reaction thermochemistries, heats of formation and combustion, hydrogen bonding strengths etc. etc.
Molecular Mechanics
Approach to understanding structure-function relationships Applications:
Structure determination and refinement Homology modelling Structure-based ligand design Pharmacore modelling Mutant structure prediction Enzyme mechanism Protein folding pathways Protein design Molecular dynamics Normal mode analysis (“characteristic motions”)
Molecular Mechanics Force Fields
Classical Mechanical approximation
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Molecular mechanics Potential
describe deviation from a reference value
Force Fields - Parameterization
Can had more/fewer terms CHARMM and similar force fields seek to have a minimal set of easily
comprehensible terms
To define a force field, one must specify not only the functional form, but also the parameters. Two force fields may have identical functional form but very
different parameters e.g. “CHARMM” force field has many possible parameter sets
Force field parameter terms expressed in terms of “atom type” Distinguish e.g. between sp, sp2 and sp3 hybridized carbons
Force Fields - Parameterization
Force field parameters are not necessarily transferable “energy” is relative
Typically parameterized for a specific class of molecules - protein, DNA/RNA, carbohydrates, etc. “general” force fields - CVFF etc. Usually designed to predict structural properties
Force field parameterization is a full-time job
Force Field examples
AMBERGROMOS
CHARMM - topology and parameter files
MM2/MM3
Force Fields
How can I pick the best force field for my problem?
How can I trust the results?
Coordinate files: PDB
Simulations start with atomic structures from the Protein Data Bank, in standard PDB file format
PDB files contain a lot of data - species, tissue, authorship,citations secondary structure etc. Only interested in atomic data
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Molecular Visualization Packages
Huge number of molecular visualization packages RasMol The PyMOL Molecular Graphics System Gopenmol VMD
Similar approaches to visualization may combine with structure refinement/molecular
modelling etc. My recommendation is to use one product and get to
know it well.
Molecular Modelling Software
Commercial: Cerius2, Insight II (from Accelrys)
Academic: CHARMM AMBER GROMOS NAMD
Why NAMD?
“VMD, NAMD, and BioCoRE represent a broad effort by the Theoretical and Computational Biophysics Group, an NIH Resource for Macromolecular Modeling and Bioinformatics, to develop and freely distribute effective tools (with source code) for molecular dynamics studies in structural biology.” Scales best … Uses CHARMM force field Written in C++ Good tutorial and user guide
What type of simulation?
What are the most stable/probable conformations? Energy minimization Molecular dynamics Monte Carlo methods Hybrid MD-MC methods Simulated annealing
What are the functional motions? Molecular (deterministic) dynamics Stochastic dynamics Normal modes
There are of course other questions…
optimization
sampling
dynamics
Energy Minimization/Geometry Optimization Aim : find the lowest-energy conformation
problem in applied mathematics Given a function f with independent variables x1,x2,…,xn find
the values of those variables where f has a minimum value
Can’t use analytical methods because of the complicated way energy varies with coordinates
Minima located using numerical methods which gradually change the coordinates to produce configurations with lower and lower energies until minimum is found Various optimization procedures
Energy Minimization - Algorithms
Most algorithms only go downhill on surface - multiple minima problem
Energy Minimization - Algorithms
Some algorithms use derivatives of energy, others do not
Usually one uses a combination of methods More robust (but less efficient) first Less robust (more efficient) second
Convergence criteria Usually monitor energy from one iteration to next and
stop when difference between successive measurements falls beneath a certain threshold
Energy Minimization - Simplex MethodNon-derivative methodA simplex is a geometrical figure with M+1
interconnected vertices, where M is the dimensionality of the energy function
Each vertex corresponds to a set of coordinates for which the energy can be calculated
Energy Minimization - Simplex Method3 moves possible:
Reflect (1) Contract in one dimension (2) Contract around lowest point (3)
Energy Minimization - Simplex Method
Need to generate vertices of initial simplex Add constant increment to each coordinate in
turn
Expensive Algorithm - many energy evaluations most useful where initial configuration very high
in energy (i.e. very far from minimum) Often used for initial few minimization steps
Energy Minimization - Steepest Descent Algorithm 1st derivative method,
Moves in direction parallel to net force “straight downhill”
Need to decide how far to move
Most implementations have a step size with predetermined default value
If 1st step has decrease in E, increase the step size by a factor for next iteration
If energy increases, decrease step size by a factor
Both gradients and direction of successive steps are orthogonal
Energy Minimization - Steepest Descent Algorithm Advantages:
Direction of gradient is determined by largest inter-atomic forces, good for relieving highest energy features in initial configuration
robust Disadvantage:
Numerous small steps when proceeding down long narrow valley - right angled turn at each step
“tacking into the wind”
Energy Minimization - Conjugate gradients method1st-derivative methodDoes not show oscillatory behaviour in
narrow valleys.Similar to steepest descents, but while
gradients are orthogonal, directions are conjugate
€
vk = −gk + γ kvk−1
Energy Minimization - Newton-Rhapson AlgorithmSimplest second derivative method
Uses information about the curvature of the function Should be familiar from 1st year…
Computationally demanding and more suited to smaller molecules Has problems with structures far from minimum
minimization can become unstable.
€
xn +1 = xn −V '(xn )
V ''(xn )
Work for tomorrow
Install NAMD and VMDperform minimization of a suitable protein have “before” and “after” .pdb files - send
to me via email
Energy Minimization - problems
Only local minima found Multiple minima problem
Only minimum of potential energy, not free energy
Energy Minimization - applications
Widely used in molecular modelling Prior to Monte Carlo simulations or MD to
remove any unfavourable interactions in initial configuration of system
Often used for optimizing experimental structures
NAMD minimization procedure
Conjugate gradient parameters
The default minimizer uses a sophisticated conjugate gradient and line search algorithm with much better performance than the older velocity quenching method. The method of conjugate gradients is used to select successive search directions (starting with the initial gradient) which eliminate repeated minimization along the same directions. Along each direction, a minimum is first bracketed (rigorously bounded) and then converged upon by either a golden section search, or, when possible, a quadratically convergent method using gradient information.
For most systems, it just works.
* minimization $<$ Perform conjugate gradient energy minimization? $>$ Acceptable Values: on or off Default Value: off Description: Turns efficient energy minimization on or off.
* minTinyStep $<$ first initial step for line minimizer $>$ Acceptable Values: positive decimal Default Value: 1.0e-6 Description: If your minimization is immediately unstable, make this smaller.
* minBabyStep $<$ max initial step for line minimizer $>$ Acceptable Values: positive decimal Default Value: 1.0e-2 Description: If your minimization becomes unstable later, make this smaller.
* minLineGoal $<$ gradient reduction factor for line minimizer $>$ Acceptable Values: positive decimal Default Value: 1.0e-4 Description: Varying this might improve conjugate gradient performance.
Grid search with energy minimization
Approach to “mapping” the energy surface Produce an adiabatic map as a
function of the chief conformational coordinates
E.g Ramachandran map of conformational energy as function of
Search through conformational coordinates in increments
Very time-consuming if done thoroughly, only practicable in a few dimensions
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
€
360
D
⎛
⎝ ⎜
⎞
⎠ ⎟n
Energy Minimization - Normal Mode Analysis Normal modes of vibration are simple harmonic
oscillations about a local energy minimum, characteristic of a system's structure and its energy function .
For a purely harmonic function any motion can be exactly expressed as a superposition of normal modes.
For an anharmonic function, the potential near the minimum will still be well approximated by a harmonic potential, and any small-amplitude motion can still be well described by a sum of normal modes.
Energy Minimization - Normal Mode AnalysisThe normal mode spectrum of a 3-
dimensional system of N atoms contains 3N - 6 normal modes ( for linear molecules in 3D).
In general, the number of modes is the system's total number of degrees of freedom minus the number of degrees of freedom that correspond to pure rigid body motion (rotation or translation).
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Energy Minimization - Normal Mode AnalysisEach mode is defined by an eigenvector
and its corresponding eigenfrequency.
The eigenvector contains the amplitude and direction of motion for each atom (same frequency of vibration for all atoms)
Energy Minimization - Normal Mode Analysis In macromolecules, the lowest frequency modes
correspond to delocalized motions, in which a large number of atoms oscillate with considerable amplitude.
The highest frequency motions are more localized, with appreciable amplitudes for fewer atoms, e.g., the stretching of bonds between carbon and hydrogen atoms.
Energy Minimization - Normal Mode AnalysisNormal modes useful because they
correspond to collective motions of atoms in a coupled system that can be individually excited
Frequencies of normal modes and displacements may be calculated from a molecular mechanics force field using the Hessian matrix of second derivatives
Molecule must be at a minimum
Energy Minimization - Normal Mode AnalysisResults can be:
Used to calculate thermodynamic quantities Compared to spectroscopic experiments
Used in parameterization of force fields
For large molecules, low-energy vibrations are of most interest Correspond to large-scale conformational motions Can be compared to molecular dynamics simulations
Energy Minimization - Docking
the prediction of the strength and specificity with which a small to medium sized molecule can bind to a biological macromolecule
docking - evaluating the energy of binding between two molecules for various relative positions of the two simiplification: use rigid molecules