molecular modelling dr michelle kuttel department of computer science university of cape town

Molecular Modelling

Dr Michelle Kuttel

Department of Computer Science

University of Cape Town

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Aim

Brief introduction to/overview of molecular modelling why theory how

Hand-on experience of molecular modelling package (NAMD) and visualization software (VMD)

Resources

SlidesMolecular Modelling. Principles and

Applications. A. R. Leach, Addison Wesley Longman Limited,1996 (in library)

Essentials of Computational Chemistry - Theories and Models, 2nd Edition, Christopher J. Cramer

NAMD website - http://www.ks.uiuc.edu/

Computational Chemistry

Chemical/physicalproblem

?

MolecularSimulations

AnalyticalToolsComparison

withexperiment

Validation

Insight

simulation data

Predictions

Why is computational Chemistry Increasingly Popular?Chemical waste disposal and

computational technology which keeps getting cheaper and cheaper and

which more and more expensive?

Simulations: Modelling Strategies

Chemical/physicalproblem

?Molecular

Simulations

Force Field Methods

Ab initio QMMethods

Quantum Mechanics

postulates and theorems of quantum mechanics form the rigorous foundation for the prediction of observable chemical properties from first principles. microscopic systems are described by wave

functions that completely characterise all the physical properties of the system

operators applied to the wave function allow one to predict the probability of the system having a value or range of values.

Quantum mechanics vs Force Field methods QM deals with electrons in system

Accurate Can deal with reactions (bond breaking etc.) Often used to parameterize force fields Large number of particles means infeasibly time-consuming

for molecules as large as proteins Static models only (no time)

FF methods Molecular mechanics Cannot answer questions that depend on electron distribution

in a molecule But fast and surprisingly useful

Computable properties I

Structure determination of “best” structure very common

application of Comp Chem. lowest possible energycare when comparing theory with experiment -

thermal averaging for measured structure

Computable properties II

Potential Energy Surfaces fully characterize the potential energy surface

(PES) for a given chemical formula3N-6 coordinate dimensions, where N is the number

of atoms >= 3 points of interest are local minima (optimal

structures), saddle points (lowest energy barriers on the paths connecting minima - transition state)

Typically, take slices through PES, involving 1 or 2 coordinates, as hard to visualize otherwise

Computable properties III

Chemical Properties Single molecule properties

e.g. spectral quantities - NMR shifts and coupling constants etc.

Thermodynamic quantities. Enthalpy, free energy. theory extensively used to estimate equilibrium constants,

which are derived from free energy differences between minima on a PES and connected transition state structures.

reaction thermochemistries, heats of formation and combustion, hydrogen bonding strengths etc. etc.

Molecular Mechanics

Approach to understanding structure-function relationships Applications:

Structure determination and refinement Homology modelling Structure-based ligand design Pharmacore modelling Mutant structure prediction Enzyme mechanism Protein folding pathways Protein design Molecular dynamics Normal mode analysis (“characteristic motions”)

Molecular Mechanics Force Fields

Classical Mechanical approximation

QuickTime™ and aTIFF (LZW) decompressor


Molecular mechanics Potential

describe deviation from a reference value

Force Fields - Parameterization

Can had more/fewer terms CHARMM and similar force fields seek to have a minimal set of easily

comprehensible terms

To define a force field, one must specify not only the functional form, but also the parameters. Two force fields may have identical functional form but very

different parameters e.g. “CHARMM” force field has many possible parameter sets

Force field parameter terms expressed in terms of “atom type” Distinguish e.g. between sp, sp2 and sp3 hybridized carbons

Force Fields - Parameterization

Force field parameters are not necessarily transferable “energy” is relative

Typically parameterized for a specific class of molecules - protein, DNA/RNA, carbohydrates, etc. “general” force fields - CVFF etc. Usually designed to predict structural properties

Force field parameterization is a full-time job

Force Field examples

AMBERGROMOS

CHARMM - topology and parameter files

MM2/MM3

Force Fields

How can I pick the best force field for my problem?

How can I trust the results?

Coordinate files: PDB

Simulations start with atomic structures from the Protein Data Bank, in standard PDB file format

PDB files contain a lot of data - species, tissue, authorship,citations secondary structure etc. Only interested in atomic data



Molecular Visualization Packages

Huge number of molecular visualization packages RasMol The PyMOL Molecular Graphics System Gopenmol VMD

Similar approaches to visualization may combine with structure refinement/molecular

modelling etc. My recommendation is to use one product and get to

know it well.

Molecular Modelling Software

Commercial: Cerius2, Insight II (from Accelrys)

Academic: CHARMM AMBER GROMOS NAMD

Why NAMD?

“VMD, NAMD, and BioCoRE represent a broad effort by the Theoretical and Computational Biophysics Group, an NIH Resource for Macromolecular Modeling and Bioinformatics, to develop and freely distribute effective tools (with source code) for molecular dynamics studies in structural biology.” Scales best … Uses CHARMM force field Written in C++ Good tutorial and user guide

What type of simulation?

What are the most stable/probable conformations? Energy minimization Molecular dynamics Monte Carlo methods Hybrid MD-MC methods Simulated annealing

What are the functional motions? Molecular (deterministic) dynamics Stochastic dynamics Normal modes

There are of course other questions…

optimization

sampling

dynamics

Energy Minimization/Geometry Optimization Aim : find the lowest-energy conformation

problem in applied mathematics Given a function f with independent variables x1,x2,…,xn find

the values of those variables where f has a minimum value

Can’t use analytical methods because of the complicated way energy varies with coordinates

Minima located using numerical methods which gradually change the coordinates to produce configurations with lower and lower energies until minimum is found Various optimization procedures

Energy Minimization - Algorithms

Most algorithms only go downhill on surface - multiple minima problem

Energy Minimization - Algorithms

Some algorithms use derivatives of energy, others do not

Usually one uses a combination of methods More robust (but less efficient) first Less robust (more efficient) second

Convergence criteria Usually monitor energy from one iteration to next and

stop when difference between successive measurements falls beneath a certain threshold

Energy Minimization - Simplex MethodNon-derivative methodA simplex is a geometrical figure with M+1

interconnected vertices, where M is the dimensionality of the energy function

Each vertex corresponds to a set of coordinates for which the energy can be calculated

Energy Minimization - Simplex Method3 moves possible:

Reflect (1) Contract in one dimension (2) Contract around lowest point (3)

Energy Minimization - Simplex Method

Need to generate vertices of initial simplex Add constant increment to each coordinate in

turn

Expensive Algorithm - many energy evaluations most useful where initial configuration very high

in energy (i.e. very far from minimum) Often used for initial few minimization steps

Energy Minimization - Steepest Descent Algorithm 1st derivative method,

Moves in direction parallel to net force “straight downhill”

Need to decide how far to move

Most implementations have a step size with predetermined default value

If 1st step has decrease in E, increase the step size by a factor for next iteration

If energy increases, decrease step size by a factor

Both gradients and direction of successive steps are orthogonal

Energy Minimization - Steepest Descent Algorithm Advantages:

Direction of gradient is determined by largest inter-atomic forces, good for relieving highest energy features in initial configuration

robust Disadvantage:

Numerous small steps when proceeding down long narrow valley - right angled turn at each step

“tacking into the wind”

Energy Minimization - Conjugate gradients method1st-derivative methodDoes not show oscillatory behaviour in

narrow valleys.Similar to steepest descents, but while

gradients are orthogonal, directions are conjugate

€

vk = −gk + γ kvk−1

Energy Minimization - Newton-Rhapson AlgorithmSimplest second derivative method

Uses information about the curvature of the function Should be familiar from 1st year…

Computationally demanding and more suited to smaller molecules Has problems with structures far from minimum

minimization can become unstable.

€

xn +1 = xn −V '(xn )

V ''(xn )

Work for tomorrow

Install NAMD and VMDperform minimization of a suitable protein have “before” and “after” .pdb files - send

to me via email

Energy Minimization - problems

Only local minima found Multiple minima problem

Only minimum of potential energy, not free energy

Energy Minimization - applications

Widely used in molecular modelling Prior to Monte Carlo simulations or MD to

remove any unfavourable interactions in initial configuration of system

Often used for optimizing experimental structures

NAMD minimization procedure

Conjugate gradient parameters

The default minimizer uses a sophisticated conjugate gradient and line search algorithm with much better performance than the older velocity quenching method. The method of conjugate gradients is used to select successive search directions (starting with the initial gradient) which eliminate repeated minimization along the same directions. Along each direction, a minimum is first bracketed (rigorously bounded) and then converged upon by either a golden section search, or, when possible, a quadratically convergent method using gradient information.

For most systems, it just works.

* minimization $<$ Perform conjugate gradient energy minimization? $>$ Acceptable Values: on or off Default Value: off Description: Turns efficient energy minimization on or off.

* minTinyStep $<$ first initial step for line minimizer $>$ Acceptable Values: positive decimal Default Value: 1.0e-6 Description: If your minimization is immediately unstable, make this smaller.

* minBabyStep $<$ max initial step for line minimizer $>$ Acceptable Values: positive decimal Default Value: 1.0e-2 Description: If your minimization becomes unstable later, make this smaller.

* minLineGoal $<$ gradient reduction factor for line minimizer $>$ Acceptable Values: positive decimal Default Value: 1.0e-4 Description: Varying this might improve conjugate gradient performance.

Grid search with energy minimization

Approach to “mapping” the energy surface Produce an adiabatic map as a

function of the chief conformational coordinates

E.g Ramachandran map of conformational energy as function of

Search through conformational coordinates in increments

Very time-consuming if done thoroughly, only practicable in a few dimensions



€

360

D

⎛

⎝ ⎜

⎞

⎠ ⎟n

Energy Minimization - Normal Mode Analysis Normal modes of vibration are simple harmonic

oscillations about a local energy minimum, characteristic of a system's structure and its energy function .

For a purely harmonic function any motion can be exactly expressed as a superposition of normal modes.

For an anharmonic function, the potential near the minimum will still be well approximated by a harmonic potential, and any small-amplitude motion can still be well described by a sum of normal modes.

Energy Minimization - Normal Mode AnalysisThe normal mode spectrum of a 3-

dimensional system of N atoms contains 3N - 6 normal modes ( for linear molecules in 3D).

In general, the number of modes is the system's total number of degrees of freedom minus the number of degrees of freedom that correspond to pure rigid body motion (rotation or translation).

QuickTime™ and aTIFF (Uncompressed) decompressor


Energy Minimization - Normal Mode AnalysisEach mode is defined by an eigenvector

and its corresponding eigenfrequency.

The eigenvector contains the amplitude and direction of motion for each atom (same frequency of vibration for all atoms)

Energy Minimization - Normal Mode Analysis In macromolecules, the lowest frequency modes

correspond to delocalized motions, in which a large number of atoms oscillate with considerable amplitude.

The highest frequency motions are more localized, with appreciable amplitudes for fewer atoms, e.g., the stretching of bonds between carbon and hydrogen atoms.

Energy Minimization - Normal Mode AnalysisNormal modes useful because they

correspond to collective motions of atoms in a coupled system that can be individually excited

Frequencies of normal modes and displacements may be calculated from a molecular mechanics force field using the Hessian matrix of second derivatives

Molecule must be at a minimum

Energy Minimization - Normal Mode AnalysisResults can be:

Used to calculate thermodynamic quantities Compared to spectroscopic experiments

Used in parameterization of force fields

For large molecules, low-energy vibrations are of most interest Correspond to large-scale conformational motions Can be compared to molecular dynamics simulations

Energy Minimization - Docking

the prediction of the strength and specificity with which a small to medium sized molecule can bind to a biological macromolecule

docking - evaluating the energy of binding between two molecules for various relative positions of the two simiplification: use rigid molecules

molecular modelling dr michelle kuttel department of computer science university of cape town

Documents