methods of molecular simulations lecture notes€¦ · [1] leach, a. r. (2001) molecular modelling:...

Methods of Molecular Simulations � Lecture Notes

Petra Imhof

June 16, 2014

Contents

1 Introduction 3

2 Exploring the Energy Landscape 42.1 Energy functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Finding Energy Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Steepest Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.2 Conjugate Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.3 Newton's Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.4 Quasi Newton Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.5 Step size control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Convergence Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.4 Normal Mode Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4.1 Infrared Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.5 Finding Transiton States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5.1 Eigenvector Following . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.5.2 Coordinate Driving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.5.3 Synchronuous Transit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.5.4 Nudged Elastic Band . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.5.5 Conjugate Peak Re�nement . . . . . . . . . . . . . . . . . . . . . . . . . . 252.5.6 Characterisation of Stationary Points . . . . . . . . . . . . . . . . . . . . . 26

2.6 Intrinsic Reaction Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Electrostatic interactions 283.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2 Implicit Solvent Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2.1 Poisson-Boltzmann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.2.2 Reaction Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2.3 Generalised Born . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2.4 Polarisable Continuum Method . . . . . . . . . . . . . . . . . . . . . . . . 35

3.3 Discrete solution of the Poisson equation . . . . . . . . . . . . . . . . . . . . . . . 383.3.1 Discretizing the Laplace operator . . . . . . . . . . . . . . . . . . . . . . . 383.3.2 Periodic Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 383.3.3 Solution via Fourier Transformation . . . . . . . . . . . . . . . . . . . . . . 39

1

3.3.4 Direct summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.4 Ewald summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.4.1 Particle Mesh Ewald . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2

1 Introduction

to be written....

Recommended Reading

[1] Leach, A. R. (2001) Molecular modelling: principles and applications, Pearson Prentice Hall,New Jersey 2 edition.

[2] Schlick, T. (2010) Molecular Modeling and Simulation: An Interdisciplinary Guide, Springer,New York, Heidelberg, London 2 edition.

[3] Frenkel, D. and Smit, B. (2002) Understanding Molecular Simulation: From Algorithms toApplications, Academic Press, San Diego, London 2 edition.

[4] Allen, M. P. and Tildesley, D. J. (1987) Computer simulation of liquids, Oxford UniversityPress, Oxford, New York 2 edition.

3

2 Exploring the Energy Landscape

In order to have a quantitative description of molecules the concept of the potential energy surface(PES) has turned out to be useful. The PES can be considered as an energy landscape with valleys,peaks and mountain passes etc. which are the following representations

feature corresponds to

minima in valleys (stable) molecular structures

energy di�erence (di�erencein altitude) between reactantand product valley

reaction energy

height and pro�le ofmountain pass

reaction rate

shape of valley vibrational spectrum

whole, separated potentialenergy surface

electronic state

energy di�erences betweenseparate potential energysurfaces

electronic spectrum (inBorn-Oppenheimerapproximation)

Figure 1 shows a schematic, two-dimensional energy function and its important features.

Figure 1: A Potential Energy surface and its features from ref. [?]

4

2.1 Energy functions

The potential energy surface is a function of the atomic coordinates and of the atom types.One possibility is to solve the time independent Schrödinger equation

HΨ = EΨ (2.1.1)

In this case, the atom types are de�ned by their atom number (and the type of isotope). TheHamiltonian H and hence the solution of the Schrödinger equation depends on the coordinates ofthe nuclei R and of the electonic coordinates r

H (R, r) = −~2

2

∑A

1

MA∆A +

e2

4πε0

∑A 6=B

ZAZBRAB

− ~2

2

∑i

∆i

+e2

4πε0

−∑i

∑A

ZArAi

+1

2

∑i

∑i 6=j

1

rij

(2.1.2)

In eq. 2.1.2 ~ is Planck's constant, ZA,B are the nuclear charges, e is the elementary charge, ε0isthe dielectric constant in vacuum, MAand m are the nuclear and the electron masses, respectively.RAB, rAi, and rij are the distances between atoms A and B, electron i and atom A, and electrons

i and j. ∆ represents the Laplace operator ∆ = ∂2

∂x2+ ∂2

∂y2+ ∂2

∂z2. The many-electron problem can

only be solved iteratively. Note that for large biomolecules, such as proteins, the number of atomsis of the order of several thousands, and the number of electrons is another several thousands (six,seven, eight for carbon, nitrogen or oxygen atoms, respectively).

A simpler approach is an empirical molecular mechanical energy function which does not con-sider the electrons explicitely. A typical molecular mechancis energy function looks like

EMM =∑bonds

kb2

(b− b0)2 +∑angles

kθ2

(θ − θ0)2 +∑

torsions

Vn2

[1 + cos (nφ− φ0)]

+∑i,j

4εij

[(σijrij

)12

−(σijrij

)6]

+∑i,j

qiqj4πεrij

(2.1.3)

This means that (covalent) bonds and bond angles are described by harmonic potentials. The forceconstants kb, kθ and the equilibrium values b0, θ0 are empirical parameters. The torsion cannot bedescribed by a harmonic potential. In the torsion potential in 2.1.3 Vnis the barrier height of thetorsion of multiplicity n (periodicity=number of repeats between zero and 2π). Also Vn and thevalue φ0 are parameters. In addition to the bonded terms (sum over bonds, angles and torsions),there are non-bonded terms which are summed over all pairs of atoms that are not connected by abonded term. The last is the coulomb interaction of two charges qi and qj and the �rst non-bondedterm is a Lennard-Jones potential representing the van der Waals interaction of two atoms i andj. The set of all parameters is called a force �eld.

In order to account for the di�erent bonded situations an atom can be in, there are more atomtypes than chemical elements. Figure 3 shows the atom types and their charges of the amino acidasparagine as used in the CHARMM force �eld. Note that there are three di�erent types of carbonatoms, two di�erent kinds of nitrogen atoms, two di�erent oxygen atom types and four di�erenttypes of hydrogen atoms.

5

Figure 2: Energy terms in a molecular mechanical energy function.

Which atoms are bound to each other and consequently which form a bond angle or a torsiondihedral angles have to be pre-de�ned. Bonds are represented by lines and bars in �gure 3. Allthree atoms connected via bonds form a bond angle and all sets of four atoms, connected viabonds de�ne a dihedral (torsion) angle. This de�nition, i.e. this topology, remains �xed during theentire simulation. That means, chemical reactions, involving bond breaking and making cannot beappropriately described by this sort of empirical energy function.1 Another drawback is that eachnew type of molecule requires new parameters to be developed. Quantum mechanical (electronicstructure) methods require in principle only the coordinates and types of the nuclei. They are �exibelin the sense that they can be used for all sorts of molecules, changes in the electronic density likethey occur in bond breaking/making can be treated accurately and even di�erent electronic statescan be handled. However, the computational cost is rather high and at least nowadays full �rstprinciple calculations of several thousand atoms is a tremendous challenge. Empirical force �eldson the other hand are computationally e�cient. This is their major advantage and the reason whythey are so widely used in biomolecular simulations.

1There are, however other types of classical force �eld, making use of e.g. Morse potentials so as to betteraccount for elongated/dissociating bonds

6

Atom Atom ChargeName Type

N NH1 -0.47 ! |HN H 0.31 ! HN-NCA CT1 0.07 ! | HB1 OD1 HD21 (cis to OD1)HA HB 0.09 ! | | || /

! HA-CA�CB�CG�ND2CB CT2 -0.18 ! | | \HB1 HA 0.09 ! | HB2 HD22 (trans to OD1)HB2 HA 0.09 ! O=C

! |CG CC 0.55OD1 O -0.55

ND2 NH2 -0.62HD21 H 0.32HD22 H 0.30

Figure 3: Atom types and their charges of Asparagine in the CHARMM force �eld [?, ?].

2.2 Finding Energy Minima

In the following we will discuss a few methods to locate local minima on the potential energysurface. A minimum is de�ned as a point where the �rst derivative (of the energy function)

vanishes ∂E∂xi

= 0 and the second derivative is larger than zero ∂2E∂x2i

> 0 for all (atomic coordinates)

xi. The vector of the �rst derivatives is called the gradient

g =

∂E∂x1...∂E∂xn

=∂E

∂x(2.2.1)

and the Hessian matrix H contains all second derivatives

H =

∂2E

∂x1∂x1. . . ∂2E

∂xn∂xn. . .. . .. . .

∂2E∂x1∂xn

. . . ∂2E∂xn∂xn

=∂2E

∂x2(2.2.2)

Note that the Hessian is symmetric. Numerically, local minima are determined iteratively.

7

2.2.1 Steepest Descent

One of the simplest methods to locate local minima is the steepest descent. This method can beconsidered as a walk downhill (the energy landscape), with the walk being in direction parallel tothe net force, i.e. along the gradient. (Note that the force is the negative gradient). Consider alinear model, i.e. a Taylor expansion up to 1st order of the new iteration point xk+1 around theprevious point xk

E (xk+1) = E (xk + dk) ≈ E (xk) + dTk gk (2.2.3)

where d is the step direction. Which direction now minimises E (xk + dk)?

min {E (xk + dk)} = min{E (xk) + dTk gk

}= E (xk) + min

{dTk gk

}(2.2.4)

= E (xk)−max{dTk gk

}and obviously dTk gk is maximal if the two directions are parallel, hence

dk = − gk|gk|

(2.2.5)

and the new point is given byxk+1 = xk − σkgk (2.2.6)

where σ determines the step size. The step size can be chosen arbitrarily: i.e. start with anarbitrary step size. If the iteration reduces the energy value, increase σ, if the iteration increasesthe energy, then reject the step and decrease σ. Alternatively, the step size can be determined byline search in direction of the gradient g, which is a one-dimensional problem.

If a minimum is found, the gradient of the next step is orthogonal to that of the previous onegTk gk+1 = 0 . As a consequence, the steepest descent algorithm makes quick progress initially, butis slow at ater stages. In particular, it does not converge to the minimm but only close to it. Forshallow minima it can exhibit oscillations as illustrated in Fig. 4

Figure 4: Steepest descent minimisation. Due to the orthogonality of subsequent gradients, thealgorithm oscillates around the minimum.

2.2.2 Conjugate Gradients

In order to avoid oscillations and to avoid searching again where we have already looked, we wouldlike to preserve the minimisation already achieved in previous iterations. The search direction isthen given by

dk = −gk + βkdk−1 (2.2.7)

8

where the second term βkdk−1 restores �some minimisation from before�. This term is determinedby the concept of conjugate directions. Let's say we have three iteration points xk, xk+1, andxk+2 where xk and xk+1 are connected by dk =

xk+1−xk|xk+1−xk| and accordingly xk+1 and xk+2 are

connected by dk+1 =xk+2−xk+1

|xk+2−xk+1| . We furthermore know that at the minimum of a line search

dk · gk+1 = 0 and dk+1 · gk+2 = 0 . We want now that also dk · gk+2 = 0 , i.e. we want to go ina completely new direction.

Using a Taylor expansion up to second order

E (xk+1) = E (xk + dk) ≈ E (xk) + dTk gk +1

2dTkHkdk (2.2.8)

taking the derivative w.r.t dk

∂

∂dkE (xk+1) =

∂

∂dkE (xk) +

∂

∂dkdTk gk +

∂

∂dk

1

2dTkHkdk

∂

∂dkE (xk+1) = gk + Hkdk + . . . (2.2.9)

0 = gk + Hkdk

which vanishes at a minimum (of the line search). We can set

gk+1 = gk + Hkdk (2.2.10)

gk+2 = gk+1 + Hkdk+1

and use this to obtain

dk · gk+2 = 0

dk · (gk+1 + Hkdk+1) = 0 (2.2.11)

which, since dk · gk+1 = 0 , yields

dk ·Hkdk+1 = 0 (2.2.12)

The two directions dk and dk+1 are conjugate, i.e. orthogonal with respect to the matrix H. Notethat in steepest descent dk · dk+1 = 0 which is equivalent as orthogonal to the identity matrix I.For our iteration to work we have to determine β in 2.2.7:

dk+1 ·Hkdk = 0 (2.2.13)

(−gk+1 + βk+1dk) ·Hkdk = 0

Hkdk is parallel to gk+1 − gk(as can be seen from 2.2.10) we get

(−gk+1 + βk+1dk) · (gk+1 − gk) = 0 (2.2.14)

− |gk+1|2 + βk+1dk · gk+1 − βk+1dk · gk+ gk+1 · gk = 0

9

At the minimum (of line search/previous step) gk+1 ·gk = 0 and βk+1dk ·gk+1 = 0 and we obtain

βk+1 = −|gk+1|2

dkgk(2.2.15)

and with dk= −gk

βk+1 =|gk+1|2

|gk|2(2.2.16)

and the update of the direction reads

dk+1 = −gk+1 +|gk+1|2

|gk|2· dk (2.2.17)

2.2.3 Newton's Method

Before we deal with �nding minima recall Newton's method for �nding the root of a (one-dimensional) function:

xk+1 = xk −f (xk)

f ′ (xk)(2.2.18)

Figure 5: Newton's iterative method to �nd the root of y = f (x). Calculate the function valuef (x0) at a starting point x0. The intercept of the tangent f ′ (x0)with the x-axis gives the newpoint x1etc. With the slope of the tangent becoming smaller and smaller the evaluation points xkapproach x∗ with f (x∗) = 0.

In �gure 5 we see that

f ′ (x0) =f (x0)

x0 − x1

x0 − x1 =f (x0)

f ′ (x0)(2.2.19)

x1 = x0 −f (x0)

f ′ (x0)

10

Finding a minimum can be understood as �nding the root of f ′ (x) in one dimension. Newton'smethod to �nd stationary points, i.e. points where the gradient vanishes, is thus equivalent to�nding the root of the gradient.

We start with a Taylor expansion uo to second order (one dimension)

f (xk+1) = f (xk + dk) (2.2.20)

f (xk+1) ≈ f (xk) + dk · f ′ (xk) +1

2d2 · f ′′ (xk)

In order to �nd the direction d that minimises f (xk+1) we take the derivative w.r.t. d and set itto zero:

f ′ (xk+1) ≈ f ′ (xk) + d · f ′′ (xk)0 = f ′ (xk) + d · f ′′ (xk)

dk = − f ′ (xk)

f ′′ (xk)(2.2.21)

And in the multi-dimensional case

f (xk+1) = f (xk + dk)

f (xk+1) ≈ f (xk) + dTk · gk +1

2dTk ·Hkdk

dk = −H−1k gk (2.2.22)

such that the new point is given by

xk+1 = xk − σH−1 · gk (2.2.23)

where σ de�nes the step size.Newton's method is exact for quadratic functions, since the search direction points exactly

to the minimum. It is e�cient close to stationary points where the function is approximatelyquadratic (quadratic region). However, it �nds the closest stationary point, regardless whether it isa minimum, maximum or a saddle point. Thus, the outcome depends on the starting point. (Notethat in a search for a minimum one can always check whether the function value, i.e. the energy,decreases). The most severe drawback of the method is the need to compute the Hessian H andto either compute the inverted Hessian (r solve the linear equation system Hkdk = −gk). This iscomputationally expensive for large systems and for very large systems already the storage of theHessian can cause problems.

2.2.4 Quasi Newton Methods

Instead of computing the Hessian (and inverting it) an approximation to the Hessian is used thatis based on the change in the gradient between the iterations.

Hk+1 =gk+1 − gk

(xk+1 − xk)(2.2.24)

which is analogue to computing the di�erential

f ′′ (xk+1) =f ′ (xk+1)− f ′ (xk)

xk+1 − xk(2.2.25)

11

This leads us to the secant equation

Hk+1 (xk+1 − xk) = gk+1 − gk

Hk+1Sk = Yk (2.2.26)

with

Sk = xk+1 − xkYk = gk+1 − gk (2.2.27)

Symmetric update rank one SR1

The new Hessian is computed as �the old Hessian plus something�. Both, Hk and Hk+1 must besymmetric and positive.

We set

Hk+1 = Hk + αuuT (2.2.28)

with α 6= 0, u ∈ R, u 6= 0. Plugging 2.2.28 into the secant equation 2.2.26 we get

Hk+1Sk = HkSk + αuuTSk

HkSk + αuuTSk = Yk (2.2.29)

We choose

u = Yk −HkSk (2.2.30)

such that

α(uTSk

)= 1 (2.2.31)

α =1

uTSk

α =1

(Yk −HkSk)T Sk

For the Hessian update we thus obtain

Hk+1 = Hk +(Yk −HkSk) (Yk −HkSk)

T

(Yk −HkSk)T Sk

Hk+1 = Hk +uuT

α−1(2.2.32)

This update becomes problematic if Yk −HkSk is close to zero (the update correction should besmall and not close to ∞.)

BFGS

A solution to this problem is an update scheme of rank 2

Hk+1 = Hk + αuuT + βvvT (2.2.33)

12

Plugging this update into the secant equation

Hk+1Sk = HkSk + αuuTSk + βvvSk

HkSk + αu(uTSk

)+ βv

(vTSk

)= Yk (2.2.34)

and choosing

u = Yk (2.2.35)

v = HkSk (2.2.36)

we get

HkSk + αYk

(YTk Sk

)+ βHkSk (HkSk)

T Sk = Yk (2.2.37)

αYk

(YTk Sk

)+ βHkSk

(STkHkSk

)= Yk −HkSk

Comparing the coe�cients we get for α and β

α(YTk Sk

)= 1

β(STkHkSk

)= −1

and we arrive at the update formula from Broyden-Fletcher-Goldfarb-Shenno (BFGS)

Hk+1 = Hk +YkY

Tk

YTk Sk

− HkSk (HkSk)T

STkHkSk(2.2.38)

DFP

An alternative update scheme uses an approximation of the Hessian inverse Bk+1 = H−1k+1 directly.The quasi-Newton condition is

xk+1 − xk = Bk+1 (gk+1 − gk)

leading to the update formula by Davidon-Fletcher-Powell (DFP) is

Bk+1 = Bk +SkS

Tk

STkYk− BkYk (BkYk)

T

YTkBkYk

(2.2.39)

Note that this update formula has the same structure as the BFGS formula, and can be obtainedby replacing Bk+1and H−1k+1 and interchanging Sk and Yk .

From plugging in one can see that the secant equation is ful�lled

Bk+1Yk = BkYk +SkS

Tk

STkYkYk −

BkYk (BkYk)T

YTkBkYk

Yk

Bk+1Yk = BkYk +SkS

TkYk

STkYk−

BkYk ·YTkBkYk

YTkBkYk

Bk+1Yk = BkYk + Sk −BkYk

Bk+1Yk = Sk (2.2.40)

13

In both update schemes, BFGS and DFP, the updated (inverse) Hessian matrices show hered-

itary symmetry and hereditary positive de�niteness , that is if Bk is symmetric and positive

de�nite, then Bk+1is also symmetric and positive de�nite. An update scheme, that ensures

hereditary symmetry but not positive de�niteness, i.e. allows the Hessian to have negative

eigenvalues is the Powell-Symmetric-Broyden update (PSB)

HPSBk+1 = Hk +

(Yk −HkSk)STk + Sk (Yk −HkSk)

T

STk Sk−

STk (Yk −HkSk)SkSTk(

STk Sk)2 (2.2.41)

L-BFGS

The limited-memory variant of the BFGS update does not store nor manipulate the approximate

(invers) Hessian of size N2, but only stores multiple vectors Vk = I − YkSTk

YTk Sk

of lenth N to

�remember� the update only from the last m iterations. The Hessian update is then �unrolled� via

Hk = VTk−1Hk−1Vk−1 +

Sk−1STk

YTk−1Sk−1

(2.2.42)

Hk = VTk−1V

Tk−2Hk−2Vk−2Vk−1 + VT

k−1Sk−2S

Tk−2

YTk−2Sk−2

+Sk−1S

Tk

YTk−2Yk−2

. . .

Proof of the heredity positive de�niteness

Let Hk be positive de�nite and YTk Sk > 0. Then is zTHkz also positive de�nite for z 6= 0. We

will now show that zTHk+1z is positive de�nite if zTHkz is:

zTHk+1z = zTHkz +zTYkY

Tk z

YTk Sk

− zT (HkSk) (HkSk)T z

STkHkSk

zTHk+1z = zTHkz +

(zTYk

)2YTk Sk

−(zTHkSk

)2STkHkSk

(2.2.43)

14

Since Hkis positive de�nite we can write Hk = H1/2k ·H1/2

k

zTHk+1Sk = zTH1/2k H

1/2k Sk ≤

∥∥∥H1/2k z

∥∥∥∥∥∥H1/2k Sk

∥∥∥(zTHk+1Sk

)2 ≤∥∥∥H1/2

k z∥∥∥2 ∥∥∥H1/2

k Sk

∥∥∥2(zTHk+1Sk

)2 ≤ zTHkz · STkHkSk

−(zTHk+1Sk

)2STkHkSk

≥ −zTHkz

zTHkz−(zTHkSk

)2STkHkSk

≥ zTHkz− zTHkz

zTHkz−(zTHkSk

)2STkHkSk

≥ 0 (2.2.44)

zTHk+1z = zTHkz−(zTHkSk

)2STkHkSk

+

(zTYk

)2YTk Sk

≥ 0

�

2.2.5 Step size control

The quadratic model (Taylor expansion) is usually only valid for a small local region, the quadraticregion. This region is called trust region. Typically it is assumed to be a sphere of radius τ

T (τ) = {‖x− xa‖ ≤ τ} (2.2.45)

With a quadratic model ma of the function f around xa

ma = f (xa) +∇f (xa)T (x− xa) +

1

2(x− xa)T ∇2f (xa) (x− xa) (2.2.46)

we �rst solve an auxilliary problem

min‖d‖≤τ

{ma (xa + d)} (2.2.47)

that leads to a trial solution dv with xv = xa + dv. Then we test whether the quadratic model isa good approximation of f in T (τ): De�ne the actual reduction areda and the one predicted bythe quadratic moel preda:

areda = f (xa)− f (xv)

preda = ma (xa)−ma (xv) (2.2.48)

= f (xa) +∇f (xa)T (xa − xa) +

1

2(xa − xa)T ∇2f (xa) (xa − xa)

− f (xa)−∇f (xa)T (xv − xa)−

1

2(xv − xa)T ∇2f (xa) (xv − xa)

= −∇f (xa)T (xv − xa)−

1

2(xv − xa)T ∇2f (xa) (xv − xa)

= −∇f (xa)T (xv − xa)−

1

2(xv − xa)T Ha (xv − xa)

15

with Ha = ∇2f (xa) . We de�ne the ratio

ρ =aredapreda

(2.2.49)

to evaluate the current step in the quadratic model and the current traust radius:ifρ < 10−4 reject (at worst ρ is negative, i.e. going uphill)ifρ < 0.25 decrease: τnew = 1

2τoldifρ > 0.75 increase: τnew = 2τoldotherwise leave unchanged: τnew = τold

The energy is minimised under the constraint that the step is not larger than τ . This is doneby introducing a lagrangian multiplier λ such that

L = E (x)− 1

2λ(∆x2 − τ2

)(2.2.50)

with E (x) ≈ g ·∆x+ 12∆xTH∆x

dL

d∆x=

d

d∆xE (x)− λ∆x

dL

d∆x= g + H∆x− λ∆x

0 = g + H∆x− λ∆x

− (H− λ) ∆x = g

∆x = − (H− λI)−1 g (2.2.51)

Even when H has negative eigenvalues one can get a controlled step towards the minimum byforcing λ to be smaller (more negative) than the lowest eigenvalue of H. For transition stateoptimisation λ must be larger than the lowest, but smaller than the second eigenvalue of H.

2.3 Convergence Criteria

All the above methods are iterative and will only come close to the minimum but never reach thetrue minimum (except for Newton's method for quadratic functions). Therefore one needs to de�neconvergence criteria which have to be ful�lled for the iteration to stop. Typical such criteria to bebelow a given threshold are

� The Energy di�erence between two iteration steps E (xk)− E (xk+1) < ε [1 + E (xk)]

� the coordinate di�erence (displacement) |xk − xk+1| < ε [1 + |xk|]

� the gradient norm |gk| < ε [1 + |E (xk)|]

16

2.4 Normal Mode Analysis

Let us consider a molecule as a multi-dimensional harmonic oscillator. We expand the energy in aTaylor series around a stationary point xmin

E (x) ≈ E (xmin) + ∆xTg (xmin) +1

2∆xTH (xmin) ∆x (2.4.1)

The �rst term is a constant which we can arbitrarily set to zero. The second term vanishes sinceat a stationary point, the gradient is zero. The force at the stationary point is

F (xmin) = −dE (xmin)

d∆x= −H (xmin) ∆x (2.4.2)

which we recognise as Hooke's law F = kx for the mutli-dimensional case. We further know thatF = mx, hence

M∆x = −H∆x (2.4.3)

This di�erential equation can be solved (like the one-dimensional harmonic oscillator) with theansatz

∆x (t) = a cos (ωt+ φ)

∆x (t) = −asin (ωt+ φ) · ω∆x (t) = −a cos (ωt+ φ) · ω2 (2.4.4)

using this in eq. 2.4.3 we get

−Ma cos (ωt+ φ) · ω2 = −Ha cos (ωt+ φ)

a cos (ωt+ φ) · ω2 = M−1Ha cos (ωt+ φ)

a · ω2 = M−1Ha (2.4.5)

where the ω2 are the eigenvalues and athe eigenvectors, that is the collective vibration of the atomsin the molecule, also known as normal mode. In order to calculate the eigenvalues the followingsecular equation has to be solved

Hma− ω2a = 03N∑i=1

Hmij ai − ω2ai = 0(

ω2I−Hm)a = 0 (2.4.6)

where Hm is a mass weighted Hessian Hm = M−1/2HM1/2. Eg. 2.4.6 is solved by setting thesecular determinant to zero

det(Hm − ω2I

)= 0∣∣∣∣∣∣∣

Hm1,1 − ω2 . . . Hm

1,3N...

. . ....

Hm3N,1 . . . Hm

3N,3N − ω2

∣∣∣∣∣∣∣ = 0

17

To give an explicit example let us consider a 2-atomic molecule whose atoms vibrate only inthe x-direction

O1←→! O2←→corresponding to the displacements ∆x1 and ∆x2. This yields the equation system

−ω2m1∆x1 = −k11xx∆x1 − k12xx∆x2

−ω2m2∆x2 = −k21xx∆x1 − k22xx∆x2

The kijxx are the force constants for the movements in the x direction, i.e. the second derivativew.r.t. i or j, respectively.

In matrix notation, dropping the xx we get(k11 − ω2m1 k12

k21 k22 − ω2m2

)(∆x1∆x2

)= 0 (2.4.7)

For non-trivial solutions the determinant of the matrix must be zero(k11 − ω2m1

) (k22 − ω2m2

)− k21 · k12 = 0 (2.4.8)

which has as the two solutions ω21 and ω2

2. Let us set m1 = m2. Then

k11k22 − ω2m1k22 − ω2m2k

11 +(ω2m1ω

2m2

)− k21 · k12 = 0(

ω2m)2 − ω2m

(k22 + k11

)−(k11k22 + k21 · k12

)= 0 (2.4.9)

ω21,2 =

k11 + k22

2m±

√(k11 + k22

2m

)2

+(k11k22 + k21 · k12)

m2

(2.4.10)

2.4.1 Infrared Spectra

Vibrational frequencies can be computed as the eigenvalues of a normal mode analysis, and thecorresponding eigenvectors represent the vibrations, i.e. movement of atoms. The vibrationalfrequencies of a molecule are in the range of infrared light, i.e. 430THz�300GHz, correspondingto 1.meV�1.7eV or wave lengths of 700nm�1µm. The unit used typically in infrared-spectroscopyis wavenumbers cm−1.

If a system has a dipole whose oscillation frequency coincides with the frequency of electro-magnetic radiation according to hνem = hνf − hνi, the system absorbs the light and undergoes atransition from state i to state f . In case of IR light, these are vibrational transitions.

The intensity of the absorbtion is proportional to the corresponding transition dipole momentµ

Iif (Qn) ∝ [µif ]2xyz (2.4.11)

where Qn is the normal mode coordinate of mode n which is excited upon the transition.The transition dipole moment is given by

µif =⟨Ψ∗f |µ|Ψi

⟩(2.4.12)

18

Figure 6: Spectrum of Electromagnetic Radiation.

where µ is the molecular dipole moment that has components in ζ = x, y, z direction, respectively.For small displacements the molecular dipole moment can be expanded in a Taylor series

µζ = µ0ζ +3N−6∑ζn

(∂µ

ζ

∂Qn

)0

Qn + . . . (2.4.13)

and

µif = µ0ζ⟨Ψ∗f |Ψi

⟩︸︷︷︸=0

+

3N−6∑ζn

(∂µ

ζ

∂Qn

)0

Qn⟨Ψ∗f |Qn|Ψi

⟩︸︷︷︸f=i±1

(2.4.14)

The �rst term on the right hand side vanishes because of the orthogonality of the two wavefunctions. The integral in the second term is only di�erent for f = i±1, i.e. only single excitationsare allowed. And only if the dipole moment varies with Qn the second term on the right hand sidedoes not vanish.

Figure 7: Calculated (top) and experimental (bottom) IR spectra of carbon dioxide (left) andwater(right).

19

Figure 8: Energy Surface with a �rst order saddle point (TS) connecting two minima Min1 andMin2

This also shows, that the intensity of the absorbtion is proportional to the change in themolecular dipole moment along the respective mode.Total symmetric modes such as the symmetricstretch in Carbon dioxide with two C=O bonds stretching in-phase, i.e. both bond lengths are thesame throughout the vibration, are not IR-active.

2.5 Finding Transiton States

Transition states are �rst order saddle points, connecting reactant and product state. At a �rstorder saddle point, the energy surface curves down in one direction, the one connecting R andP, and it curves up in all other directions. This corresponds to one negative eigenvector of theHessian, the transition vector, with all other eigenvectors positive.

2.5.1 Eigenvector Following

One way to �nd a transition state is �walking up valleys� (Fig. 9) also called shallowest ascent,following the lowest eigenvector. Diagonalisation of the Hessian yields the eigenvalues and eigen-vectors

H (xk)a = ω2a

where ω1 is the lowest eigenvalue with the corresponding eigenvector a1. Close to the transitionstate the lowest eigenvalue is negative and the corresponding eigenvector points to the transitionstate. The direction uphill is obtained from a projection of the eigenvector a1 onto the gradient atxk

aup1 =(a1 · gk)a1|(a1 · gk)a1|

(2.5.1)

The step is then

xk+1 = xk + λaup1 (2.5.2)

This method may lead to a wrong transition state that does not connect to the desired stateon the other side (walking up the wrong hill).

20

Figure 9: Walking up valleys

Figure 10: Classes of reactions on a energy surface, represented by contour lines. a) hinderedrotation, I-shape b) breaking and forming bonds, L- or V-shape, c) T-shape, d) H-shape

21

Figure 11: Coordinate driving

2.5.2 Coordinate Driving

Depending on the shape of the potential energy surface (cf. Fig. 10) di�erent strategies to �nd atransition state can be sucessful or not.

A way to obtain information about the energy surface and eventually locate a transition stateis a PES scan, also called coordinate driving or drag method. Here one reaction coordinateis de�ned, that ideally describes the transition from reactant to produt state. Usually it is aninternal coordinate or a combination of a few internal coordinates. This reaction coordinate is�driven� by constraining the system to particular values along the RC and relaxing (optimising) allother degrees of freedom. For I-shaped (cf. Fig. 10) potential energy surfaces this should work.However, consider the PES shown in Fig. 11. Starting from R by driving the reaction coordinate,the system climbs up the PES close to the slowest ascent, as represented by the �lled circles. Whenthe system reaches a potential that is larger than the (unknown) saddle point, it slips over to theproduct valley (relaxation along the solid lines) and misses the saddle point. The correspondingenergy pro�le shows a discontinuity at the RC value of the jump.

2.5.3 Synchronuous Transit

In linear synchronuous transit (LST) a pathway is constructed from idealised structures that arede�ned in terms of interpolates between the reactant state R and the product state P as

rab (ξ) = (1− ξ) rRab + ξrPab (2.5.3)

where rab are the internuclear distances. The reactant state is at ξ = 0 and the product stateis at ξ = 1. The N (N − 1) /2 distinct internuclear distances overspecify the system of 3N − 6

22

coordinates. The path is obtained by minimising

S (ξ) =1

2

∑a6=b

[rab − riab (ξ)

]2[riab (ξ)

]4 + 10−6∑

ζ=x,y,z

∑a

(ζa − ζia (ξ)

)2(2.5.4)

where i denotes the ith interpolate and ζ are the cartesian coordinates. The 10−6 in the secondterm suppresses rigid translations and rotations.

The maximum of the LST path is then optimised laterally, under the constraint that the pathcoordinate ϑ remains the same with

ϑ =dR

dR + dP

and dR, dP are the distances from the intermediate to R and P, respectively:

dR =

1

N

∑ζ=x,y,z

N∑a

[ζa − ζRa

]21/2

(2.5.5)

Quadratic synchronuous transit is a three-point interploation where the third point rMab isobtained from linear synchronuous transit

rab (ξ) = (1− ξ) rRab + ξrPab + γξ (1− ξ) (2.5.6)

with

γ =rMab − (1− ϑM ) rRab − ϑMrPab

ϑ (ϑ− 1)(2.5.7)

2.5.4 Nudged Elastic Band

Nudges elastic band (NEB) is a chain-of-states methods, where the path connecting reactantand product state is (initially) represented by a series of points xi , i = 1, . . . ,m , e.g. linearinterpolates.

The optimal path is the one minimising the integral of the energy along the path

Epath =1

L

ˆE (x (ξ) dξ) (2.5.8)

represented as a discrete set of poins

V path =∑

E (xi) (2.5.9)

All states are minimised at the same time under the constraint that the distances betweenadjacent states are approximately similar. To keep the points equidistant a spring potential isintroduced

V spring =1

2kspring

∑(xi − xi−1)2 (2.5.10)

At each point the gradient has a component from the potential energy

gi =dV path

dxi= g⊥i + g

‖i (2.5.11)

23

Figure 12: Initial (small circles) and �nal (thick circles) con�gurations of a nudged elastic band(NEB) with 16 images.

Figure 13: Depending on the initial coordinates, the NEB method converges to di�erent minimumenergy paths.

24

and from the spring potential

gi =dV spring

dxi= g⊥i + g

‖i (2.5.12)

where both gradient components can be separated into a component parallel and one perpendicularto the path. In the path optimisation, the points should slide down the gradient only perpendicular

to the path g⊥i = gi − g‖i and the spring potential is used to displace (nudge) the points along

the path to keep them equidistant g‖i = ki (xi+1 − xi) (xi − xi−1). The gradient used in the path

optimisation is then

gNEBi = g⊥i + g‖i (2.5.13)

Parallel means along the tagent τi to the path at point i

g‖i =

(gTi τi

)τi (2.5.14)

The tangent can be obtained by e.g. central di�erence or bisection of two vectors:

τi =xi+1 − xi|xi+1 − xi|

+xi − xi−1|xi − xi−1|

(2.5.15)

The minimisation/iteration is carried out as e.g. x1...xm

n+1

=

x1...xm

n

− λ

gNEB1...

gNEBm

n

(2.5.16)

or using other gradient-based methods such as L-BFGS. A sketch of initial and �nal NEB pathsisshown in Fig. 12.

NEB usually converges fast in the beginning but can be slow towards the end. It �nds a wholeminimum energy path that can contain several minima and transition states, hoever, does notnecessarily include the exact transition state. To get closer to the saddle point one can interpolatebetween images, e.g. by cubic polynomial, where the continuity in energy and force at both endshave to be kept.

2.5.5 Conjugate Peak Re�nement

Conjugate Peak re�nment (CPR) is another chain of states optimiser. It starts by generating a setof points between reactant R and product state P , e.g. a linear interpolate. Then, the maximumY1 along this path is optimised along the conjugate vectors, resulting in a new point S1. This pointis used to construct a new path R− S1 − P with a new maximum along the tangent of that pathY2. Conjugate minimisation of that point yields a new point S2 etc. (Fig. 14). CPR �nds a pathclose to the minimum energy path that can contain several minima and saddle points. The trueminimum energy path can be obtained by e.g. steepest descent from the saddel points. In contrastto NEB, the saddle points are true �rst order transition states if the starting end points are trueminima and the optimisation has been carried out to a strict enough convergence criterion.

25

2.5.6 Characterisation of Stationary Points

Once, stationary points, i.e. minima and transition states, have been located, one should checkwhether the respective point is a true minimum/true �rst order saddle point. This can be done bycalculating and diagonalising the Hessian. For a minimum all eigenvectors must be positive. Fora transition state, there must be one and only one negative eigenvector. Moreover, the transitionvector must point in the direction connecting reactant and product state. This can for simpletransitions loosely checked by visualisation of the corresponding atom movement.

A more rigorous check is the compuation of a steepest descent path, starting from the transitionstate

dx (s)

ds= − g (s)

|g (s)|(2.5.17)

where s is the arc length along the path in mass weighted coordinates. The steepest descent pathequals the intrinsic reaction coordinate (for a classicle particle moving with in�nitesimal energy).

2.6 Intrinsic Reaction Path

Consider a molecule with N atoms of mass mn (for simplicity all atoms have the same mass).Then we can write a Lagrangian

L = T − V= Ekin − Epot

=N∑n=1

mn

2

∣∣∣∣dx (n)

dt

∣∣∣∣2 − V [x (1) , . . . , x (n)]

=N∑n=1

1

2

∣∣∣∣dζ (n)

dt

∣∣∣∣2 − V [ζ (1) , . . . , ζ (n)] (2.6.1)

where ζ (n) =√mnx (n) is a mass weighted coordinate. Moreover, we can set

kinetic Force = potential Force

d2ζ (n)

dt2= − ∂V

∂ζ (n)(2.6.2)

Figure 14: Conjugate Peak Re�nement. The maximum along an interpolation of end states Rand P is minimised orthogonal (in conjugate directions) to the connecting line to give a new pathpoint. New path points are determined in subsequent maximisation-minimisation iterations so asto obtain a path that is close to the minimum energy path (green line), containing all saddle pointsbetween the end states R and P.

26

Figure 15: Reaction path from Transition State TS to the reactant valley R. Di�erent initialvelocities lead to di�erent trajectories of similar style (green and blue). With almost zero velocitiesthe steepest descent path (red) is followed.

All mass-weighted coordinates experience a classical force opposite to the potential gradient. Whensolving the di�erential equation at the saddle poinst as initial position ζ (t = 0) with initial velocitycomponent towards either reactant or pdroduct, the system will accelerate down the potentialenergy surface and oscillate (vibrate) from side to side of the valley.

Upon introduction of heavy friction such that the velocities are almost zero we obtain

dζ

dt= −∂V

∂ζ

the steepest descent path in mass weighted coordinates, i.e. the intrinsic reaction path.Let t now denote the distance along the path instead of time, then are ζ (t) and ζ (t+ dt)two

neighbouring points on the path, separated by ds with

ds2 = |ζ (t+ dt)− ζ (t)|2

=

∣∣∣∣dζ (t)

dt

∣∣∣∣2 dt=

∣∣∣∣ ∂V∂ζ (t)

∣∣∣∣2 dt (2.6.3)

with dt = ds∣∣∣ ∂V∂ζ(t)

∣∣∣ , hencedζ

ds= − ∂V/∂ζ

|∂V/∂ζ|(2.6.4)

which is equivalent to eq. 2.5.17. The distance s (with units length·√mass) is the intrinsic

reaction coordinate.

27

3 Electrostatic interactions

3.1 Basics

The electric �eld E due to a static charge density distribution ρ(r) is determined by Gauss' Law(one of Maxwell's equations):

∇ ·E(r) =ρ(r)

ε0

where ε0 is the dielectric constant. A static electric �eld can be derived from the electric potentialφ(r): E = −∇φ, yielding the Poisson equation:

∇2φi(r) = −ρi(r)

ε0(3.1.1)

where the Laplace operator is given by

∇2 = ∇ · ∇ =∂2

∂x2+

∂2

∂y2+

∂2

∂z2

The Poisson equation relates the charge density and the potential �eld generated by it. For specialgeometries the Poisson equation can be solved directly. Consider for example a Gaussian chargedensity with charge q centered at the origin

ρ(r) = Gσ(r) =q

σ3(2π)3/2exp

(− r2

2σ2

)where r = |r| is the distance from the origin. We transform ∇2 to spherical coordinates:

1

r2∂

∂r

(r2∂

∂rφσ(r)

)= −Gσ(r)

ε0

integrating two times with respect to r and rearranging yields the solution

φσ(r) =q

4πε0rerf

(r√2σ

). (3.1.2)

Taking the limit σ → 0 provides the potential of a point charge at the origin:

φδ(r) =q

4πε0r

In general, let there be a point charge qi at position ri. The electric �eld generated by this pointcharge is

φi(r) =qi

4πε0|r− ri|(3.1.3)

We generalize this by calling ρi(r) be an arbitrary charge density of ion i. The potential �eld canbe written in terms of a superposition �elds generated by point charges.

φi(r) =1

4πε0

ˆd3r′

ρi(r)

|r− r′|(3.1.4)

28

The electrostatic interaction energy of a point charge q(r) with a �eld φ(r) is given by (Coulomb'slaw):

E = q(r)φ(r),

in the special case of two ions at positions ri and rj :

Eij =qiqj

4πε0|rj − ri|.

Finally, the total interaction energy of N charges q1, ..., qN is given by:

E =∑i,j

qiqj4πε0|rj − ri|

3.1.1 Numbers

TIP3P water model:

r(OH) 0.9572 A 0.09572 nm

HOH 104.52 104.52

A 582 10−3 kcal A12

mol 582 10−3 × 4.184 kJkcal

kcal 10−12nm12

mol = 2.435088 10−12 kJ nm12

mol

B 595 kcal A6

mol 595× 4.184 kJkcal

kcal 10−6nm6

mol = 2.48949× 10−3 kJ nm12

mol

q(O) -0.834 -0.834

q(H) 0.417 0.417

Elementary charge: q0 = 1.60217656535× 10−19CDielectric constant of vacuum: ε0 = 8.854187817× 10−12 C2

Nm2

Distance in Angstrom: rij [A] = 10−10mAvogadro constant N = 6.0221413× 1023

Eij =1

r/nm

(1.60217656535× 10−19C)2 × 6.0221413× 1023 1mol

4π × 8.854187817× 10−12 C2

Jm10−9m10−3

kJ

J

= 86.716695983218kJ

mol

29

Figure 16: A) A molecule (protein) in explicit solvent. B) Inimplicit solvent models, only the e�ectthe solvent has on the molecule is modelled, not the solvent molecules themselves.

3.2 Implicit Solvent Methods

3.2.1 Poisson-Boltzmann

In a uniform dielectric medium with permittivity ε the electrostatic potential is (omitting the 4π ofthe volume element)

∆φ (r) = −ρ (r)

ε0ε(3.2.1)

and the energies

E (rij) =qiqj

4πε0εrij(3.2.2)

In a non-uniform dielectric medium, the permittivity is position-dependent ε = ε (r) and thePoisson-equation becomes

∇ [ε (r)∇φ (r)] = −ρ (r)

ε0· 4π (3.2.3)

In a simulation , a protein can be considered as a set of atomic point charges, immersed in alow dielectric medium (εprotein =2�4) and the surrounding solvent (usually water) is given by ahigh-dielectric medium, containing ions. We have to consider the combined e�ect of solute charges,dielectric distribution and ionic distribution.

Express the density of a particle at any point relative to density in absence of interactions withother particles

ρparticle (r) = g (r) ρ0particle (r) (3.2.4)

where g (r) is the distribution function of that particle described by Boltzmann distribution

g (r) = exp [−w (r) /kBT ] (3.2.5)

w (r)is the potential of mean force (gradient w.r.t. particle coordinates gives mean force acting onparticle), representing the interaction energy of a particle with all other particles.

30

Figure 17: Sketch of a protein in ionic solution

The ion-distribution ρion is hence described by a Boltzmann statistics in a mean �eld approxi-mation, i. e. the e�ect of whole system condensed in single particle potential.

ρion (r) =∑s

cs (r) qs exp [−βqsφ (r)]

with qsion charge of speciess , cs: local concentration of species s, β = 1kBT

.The Poisson equation for the protein with a sum of (delta function) charges, and such an ion

distribution is then given by the Poisson-Boltzmann equation

∇ [ε (r)∇φ (r)] = − [ρ (r)− ρion (r)] · 4π

∇ [ε (r)∇φ (r)] = −

[ρ (r)−

∑s

cs (r) qs exp [−βqsφ (r)]

]· 4π (3.2.6)

Expanding the exponential in a power series of φ:∑s

cs (r) qs exp [−βqsφ (r)] =∑s

cs (r) qs − β∑s

cs (r) q2sφ (r) + . . .

For an electroneutral solution, the �rst sum on the right hand side is zero. Truncation the powerseries after �rst order we get

∇ [ε (r)∇φ (r)] = −

[ρ (r) + β

∑s

cs (r) q2sφ (r)

]· 4π (3.2.7)

Introducing the ionic strength I (r) = 12

∑s cs (r) q2s we arrive at the linearised Poisson-Boltzmann

equation∇ [ε (r)∇φ (r)] = − [ρ (r) + β · 2I (r)φ (r)] · 4π (3.2.8)

also often expressed as∇ [ε (r)∇φ (r)] = −ρ (r) +K · εφ (r)

31

Figure 18: Illustration of the cubic grid for the numerical solution of the Possion-Boltzmann equa-tion.

with the Debye screening constant K2 = 4πIεkBT

; 1K Debye length.

Even for the linearised Poisson-Boltzmann equation analytical solutions are only possible forsimple geometries.

Numerical solution

The numerical solution of the PBE is carried out by �nite di�erences, The system is mapped ona cubic grid with size l with a point charge q0 at the centre of the cube and an ionic strength I0inside the cube. On the six faces i = 1, . . . 6 of the cube dielectric functions ε (i)are de�ned. Insidethe adjacent cubes i = 1 . . . 6, we have the electrostatic potentials φi, the charges qi ad the ionicstrengths Ii.

We can then integrate over the volume of the cube

ˆVcube

∇ [ε (r)∇φ (r)] =

ˆVcube

8πI (r)φ (r)− 4π

ˆVcube

ρ (r)

ˆVcube

∇ [ε (r)∇φ (r)] = 8πI0φ0l3 − 4πq0l

3 (3.2.9)

Using Gauss' theorem, the volume integral is transformed into a surface integral

ˆVcube

∇ [ε (r)∇φ (r)] =

ˆScube

ε (r)∇φ (r) · −→n dS (3.2.10)

The derivative is replaced by an increment (�nite di�erence)

∇φ =φi − φ0

l

such that6∑i=1

2εi (φi − φ0) l2

l= 8πβI0φ0l

3 − 4πq0l3 (3.2.11)

32

Figure 19: Illustration of focussing in the numerical solution of the linearised Poisson-Boltzmannequation. The grid spacing is set �ner and �ner in a series of calculations while the solute isoccupying a larger fraction of the grid space.

Solving this for φ0 we get

6∑i=1

εiφi −6∑i=1

εiφ0 = 8πβI0φ0l2 − 4πq0l

2

8πβI0φ0l2 +

6∑i=1

εiφ0 =6∑i=1

εiφi + 4πq0l2 (3.2.12)

φ0 =

∑6i=1 εiφi + 4πq0l

2∑6i=1 εi + 8πβI0l2

Each φ0 depends on the other φi. The solution is carried out iteratively, starting from arbitraryvalues. In order to obtain a stable solution grid sizes of l ' 0.3 would be necessary. This isusually computationally too demanding for proteins. Instead, the iterative solutions are obtainedwith focussing, i.e. initial solutions are computed on a coarse grid. In subsequent iterations, thegrid becomes �ner and �ner, mainly in the region of interest.

3.2.2 Reaction Field

When a charge, distributed on a molecular system with εint = εp is transferred from a uniformphase into the solvent εext = εw, it experiences a reation �eld, given by the di�erence of theelectrostatic potentials

φreac = φsol − φvac (3.2.13)

One can write an e�ective energy function W of the molecular coordinates rM which takes intoaccount the solvation

W(rM)

= HMM

(rM)

+ ∆Gsolv(rM)

(3.2.14)

The solvation free energy ∆Gsolv consists of the cavity formation, i.e. the rearrangement of thesolvent molecules, and the electrostatic solvation (polarisation)

∆Gpol =1

2

∑i

qiφreac (ri)

Additionally there would be solvent entropy and nonpolar solvent-solute contributions to be con-sidered. The electrostatic solvation free energy in a continuous form reads

∆Gpol =1

2

ˆρ (r)φreac (r) d3r (3.2.15)

33

Figure 20: Born Solvation of a spherical particle(s)

3.2.3 Generalised Born

The Born solvation free energy

∆GBorn = − q2

2a

(1

εp− 1

εw

)for the reaction �eld experienced by a single spherical charge of radius a.

Let us assume that the solvation free energy is given by a pairwise sum over interacting partialcharges qi in the solute

∆Gpol =1

2

(1

εp− 1

εw

)∑i,j

qiqjfGB (rij)

(3.2.16)

The function fGB interpolates between the distance rij at large distances and an �e�ectiveBorn radius� Ri at short distances

fGB (rij) =

[r2ij +RiRj exp

(−

r2ij4RiRj

)] 12

For large distances r2ij � RiRj the reaction �eld neglects the size of the atoms. At small distances

r2ij < RiRj the Born radii dominate.Considering only one charge qialone, the e�ective Born radius returns the elctrostatic energy

according to the Born equation if all other atoms were uncharged

∆Gipol = − qi2Ri

(1

εp− 1

εw

)∆Gipol =

qi2φireac (ri)

Hence, the e�ective Born radii could in principle be determined if φreac was known (e.g. fromsolving the PBE). This is however not useful since the we want to avoid the computation of φreac.

34

Figure 21: Illustration of the Still method for the determination of Born radii.

The Born radii can be determined by the Still method:

The Born radus or e�ective radius determines the degree of burial inside of the solute, i.e. thedistance from the atom to the molecular surface.

∆Gpol =1

2

(1

εp− 1

εw

)q2i

{M∑k=1

Ak4πr2k

[1

rk − 12Γk− 1

rk + 12Γk

]+

1

rM+1 − 12ΓM+1

}

The expression in the {. . .} = 1ai

yields the Born radii. rk are the radii of spheres around thesurface of the molecule, increasing until the whole molecule is included. Ak is the correspondingamount of the surface area of such a spehre that is not included in the van der Waals surface ofthe molecule. Γk+1 = (1 + F ) Γkwhere F and Γare parameters.

3.2.4 Polarisable Continuum Method

The polarisation of the solute molecule M must satisfy the Poisson equation

−∇ [ε (r)∇φ (r)] = 4πρM (r) (3.2.17)

This can be simpli�ed to

−∇φin (r) = 4πρM (r) (3.2.18)

−ε∇φout (r) = 0 (3.2.19)

where in and out refer to inside and outside the cavity. The shape of the cavity is given by thevan der Waals radii of the solute atoms.

The total electrostatic potential is the sum of the molecular electrostatic potential and thereaction potential: φ (r) = φM (r) + φσ (r). This assumes that all real charges, described by ρMare solute charges inside the cavity.

On the cavity surface∑

, two conditions must be ful�lled

φin − φout = 0 (3.2.20)

and (∂φ

∂n

)in

− ε(∂φ

∂n

)out

= 0 (3.2.21)

35

where −→n is the normal to the cavity surface.The reaction potential φσ can be described by an apparent charge distribution σ on the cavity

surface.

φσ (r) =

ˆ∑ σ (s)

|r − s|d2s (3.2.22)

where s is a point on the cavity surface∑

.The surface charge distribution that appears at the boundary of the two regions in and out

σin,out (s) = −(εout − 1

4π∇φout −

εin − 1

4π∇φin

)· −→n in,out (3.2.23)

Inside the cavity we have εin = 1 and therefore εin−14π ∇φin = 0 and outside εout = ε. The surface

distribution is thus

σ (s) =ε− 1

4π∇φout · −→n

or

σ (s) =ε− 1

4πε∇φin · −→n (3.2.24)

Using φin (r) = φM,in (r) + φσ,in (r) we set

σ (s) =ε− 1

4πε

∂ (φM,in − φσ,in)

∂−→n(3.2.25)

where −→n is the normal from in to out.To solve the ASC equations, the cavity surface is divided into small tesserae (boundary element

method) with area ∆S small enough that σ (s) can be considered constant within one tessera,represented by a charge qk at a point sk within ∆Sk.

The charges

qk = ∆Skσ (sk)

qk = ∆Sk ·ε− 1

4πε∇φM,in (sk) · −→n (3.2.26)

are determined such that

φσ (r) =∑k

σ (sk)Sk|r − sk|

(3.2.27)

φσ (r) =∑k

qk|r − sk|

(3.2.28)

which is equation 3.2.22 sipli�eed by a sum over all tesserae.The charges qk are calculated interativley according to

qmk = q0k [1 +Ak + . . .+Amk ]−m−1∑i=0

Aik∑l 6=k

qm−i−1l Bk,l

with

Ak =ε− 1

4πε

(2π − 2π

√∆S

4πR2k

)

36

a) b)

Figure 22: a) In the polarisable continuum model, the solvent is represented as apparent surfacecharges, positioned in the centres of small tesserae of the cavity surface. b) The surface is formedby e.g. the van der Waals surface.

Rk is the local radius of the cavity, and

Bk,l =ε− 1

4πε∆S

(Sk − Sl) · −→n k

|Sk − Sl|3

The Ak and Bk.l depend on the cavity and the dielectric constant ε.This can be formulated as a set of linear equations

D · q = Ein

with

Dkk =4πε

ε− 1(1−Ak) (3.2.29)

Dkl =4πε

ε− 1Bkl (3.2.30)

Ein,k = −∆Sk∇φM,in (Sk) · −→n k (3.2.31)

The matrix D depends on the initial de�nition of cavity and dielectric and does not change duringthe iteration.

37

3.3 Discrete solution of the Poisson equation

3.3.1 Discretizing the Laplace operator

The discrete solution of the Poissonis equation requires the discretization of the Laplace operator∇2. In a d-dimensional regular m × n lattice with grid spacing h, the Laplace operator can bediscretized as:

∇2dφi =

1

h2

∑neighbors j

φj − 2φi

For example, in 1D:

∇21φi =

φi−1 + φi+1 − 2φih2

this is equivalent to the �nite di�erence

∂2φ

∂x2=φ (x+ h) + φ (x− h)− 2φ (x)

h2

at all grid points.In 2D :

∇22φi,j =

φi−1,j + φi+1,j + φi,j−1 + φi,j+1 − 4φi,jh2

and in 3D:

∇22φi,j =

φi−1,j,k + φi+1,j,k + φi,j−1,k + φi,j+1,k + φi,j,k−1 + φi,j,k+1 − 6φi,jh2

Using the 2D-case, we obtain the discretized Poisson equation (with ε0 = 1):

4φi,j − φi−1,j − φi+1,j − φi,j−1 − φi,j+1 = h2ρi,j .

This equation can be written in Matrix-vector form4 −1 · · · 0 0−1 4...

. . .

0 4 −10 · · · −1 4

φ11φ21...

φmn

= −h2

ρ11ρ21...

ρmn

where the matrixthat contains the coe�cients specifying the update rule is block-diagonal andsparse allowing the use of e�cient solvers.

3.3.2 Periodic Boundary Conditions

In a molecular simulation we are naturally limited to simulate only small and �nite systems. Apopular trick to mimick an in�nite or at least large system, and thereby better model a real-world scenario are periodic boundary conditions (PBC). With PBC the real box is in�nitely oftenreplicated by images in all dimensions. This further allows the particles to leave the simulation boxwhile maintaining the box size: a particles that has left the box �to the left� will re-enter �from theright�, see Fig. 23.

With the addition of periodic images, however, the grid for solving the Poisson equation bydiscretisation becomes in�nite, too. Since every challenge bares an opportunity we will actuallymake use of the periodicity, i.e. the periodic cahrge density.

38

Figure 23: Illustration of periodic boundary conditions. All particles of the primary cell (shown indark) are replicated in periodic images (grey). Particles leaving the box re-enter from the otherside.

3.3.3 Solution via Fourier Transformation

We consider the Fourier-transformed variants φ(k) and ρ(k) in reciprocal space k. Since weConsider real space, spanned by vectors a1;a2;a3, represented as a regular lattice. Discretisa-

tion in a mesh we have following real space coordinates

rm =m1

|a1|a1 +

m2

|a2|a2 +

m3

|a3|a3

with index setm = (m1,m2,m3)

Associated with the lattice of mesh points is a �nite reciprocal lattice:

km = m1b1 +m2b2 +m3b3

where the vectors b1;b2;b3with

b1 =a2 × a3

a1 · (a2 × a3); b2 =

a1 × a3a1 · (a2 × a3)

; b3 =a1 × a2

a1 · (a2 × a3)

span the reciprocal or k-space or 3d-space of plane waves with wave vector k of wave exp (ikr).In a crystal lattice, i.e. a regular lattice with translational symmetry there is a translational

vector R that transforms the crystal into itself.The waves in that periodic symmetry with wave vectors k we get

exp (ikr) = exp [ik (R + r)]

andexp (ikR) = 1

Our in�nite regular lattice in real space can therefore be Fourier-transformed into �nite k-space.Likewise we can transform the periodic meshed charge density via:

ρ(k) =∑rm

ρ(r)e−i〈k,r〉

39

into k-space and backtransform via

ρ(r) = 1|a1||a2||a3|

∑km

ρ(k)ei〈k,r〉

The potential can be transformed back and forth accordingly:

φ(k) =∑rm

φ(r)e−i〈k,r〉

φ(r) =1

|a1| |a2| |a3|∑km

φ(k)ei〈k,r〉

The Fourier-transformed Poisson equation in reciprocal space k is:

k2φ(k) =ρ(k)

ε0(3.3.1)

Therefore we can solve for φ by:

1. Transform charge density to reciprocal space: ρ(k) = F{ρ(r)}

2. Compute reciprocal potential: φ(k) = ρ(k)/(k2ε0)

3. Transform back to real space: φ(r) = F−1{φ(k)}

The discrete Fourier transforms can be performed using the fast Fourier transform (FFT), whichhas a computational complexity of M logM . Assuming that we assign m lattice points to eachcharge, this amounts to a complexity of Nm logNm = m(N logN +N logm) which has formalcomplexity N logN with a possibly large pre-factor m.

3.3.4 Direct summation

The total interaction energy for charges under PBC has to include interactions between periodicimages

Eelec =1

4πε0

′∑n

N∑i,j=1

qiqj|rij + n|

(3.3.2)

where the sum∑′

n runs over all periodic boxes vectors n, and the prime indicates that the termi = j is omitted in the primary cell. The vector n = (ix, iy, iz)L points to the origin of aperiodic box of length L, with indexes ix,y,z = 0,±1,±2, ... . The sum (3.3.2) is only conditionallyconvergent. Moreover the convergence is very slow.

3.4 Ewald summation

Idea: We recompute the electrostatic energy, but rewrite the charge distribution by subtractingand adding subtracting a smeared charge distribution Gσ(r):

ρi(r) = qiδ(r− ri)− qiGσ(r− ri)︸︷︷︸ρSi (r)

+ qiGσ(r− ri)︸︷︷︸ρLi (r)

40

and split this sum into parts ρSi (r) and ρLi (r). This approach is useful because with an appropriatechoice of Gσ the contribution −qiGσ(r − ri) counteracts (shields) the charge qi such that thepart ρSi (r) only generates a short-ranged �eld that decays quickly in real space can be e�cientlytruncated using a real-space cuto�. We are left with ρLi (r) which does generate a long-rangedpotential, but, is periodic on the lattice and with an appropriate choice of Gσ is smooth enoughsuch that we can e�ciently describe it as a Fourier sum using not too many Fourier coe�cients.For Gσ various choices are possible, but we will use a normalized Gaussian distribution:

Gσ(r) =1

(2πσ2)3/2exp

(− r2

2σ2

)where r = |r|. The scheme below illustrates the charge splitting of a set of point charges intoshielded point charges to be evaluated as a direct real space sum and shielding potentials that willbe transformed to reciprocal space, solved there, and then back-transformed to real space.

Potential decomposition:

φi(r) =qi

4πε0

[1

|ri − r|−ˆd3r′

Gσ(r− r′)

|r− r′|

]︸︷︷︸

φSi (r)

+qi

4πε0

ˆd3r′

Gσ(r− r′)

|r− r′|︸︷︷︸φLi (r)

(3.4.1)

Short range potential and interaction energyThe potential �eld generated by a Gaussian charge distribution is evaluated by solving Poisson's

equation ∇2φσ(r) = −Gσ(r)/ε0. This has been done above and the result is given in Eq. (3.1.2).Using this result and associating r = |r − ri| for each charge, we can thus obtain the electricpotential φSi (r) of the decomposition in Eq. (3.4.1):

φSi (r) =qi

4πε0|ri − r|

[1− erf

(|ri − r|√

2σ

)]=

qi4πε0|ri − r|

erfc

(|ri − r|√

2σ

)(3.4.2)

The error function erf quickly increases from erf(0) = 0 to a value of 1, while the complementaryerror function erfc quickly decays from erfc(0) = 1 to 0. Therefore, can associate the shieldedpotential φSi (r) is a short-range potential that can be e�ciently cut o�. We therefore computethe corresponding electrostatic energy directly between pairs of charges, and will later truncate thisterm at a certain cuto� radius:

ES =1

8πε0

∑n

N∑i=1

∗ qiqj|ri − rj + nL|

erfc

(|ri − rj + nL|√

2σ

)(3.4.3)

Figure 24: Illustration of the transformation of point charges to gaussian charge distributions inreal space, and then to k space in the Ewald summation.

41

Here the prime∑ ′

indicates that the term i = j is omitted in the primary cell in order to avoidinteraction of the charge with itself.

Long range potential and interaction energyThe potential φLi (r) is dominated by a r−1 decay, and thus a long-range potential that cannot

be e�ciently cuto�:

φLi (r) =qi

4πε0|ri − r|erf

(|ri − r|√

2σ

). (3.4.4)

The total long-ranged charge density

ρL(r) =∑n

N∑j=1

qjGσ(r− rj + nL) (3.4.5)

is periodic in n, and likewise the total potential φL is periodic. Therfore we consider to solve itby Fourier transforming the charge distribution to the reciprocal space, solving for φL there andthen transforming it back. We will see that in this way we will bet an expression for the potentialand total energy that can also be e�ciently cut o�, only that now the cut-o� parameter is not areal-space radius but rather a reciprocal space radius, i.e. we restrict ourselves to considering onlya few wave vectors.

Consider the Fourier transforms of ρL and φL, denoted by ρL = F{ρL(r)} and φL = F{φL(r)}.The forward and backward Fourier transforms are given by:

φL(k) =´V d

3r φL(r)e−i〈k,r〉 φL(r) = 1V

∑k φ

L(k)ei〈k,r〉

ρL(k) =´V d

3r ρL(r)e−i〈k,r〉 ρL(r) = 1V

∑k ρ

L(k)ei〈k,r〉

where k = 2πL (k1, k2, k3) are reciprocal lattice vectors, where k1,2,3 ∈ Z.

We Fourier-transform the charge density (3.4.5). Due to periodicity we simplify the calculationby considering only the charges of the primary cell and take the integral over the entire R3:

ρL(k) =N∑j=1

qj

ˆR3

d3rGσ(r− rj)e−i〈k,r〉

=N∑j=1

qje−i〈k,rj〉e−σ

2k2/2

where k = |k| is the reciprocal vector length. Now we compute the reciprocal potential using thereciprocal Poisson equation (3.3.1):

φL(k) =1

ε0

N∑j=1

qje−i〈k,rj〉 e

−σ2k2/2

k2

and compute the real-space potential by back-transforming φL(k):

φL(r) =1

V ε0

∑k

N∑j=1

qjei〈k,r〉e−i〈k,rj〉

e−σ2k2/2

k2

42

Note that the contribution from the k = 0 term is zero if the total charge is zero, i.e.∑

i qi = 0.We now compute the total long-range interaction energy (do not confuse the imaginary unit i withthe index i:

EL =1

2V ε0

∑k

N∑i=1

N∑j=1

qaqbei〈k,ri〉e−i〈k,rj〉

e−σ2k2/2

k2

=1

2V ε0

∑k

|S(k)|2 e−σ2k2/2

k2(3.4.6)

where we have de�ned the structure factor S (k)

Using the structure factor, the calculation of the long-ranged energy is linear in N . The sumover k is in�nite, but can be e�ciently truncated by choosing a maximal cuto� for k, because theterm e−σ

2k2/2/k2 decays quickly in k.

Self-interaction energy

The long-ranged interaction energy is given by the sum of the interaction energies of eachcharge with the potential generated by the other charges. However, we have so far ignored that agiven charge should not interact with itself. In the short-range energy 3.4.3 this is easy to take careof, by simply excluding the pair i = j for the principal cell. However, in the long-range energy wehave avoided such an �exception� for the primary cell so as to make sure that the lattice is entirelyperiodic. Therefore, this is not yet taken care of: the Gaussian charge centered at position ri doesgenerate a nonzero potential at ri, and hence EL includes a nonphysical self-interaction that needsto be subtracted. This spurious self-interaction is easy to compute by considering the long-rangepotential given by Eq. (3.4.4) and letting r→ ri:

limr→ri

φLi (r) = limr→ri

qi4πε0|ri − r|

erf

(|ri − r|√

2σ

)=

qi4πε0σ

√2

π,

yielding a total self-energy generated by all charges of:

Eself =1

2ε0σ(2π)3/2

N∑i=1

q2i (3.4.7)

Total Ewald interaction energy

The total Ewald interaction energy is thus given by combining Eqs. (3.4.3), (3.4.6) and (3.4.7):

E = ES + EL − Eself (3.4.8)

=1

8πε0

∑n

N∑i=1

N∑i=1

∗ qiqj|ri − rj + nL|

erfc

(|ri − rj + nL|√

2σ

)

+1

2V ε0

∑k

|S(k)|2 e−σ2k2/2

k2

− 1

2ε0σ(2π)3/2

N∑i=1

q2i

43

3.4.1 Particle Mesh Ewald

The superlinear (N3/2) performance of the standard Ewald method comes from the fact that weneed to increase both the real-space cuto� and the reciprocal-space cuto� when the system sizegrows, in order to maintain the same error level. PME seeks to keep the real-space cuto� constantwhile maintaining the error level by introducing following trick:

1. Map the charges on a periodic lattice.2. Solve the Fourier transform on the periodic lattice using FFT.3. Map the potential back to the charges.FFT is complete (no cuto�). The FFT-part depends only mildly on N (N logN), while the

real-space sum is constant in N when the real-space cuto� is �xed. Thus we arrive at a totalperformance of N logN .

In order to make use of FFT, we cannot evaluate the charges at their actual positions but haveto

Map the charges onto a grid Consider the mesh coordinates

rm =m1

|a1|a1 +

m2

|a2|a2 +

m3

|a3|a3

with index setm = (m1,m2,m3)

The charge at a mesh point rm is in general given by

q(rm) =∑i

qiW (rm − ri)

where W is a spreading function. In PME, we use an approximation to the Gaussian spreadfunctions:

q(rm) =∑n

N∑j=1

qjGσ(rm − rj + nL)

(Note that for grids of realistic width we must use di�erent spreading functions that preservethe charge

∑m q(rm) =

∑i qi. We won't go into details about that here)

Solving φL via the Poisson equation As described in Sec. 3.3, we can solve for φ either byusing a direct solver (via setting up the linear system Ax = b and then using a sparse matrixsolver), or solve by transforming to reciprocal space, dividing by k2, and backtransforming to realspace via FFT. Either way, when e�ciently implemented, the e�ort is N logN .

We note that the di�erence from the situation in Sec. 3.3 is that we do not have point chargesbut Gaussian spread charges. If these were approximated by a �ne grid, we would directly obtain:

ρLkm=

1

Ve−σ

2k2/2∑rm

q(rm) e−i〈km,rm〉

which is transformed into the reciprocal potential:

φL(km) =1

V ε0G′(k)

∑rm

q(rm) e−i〈km,rm〉 (3.4.9)

44

Figure 25: Conceptual overview over Ewald summation and Particle Mesh Ewald (PME). Thecomputational complexity of PME is N logN (see Poisson solution, above)

with

G′(k) =e−σ

2k2/2

k2(3.4.10)

Next, we back-transform φ via inverse FFT:

φL(rm) =∑km

ei〈km,rm〉φL(km)

However, it is more e�cient to use a not-so-�ne grid and instead of true Gaussians use spreadfunctions with �nite support, such that we only need to spread charges to neighboring grid cells.Thus, we introduce discretization errors that can be counteracted by using di�erent coe�cientsG′(k) instead of Eq. (3.4.10). Making these choices appropriately such that computational e�ortand discretization error are balanced is the art of PME implementations, and we will not go intodetails about that here.

Back-interpolation We interpolate φL(rm) back onto the charges, obtaining φL(r) and computethe total energy

EL =∑i

qiφL(ri)

45

methods of molecular simulations lecture notes€¦ · [1] leach, a. r. (2001) molecular modelling:...

Documents