advanced review geometry optimization - dept of chemistry...

20
Advanced Review Geometry optimization H. Bernhard Schlegel Geometry optimization is an important part of most quantum chemical calcu- lations. This article surveys methods for optimizing equilibrium geometries, lo- cating transition structures, and following reaction paths. The emphasis is on optimizations using quasi-Newton methods that rely on energy gradients, and the discussion includes Hessian updating, line searches, trust radius, and rational function optimization techniques. Single-ended and double-ended methods are discussed for transition state searches. Single-ended techniques include quasi- Newton, reduced gradient following and eigenvector following methods. Double- ended methods include nudged elastic band, string, and growing string methods. The discussions conclude with methods for validating transition states and fol- lowing steepest descent reaction paths. C 2011 John Wiley & Sons, Ltd. WIREs Comput Mol Sci 2011 1 790–809 DOI: 10.1002/wcms.34 INTRODUCTION G eometry optimization is a key component of most computational chemistry studies that are concerned with the structure and/or reactivity of molecules. This chapter describes some of the meth- ods that are used to optimize equilibrium geometries, locate transition structures (TSs), and follow reac- tion paths. Several surveys and reviews of geometry optimization are available. 1–14 Rather than an exten- sive review of the literature, this article is directed to- ward practical geometry optimization methods that are currently in use. In particular, it is concerned with geometry optimization methods applicable to elec- tronic structure calculations. 15, 16 Because electronic structure calculations can be lengthy, geometry opti- mization methods need to be efficient and robust. For inexpensive molecular mechanics calculations, sim- pler methods may be adequate. Most major electronic structure packages have a selection of geometry op- timization algorithms. By discussing the components of various optimization methods we hope to aid the reader in selecting the most appropriate optimization methods and in overcoming the occasional difficulties that may arise. Unconstrained nonlinear optimization methods have been discussed extensively in numeri- cal analysis texts (e.g., Refs 17–20). These methods are designed to find a local minimum closest to the starting point. Global optimization and conforma- tional searching are more difficult problems and will Correspondence to: [email protected] Department of Chemistry, Wayne State University, Detroit, MI, USA DOI: 10.1002/wcms.34 be reviewed elsewhere in this series. Global mapping of reaction networks 21, 22 is outside the scope of this chapter. Transition path sampling and reaction paths on free energy surfaces are also not covered. 23 Dis- cussions of optimization methods to find conical in- tersections and minimum energy seam crossings can be found elsewhere. 22, 24–36 Here, we focus on finding the equilibrium geometry of an individual molecule, locating a TS for a reaction, and following the reac- tion path connecting reactants through a TS to prod- ucts. The geometry optimization methods discussed in this chapter use energy derivatives and depend on factors such as the choice of coordinates, methods for calculating the step direction, Hessian updating meth- ods, line search, and step size control strategies. Some of these methods have been compared recently 37 for modest set of test cases for optimizing equilibrium geometries 38 and TSs. 39 POTENTIAL ENERGY SURFACES The structure of a molecule can be specified by giv- ing the locations of the atoms in the molecule. For a given structure and electronic state, a molecule has a specific energy. A potential energy surface describes how the energy of the molecule in a particular state varies as a function of the structure of the molecule. A simple representation of a potential energy surface is shown in Figure 1, in which the energy (vertical coordinate) is a function of two geometric variables (the two horizontal coordinates). The notion of molecular structure and po- tential energy surfaces are outcomes of the Born–Oppenheimer approximation, which allows us 790 Volume 1, September/October 2011 c 2011 John Wiley & Sons, Ltd.

Upload: others

Post on 27-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review

Geometry optimizationH. Bernhard Schlegel∗

Geometry optimization is an important part of most quantum chemical calcu-lations. This article surveys methods for optimizing equilibrium geometries, lo-cating transition structures, and following reaction paths. The emphasis is onoptimizations using quasi-Newton methods that rely on energy gradients, andthe discussion includes Hessian updating, line searches, trust radius, and rationalfunction optimization techniques. Single-ended and double-ended methods arediscussed for transition state searches. Single-ended techniques include quasi-Newton, reduced gradient following and eigenvector following methods. Double-ended methods include nudged elastic band, string, and growing string methods.The discussions conclude with methods for validating transition states and fol-lowing steepest descent reaction paths. C© 2011 John Wiley & Sons, Ltd. WIREs Comput MolSci 2011 1 790–809 DOI: 10.1002/wcms.34

INTRODUCTION

G eometry optimization is a key component ofmost computational chemistry studies that are

concerned with the structure and/or reactivity ofmolecules. This chapter describes some of the meth-ods that are used to optimize equilibrium geometries,locate transition structures (TSs), and follow reac-tion paths. Several surveys and reviews of geometryoptimization are available.1–14 Rather than an exten-sive review of the literature, this article is directed to-ward practical geometry optimization methods thatare currently in use. In particular, it is concerned withgeometry optimization methods applicable to elec-tronic structure calculations.15,16 Because electronicstructure calculations can be lengthy, geometry opti-mization methods need to be efficient and robust. Forinexpensive molecular mechanics calculations, sim-pler methods may be adequate. Most major electronicstructure packages have a selection of geometry op-timization algorithms. By discussing the componentsof various optimization methods we hope to aid thereader in selecting the most appropriate optimizationmethods and in overcoming the occasional difficultiesthat may arise. Unconstrained nonlinear optimizationmethods have been discussed extensively in numeri-cal analysis texts (e.g., Refs 17–20). These methodsare designed to find a local minimum closest to thestarting point. Global optimization and conforma-tional searching are more difficult problems and will

∗Correspondence to: [email protected]

Department of Chemistry, Wayne State University, Detroit, MI,USA

DOI: 10.1002/wcms.34

be reviewed elsewhere in this series. Global mappingof reaction networks21,22 is outside the scope of thischapter. Transition path sampling and reaction pathson free energy surfaces are also not covered.23 Dis-cussions of optimization methods to find conical in-tersections and minimum energy seam crossings canbe found elsewhere.22,24–36 Here, we focus on findingthe equilibrium geometry of an individual molecule,locating a TS for a reaction, and following the reac-tion path connecting reactants through a TS to prod-ucts. The geometry optimization methods discussedin this chapter use energy derivatives and depend onfactors such as the choice of coordinates, methods forcalculating the step direction, Hessian updating meth-ods, line search, and step size control strategies. Someof these methods have been compared recently37 formodest set of test cases for optimizing equilibriumgeometries38 and TSs.39

POTENTIAL ENERGY SURFACES

The structure of a molecule can be specified by giv-ing the locations of the atoms in the molecule. For agiven structure and electronic state, a molecule has aspecific energy. A potential energy surface describeshow the energy of the molecule in a particular statevaries as a function of the structure of the molecule.A simple representation of a potential energy surfaceis shown in Figure 1, in which the energy (verticalcoordinate) is a function of two geometric variables(the two horizontal coordinates).

The notion of molecular structure and po-tential energy surfaces are outcomes of theBorn–Oppenheimer approximation, which allows us

790 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 2: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

FIGURE 1 | Model potential energy surface showing minima,transition structures, second-order saddle points, reaction paths, and avalley ridge inflection point (Reprinted with permission from Ref 5.Copyright 1998 John Wiley & Sons.)

to separate the motion of the electrons from the mo-tion of the nuclei. Because the nuclei are much heav-ier and move much more slowly than the electrons,the energy of a molecule in the Born–Oppenheimerapproximation is obtained by solving the electronicstructure problem for a set of fixed nuclear positions.Because this can be repeated for any set of nuclearpositions, the energy of a molecule can be describedas a parametric function of the position of the nuclei,thereby yielding a potential energy surface.

A potential energy surface, like the one shown inFigure 1, can be visualized as a hilly landscape, withvalleys, peaks, and mountain passes. Even thoughmost molecules have many more than two geomet-ric variables, most of the important features of a po-tential energy surface can be represented in such alandscape.

The valleys of a potential energy surface repre-sent reactants, intermediates, and products of a reac-tion. The position of the minimum in a valley repre-sents the equilibrium structure. The energy differencebetween the product valley and reactant valley min-ima represents the energy of the reaction. Vibrationalmotion of the molecule about the reactant and prod-uct equilibrium geometries can be used to computezero-point energy and thermal corrections needed tocalculate enthalpy and free energy differences.40 Thelowest energy pathway between the reactant valleyand the product valley is the reaction path.41 Thehighest point on this lowest energy reaction path isthe TS for the reaction, and the difference betweenthe energy of the TS and the reactant is the energybarrier for the reaction. A TS is a maximum in onedirection (the direction connecting reactant and prod-

uct along the reaction path) and a minimum in allother directions (directions perpendicular to the reac-tion path). A TS is also termed as a first-order saddlepoint. In Figure 1, it can be visualized as a mountainpass connecting two valleys. A second-order saddlepoint (SOSP) is a maximum in two directions and aminimum in all the remaining directions. If a reactionpath goes through an SOSP, a lower energy reactionpath can always be found by displacing the path awayfrom the SOSP. An n-th order saddle point is a max-imum in n directions and a minimum in all the otherdirections.

For a thermally activated reaction, the energy ofthe TS and the shape of the potential energy surfacearound the TS can be used to estimate the reaction rate(see other reviews in this series). The steepest descentreaction path (SDP) from the TS down to the reactantsand to the products is termed the minimum energypath (MEP) or the intrinsic reaction coordinate (IRC;the MEP in mass-weighted coordinates). The reactionpath from reactants through intermediates (if any) toproducts describes the reaction mechanism.41 A moredetailed description of a reaction can be obtainedby classical trajectory calculations42–45 that simulatemolecular dynamics by integrating the classical equa-tions of motion for a molecule moving on a potentialenergy surface. Photochemistry involves motion onmultiple potential energy surfaces and transitions be-tween them (see Ref 46).

ENERGY DERIVATIVES

The first and second derivatives of the energy withrespect to the geometrical parameters can be used toconstruct a local quadratic approximation to the po-tential energy surface:

E(x) = E(x0) + gT0 �x + 1/2�xT H0�x (1)

where g0 is the gradient (dE/dx) at x0, H0 is the Hes-sian (d2E/dx2) at x0, and �x = x−x0. The gradientand Hessian can be used to confirm the character ofminima and TSs. The negative of the gradient is thevector of forces on the atoms in the molecule. Be-cause the forces are zero for minima, TSs, and higher-order saddle points, these structures are also termedstationary points. The Hessian or matrix of secondderivatives of the energy is also known as theforce constant matrix. The eigenvectors of the mass-weighted Hessian in Cartesian coordinates corre-spond to the normal modes of vibration (plus five orsix modes for translation and rotation).47 For a struc-ture to be characterized as a minimum, the gradient

Volume 1, September /October 2011 791c© 2011 John Wi ley & Sons , L td .

Page 3: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

must be zero and all of the eigenvalues of the Hessiancorresponding to molecular vibrations must be pos-itive; equivalently, the vibrational frequencies mustbe real (the vibrational frequencies are proportionalto the square root of the eigenvalues of the mass-weighted Hessian). For a TS, the potential energysurface is a maximum in one direction (along thereaction path) and a minimum in all other perpen-dicular directions. Therefore, a TS is characterized bya zero gradient and a Hessian that has one (and onlyone) negative eigenvalue; correspondingly, a TS hasone and only one imaginary vibrational frequency. Ann-th order saddle point (also called a stationary pointof index n) has a zero gradient and is a maximum inn orthogonal directions and hence has n imaginaryfrequencies. For a TS, the vibrational mode corre-sponding to the imaginary frequency is also knownas the transition vector. At the TS, the transition vec-tor is tangent to the reaction path in mass-weightedcoordinates.

Most methods for efficient geometry optimiza-tion rely on first derivatives of the energy; some alsorequire second derivatives. For most levels of theoryused routinely for geometry optimization, the firstderivatives can be calculated analytically at a costcomparable to that for the energy. Analytic secondderivatives are also available for several levels of the-ory, but the cost is usually considerably higher thanfor first derivatives. With the possible exception ofoptimization of diatomic molecules, derivative-basedgeometry optimization methods are significantly moreefficient than energy-only algorithms. If analytic firstderivatives are not available, it is possible to use sim-plex and pattern search methods,48–50 but these be-come less efficient as the number of degree of freedomincreases.51 Thus, it may be more efficient to computegradients numerically and to use a gradient-basedoptimization algorithm than to use an energy-onlyalgorithm.

COORDINATES

In principle, any complete set of coordinates can beused to represent a molecule and its potential energysurface. However, choosing a good coordinate systemcan significantly improve the performance of geome-try optimizations. Inspection of the Hessian used inthe local quadratic approximation to the potentialenergy surface, in Eq. (1), can reveal some favorableaspects of a good coordinate system. For example, anoptimization will be less efficient if there are some verystiff coordinates and some very flexible coordinates.This corresponds to a mixture of very large and very

small eigenvalues of the Hessian, (i.e., the Hessian isan ill-conditioned matrix). Strong coupling betweencoordinates can also slow down an optimization. Thiscorresponds to off-diagonal Hessian matrix elementsthat are comparable in magnitude to the diagonal el-ements. Strong anharmonicity can seriously degradethe performance of an optimization. If the Hessianchanges rapidly when the geometry of the molecule ischanged, or if the valley around a minimum is stronglycurved, then the quadratic expression in Eq. (1) is apoor approximation to the potential energy surfaceand the optimization will be slow to converge. Thenature of the Hessian and the anharmonicity of thepotential energy surface will be directly affected bythe choice of the coordinate system.

There are a number of coordinate systems thatare typically used for geometry optimization. Carte-sian coordinates are perhaps the most universal andthe least ambiguous. An advantage is that most energyand derivative calculations are carried out in Carte-sian coordinates. However, they are not well suitedfor geometry optimization because they do not reflectthe ‘chemical structure’ and bonding of a molecule.The x, y, and z coordinates of an atom are stronglycoupled to each other and to the coordinates of neigh-boring atoms.

Internal coordinates such as bond lengths andvalence angles are more descriptive of the molecularstructure and are more useful for geometry optimiza-tion. Bond stretching requires more energy than anglebending or torsion about single bonds. More impor-tantly, the coupling between stretches, bends, and tor-sions are usually much smaller than between Carte-sian coordinates. In addition, internal coordinates aremuch better than Cartesians for representing curvilin-ear motions such as valence angle bending and rota-tion about single bonds. For an acyclic molecule withN atoms, it is easy to select set of 3N−6 internal coor-dinates to represent the molecule (3N−5 coordinatesfor a linear molecule). Z-matrix coordinates are an ex-ample of such a coordinate system.52 It is straightfor-ward to convert geometries and derivatives betweenCartesian and Z-matrix internal coordinates.53

For acyclic molecules, the set of all bonds, an-gles, and torsions represents the intrinsic connectiv-ity and flexibility of the molecule. However, for acyclic molecule, this introduces more than the 3N−6coordinates required to define the geometry ofthe molecule. Such a coordinate system has acertain amount of redundancy in the geometricparameters.53–68 Because only 3N−6 of these redun-dant internal coordinates can be transformed backto Cartesian coordinates in three dimensions, cer-tain combinations of the redundant internals must

792 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 4: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

be constrained during the optimization. The set of allbonds, valence angles, and torsions (if necessary, aug-mented by out-of-plane bends and linear bends) con-stitutes a primitive redundant coordinate system.53,58

In some cases, it may be advantageous to form lin-ear combinations of the primitive redundant inter-nals to form natural or delocalized redundant internalcoordinates54–57 or symmetry-adapted redundant in-ternal coordinates. For periodic systems such as solidsor surfaces, unit cell parameters need to be added66–68

(either explicitly or implicitly via coordinates thatcross the boundaries of the unit cell). For moleculesin nonisotropic media, additional coordinates areneeded to specify the orientation of the molecule. Forsystems containing more than one fragment, addi-tional coordinates are required to specify the positionsof the fragments relative to each other. The union ofthe redundant internal coordinates for the reactantsand products is usually a good coordinate system forTS optimization.58 The transformation of Cartesiancoordinates and derivatives to redundant internals isstraightforward, but the back transformation of a fi-nite displacement of redundant internals to Cartesianusually is solved iteratively.53–68

NEWTON AND QUASI-NEWTONMETHODS

As described in standard texts on optimization,17–20

most nonlinear optimization algorithms are based ona local quadratic approximation of the potential en-ergy surface; Eq. (1). Differentiation with respect tothe coordinates yields an approximation for the gra-dient, given by:

g(x) = g0 + H0�x. (2)

At a stationary point, the gradient is zero, g(x) =0; thus, in the local quadratic approximation to thepotential energy surface, the displacement to the min-imum is given by:

�x = −H−10 g0. (3)

This is known as the Newton or Newton–Raphsonstep.

Newton and quasi-Newton methods are themost efficient and widely used procedures for opti-mizing equilibrium geometries and can also be usedeffectively to find TSs. For each step in the Newtonmethod, the Hessian in Eq. (3) is calculated at the cur-rent point. For quasi-Newton methods, Eq. (3) is usedwith an approximate Hessian that is updated at eachstep of the optimization (see below). Because actualpotential energy surfaces are rarely quadratic, several

Newton or quasi-Newton steps are required to reach astationary point. For minimization, the Hessian musthave all positive eigenvalues (i.e., positive definite).If one or more eigenvalues are negative, the step willbe toward a first or higher-order saddle point. Thus,without some means of controlling the step size anddirection, simple Newton steps are not robust. Sim-ilarly, if the aim is to optimize to a TS, the Hessianmust have one and only one negative eigenvalue, andthe corresponding eigenvector (i.e., the transition vec-tor) must be roughly parallel to the reaction path.Methods for ensuring that the step is in the desireddirection for minimization or TS optimization are dis-cussed in the section dealing with step size control.

At each step, Newton methods require the ex-plicit calculation of the Hessian, which can be rathercostly. Quasi-Newton methods start with an inex-pensive approximation to the Hessian. The differencebetween the calculated change in the gradient andthe change predicted with the approximate Hessianis used to improve the Hessian at each step in theoptimization.17–20

Hnew = Hold + �H (4)

For a quadratic surface, the updated Hessianmust fulfill the Newton condition,

�g = Hnew�x, (5)

where �g = g(xnew) − g(xold) and �x = (xnew − xold).However, there are an infinite number of ways toupdate the Hessian and fulfill the Newton condition.One of the simplest updates is the symmetric rankone (SR1) update,17–20 also known as the Murtagh–Sargent update.69

�HSR1 = (�g − Hold�x) (�g − Hold�x)T

(�g − Hold�x)T�x(6)

This formula can encounter numerical problemsif |�g − Hold�x| is very small. The Broyden familyof updates70 avoids this problem while ensuring thatthe Hessian update is positive definite. The Broyden–Fletcher–Goldfarb–Shanno (BFGS) update70–73 is themost successful and widely used member of thisfamily:

�HBFGS = �g �gT

�gT�x− Hold�x �xTHold

�xTHold�x. (7)

For TS optimization, it is important that theHessian has one and only one negative eigenvalue.This should be checked at every step of the opti-mization. If the Hessian does not have the correctnumber of negative eigenvalues, the eigenvalues needto be shifted or one of the methods for step size control

Volume 1, September /October 2011 793c© 2011 John Wi ley & Sons , L td .

Page 5: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

needs to be used (see Step Size Control). The initial es-timate of the Hessian for TS optimizations must haveone negative eigenvalue and the associated eigenvec-tor should be approximately parallel to the reactionpath. The Hessian update should not be forced tobe positive definite. The Powell-symmetric-Broyden(PSB) update18 fulfills this role:

�HPSB = (�g − Hold�x)�xT + �x (�g − Hold�x)T

�xT�x

− (�xT(�g − Hold�x)) �x �xT

(�xT�x)2. (8)

Bofill74 found that a combination of the PSB andSR1 updates performs better for TS optimizations:

�HBofill = φ�HSR1 + (1 − φ)�HPSB,

where φ = ((�g − Hold�x)T�x) 2

|�g − Hold�x|2 |�x|2 . (9)

Farkas and Schlegel75 constructed a similar up-date for minimization:

�HFS = φ1/2�HSR1 + (1 − φ1/2)�HBFGS, (10)

where φ is same as in Eq. (9). Bofill76 has recently re-viewed Hessian updating methods and proposed somepromising new formulas suitable for minima and TSs.For controlling the step size by trust radius (τ ) meth-ods or rational function optimization (see Step SizeControl), diagonalizing the Hessian can become acomputational bottleneck for large systems. Updat-ing the eigenvectors and eigenvalues of the Hessiancan be considerably more efficient.77

Quasi-Newton methods require an initial esti-mate of the Hessian. A scaled identity matrix maybe sufficient in some cases, but a better starting Hes-sian can be obtained from knowledge of the structureand connectivity of the molecule. Simple empiricalestimates of stretching, bending, and torsional forceconstants are usually satisfactory.78–80 Calculation ofan initial Hessian by molecular mechanics or semi-empirical electronic structure methods can provide abetter estimate. If the molecule has been optimized ata lower level of theory, the updated Hessian from theoptimization or the Hessian from a frequency calcu-lation at a lower level of theory are even better initialestimates. For harder cases, the rows and columnsof the Hessian can be calculated numerically for afew critical coordinates. For more difficult cases, thefull Hessian can be calculated at the beginning of theoptimization and recalculated at every few step if nec-essary. Recalculating the Hessian at each step corre-sponds to the Newton method.

Standard quasi-Newton methods store and in-vert the full Hessian. For large optimization problems,

this may be a bottleneck. The updating methods canbe reformulated to update the inverse of the Hessian.For example, the BFGS formula for the update of theinverse Hessian is:

�BBFGS = �x�xT

�xT�g− Bold�g�gTBold

�gTBold�g, (11)

where B = H−1, and the updated inverse Hessianobeys �x = B�g. Limited memory quasi-Newtonmethods such as L-BFGS80–89 avoid the storage of thefull Hessian or its inverse which would require O(n2)memory for n variables. Instead, they start with a di-agonal inverse Hessian, and store only the �x and �gvectors from a limited number of previous steps; thus,the storage is only O(n). The inverse Hessian is writ-ten as a diagonal Hessian plus the updates using thestored vectors. For the Newton step, xnew = xold −B gold, the product of the updated inverse Hessianand the gradient involves only O(n) work because itcan be expressed in terms of dot products betweenvectors.

CONJUGATE GRADIENT METHODS

Conjugate gradient methods are suitable for verylarge systems because they require less storage thanlimited memory quasi-Newton methods. The conceptbehind conjugate gradient methods is to choose anew search direction that will lower the energy whileremaining at or near the minimum in the previoussearch direction. If the Hessian has coupling betweenthe coordinates, the optimal search directions are notorthogonal but are conjugate, in the sense that �xnew

H�xold = 0. Two of the most frequently used con-jugate gradient methods are Fletcher–Reeves90 andPolak-Ribiere91:

si = − gi + gTi gi

gTi−1gi−1

si−1 (12)

si = − gi + (gi − gi−1)Tgi

gTi−1gi−1

si−1, (13)

where si and si−1 are the current and previous searchdirections, respectively, and gi and gi−1 are the currentand previous gradients, respectively. Note that onlythree vectors need to be stored. Unlike quasi-Newtonmethods, conjugate gradient methods require a linesearch at each step in order to converge.

GDIIS

Direct inversion of the iterative subspace (DIIS) con-structs a solution to a set of linear equations suchas Eq. (3) by expanding in a set of error vectors or

794 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 6: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

residuals from prior iterations. Geometry optimiza-tion by DIIS (GDIIS) applies this approach to theoptimization of equilibrium geometries and transi-tion structures.92–95 In the numerical analysis liter-ature, DIIS is known as generalized minimum of theresidual (GMRES).96 It has its origins in methods likethe Krylov,97 Lanczos,98 and Arnoldi99 algorithms.In GDIIS, the goal is to construct a new geometry asa linear combination of previous geometries so as tominimize the size of the Newton step [i.e., the residualin the iterative solution of Eq. (3)].

x∗ =∑

i

ci xi g∗ =∑

i

ci g(xi )∑

i

ci = 1 (14)

Minimize |xnew − x∗|2 with respect to ci ,

where xnew = x∗ − H−1g∗ (15)

The GDIIS coefficients are determined by minimiz-ing the residual and can be obtained by solving thefollowing:[

A 11 0

] [cλ

]=

[01

],

where Ai j = (H−1gi )T(H−1g j ). (16)

The GEDIIS method95 uses Aij = (gi − gj) (xi −xj). If A becomes singular, the oldest geometry is dis-carded and the solution is attempted again. The ap-proximate Hessian can be held constant, but betterperformance is obtained if it is updated. If the mini-mization wanders into a region with a negative eigen-value, it may converge to a TS. For a TS optimization,GDIIS sometimes converges not to the desired reac-tion barrier but to the nearest saddle point. One wayto control GDIIS optimizations is to compare the pre-dicted geometry with the one from a Newton step.94

If the angle is too large, it is safer to take the New-ton step (provided the Hessian has the correct num-ber of negative eigenvalues). In the final stages of aminimization, GDIIS sometimes converges faster thanquasi-Newton methods; a hybrid of quasi-Newtonand DIIS methods can be more efficient.95

LINE SEARCH

Newton, quasi-Newton, and GDIIS methods will con-verge without a line search if the surface is quadratic.However, life is not quadratic. For anharmonic sur-faces, the Newton step (Eq. (3)) may be too large ortoo small. Near an inflection point, Newton steps canget into an infinite hysteresis loop. Conjugate gradientmethods need a line search because they generate onlya search direction but not a step length. Thus, a linesearch is a good idea for any method, especially if it

can be done with little or no extra cost. The line searchneed not be exact but must fulfill the Wolfe condition,�E < α gT�x and gnew T�x > β gold T�x, reducing thefunction value and the magnitude of the gradient (α ≈0.1 and β ≈ 0.5 have been found to be practical forquasi-Newton optimizations18). A simple approachis to fit a polynomial to the energy and gradient atthe current and the previous geometry (a cubic poly-nomial or a quartic polynomial constrained to haveno negative second derivatives100). The minimum isfound on the polynomial, the gradient is interpolatedto the minimum, and the interpolated gradient is usedfor the next quasi-Newton step. If there is a largeextrapolation to the minimum, it is safer to explic-itly search for the minimum by doing additional en-ergy (and gradient) calculations to obtain a sufficientreduction in the energy (and the magnitude of thegradient).

STEP SIZE CONTROL

The quadratic approximation to the potential energysurface is satisfactory only for a small local region,usually specified by a trust radius, τ . Steps outside thisregion are risky and optimizations are more robust ifthe step size does not exceed τ . An initial estimate of τ

can be updated during the course of an optimizationbased on how well the potential energy surface canbe fit by a quadratic expression. A typical updatingrecipe is as follows18:

ρ = �E/(gT�x + 1/2 �xT H0 �x).

If ρ > 0.75 and 5/4 |�x| > τ old,

then τ new = 2 τ old.

If ρ < 0.25, then τ new = 1/4 |�x|.Otherwise, τ new = τ old. (17)

The simplest approach to step size control is toscale the Newton step back if τ is exceeded. A betterapproach is to minimize the energy under the con-straint that the step is not larger than τ .18,20 In thetrust radius method (TRM), this is done by using aLagrangian multiplier, λ, and corresponds to min-imizing E(x)−1/2 λ (�x2−τ 2). With the usualquadratic approximation for E(x), this yields:

g0 + H0�x − λ�x = 0

or �x = −(H0 − λI)−1g0 (18)

where I is the identity matrix.For minimizations, λ must be chosen so that all

the eigenvalues of the shifted Hessian, H − λ I, arepositive [i.e., λ must be smaller (more negative) thanthe lowest eigenvalue of H]. Thus, even if H has one or

Volume 1, September /October 2011 795c© 2011 John Wi ley & Sons , L td .

Page 7: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

FIGURE 2 | Plot of the displacement squared,|�x |2 = |(H − λ I )−1g|2, as a function of the Hessian shiftparameter λ (Reprinted with permission from Ref 102. Copyright 1983ACS Publications.). The singularties occur at the eigenvalues of theHessian, hi . (a) For displacement to a minimum with a trust radius ofτ 1, shift parameter λ1 must be less than the lowest eigenvalue of theHessian, h1. (b) For a displacement to a transition structure, the shiftparameter must be between h1 and 1

2 h2. For τ 1, the lower of the twosolutions is chosen. For a smaller trust radius τ 2, there is no solution;λ2 is chosen as the minimum in the curve [or λ2 = 1

2 (h1 + 12 h2) if λ2

> 12 h2] and the displacement is scaled back to the trust radius (see

Refs 101–104 for more details).

more negative eigenvalues, this approach can be usedto take a controlled step downhill toward a minimum.A plot of the square of step size as a function of λ isshown in Figure 2(a) (the singularities occur when λ

is equal to one of the eigenvalues of H).For TS optimization, the shifted Hessian must

have one negative eigenvalue (i.e., λ must be largerthan the lowest eigenvalue of H but smaller than thesecond lowest eigenvalue). As shown in Figure 2(b), ifthere are two solutions for a given step size, λ is usu-ally chosen to be closer to the lowest eigenvalue; ifthere are no solutions for a given step size, λ is chosenas the minimum between the lowest and second low-est eigenvalue and the resulting step is scaled back tothe τ .101–104 This procedure then allows a controlledstep to be taken toward the TS even when the Hes-sian does not have the correct number of negativeeigenvalues.

An alternative approach for TS optimization isto use the eigenvectors of the approximate Hessian todivide the coordinates into two groups; the optimiza-tion then searches for a maximum along one eigen-vector and for a minimum in the remaining space.The step size in this approach can be controlled byusing two Lagrangian multipliers, one for the step up-hill along one eigenvector and the other for the step

downhill in the remainder of the space. Because thisapproach allows one to follow an eigenvector uphilleven if it does not have the lowest eigenvalue, it isalso known as eigenvector following.74,101–109 Alter-natively, a separate λ can be chosen for each eigen-vector such that the step varies from steepest descentfor large gradients to a Newton step with a shiftedHessian for moderate gradients to a simple Newtonstep for small gradients.2 For all of these approachesto transition structure optimization, the Hessian musthave a suitable eigenvector that resembles the desiredreaction path. Instead of focusing on a single eigen-vector, the reduced potential surface approach selectsa small subset of the coordinates for the transitionstructure search and minimizes the energy in the re-maining space.110,111

The rational function optimization (RFO)method is a closely related approach for con-trolling the step size for both minimization andtransition structure searching that replaces thequadratic approximation by a rational functionapproximation.105

�E(x) = gT�x + 1/2�xTH�x1 + �xTS�x

= 12

[�x 1][

H ggT 0

] [�x1

]

[�x 1][

S 00 1

] [�x1

] (19)

Minimizing �E with respect to �x by settingd�E/d�x = 0 leads to an eigenvalue equation involv-ing the Hessian augmented by the gradient.[

H ggT 0

] [�x1

]= 2 �E

[S 00 1

] [�x1

](20)

Typically, S is chosen as a constant times theidentity matrix [in this case the top row of Eq. (20) re-duces to �x = −(H0 − λI)−1g0, as in TRM; Eq. (18)].The lowest eigenvalue and eigenvector of the aug-mented Hessian are used for minimizations, whereasthe second lowest eigenvalue and eigenvector are cho-sen for transition structure optimizations. The �xobtained by the RFO approach is reduced if it islarger than the τ .105,106,108,109 A partitioned RFOmethod may be better for saddle point searches if theHessian does not have the correct number of neg-ative eigenvalues.105 In this case, one eigenvector ischosen to be followed uphill to a maximum and theremaining eigenvectors are chosen for minimization;the RFO method is used for optimization in both sub-spaces. It is possible to reach other transition struc-tures by following an eigenvector other than the onewith the lowest eigenvalue.105–109

796 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 8: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

CONSTRAINED OPTIMIZATION

Under a variety of circumstances, it may be necessaryto apply constraints while optimizing the geometry(e.g., scanning potential energy surfaces, coordinatedriving, reaction path following, etc.). For nonre-dundant coordinate systems and simple constraints,the coordinates being held constant can be easily re-moved from the space of variables being optimized.For more general constraints and/or redundant inter-nal coordinate systems, constraints can be applied bypenalty functions, projection methods, or Lagrangianmultipliers.

In the penalty function method, the constraintsCi (x) = 0 are imposed by adding an extra term,1/2

∑αiCi (x)2, to the energy in Eq. (1) and the

energy is minimized as usual. Because the α′i s need

to have suitably large values so that the constraintsare approximately satisfied at the minimum, the opti-mization may converge much slower than the corre-sponding unconstrained optimization.

The preferred method for including con-straints in an optimization is by using Lagrangianmultipliers.

L(x) = E(x) +∑

i

λiCi (x) (21)

At convergence of the constrained optimization,the derivative of the Lagrangian L(x) with respectto the coordinate x and the Lagrangian multiplier λi

must be zero. In the LQA,

∂L(x)∂x

= g0 + H0�x +∑

i

λi∂Ci (x)

∂x= 0

and∂L(x)∂λi

= Ci (x) = 0. (22)

Because the Lagrangian multipliers are opti-mized along with the geometric variables, this methodgenerally converges much faster than the penalty func-tion method and the constraints are satisfied exactly.In the special case in which the constraint is linearfunction of the displacements, cT�x = c0, the La-grangian multiplier problem can be solved by usingan augmented Hessian,

g0 + H0�x + λc = 0 and cT�x = c0

or[

H ccT 0

] [�xλ

]=

[−g0

c0

](23)

Another way of applying linear constraints isthe projection method, in which a projector P is usedto remove the directions in which the displacementsare constrained to be zero.

Pg0 + PH0P�x + α(I − P) = 0,

P = I −∑

i

cicTi /|ci |2, (24)

where the ci are a set of orthogonal constraint vec-tors and α > 0. For redundant internal coordinates,the projector needs to remove the coordinate redun-dancies as well as the constraint directions.58

SPECIAL CONSIDERATIONS FORTRANSITION STRUCTUREOPTIMIZATION

Finding a minimum is comparatively easy because thenegative of the gradient always points downhill. Bycontrast, a transition structure optimization must stepuphill in one direction and downhill in all other or-thogonal directions. Often, the uphill direction is notknown in advance and must be determined during thecourse of the optimization. As a result, numerous spe-cial methods have been developed for transition struc-ture searching and many of them are closely related.They can be loosely classified as single-ended anddouble-ended methods.2 Single-ended methods startwith an initial structure and displace it toward thetransition structure. These algorithms include quasi-Newton and related methods (as discussed above andin Quasi-Newton Methods for Transition Structuresand Related Single-Ended Methods). Double-endedmethods start from the reactants and products andwork from both sides to find the transition struc-ture and the reaction path. These include the nudgedelastic band (NEB) method, string method (SM), andgrowing string method (GSM; see Chain-of-States andDouble-Ended Methods).

The success of a given transition structure op-timization method depends on the topology of thesurface as well as the starting structure(s), initial Hes-sian(s), and coordinates. The two dimensional con-tour plots in Figure 3 illustrate some of the impor-tant features that can be found in higher dimensionalsurfaces.112,113 If the change from reactant to productis dominated by a single coordinate, the potential en-ergy surface will have an approximately linear valley,as shown in Figure 3(a). Hindered rotation about asingle bond is an example of such an I-shaped sur-face. All methods should be able to find the transitionstructure on this type of surface without difficulty.Making one bond and breaking the other results inan L- or V-shaped surface, as shown in Figure 3(b).Because the reaction path is strongly curved, findingthe transition structure can be a bit more difficult.A T-shaped surface involves a transition structure

Volume 1, September /October 2011 797c© 2011 John Wi ley & Sons , L td .

Page 9: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

FIGURE 3 | Various classes of reaction channels near thetransition structure on reactive potential energy surfaces: (a) I-shapedvalley, (b) L- or V-shaped valley, (c) T-shaped valley, and (d) H- orX-shaped valley. (Reprinted with permission from Ref 113. Copyright2009 ACS Publications.)

in a hanging valley at the side of a main valley,Figure 3(c). A method that follows the main valleywill miss the transition structure, but will reach thetransition structure if it starts from the other valley.The H- or X-shaped surface in Figure 3(d) is the mostdifficult because following the valley floor from eitherthe reactant or the product side will miss the transi-tion structure.

Quasi-Newton and single-ended methods arediscussed first. These methods differ primarily in theway they try to get close to the quadratic region oftransition structure. Double-ended methods start withthe reactants and products, and optimize the reactionpath as well as the transition structure.

Quasi-newton Methods for TransitionStructuresNewton and quasi-Newton algorithms are the mostefficient single-ended methods for optimizing transi-tion structures if the starting geometry is within thequadratic region of the transition structure. With suit-able techniques for controlling the optimization stepssuch as TRM, RFO and eigenvector following (seeStep Size Control), these methods will also convergeto a transition structure even if they start outside the

quadratic region. Many of the related methods differprimarily in the techniques they use to get close to thequadratic region. There are three main differences inusing quasi-Newton methods to optimize transitionstructures compared with minima:

1. The Hessian update must allow for nega-tive eigenvalues (Newton and Quasi-NewtonMethods). BFGS (Eq. (7)) yields a positivedefinite update and is not appropriate. SR1and PSB updates are suitable (Eqs. (6) and(8)); the Bofill update74 (Eq. (9)) combinesSR1 and PSB yielding better results.

2. Line searches (Step Size Control) are gener-ally not possible because the step toward thetransition structure may be uphill or down-hill. If the coordinates can be partitioned apriori into a subspace spanning the coordi-nates involved in the transition vector andthe remaining subspace involving only coor-dinates to be minimized, then a line searchcan be used in the latter subspace.

3. Controlling the step size and direction (seeStep Size Control) are keys to the success ofquasi-Newton methods for transition struc-tures, especially if the initial structure isnot in the quadratic region. For the TRM(Eq. (18)), the shifted Hessian, H − λI, isrequired to have one negative eigenvalueand λ must be chosen so that the step sizedoes not exceed τ . This is illustrated inFigure 2(b) and discussed in the section StepSize Control. For the partitioned rationalfunction optimization and eigenvector fol-lowing methods,105–109 Eqs. (19) and (20),two different values of λ are used, one toshift the eigenvalue for the maximization di-rection and the other to shift the eigenvaluesfor minimization in the remaining directions.

A quasi-Newton optimization of a transitionstructure requires an initial geometry and an initialestimate of the Hessian. The initial geometry must besomewhere near the quadratic region of the transitionstructure. This can be challenging at times becauseour qualitative understanding of transition structuregeometries is not as well developed as for equilibriumgeometries. There are some general rules-of-thumbthat may be useful. For example, the bonds that arebeing made or broken in a reaction are, in manycases, elongated by ca 50% in the transition struc-ture. The Hammond’s postulate114 is useful for esti-mating the position of the transition structure alongthe reaction path. It states that the transition structure

798 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 10: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

is more reactant-like for an exothermic reaction andmore product-like for an endothermic reaction. Or-bital symmetry/phase conservation rules115 may indi-cate when a reaction should occur via a non-least-motion pathway. The choice of coordinates for theoptimization is also very important. Often the unionbetween the (redundant) internal coordinates of thereactants and products provides a good coordinate setfor the transition structure. Sometimes extra dummyatoms may need to be added to avoid problems withthe coordinate system.

When a transition structure for an analogous re-action is not available as an initial guess, then brack-eting the search region for the transition structure op-timization can be useful. In the QST2 approach,116 astructure on the reactant side and one on the productside are used to provide bounds on transition struc-ture geometry and to approximate the direction ofthe reaction path through the transition structure.The QST3 input adds a third structure as an initialestimate of the transition structure. A series of stepsis taken along the path connecting these two or threestructures until a maximum is reached. Then, a full di-mensional quasi-Newton optimization is carried outto find the transition structure.

Quasi-Newton methods require an initial esti-mate of the Hessian that has a negative eigenvaluewith a corresponding eigenvector that is roughlyparallel to the desired reaction path. The empir-ical rules for estimating Hessians for equilibriumgeometries78–80 do not provide suitable estimates ofthe required negative eigenvalue and eigenvector ofthe Hessian. Thus, quasi-Newton transition structureoptimizations are typically started with an analyticalor numerical calculation of the full Hessian (or at leastnumerical calculation of the most important compo-nents of the Hessian needed to obtain the transitionvector). For QSTn methods,116 updating the Hessianduring initial maximization steps along the reactionpath is usually sufficient to produce a suitable nega-tive eigenvalue and corresponding eigenvector.

An empirical valence bond (EVB) model117 ofthe surface can be useful in obtaining a starting ge-ometry and an initial Hessian for a quasi-Newtontransition structure optimization. A quadratic surfaceis constructed around the reactant minimum and an-other around the product minimum. These two sur-faces intersect along a seam. The lowest point alongthis seam is a good estimate of the transition structuregeometry118,119 (provided that there are no additionalintermediates or transition structures along the reac-tion path). An initial estimate of the Hessian can beobtained from the interaction of the reactant surfaceand the product surface using a 2 × 2 EVB Hamilto-

nian with a suitable guess for the interaction matrixelement.120

If more information about the potential energysurface is needed to find the quadratic region of thetransition structure, then one can calculate the energyfor a series of points along the linear synchronoustransit121 (LST) path between reactants and productsto find a maximum along this approximate path (LSTscan); internal or distance matrix coordinates are bet-ter than Cartesian coordinates for an LST scan. Formore complex systems, two or more directions mayneed to be scanned to find the ridge separating re-actants and products. These scans could be carriedout with or without optimizing the remaining coor-dinates (relaxed vs rigid surface scan). The reducedsurface method110,111 selects a set of coordinates tospan a two- or three-dimensional surface and mini-mizes the energy with respect to all of the remainingcoordinates. This reduced dimensional surface can beinterpolated using distance-weighted interpolants andsearched exhaustively for the lowest energy transitionstructures and reaction paths.

Related Single-Ended MethodsIn principle, one should be able to start at the re-actants (or products) and go uphill to the transi-tion structure. However, all directions from a mini-mum are uphill and most will end at higher energytransition structures or higher-order saddle points.One approach is to choose a coordinate that willcarry the molecule from reactant to product and stepalong this coordinate while minimizing all other di-rections. This is known as coordinate driving andinvolves a series of constrained optimizations (seeConstrained Optimization). Coordinate driving hasits pitfalls.122–125 For example, if the dominant co-ordinate changes along the reaction path, this ap-proach may not work (e.g., Figure 3(c) and (d)).Alternatively, one can choose a direction and per-form a constrained optimization of the componentsof the gradient perpendicular to this direction. This isknown variously as line-then-plane,126 reduced gradi-ent following,127–129 and Newton trajectories.130–132

Growing string methods (GSM)131–136 are coordinatedriving or reduced gradient following/Newton trajec-tory methods that start from both the reactant andproduct side. Because GSMs are double-ended meth-ods closely related to chain-of-states methods likeNEB method and SM; they are discussed in the nextsection.

An alternative to coordinate driving and re-duced gradient following is to step uphill alongthe lowest eigenvector of the Hessian—this amounts

Volume 1, September /October 2011 799c© 2011 John Wi ley & Sons , L td .

Page 11: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

to the walking up valleys or eigenvector followingmethod described above.101–109 Stepping toward thetransition structure is handled by a quasi-Newton ap-proach with TRM and RFO, Eqs. (18)–(20), to con-trol the step.

The dimer method137–140 is a variant of theeigenvector following approach that uses gradientscalculated at two closely spaced points to keep trackof the direction to search for a maximum. At eachstep, the dimer is rotated to minimize the sum of thetwo energies (this can be the more costly step in thedimer method). The minimum in the dimer energyoccurs when the vector between the points is alignedwith the Hessian eigenvector with the lowest eigen-value. The midpoint of the dimer is displaced uphilltoward the transition structure in a manner similar tothe eigenvector following method.

Another way to walk uphill along the shal-lowest ascent path is to follow a gradient extremalpath.141–143 Although steepest descent reaction pathscan only be followed downhill, gradient extremalpaths are defined locally (the gradient is an eigenvec-tor of the Hessian) and can be followed uphill as wellas downhill. A number of algorithms have been de-veloped for following gradient extremal paths.144–146

Gradient extremals do pass through minima, transi-tion structures, and higher-order saddle points; how-ever, they have a tendency to wander about the sur-face rather than follow the more direct route thatsteepest descent paths take.146,147 This makes gradi-ent extremal path following less attractive for transi-tion structure optimization.

The image function or associated surfaceapproach148–150 converts a transition structure searchinto a minimization on a modified surface that hasa minimum that coincides with the transition struc-ture of the original function. In a representation inwhich the Hessian is diagonal, this is accomplishedby choosing an eigenvector to follow and invert-ing the sign of the corresponding component of thegradient and Hessian eigenvalue. A TRM approach,Eq. (18), is then used to search for the minimum onthe image function or associated surface. The transi-tion structure can also be optimized as a minimumon the gradient norm surface, |g(x)| (Refs 151–153);however, the gradient norm can have additional min-ima at nonstationary points in which the gradient isnot zero. Because of the many minima and smallerradius of convergence, minimizing the gradient normis less practical for transition structure optimizations.

Bounds on the transition structure can be foundby approaching it from both the reactant and productside, providing progressively tighter limits.142,154,155

This is closely related to the GSM131–136 (see below).

FIGURE 4 | An example of a double-ended reaction pathoptimization on the Muller–Brown surface.154 The path optimizationwith 11 points starts with the linear synchronous transit path (black);the first two iterations are shown in blue and green, respectively. Thepath after 12 steps is in red and can be compared with the steepestdescent path in light blue.

Given two points on opposite sides of the ridgeseparating reactants from products, the ridgemethod156,157 can also be used to optimize the tran-sition structure. A point on the ridge is obtained byfinding a maximum along a line connecting a pointin the reactant valley and a point in the product val-ley. Two points on opposite sides of the ridge pointare generated by taking small steps along this line.Then, a downhill step is taken from these two pointsto generate two new points; the maximum along theline between these new points is a lower energy ridgepoint. The process is repeated until a minimum isfound along the ridge; this minimum is a transitionstructure.

Chain-of-States and Double-EndedMethodsInstead of optimizing a single point toward the transi-tion structure for a reaction, an alternative approachis to optimize the entire reaction path connectingreactants to products. Typically this is done by repre-senting the path by a series of points, that is, a chain-of-states. An example of a path optimization on theMuller–Brown surface154 is shown in Figure 4. Thevarious methods of this type differ primarily by howthe points are generated, what function of the points isminimized, and what constraints are imposed to con-trol the optimization. Many of the path optimization

800 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 12: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

methods are based on minimizing the integral ofthe energy along the path, normalized by the pathlength158:

Epath = 1L

∫E(x(s)) ds, (25)

where L is the total length of the path. Although thisdoes not yield a steepest descent path,159 it provides agood approximation to it. The elastic band or chain-of-states method replaces the integral by a discreteset of points, and the path is found by a constrainedminimization of the sum of the energies of the pointson the path from reactants to products,

Vpath =∑

E(xi ), (26)

where xi are the points on the path, which are requiredto remain equally spaced. Additional potentials aresometimes introduced to prevent the path from kink-ing or coiling up in minima.160 The number of pointsneeded depends on the nature of the path (e.g., num-ber of intermediates and transition states, curvatureof the path, etc.) and can range from less than 10 tomore than 50. Typically, several thousands of stepsare required for convergence, primarily because themotions of adjacent points are strongly coupled. Thenudged elastic band (NEB) method161–170 and stringmethod (SM)171–176 are two current approaches thatoffer some improvement in the convergence of thepath optimization.

In the NEB method,161–163 the points are keptequally spaced by adding a spring potential betweenthe points (c.f. penalty function method, ConstrainedOptimization):

Vspring = 12

kspring∑

(xi − xi−1)2. (27)

The gradient for a point has contributions fromthe potential energy surface and from the spring po-tential, which can be projected into components par-allel and perpendicular to the path:

gi = dVpath

dxig‖

i = (gT

i τ i)τ i g⊥

i = gi − g‖i , (28)

gi = dVspring

dxig‖

i = (gT

i τ i)τ i g⊥

i = gi − g‖i , (29)

where τ i is the normalized tangent to the path at xi.The tangent can be calculated by central difference(of xi−1 and xi+1) or by forward difference using thepoint uphill from xi. The NEB method161–163 uses thegradient of the spring potential to displace (nudge)the points along the reaction path, and uses the gra-dient of the unmodified potential energy surface fordirections perpendicular to the path.

gNEBi = g⊥

i + g‖i (30)

The doubly NEB method164 includes a portionof the spring gradient perpendicular to the path (e.g.,a second nudge).

gDNEBi = g⊥

i + g‖i + (

g⊥i − (

g⊥Ti g⊥

i

)g⊥

i /|g⊥i |2) (31)

This accelerates the optimization in the early stages,but may inhibit full convergence in the later stages.168

The original NEB method used a quenched ordamped Velocity Verlet method to propagate thepoints toward the path.161–163 After each step, mostof the velocity is removed. Alternatively, a quasi-Newton method can be used for the constrained op-timization, preferably by moving all of the pointssimultaneously.164,165,167,168 Because the dimensionof the optimization problem (number of coordinatetimes the number of points) can be rather large, lim-ited memory quasi-Newton methods such as L-BFGSand ABNR have been used.164,165,167,168 Currently,the NEB/L-BFGS method is the most efficient and sta-ble nudge elastic band method.168

The spring potential used to maintain uniformspacing in the NEB method introduces additional cou-pling between the points on the path that slows downthe convergence of the optimization. An early pathoptimization method avoided the spring potential bymoving the points so that the gradient would lie onthe steepest descent path.173 The recently developedstring method171,172 also minimizes the perpendiculargradient, g⊥

i in Eq. (28), for the points on the path.A cubic spline is used to redistribute the points tomaintain equal spacing and the fourth-order Runge–Kutta method is used to evolve the path.171,172 Inthe quadratic string method, the points move on lo-cal quadratic approximations to the surface usingupdated Hessians.174,175 Alternatively, the L-BFGSmethod can be used to propagate all of the pointsat the same time.168 The efficiency of the best stringmethods appears to be similar to the best NEB meth-ods, requiring from hundreds to thousands of gradi-ent calculations to optimize the path.168,176

The GSM133–136 reduces the total effort by gen-erating the points one at a time starting from thereactants and products. This is similar to the line-then-plane method126 and reduced gradient follow-ing/Newton trajectories131,132 from both ends of thepath. The GSM is also closely related to earlier meth-ods that stepped from the reactant and product to-ward the transition structure.142,154,155 As the stringgrows toward the center, the endpoints of the stringprovide better brackets for the transition structure.The search can be completed by interpolating to the

Volume 1, September /October 2011 801c© 2011 John Wi ley & Sons , L td .

Page 13: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

maximum along the path and optimizing the transi-tion structure.136 If the ends of the growing string aretoo far apart, a NEB method or SM can be used tooptimize of the entire path. To reduce the cost, a lowlevel of theory can be used to grow the string and ahigher level of theory can be used to refine the stringor to optimize the transition structure.134,135

Chain-of-state methods such as NEB, SM, andGSM are still more costly than quasi-Newton meth-ods for transition structures, and there is considerableroom for improving the efficiency of these methods.However, these methods do map out the entire re-action path. This can be advantageous if the pathcontains several transition states and intermediates.Furthermore, chain-of-states methods are more read-ily parallelized than quasi-Newton and other single-ended methods. The development of these methods isongoing.

Characterization of an OptimizedTransition StructuresOnce a transition structure has been optimized, itis necessary to confirm that it is indeed a transi-tion structure and is appropriate for the reaction un-der consideration. A vibrational frequency calculationneeds to be carried out for the transition structureto confirm that it has one and only one imaginaryfrequency (equivalently, the full dimensional Hessianmust have one and only one negative eigenvalue). Itis not sufficient to examine the updated Hessian usedin the quasi-Newton optimization process because itmay be subject to numerical noise and the optimiza-tion may not have explored the full coordinate space(e.g., modes that lower the symmetry of the molecule).Firstly, an accurate Hessian must be calculated analyt-ically or numerically and used for the vibrational fre-quency analysis. Secondly, the transition vector, thatis, the vibrational normal mode associated with theimaginary frequency must be inspected to make surethat the motion corresponds to the desired reaction.This is most easily done with graphical software thatcan animate vibrational modes. For complex reactionnetworks, reaction path following may be needed toverify that the transition structure connects the de-sired reactants and products. Chain-of-states meth-ods yield good approximations to the reaction pathas a part of the optimization. For quasi-Newton andother methods that converge on a single structure, fol-lowing the steepest descent path (see Reaction PathFollowing) will indicate the reactants and productsconnected by the transition structure. Reaction pathfollowing may also reveal other stationary points onthe path.

REACTION PATH FOLLOWING

For quasi-Newton and related methods that locateonly the transition structure, the steepest descent re-action path can be obtained by following the gra-dient downhill. Chain-of-states approaches such asNEB method, SM, and GMS (see Chain-of-States andDouble-Ended Methods) provide an approximationto the reaction path as a part of the optimization.Methods such as variational transition state theory177

and reaction path Hamiltonian178 may require a moreaccurate path. For this case, rather than trying toobtain very tight convergence with a chain-of-statesmethod, it is much more efficient to calculate an ac-curate path by a reaction path following method.

In the steepest descent reaction path (SDP) orminimum energy path (MEP), x(s) is defined as:

dx(s)ds

= −g(x(s))|g(x(s))| , (32)

where s is the arc length along the path. When mass-weighted coordinates are used, the path is the IRCof Fukui179 and corresponds to a classical particlemoving with infinitesimal kinetic energy. A numberof reviews are available for reaction path followingmethods.8,180 In principle, the reaction path can beobtained by standard methods for numerical integra-tion of ordinary differential equations.181 However,in some regions of the potential energy surface, thesemethods produce paths that zigzag across the valleyunless very small step sizes are used. This is charac-teristic of stiff differential equations for which specialtechniques such as implicit methods are needed.182

The Ishida, Morokuma, Komornicki (IMK)method183 uses an Euler step followed by a line searchto step back toward the path. The LQA methodof McIver and Page,184,185 and subsequent improve-ments by others186–188 obtain a step along the pathby integrating Eq. (32) analytically on a quadraticenergy surface using a calculated or updated Hes-sian. Because it makes use of quadratic information,LQA can take large steps than IMK. The second-ordermethod of Gonzalez and Schlegel189,190 (GS2) is animplicit trapezoid method that combines an explicitEuler step with an implicit Euler step. The latter is ob-tained by a constrained optimization. The larger stepsize and greater stability of the GS2 method morethan compensates for the few extra gradient calcula-tions needed in the optimization. The implicit trape-zoid method has been generalized and extended byBurger and Yang191,192 to a family of implicit–explicitmethods for accurate and efficient integration of re-action paths.

802 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 14: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

The reaction path can be thought of as an av-erage over a collection of classical trajectories or atrajectory in a very viscous medium. The dynamic re-action path method193 and the damped Velocity Ver-let approach194 obtain a reaction path by calculat-ing a classical trajectory in which almost all of thekinetic energy has been removed at every step. TheHessian-based predictor–corrector method for reac-tion paths195,196 adapts an algorithm used for classi-cal trajectory calculations. An LQA predictor step istaken on a local quadratic surface; a new Hessian isobtained at the end of the predictor step (by updat-ing or by direct calculation); a corrector step is ob-tained on a distance-weighted interpolant using thelocal quadratics at the beginning and end of the pre-dictor step.

A reaction path calculated at a lower level oftheory can be used to provide an estimate of the transi-tion structure optimization at a higher level of theorywhen full optimization at the higher level is not prac-tical. The potential energy surface is more likely tochange along the path than perpendicular to the pathbecause reaction energies and bond making/breakingcoordinates are typically more sensitive to the ac-curacy of the electronic structure calculation. In theIRCMax method, single point high-level energy cal-culations along the low-level path are interpolated togive an estimate of the transition structure and energyat the high level of theory.197

Reaction paths can also be formulated as a vari-ational method by finding the path that minimizes thefollowing integral159,198–201:

I =∫ √

g(x(t))T g(x(t))√

(dx(t)/dt)T(dx(t)/dt) dt

=∫ √

g(x(s))T g(x(s)) ds, (33)

where x(t) is the reaction path parameterized by anarbitrary variable t, whereas x(s) is the reaction pathas a function of the arc length (see Eq. (32)). This isclosely related to obtaining a classical trajectory byfinding the path that minimizes the classical action,and may hold promise for future developments.

SUMMARY

Methods for optimizing equilibrium geometries, lo-cating transition structures, and following reac-tion paths have been outlined in this chapter.Quasi-Newton methods are very efficient for ge-ometry optimization when used with Hessian up-dating, approximate line searches, and trust radiusor rational function techniques for step size con-trol. Single-ended and double-ended methods fortransition structure searches have been summarized.Procedures for approaching transitions structures aredescribed. Quasi-Newton methods with eigenvectorfollowing and rational function optimization are effi-cient for finding transition structures. Double-endedtechniques for transition structures are discussed andinclude nudged elastic band, string and growing stringmethods. The chapter concludes with a discussion ofvalidating transition structures and a description ofmethods for following steepest descent paths.

REFERENCES

1. Hratchian HP, Schlegel HB. Finding minima, tran-sition states, and following reaction pathways onab initio potential energy surfaces. In: Dykstra CE,Frenking G, Kim KS, Scuseria GE, eds. Theoryand Applications of Computational Chemistry: theFirst Forty Years. New York: Elsevier; 2005, 195–249.

2. Wales DJ. Energy Landscapes. Cambridge: Cam-bridge University Press; 2003.

3. Schlegel HB. Exploring potential energy surfaces forchemical reactions: an overview of some practicalmethods. J Comput Chem 2003, 24:1514–1527.

4. Pulay P, Baker J. Optimization and reaction path al-gorithms. In: Spencer ND, Moore JH, eds. Encyclo-pedia of Chemical Physics and Physical Chemistry.Bristol: Institute of Physics; 2001, 2061–2085.

5. Schlegel HB. Geometry optimization. In: SchleyerPvR, Allinger NL, Kollman PA, Clark T, SchaeferHF III, Gasteiger J, Schreiner PR, eds. Encyclopediaof Computational Chemistry. Vol 2. Chichester: JohnWiley & Sons; 1998, 1136–1142.

6. Schlick T. Geometry optimization: 2. In: SchleyerPvR, Allinger NL, Kollman PA, Clark T, SchaeferHF III, Gasteiger J, Schreiner PR, eds. Encyclopediaof Computational Chemistry. Vol 2. Chichester: JohnWiley & Sons; 1998: 1142–1157.

7. Jensen F. Transition structure optimization tech-niques. In: Schleyer PvR, Allinger NL, Kollman PA,Clark T, Schaefer HF III, Gasteiger J, SchreinerPR, eds. Encyclopedia of Computational Chemistry.Vol 5. Chichester: John Wiley & Sons; 1998: 3114–3123.

Volume 1, September /October 2011 803c© 2011 John Wi ley & Sons , L td .

Page 15: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

8. Schlegel HB. Reaction path following. In: SchleyerPvR, Allinger NL, Kollman PA, Clark T, Schaefer HFIII, Gasteiger J, Schreiner PR, eds. Encyclopedia ofComputational Chemistry. Vol 4. Chichester: JohnWiley & Sons; 1998: 2432–2437.

9. Schlegel HB. Geometry optimization on potential en-ergy surfaces. In: Yarkony DR, ed. Modern ElectronicStructure Theory. Singapore: World Scientific Pub-lishing; 1995, 459–500.

10. Liotard DA. Algorithmic tools in the study of semiem-pirical potential surfaces. Int J Quantum Chem 1992,44:723–741.

11. Schlegel HB. Some practical suggestions for opti-mizing geometries and locating transition states. In:Bertran J, ed. New Theoretical Concepts for Under-standing Organic Reactions. Vol 267. The Nether-lands: Kluwer Academic; 1989, 33–53.

12. Head JD, Weiner B, Zerner MC. A survey ofoptimization procedures for stable structures andtransition-states. Int J Quantum Chem 1988, 33:177–186.

13. Schlegel HB. Optimization of equilibrium geome-tries and transition structures. Adv Chem Phys 1987,67:249–286.

14. Bell S, Crighton JS. Locating transition-states. J ChemPhys 1984, 80:2464–2475.

15. Jensen F. Introduction to Computational Chemistry.Chichester; NY: John Wiley & Sons; 2007.

16. Cramer CJ. Essentials of Computational Chemistry:Theories and Models. West Sussex, New York: JohnWiley & Sons; 2004.

17. Gill PE, Murray W, Wright MH. Practical Optimiza-tion. London, New York: Academic Press; 1981.

18. Dennis JE, Schnabel RB. Numerical Methods for Un-constrained Optimization and Nonlinear Equations.Englewood Cliffs, NJ: Prentice-Hall; 1983.

19. Scales LE. Introduction to Non-Linear Optimization.New York: Springer-Verlag; 1985.

20. Fletcher R. Practical Methods of Optimization. 2nded. Chichester: John Wiley & Sons; 1987.

21. Maeda S, Ohno K, Morokuma K. An auto-mated and systematic transition structure explorerin large flexible molecular systems based on com-bined global reaction route mapping and microitera-tion methods. J Chem Theory Comput 2009, 5:2734–2743.

22. Maeda S, Ohno K, Morokuma K. Automated globalmapping of minimal energy points on seams of cross-ing by the anharmonic downward distortion follow-ing method: a case study of H2CO. J Phys Chem A2009, 113:1704–1710.

23. E W, Vanden-Eijnden E. Transition path theory andpath finding algorithms for the study of rare events.Annu Rev Phys Chem 2010, 61:391–420.

24. Koga N, Morokuma K. Determination of the lowestenergy point on the crossing seam between two po-tential surfaces using the energy gradient. Chem PhysLett 1985, 119:371–374.

25. Farazdel A, Dupuis M. On the determination of theminimum on the crossing seam of two potential en-ergy surfaces. J Comput Chem 1991, 12:276–282.

26. Ragazos IN, Robb MA, Bernardi F, Olivucci M.Optimization and characterization of the lowestenergy point on a conical intersection using anMCSCF Larangian. Chem Phys Lett 1992, 197:217–223.

27. Manaa MR, Yarkony DR. On the intersection of twopotential energy surfaces of the same symmetry. Sys-tematic characterization using a Lagrange multiplierconstrained procedure. J Chem Phys 1993, 99:5251–5256.

28. Bearpark MJ, Robb MA, Schlegel HB. A directmethod for the location of the lowest energy point ona potential surface crossing. Chem Phys Lett 1994,223:269–274.

29. Anglada JM, Bofill JM. A reduced-restricted-quasi-Newton-Raphson method for locating and optimizingenergy crossing points between two potential energysurfaces. J Comput Chem 1997, 18:992–1003.

30. De Vico L, Olivucci M, Lindh R. New general toolsfor constrained geometry optimizations. J Chem The-ory Comput 2005, 1:1029–1037.

31. Levine BG, Ko C, Quenneville J, Martinez TJ. Con-ical intersections and double excitations in time-dependent density functional theory. Mol Phys 2006,104:1039–1051.

32. Keal TW, Koslowski A, Thiel W. Comparison ofalgorithms for conical intersection optimisation us-ing semiempirical methods. Theor Chem Acc 2007,118:837–844.

33. Bearpark MJ, Larkin SM, Vreven T. Searching forconical intersections of potential energy surfaces withthe ONIOM method: application to previtamin D.J Phys Chem A 2008, 112:7286–7295.

34. Sicilia F, Blancafort L, Bearpark MJ, Robb MA. Newalgorithms for optimizing and linking conical inter-section points. J Chem Theory Comp 2008, 4:257–266.

35. Dick B. Gradient projection method for constraint op-timization and relaxed energy paths on conical inter-section spaces and potential energy surfaces. J ChemTheory Comput 2009, 5:116–125.

36. Madsen S, Jensen F. Locating seam minima formacromolecular systems. Theor Chem Acc 2009,123:477–485.

37. Bakken V, Helgaker T. The efficient optimization ofmolecular geometries using redundant internal coor-dinates. J Chem Phys 2002, 117:9160–9174.

804 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 16: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

38. Baker J. Techniques for geometry optimization: acomparison of cartesian and natural internal coor-dinates. J Comput Chem 1993, 14:1085–1100.

39. Baker J, Chan FR. The location of transition states: acomparison of Cartesian, Z-matrix, and natural inter-nal coordinates. J Comput Chem 1996, 17:888–904.

40. McQuarrie DA. Statistical Thermodynamics. MillValley, CA: University Science Books; 1973.

41. Heidrich D. The Reaction Path in Chemistry: CurrentApproaches and Perspectives. Dordrecht: Kluwer;1995.

42. Haile JM. Molecular Dynamics Simulation: Elemen-tary Methods. New York: John Wiley & Sons; 1992.

43. Thompson DL. Trajectory simulations of molecu-lar collisions: classical treatment. In: Schleyer PvR,Allinger NL, Kollman PA, Clark T, Schaefer HF III,Gasteiger J, Schreiner PR, eds. Encyclopedia of Com-putational Chemistry. Vol 5. Chichester: John Wiley& Sons; 1998, 3056–3073.

44. Hase WL, ed. Advances in Classical Trajectory Meth-ods. Greenwich: JAI Press; 1992.

45. Bunker DL. Classical trajectory methods. MethodsComput Phys 1971, 10:287.

46. Lasorne B, Worth GA, Robb MA. Excited-state dy-namics. WIREs Comput Mol Sci 2011, 1:460–475.

47. Wilson EB, Decius JC, Cross PC. Molecular Vibra-tions: the Theory of Infrared and Raman VibrationalSpectra. New York: Dover Publications; 1980.

48. Kolda TG, Lewis RM, Torczon V. Optimization bydirect search: new perspectives on some classical andmodern methods. SIAM Rev 2003, 45:385–482.

49. Garcia-Palomares UM, Rodriguez JF. New sequen-tial and parallel derivative-free algorithms for uncon-strained minimization. SIAM J Optim 2002, 13:79–96.

50. Lewis RM, Torczon V, Trosset MW. Direct searchmethods: then and now. J Comput Appl Math 2000,124:191–207.

51. Han LX, Neumann M. Effect of dimensionality on theNelder–Mead simplex method. Optim Math Software2006, 21:1–16.

52. Hehre WJ, Radom L, Schleyer PvR, Pople JA. AbInitio Molecular Orbital Theory. New York: JohnWiley & Sons; 1986.

53. Pulay P, Fogarasi G. Geometry optimization in re-dundant internal coordinates. J Chem Phys 1992,96:2856–2860.

54. Fogarasi G, Zhou XF, Taylor PW, Pulay P. The cal-culation of ab initio molecular geometries: efficientoptimization by natural internal coordinates and em-pirical correction by offset forces. J Am Chem Soc1992, 114:8191–8201.

55. Baker J, Kessi A, Delley B. The generation and useof delocalized internal coordinates in geometry opti-mization. J Chem Phys 1996, 105:192–212.

56. Baker J, Kinghorn D, Pulay P. Geometry optimiza-tion in delocalized internal coordinates: an efficientquadratically scaling algorithm for large molecules.J Chem Phys 1999, 110:4986–4991.

57. von Arnim M, Ahlrichs R. Geometry optimization ingeneralized natural internal coordinates. J Chem Phys1999, 111:9183–9190.

58. Peng CY, Ayala PY, Schlegel HB, Frisch MJ. Us-ing redundant internal coordinates to optimize equi-librium geometries and transition states. J ComputChem 1996, 17:49–56.

59. Farkas O, Schlegel HB. Methods for geometry opti-mization of large molecules. I. An O(N2) algorithmfor solving systems of linear equations for the trans-formation of coordinates and forces. J Chem Phys1998, 109:7100–7104.

60. Farkas O, Schlegel HB. Geometry optimization meth-ods for modeling large molecules. THEOCHEM2003, 666:31–39.

61. Baker J, Pulay P. Geometry optimization of atomicmicroclusters using inverse-power distance coordi-nates. J Chem Phys 1996, 105:11100–11107.

62. Eckert F, Pulay P, Werner HJ. Ab initio geometry op-timization for large molecules. J Comput Chem 1997,18:1473–1483.

63. Baker J, Pulay P. Efficient geometry optimization ofmolecular clusters. J Comput Chem 2000, 21:69–76.

64. Billeter SR, Turner AJ, Thiel W. Linear scaling geome-try optimisation and transition state search in hybriddelocalised internal coordinates. Phys Chem ChemPhys 2000, 2:2177–2186.

65. Paizs B, Baker J, Suhai S, Pulay P. Geometry opti-mization of large biomolecules in redundant internalcoordinates. J Chem Phys 2000, 113:6566–6572.

66. Kudin KN, Scuseria GE, Schlegel HB. A redundantinternal coordinate algorithm for optimization of pe-riodic systems. J Chem Phys 2001, 114:2919–2923.

67. Bucko T, Hafner J, Angyan JG. Geometry optimiza-tion of periodic systems using internal coordinates.J Chem Phys 2005, 122:124508.

68. Doll K, Dovesi R, Orlando R. Analytical Hartree-Fock gradients with respect to the cell parameter:systems periodic in one and two dimensions. TheorChem Acc 2006, 115:354–360.

69. Murtagh BA, Sargent RWH. Computational expe-rience with quadratically convergent minimizationmethods. Comput J 1972, 13:185–194.

70. Broyden CG. The convergence of a class of dou-ble rank minimization algorithms. J Inst Math Appl1970, 6:76–90.

71. Fletcher R. A new approach to variable metric algo-rithms. Comput J 1970, 13:317–322.

72. Goldfarb D. A family of variable metric methodsderived by variational means. Math Comput 1970,24:23–26.

Volume 1, September /October 2011 805c© 2011 John Wi ley & Sons , L td .

Page 17: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

73. Shanno DF. Conditioning of quasi-Newton methodsfor functional minimization. Math Comput 1970,24:647–657.

74. Bofill JM. Updated Hessian matrix and the restrictedstep method for locating transition structures. J Com-put Chem 1994, 15:1–11.

75. Farkas O, Schlegel HB. Methods for optimizing largemolecules. II. Quadratic search. J Chem Phys 1999,111:10806–10814.

76. Bofill JM. Remarks on the updated Hessian ma-trix methods. Int J Quantum Chem 2003, 94:324–332.

77. Liang W, Wang H, Hung J, Li XS, Frisch MJ.Eigenspace update for molecular geometry optimiza-tion in nonredundant internal coordinates. J ChemTheor Comp 2010, 6:2034–2039.

78. Schlegel HB. Estimating the Hessian for gradient-type geometry optimizations. Theor Chim Acta 1984,66:333–340.

79. Wittbrodt JM, Schlegel HB. Estimating stretch-ing force constants for geometry optimization.THEOCHEM 1997, 398:55–61.

80. Fischer TH, Almlof J. General methods for geometryand wave-function optimization. J Phys Chem 1992,96:9768–9774.

81. Nocedal J. Updating quasi-Newton matrices with lim-ited storage. Math Comput 1980, 35:773–782.

82. Byrd RH, Lu PH, Nocedal J, Zhu CY. A limited mem-ory algorithm for bound constrained optimization.SIAM J Sci Comput 1995, 16:1190–1208.

83. Liu DC, Nocedal J. On the limited memory BFGSmethod for large-scale optimization. Math Program1989, 45:503–528.

84. Biegler LT, Nocedal J, Schmid C. A reduced Hes-sian method for large-scale constrained optimization.SIAM J Optim 1995, 5:314–347.

85. Morales JL, Nocedal J. Enriched methods for large-scale unconstrained optimization. Comput OptimAppl 2002, 21:143–154.

86. Fletcher R. An Optimal positivedefinite update forsparse Hessian matrices. SIAM J Optim 1995, 5:192–218.

87. Byrd RH, Nocedal J, Schnabel RB. Representations ofquasi-Newton matrices and their use in limited mem-ory methods. Math Program 1994, 63:129–156.

88. Anglada JM, Besalu E, Bofill JM. Remarks onlarge-scale matrix diagonalization using a Lagrange-Newton-Raphson minimization in a subspace. TheorChem Acc 1999, 103:163–166.

89. Prat-Resina X, Garcia-Viloca M, Monard G,Gonzalez-Lafont A, Lluch JM, Bofill JM, AngladaJM. The search for stationary points on a quan-tum mechanical/molecular mechanical potential en-ergy surface. Theor Chem Acc 2002, 107:147–153.

90. Fletcher R, Reeves CM. Function minimization byconjugate gradients. Comput J 1964, 7:149–154.

91. Polak E. Computational Methods in Optimiztion: aUnified Approach. New York: Academic; 1971.

92. Csaszar P, Pulay P. Geometry optimization by directinversion in the iterative subspace. J Mol Struct 1984,114:31–34.

93. Farkas O. PhD (Csc) thesis. Budapest: Eotvos LorandUniversity and Hungarian Academy of Sciences;1995.

94. Farkas O, Schlegel HB. Methods for optimizing largemolecules. III. An improved algorithm for geome-try optimization using direct inversion in the itera-tive subspace (GDIIS). Phys Chem Chem Phys 2002,4:11–15.

95. Li XS, Frisch MJ. Energy-represented direct inversionin the iterative subspace within a hybrid geometryoptimization method. J Chem Theory Comput 2006,2:835–839.

96. Saad Y, Schultz MH. GMRES: a generalized mini-mal residual algorithm for solving nonsymmetric lin-ear systems. SIAM J Sci Stat Comput 1986, 7:856–869.

97. Krylov AN. On the numerical solution of the equa-tion by which in technical questions frequencies ofsmall oscillations of material systems are determined.Izvestiya Akademii Nauk SSSR, Otdel Mat i EstestNauk 1931, 7:491–539.

98. Lanczos C. Solution of systems of linear euationsby minimized iterations. J Res Nat Bur Stand 1952,49:33–53.

99. Arnoldi WE. The principle of minimized iterationsin the solution of the matrix eigenvalue problem. QAppl Math 1951, 9:17–29.

100. Schlegel HB. Optimization of equilibrium geome-tries and transition structures. J Comput Chem 1982,3:214–218.

101. Cerjan CJ, Miller WH. On finding transition states.J Chem Phys 1981, 75:2800–2806.

102. Simons J, Jorgensen P, Taylor H, Ozment J. Walk-ing on potential energy surfaces. J Phys Chem 1983,87:2745–2753.

103. Nichols J, Taylor H, Schmidt P, Simons J. Walking onpotential energy surfaces. J Chem Phys 1990, 92:340–346.

104. Simons J, Nichols J. Strategies for walking on poten-tial energy surfaces using local quadratic approxima-tions. Int J Quantum Chem 1990, 38(suppl 24):263–276.

105. Banerjee A, Adams N, Simons J, Shepard R. Searchfor stationary points on surface. J Phys Chem 1985,89:52–57.

106. Baker J. An algorithm for the location of transitionstates. J Comput Chem 1986, 7:385–395.

806 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 18: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

107. Culot P, Dive G, Nguyen VH, Ghuysen JM. A quasi-Newton algorithm for first-order saddle-point loca-tion. Theor Chim Acta 1992, 82:189–205.

108. Besalu E, Bofill JM. On the automatic restricted-steprational-function-optimization method. Theor ChemAcc 1998, 100:265–274.

109. Anglada JM, Bofill JM. On the restricted step methodcoupled with the augmented Hessian for the searchof stationary points of any continuous function. Int JQuantum Chem 1997, 62:153–165.

110. Bofill JM, Anglada JM. Finding transition states usingreduced potential energy surfaces. Theor Chem Acc2001, 105:463–472.

111. Burger SK, Ayers PW. Dual grid methods for find-ing the reaction path on reduced potential energysurfaces. J Chem Theory Comput 2010, 6:1490–1497.

112. Shaik SS, Schlegel HB, Wolfe S. Theoretical Aspectsof Physical Organic Chemistry: the SN2 Mechanism.New York: John Wiley & Sons; 1992.

113. Sonnenberg JL, Wong KF, Voth GA, Schlegel HB.Distributed Gaussian valence bond surface derivedfrom ab initio calculations. J Chem Theory Comput2009, 5:949–961.

114. Hammond GS. A correlation of reaction rates. J AmChem Soc 1955, 77:334–338.

115. Woodward RB, Hoffmann R. The Conservationof Orbital Symmetry. Weinheim/Bergstr.: VerlagChemie; 1970.

116. Peng CY, Schlegel HB. Combining synchronous tran-sit and quasi-Newton methods to find transitionstates. Isr J Chem 1993, 33:449–454.

117. Warshel A, Weiss RM. An empirical valence bondapproach for comparing reactions in solutions and inenzymes. J Am Chem Soc 1980, 102:6218–6226.

118. Jensen F. Transition structure modeling by intersect-ing potential energy surfaces. J Comput Chem 1994,15:1199–1216.

119. Ruedenberg K, Sun JQ. A simple prediction of ap-proximate transition states on potential energy sur-faces. J Chem Phys 1994, 101:2168–2174.

120. Anglada JM, Besalu E, Bofill JM, Crehuet R. Predic-tion of approximate transition states by Bell–Evans–Polanyi principle: I. J Comput Chem 1999, 20:1112–1129.

121. Halgren T, Lipscomb WN. The synchronous-transitmethod for determining reaction pathways and locat-ing molecular transition states. Chem Phys Lett 1977,49:225–232.

122. Burkert U, Allinger NL. Pitfalls in the use of the tor-sion angle driving method for the calculation of con-formational interconversions. J Comput Chem 1982,3:40–46.

123. Williams IH, Maggiora GM. Use and abuse ofthe distinguished coordinate method for transition

state structure searching. THEOCHEM 1982, 6:365–378.

124. Scharfenberg P. Theoretical analysis of constrainedminimum energy paths. Chem Phys Lett 1981,79:115–117.

125. Rothman MJ, Lohr LL. Analysis of an energy mini-mization method for locating transition states on po-tential energy hypersurfaces. Chem Phys Lett 1980,70:405–409.

126. Cardenas-Lailhacar C, Zerner MC. Searching fortransition states: the line-then-plane (LTP) approach.Int J Quantum Chem 1995, 55:429–439.

127. Quapp W, Hirsch M, Imig O, Heidrich D. Search-ing for saddle points of potential energy surfaces byfollowing a reduced gradient. J Comput Chem 1998,19:1087–1100.

128. Hirsch M, Quapp W. Improved RGF method to findsaddle points. J Comput Chem 2002, 23:887–894.

129. Crehuet R, Bofill JM, Anglada JM. A new look at thereduced-gradient-following path. Theor Chem Acc2002, 107:130–139.

130. Quapp W. Newton trajectories in the curvilinearmetric of internal coordinates. J Math Chem 2004,36:365–379.

131. Quapp W. A growing string method for the reactionpathway defined by a Newton trajectory. J Chem Phys2005, 122:174106.

132. Quapp W. The growing string method for flowsof Newton trajectories by a second order method.J Theor Comput Chem 2009, 8:101–117.

133. Peters B, Heyden A, Bell AT, Chakraborty A. A grow-ing string method for determining transition states:comparison to the nudged elastic band and stringmethods. J Chem Phys 2004, 120:7877–7886.

134. Goodrow A, Bell AT, Head-Gordon M. Developmentand application of a hybrid method involving inter-polation and ab initio calculations for the determi-nation of transition states. J Chem Phys 2008, 129:174109.

135. Goodrow A, Bell AT, Head-Gordon M. Transitionstate-finding strategies for use with the growing stringmethod. J Chem Phys 2009, 130:244108.

136. Goodrow A, Bell AT, Head-Gordon M. A strategy forobtaining a more accurate transition state estimateusing the growing string method. Chem Phys Lett2010, 484:392–398.

137. Henkelman G, Jonsson H. A dimer method for find-ing saddle points on high dimensional potential sur-faces using only first derivatives. J Chem Phys 1999,111:7010–7022.

138. Kaestner J, Sherwood P. Superlinearly convergingdimer method for transition state search. J Chem Phys2008, 128:014106.

139. Shang C, Liu ZP. Constrained Broyden minimiza-tion combined with the dimer method for locating

Volume 1, September /October 2011 807c© 2011 John Wi ley & Sons , L td .

Page 19: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

Advanced Review wires.wiley.com/wcms

transition state of complex reactions. J Chem TheoryComput 2010, 6:1136–1144.

140. Heyden A, Bell AT, Keil FJ. Efficient methods forfinding transition states in chemical reactions: com-parison of improved dimer method and partitionedrational function optimization method. J Chem Phys2005, 123:224101.

141. Pancir J. Calculation of the least energy path on theenergy hypersurface. Collect Czech Chem Commun1975, 40:1112–1118.

142. Basilevsky MV, Shamov AG. The local definition ofthe optimum ascent path on a multidimensional po-tential energy surface and its practical applicationfor the location of saddle points. Chem Phys 1981,60:347–358.

143. Hoffman DK, Nord RS, Ruedenberg K. Gradient ex-tremals. Theor Chim Acta 1986, 69:265–279.

144. Jorgensen P, Jensen HJA, Helgaker T. A gradient ex-tremal walking algorithm. Theor Chim Acta 1988,73:55–65.

145. Schlegel HB. Following gradient extremal paths.Theor Chim Acta 1992, 83:15–20.

146. Sun JQ, Ruedenberg K. Gradient extremals and steep-est descent lines on potential energy surfaces. J ChemPhys 1993, 98:9707–9714.

147. Bondensgard K, Jensen F. Gradient extremal bifur-cation and turning points: an application to theH2CO potential energy surface. J Chem Phys 1996,104:8025–8031.

148. Smith CM. How to find a saddle point. Int J QuantumChem 1990, 37:773–783.

149. Helgaker T. Transition state optimizations by trust-region image minimization. Chem Phys Lett 1991,182:503–510.

150. Sun JQ, Ruedenberg K. Locating transition states byquadratic image gradient descent on potential energysurfaces. J Chem Phys 1994, 101:2157–2167.

151. McIver JW, Komornicki A. Structure of transitionstates in organic reactions. General theory and an ap-plication to the cyclobutene-butadiene isomerizationusing a semiempirical molecular orbital method. J AmChem Soc 1972, 94:2625–2633.

152. Poppinger D. Calculation of transition states. ChemPhys Lett 1975, 35:550–554.

153. Komornicki A, Ishida K, Morokuma K, Ditchfield R,Conrad M. Efficient determination and characteriza-tion of transition states using ab initio methods. ChemPhys Lett 1977, 45:595–602.

154. Mueller K, Brown LD. Location of saddle points andminimum energy paths by a constrained simplex opti-mization procedure. Theor Chim Acta 1979, 53:75–93.

155. Dewar MJS, Healy EF, Stewart JJP. Location of tran-sition states in reaction mechanisms. J Chem Soc,Faraday Trans II 1984, 80:227–233.

156. Ionova IV, Carter EA. Ridge method for findingsaddle-points on potential energy surfaces. J ChemPhys 1993, 98:6377–6386.

157. Ionova IV, Carter EA. Direct inversion in the it-erative subspace-induced acceleration of the ridgemethod for finding transition states. J Chem Phys1995, 103:5437–5441.

158. Elber R, Karplus M. A method for determining re-action paths in large molecules: application to myo-globin. Chem Phys Lett 1987, 139:375–380.

159. Quapp W. Chemical reaction paths and calculus ofvariations. Theor Chem Acc 2008, 121:227–237.

160. Choi C, Elber R. Reaction path study of helix for-mation in tetrapeptides: effect of side chains. J ChemPhys 1991, 94:751–760.

161. Jonsson H, Mills G, Jacobsen W. Nudged elastic bandmethod for finding minimum energy paths of transi-tions. In: Berne BJ, Cicotti G, Coker DF, eds. Classicaland Quantum Dynamics in Condensed Phase Simu-lations. Singapore: World Scientific; 1998, 385.

162. Henkelman G, Jonsson H. Improved tangent estimatein the nudged elastic band method for finding mini-mum energy paths and saddle points. J Chem Phys2000, 113:9978–9985.

163. Henkelman G, Uberuaga BP, Jonsson H. A climbingimage nudged elastic band method for finding sad-dle points and minimum energy paths. J Chem Phys2000, 113:9901–9904.

164. Trygubenko SA, Wales DJ. A doubly nudged elasticband method for finding transition states. J ChemPhys 2004, 120:2082–2094.

165. Chu JW, Trout BL, Brooks BR. A super-linear mini-mization scheme for the nudged elastic band method.J Chem Phys 2003, 119:12708–12717.

166. Maragakis P, Andreev SA, Brumer Y, Reichman DR,Kaxiras E. Adaptive nudged elastic band approachfor transition state calculation. J Chem Phys 2002,117:4651–4658.

167. Galvan IF, Field MJ. Improving the efficiency of theNEB reaction path finding algorithm. J Comput Chem2008, 29:139–143.

168. Sheppard D, Terrell R, Henkelman G. Optimizationmethods for finding minimum energy paths. J ChemPhys 2008, 128:134106.

169. Alfonso DR, Jordan KD. A flexible nudged elasticband program for optimization of minimum energypathways using ab initio electronic structure methods.J Comput Chem 2003, 24:990–996.

170. Gonzalez-Garcia N, Pu JZ, Gonzalez-Lafont A, LluchJM, Truhlar DG. Searching for saddle points by usingthe nudged elastic band method: an implementationfor gas-phase systems. J Chem Theory Comput 2006,2:895–904.

171. E W, Ren WQ, Vanden-Eijnden E. String method forthe study of rare events. Phys Rev B 2002, 66:052301.

808 Volume 1, September /October 2011c© 2011 John Wi ley & Sons , L td .

Page 20: Advanced Review Geometry optimization - Dept of Chemistry ...chem.wayne.edu/schlegel/Pub_folder/346.pdf · Geometry optimization H. Bernhard Schlegel ... most computational chemistry

WIREs Computational Molecular Science Geometry optimization

172. E W, Ren WQ, Vanden-Eijnden E. Simplified andimproved string method for computing the minimumenergy paths in barrier-crossing events. J Chem Phys2007, 126:164103.

173. Ayala PY, Schlegel HB. A combined method for de-termining reaction paths, minima, and transition stategeometries. J Chem Phys 1997, 107:375–384.

174. Burger SK, Yang WT. Quadratic string methodfor determining the minimum-energy path basedon multiobjective optimization. J Chem Phys 2006,124:054109.

175. Burger SK, Yang W. Sequential quadratic program-ming method for determining the minimum energypath. J Chem Phys 2007, 127:164107.

176. Koslover EF, Wales DJ. Comparison of double-endedtransition state search methods. J Chem Phys 2007,127:134102.

177. Truhlar DG, Garrett BC. Variational transitionstate theory. Annu Rev Phys Chem 1984, 35:159–189.

178. Miller WH, Handy NC, Adams JE. Reaction pathHamiltonian for polyatomic molecules. J Chem Phys1980, 72:99–112.

179. Fukui K. The path of chemical reactions—the IRCapproach. Acc Chem Res 1981, 14:363–368.

180. McKee ML, Page M. Computing reaction pathwayson molecular potential energy surfaces. In: LipkowitzKB, Boyd DB, eds. Reviews in Computational Chem-istry. Vol 4. New York: VCH; 1993, 35–65.

181. Garrett BC, Redmon MJ, Steckler R, Truhlar DG,Baldridge KK, Bartol D, Schidt MW, Gordon MS.Algorithms and accuracy requirements for comput-ing reaction paths by the method of steepest descent.J Phys Chem 1988, 92:1476–1488.

182. Gear CW. Numerical Initial Value Problems in Or-dinary Differential Equations. Englewood Cliffs, NJ:Prentice-Hall; 1971.

183. Ishida K, Morokuma K, Komornicki A. The intrinsicreaction coordinate. An ab initio calculation for HNC→ HCN and H− + CH4 → CH4 + H−. J Chem Phys1977, 66:2153–2156.

184. Page M, Doubleday C, McIver JW. Following steep-est descent reaction paths. The use of higher energyderivatives with ab initio electronic structure meth-ods. J Chem Phys 1990, 93:5634–5642.

185. Page M, McIver JM. On evaluating the reaction pathHamiltonian. J Chem Phys 1988, 88:922–935.

186. Sun JQ, Ruedenberg K. Quadratic steepest descenton potential energy surfaces. 1. Basic formalism andquantitative assessment. J Chem Phys 1993, 99:5257–5268.

187. Sun JQ, Ruedenberg K. Quadratic steepest descenton potential energy surfaces. 2. Reaction path follow-ing without analytical Hessians. J Chem Phys 1993,99:5269–5275.

188. Eckert F, Werner HJ. Reaction path following byquadratic steepest descent. Theor Chem Acc 1998,100:21–30.

189. Gonzalez C, Schlegel HB. An improved algorithm forreaction path following. J Chem Phys 1989, 90:2154–2161.

190. Gonzalez C, Schlegel HB. Reaction path followingin mass-weighted internal coordinates. J Phys Chem1990, 94:5523–5527.

191. Burger SK, Yang WT. Automatic integration of thereaction path using diagonally implicit Runge-Kuttamethods. J Chem Phys 2006, 125:244108.

192. Burger SK, Yang WT. A combined explicit-implicitmethod for high accuracy reaction path integration.J Chem Phys 2006, 124:224102.

193. Stewart JJP, Davis LP, Burggraf LW. Semiempiricalcalculations of molecular trajectories: method and ap-plications to some simple molecular systems. J Com-put Chem 1987, 8:1117–1123.

194. Hratchian HP, Schlegel HB. Following reaction path-ways using a damped classical trajectory algorithm.J Phys Chem A 2002, 106:165–169.

195. Hratchian HP, Schlegel HB. Accurate reaction pathsusing a Hessian based predictor-corrector integrator.J Chem Phys 2004, 120:9918–9924.

196. Hratchian HP, Schlegel HB. Using Hessian updatingto increase the efficiency of a Hessian based predictor-corrector reaction path following method. J ChemTheory Comput 2005, 1:61–69.

197. Malick DK, Petersson GA, Montgomery JA. Tran-sition states for chemical reactions. I. Geometryand classical barrier height. J Chem Phys 1998,108:5704–5713.

198. Olender R, Elber R. Yet another look at thesteepest descent path. THEOCHEM 1997, 398:63–71.

199. Vanden-Eijnden E, Heymann M. The geometric min-imum action method for computing minimum energypaths. J Chem Phys 2008, 128:061103.

200. Aguilar-Mogas A, Crehuet R, Gimenez X, Bofill JM.Applications of analytic and geometry concepts of thetheory of Calculus of Variations to the Intrinsic Reac-tion Coordinate model. Mol Phys 2007, 105:2475–2492.

201. Aguilar-Mogas A, Gimenez X, Bofill JM. Finding re-action paths using the potential energy as reactioncoordinate. J Chem Phys 2008, 128:104102.

Volume 1, September /October 2011 809c© 2011 John Wi ley & Sons , L td .