amathematicalintroductiontohartree-fockscf ...formalism needed for the fundamental-state quantum...

arX

iv:0

705.

0337

v1 [

phys

ics.

chem

-ph]

2 M

ay 2

007

A mathematical introduction to Hartree-Fock SCF

methods in Quantum Chemistry

Pablo Echenique1,2∗ and J. L. Alonso1,2

1 Theoretical Physics Department, University of Zaragoza,

Pedro Cerbuna 12, 50009, Zaragoza, Spain.2 Institute for Biocomputation and Physics of Complex Systems (BIFI),

Edificio Cervantes, Corona de Aragon 42, 50009, Zaragoza, Spain.

October 31, 2018

It would indeed be remarkable if Naturefortified herself against further advances inknowledge behind the analytical difficultiesof the many-body problem.

— Max Born, 1960

Abstract

We present here an introduction to Hartree-Fock theory in Quantum Chem-istry. From the molecular Hamiltonian, using and discussing the Born-Oppen-heimer approximation, we arrive to the Hartree and Hartree-Fock equationsfor the electronic problem. Special emphasis is placed in the most relevantmathematical aspects of the theoretical derivation of the final equations, aswell as in the results regarding the existence and uniqueness of their solutions.Additionally, the discretization of the one-electron orbitals space is reviewedand the Roothaan-Hall formalism introduced. Finally, the basic underlyingconcepts related to the construction and selection of Gaussian basis sets arediscussed focusing in algorithmic efficiency issues. The whole work is inten-tionally introductory and rather self-contained, so that it may be useful fornon experts that aim to use quantum chemical methods in interdisciplinaryapplications. Moreover, much material that is found scattered in the litera-ture has been put together here to facilitate comprehension and to serve as ahandy reference.

∗E-mail address: [email protected] — Web page: http://www.pabloechenique.com

http://arxiv.org/abs/0705.0337v1

http://www.pabloechenique.com

1 Introduction

In the field of computer simulation of macromolecules, available potential energyfunctions are often not accurate enough to properly describe complex processes suchas the folding of proteins [1, 2, 3, 4, 5, 6]. In order to improve the situation, it isconvenient to extract ab initio information from quantum mechanical calculationswith the hope of being able to devise less computationally demanding methodsthat can be used to tackle large systems. In this spirit, the effective potential forthe nuclei calculated in the non-relativistic Born-Oppenheimer approximation istypically considered as a good reference to assess the accuracy of cheaper potentials[7, 8, 9, 10, 11, 12, 13]. The study of molecules at this level of theoretical detailand the design of computationally efficient approximations for solving the demandingequations that appear constitute the major part of the field called quantum chemistry[14, 15]. In this document, we voluntarily circumscribe ourselves to some of the basicformalism needed for the fundamental-state quantum chemical calculations that aretypically performed in this context. For more general accounts, we refer the readerto any of the wonderful accounts in refs. 16, 17, 18.

In sec. 2, we introduce the molecular Hamiltonian and a special set of units (theatomic ones) that are convenient to simplify the equations. In sec. 3, we present in anaxiomatic way the concepts and expressions related to the separation of the electronicand nuclear problems in the Born-Oppenheimer scheme. In sec. 4, we introducethe variational method that underlies the derivation of the basic equations of theHartree and Hartree-Fock approximations, discussed in sec. 6 and 7 respectively.The computational implementation of the Hartree-Fock approximation is tackledin sec. 8, where the celebrated Roothaan-Hall equations are derived and a briefintroduction to Gaussian basis sets is made in sec. 9. Finally, in sec. 10, the Møller-Plesset 2 (MP2) theory to incorporate correlation to the Hartree-Fock results isdiscussed.

2 Molecular Hamiltonian and atomic unitsGetting rid of constants and powers of ten

Since 1960, the international scientific community has agreed on an ‘official’ set ofbasic units for measurements: Le Systeme International d’Unites, or SI for short(see http:// www.bipm.org/en/si/ and ref. 19). The meter (m), the kilogram(kg), the second (s), the ampere (A), the kelvin (K), the mole (mol), the joule (J)and the pascal (Pa) are examples of SI units.

Sticking to the SI scheme, the non-relativistic quantum mechanical Hamiltonianoperator of a molecule consisting of NN nuclei (with atomic numbers Zα and massesMα, α = 1, . . . , NN) and N electrons (i.e., the molecular Hamiltonian) is expressedas1:

1 Note that the non-relativistic molecular Hamiltonian does not depend on spin-like variables.

2

H = −NN∑

α=1

~2

2Mα∇2

α −N∑

i=1

~2

2me∇2

i +1

2

∑

α6=β

(e2

4πǫ0

)ZαZβ

|~Rβ − ~Rα|

−N∑

i=1

NN∑

α=1

(e2

4πǫ0

)Zα

|~Rα − ~ri|+

1

2

∑

i 6=j

(e2

4πǫ0

)1

|~rj − ~ri|, (2.1)

where ~ stands for h/2π, being h Planck’s constant, me denotes the electron

mass, e the proton charge, ~ri the position of the i-th electron, ~Rα that of the α-thnucleus, ǫ0 the vacuum permittivity and ∇2

i the Laplacian operator with respect tothe coordinates of the i-th particle.

Although using a common set of units presents obvious communicative advan-tages, when circumscribed to a particular field of science, it is common to appeal tonon-SI units in order to simplify the most frequently used equations by getting ridof some constant factors that always appear grouped in the same ways and, thus,make the numerical values in any calculation of the order of unity. In the field ofquantum chemistry, atomic units (see table 1), proposed in ref. 20 and named inref. 21, are typically used. In these units, eq. (2.1) is substantially simplified to

H = −NN∑

α=1

1

2Mα

∇2α −

N∑

i=1

1

2∇2

i +1

2

∑

α6=β

ZαZβ

|~Rβ − ~Rα|

−N∑

i=1

NN∑

α=1

Zα

|~Rα − ~ri|+

1

2

∑

i 6=j

1

|~rj − ~ri|. (2.2)

Since all the relevant expressions in quantum chemistry are derived in one wayor another from the molecular Hamiltonian, the simplification brought up by theuse of atomic units propagates to the whole formalism. Consequently, they shall bethe choice all throughout this document.

Apart from the atomic units and the SI ones, there are some other miscellaneousunits that are often used in the literature: the angstrom, which is a unit of lengthdefined as 1 A= 10−10 m, and the units of energy cm−1 (which reminds about the

Unit of mass: mass of the electron = me = 9.1094 · 10−31 kg

Unit of charge: charge on the proton = e = 1.6022 · 10−19 C

Unit of length: 1 bohr = a0 =4πǫ0~2

mee2= 0.52918 A= 5.2918 · 10−11 m

Unit of energy: 1 hartree = ~2

mea20= 627.51 kcal/mol = 4.3597 · 10−18 J

Table 1: Atomic units up to five significant digits. Taken from the National Insti-tute of Standards and Technology (NIST) web page at http://physics.nist.gov/cuu/Constants/.

3

1 hartree 1 eV 1 kcal/mol 1 kJ/mol 1 cm−1

1 hartree 1 27.211 627.51 262.54 219470

1 eV 3.6750 · 10−2 1 23.061 96.483 8065.5

1 kcal/mol 1.5936 · 10−3 4.3363 · 10−2 1 4.1838 349.75

1 kJ/mol 3.8089 · 10−4 1.0364 · 10−2 2.3902 · 10−1 1 83.595

1 cm−1 4.5560 · 10−6 1.2398 · 10−4 2.8592 · 10−3 1.1962 · 10−2 1

Table 2: Energy units conversion factors to five significant digits. Taken from the Na-tional Institute of Standards and Technology (NIST) web page at http://physics.nist.gov/cuu/Constants/. The table must be read by rows. For example,the value 4.1838, in the third row, fourth column, indicates that 1 kcal/mol =4.1838 kJ/mol.

spectroscopic origins of quantum chemistry and, even, quantum mechanics), elec-tronvolt (eV), kilocalorie per mole (kcal/mol) and kilojoule per mole (kJ/mol). Thelast two are specially used in the field of macromolecular simulations and quantifythe energy of a mole of entities; for example, if one asserts that the torsional barrierheight for H2O2 is ∼ 7 kcal/mol, one is really saying that, in order to make a moleof H2O2 (i.e., NA ≃ 6.0221 · 1023 molecules) rotate 180o around the O–O bond, onemust spend ∼ 7 kcal. For the conversion factors between the different energy units,see table 2.

Finally, to close this section, we rewrite eq. (2.2) introducing some self-explanatorynotation that will be used in the subsequent discussion:

H = TN + Te + VNN + VeN + Vee , (2.3a)

TN := −NN∑

α=1

1

2Mα

∇2α , (2.3b)

Te := −N∑

i=1

1

2∇2

i , (2.3c)

VNN :=1

2

∑

α6=β

ZαZβ

Rαβ, (2.3d)

VeN := −N∑

i=1

NN∑

α=1

Zα

Rαi, (2.3e)

Vee :=1

2

∑

i 6=j

1

rij. (2.3f)

4

3 The Born-Oppenheimer approximationSeparating electrons from nuclei

To think of a macromolecule as a set of quantum objects described by a wavefunctionΨ(X1, . . . , XNN

, x1, . . . , xN) dependent on the spatial and spin2 degrees of freedom,

xi := (~ri, σi), of the electrons and on those of the nuclei, Xα := (~Rα,Σα), would betoo much for the imagination of physicists and chemists. All the language of chem-istry would have to be remade and simple sentences in textbooks, such as “rotationabout this single bond allows the molecule to avoid steric clashes between atoms”or even “a polymer is a long chain-like molecule composed of repeating monomerunits”, would have to be translated into long and counter-intuitive statements in-volving probability and ‘quantum jargon’. Conscious or not, we think of moleculesas classical objects.

More precisely, we are ready to accept that electrons are quantum (we know ofthe interference experiments, electrons are light, we are accustomed to draw atomic‘orbitals’, etc.), however, we are reluctant to concede the same status to nuclei.Nuclei are heavier than electrons (at least ∼ 2000 times heavier, in the case ofthe single proton nucleus of hydrogen) and we picture them in our imagination as‘classical things’ that move, bond to each other, rotate around bonds and are atprecise points at precise times. We imagine nuclei ‘slowly moving’ in the field of theelectrons, which, for each position of the first, immediately ‘adjust their quantumstate’.

The formalization of these ideas is called Born-Oppenheimer approximation [22,23] and the confirmation of its being good for many relevant problems is a fact thatsupports our intuitions about the topic and that lies at the foundations of the vastmajority of the images, the concepts and the research in quantum chemistry3.

Like any approximation, the Born-Oppenheimer one may be either derived fromthe exact problem (in this case, the entangled behaviour of electrons and nuclei as thesame quantum object) or simply proposed on the basis of physical intuition, and laterconfirmed to be good enough (or not) by comparison with the exact theory or withthe experiment. Of course, if it is possible, the first way should be preferred, since itallows to develop a deeper insight about the terms we are neglecting and the specificdetails that we will miss. However, although in virtually every quantum chemistrybook [17, 18, 25, 26, 27] hands-waving derivations up to different levels of detailare performed and the Born-Oppenheimer approximation is typically presented asunproblematic, it seems that the fine mathematical details on which these ‘standard’approaches are based are far from clear [28, 29, 30]. This state of affairs does notimply that the final equations that will need to be solved are ill-defined or that the

2 One convenient way of thinking about functions that depend on spin-like variables is as anm-tuple of ordinary R

3N functions, where m is the finite number of possible values of the spin. Inthe case of a one-particular wavefunction describing an electron, for example, σ can take two values(say, −1/2 and 1/2) in such a way that one may picture any general spin-orbital Ψi(x) as a 2-tuple(Φ

−1/2i (~r),Φ

1/2i (~r)

). Of course, another valid way of imagining Ψi(x) is simply as a function of

four variables, three real and one discrete.3 There are many phenomena, however, in which the Born-Oppenheimer approximation is

broken. For example, in striking a flint to create a spark, mechanical motion of the nuclei exciteselectrons into a plasma that then emits light [24].

5

numerical methods based on the theory are unstable; in fact, it is just the contrary(see the discussion below), because the problems are related only to the preciserelation between the concepts in the whole theory and those in its simplified version.Nevertheless, the many subtleties involved in a derivation of the Born-Oppenheimerapproximation scheme from the exact equations suggest that the second way betaken. Hence, in the following paragraphs, an axiomatic presentation of the mainexpressions, aimed mostly to fix the notation and to introduce the language, will beperformed.

First of all, if we examine the Hamiltonian operator in eq. (2.2), we see that theterm VNe prevents the problem from being separable in the nuclear and electroniccoordinates, i.e., if we define x := (x1, . . . , xN ) as the set of all electronic coordinates(spatial and spin-like) and do likewise with the nuclear coordinates X, the term VNe

prevents any wavefunction Ψ(X, x) solution of the time-independent Schrodingerequation,

H Ψ(X, x) =(TN + Te + VNN + VeN + Vee

)Ψ(X, x) = E Ψ(X, x) , (3.1)

from being written as a product, Ψ(X, x) = ΨN(X)Ψe(x), of an electronic wave-function and a nuclear one. If this were the case, the problem would still be difficult(because of the Coulomb terms VNN and Vee), but we would be able to focus on theelectrons and on the nuclei separately.

The starting point for the Born-Oppenheimer approximation consists in assumingthat a less strict separability is achieved, in such a way that, for a pair of suitablychosen ΨN(X) and Ψe(x; X), any wavefunction solution of eq. (3.1) (or at least thosein which we are interested; for example, the eigenstates corresponding to the lowestlying eigenvalues) can be expressed as

Ψ(X, x) = ΨN(X)Ψe(x; X) , (3.2)

where we have used a ‘;’ to separate the two sets of variables in the electronicpart of the wavefunction in order to indicate that, in what follows, it is convenientto use the image that ‘from the point of view of the electrons, the nuclear degreesof freedom are fixed’, so that the electronic wavefunction depends ‘parametrically’on them. In other words, that the X are not quantum variables in eq. (3.3) below.Of course, it is just a ‘semantic’ semicolon; if anyone feels uncomfortable about it,she may drop it and write a normal comma.

Notably, in ref. 31, Hunter showed that any solution of the Schrodinger equationcan in fact be written exactly in the form of eq. (3.2), and that the two functions,ΨN(X) and Ψe(x; X), into which Ψ(X, x) is split may be interpreted as marginal andconditional probability amplitudes respectively. However, despite the insight thatis gained from this treatment, it is of no practical value, since the knowledge of theexact solution Ψ(X, x) is required in order to compute ΨN(X) and Ψe(x; X).

In the Born-Oppenheimer scheme, an additional assumption is made in order toavoid this drawback: the equations obeyed by the electronic and nuclear parts of thewavefunction are supposed to be known. Hence, Ψe(x; X) is assumed to be a solutionof the time-independent clamped nuclei Schrodinger equation,

6

(Te + VeN(r; R) + Vee(r)

)Ψe(x; R) := He(R)Ψe(x; R) = Ee(R)Ψe(x; R) , (3.3)

where the electronic Hamiltonian operator He(R) and the electronic energy Ee(R)(both dependent on the nuclei positions) have been defined, and, since the nuclearspins do not enter the expression, we have explicitly indicated that Ψe dependsparametrically on R and not on X.

The common interpretation of the clamped nuclei equation is, as we have ad-vanced at the beginning of the section, that the nuclei are much ‘slower’ than theelectrons and, therefore, the latter can automatically adjust their quantum stateto the instantaneous positions of the former. Physically, eq. (3.3) is just the time-independent Schrodinger equation of N particles (the electrons) of mass me andcharge −e in the external electric field of NN point charges (the nuclei) of size eZα at

locations ~Rα. Mathematically, it is an eigenvalue problem that has been thoroughlystudied in the literature and whose properties are well-known [32, 33, 34, 35, 36, 37].In particular, it can be shown that, in the case of neutral or positively chargedmolecules (i.e., with Z :=

∑α Zα ≥ N), the clamped nuclei equation has an infinite

number of normalizable solutions in the discrete spectrum of He(R) (bound-states)for every value of R [38, 39].

These solutions must be regarded as the different electronic energy levels, anda further approximation that is typically made consists in, not only accepting thatthe electrons immediately ‘follow’ nuclear motion, but also that, for each value ofthe nuclear positions R, they are in the fundamental electronic state4, i.e., the onewith the lower Ee(R).

Consequently, we define

Eeffe (R) := E0

e (R) . (3.4)

to be the effective electronic field in which the nuclei move, in such a way that,once we have solved the problem in eq. (3.3) and know E0

e (R), the time-independentnuclear Schrodinger equation obeyed by ΨN(X) is:

(TN + VNN(R) + Eeff

e (R))ΨN(X) := HNΨN(X) = ENΨN(X) , (3.5)

where the effective nuclear Hamiltonian HN has been implicitly defined.Now, to close the section, we put together the main expressions of the Born-

Oppenheimer approximation for quick reference and we discuss them in some moredetail:

4 This is customarily assumed in the literature and it is supported by the general fact thatelectronic degrees of freedom are typically more difficult to excite than nuclear ones. Hence, inthe vast majority of the numerical implementations of the theory, only the fundamental electronicstate is sought. We will see this in the forecoming sections.

7

He(R)Ψe(x; R) :=(Te + VeN(R) + Vee

)Ψe(x; R) = Ee(R)Ψe(x; R) , (3.6a)

Eeffe (R) := E0

e (R) , (3.6b)

HNΨN(X) :=(TN + VNN(R) + Eeff

e (R))ΨN(X) = ENΨN(X) , (3.6c)

Ψ(x,X) ≃ Ψ0e(x; R)ΨN(X) , E ≃ EN . (3.6d)

To start, note that the above equations are written in the logical order in whichthey are imagined and used in any numerical calculation. First, we assume the nu-clei fixed at R and we (hopefully) solve the clamped nuclei electronic Schrodingerequation (eq. (3.6a)), obtaining the fundamental electronic state Ψ0

e(x,R) with itscorresponding energy E0

e (R). Next, we repeat this procedure for all possible values5

of R and end up with an hyper-surface E0e (R) in R-space. Finally, we add this func-

tion to the analytical and easily computable VNN(R) and find the effective potentialthat determines the nuclear motion:

V effN (R) := VNN (R) + E0

e (R) . (3.7)

It is, precisely, this effective potential that is called Potential Energy Surface(PES) (or, more generally, Potential Energy Hyper-Surface (PEHS)) in quantumchemistry and that is the central object through which scientists picture chemicalreactions or conformational changes of macromolecules [40]. In fact, the conceptis so appealing and the classical image so strong that, after ‘going quantum’, wecan ‘go classical’ back again and think of nuclei as perfectly classical particles thatmove in the classical potential V eff

N (R). In such a case, we would not have to solveeq. (3.6c) but, instead, integrate the Newtonian equations of motion. This is thebasic assumption of every typical force field used for molecular dynamics, such as theones in the popular CHARMM [41, 42], AMBER [43, 44, 45] or OPLS [46] packages.

Finally, we would like to remind the reader that, despite the hands-waving char-acter of the arguments presented, up to this point, every computational step has aclear description and eqs. (3.6a) through (3.6c) could be considered as definitions in-volving a certain degree of notational abuse. To assume that the quantities obtainedthrough this process are close to those that proceed from a rigorous solution of thetime independent Schrodinger equation (eq. (3.1)) is where the approximation re-ally lies. Hence, the more accurate eqs. (3.6d) are, the better the Born-Oppenheimerguess is, and, like any other one, if one does not trust in the heuristic grounds onwhich the final equations stand, they may be taken as axiomatic and judged a pos-teriori according to their results in particular cases6.

5 Of course, this cannot be done in practice. Due to the finite character of available computa-tional resources, what is customarily done is to define a ‘grid’ in R-space and compute E0

e (R) in afinite number of points.

6 Until now, two approximations have been done: the non-relativistic character of the objectsstudied and the Born-Oppenheimer approximation. In the forecoming, many more will be done.The a priori quantification of their goodness in large molecules is a formidable task and, despitethe efforts in this direction, in the end, the comparison with experimental data is the only soundmethod for validation.

8

In quantum chemistry, the Born-Oppenheimer approximation is assumed in agreat fraction of the studies and it allows the central concept of potential energysurface to be well-defined, apart from considerably simplifying the calculations. Thesame decision is taken in this document.

4 The variational methodLooking for the fundamental state

There exists a mathematically appealing way of deriving the time independentSchrodinger equation (eq. (3.1)) from an extremal principle. If we define the func-tional (see appendix A) that corresponds to the expected value of the energy,

F [Ψ] := 〈Ψ| H |Ψ〉 , (4.1)

and we restrict the search space to the normalized wavefunctions, the constrained-extremals problem that results can be solved via the Lagrange multipliers method(see appendix B) by constructing the associated functional F [Ψ], where we introducea Lagrange multiplier λ to force normalization:

F [Ψ] := F [Ψ] + λ(〈Ψ|Ψ〉 − 1

). (4.2)

If we now ask that the functional derivative of F [Ψ] with respect to the complexconjugate Ψ∗ of the wave function7 be zero, i.e., we look for the stationary pointsof F [Ψ] conditioned by 〈Ψ|Ψ〉 = 1, we obtain the eigenvalues equation for H, i.e.,the time-independent Schrodinger equation. Additionally, it can be shown, first,that, due to the self-adjointedness of H, the equation obtained from the stationaritycondition with respect to Ψ (not Ψ∗) is just the complex conjugate and adds no newinformation.

Moreover, one can see that the reverse implication is also true, so that, if a givennormalized wavefunction Ψ is a solution of the eigenvalue problem and belongs tothe discrete spectrum of H, then the functional in eq. (4.2) is stationary with respectto Ψ∗:

δF [Ψ]

δΨ∗= 0 ⇐⇒ H Ψ = −λΨ := E Ψ and 〈Ψ|Ψ〉 = 1 . (4.3)

This result, despite its conceptual interest, is of little practical use, because itdoes not indicate an operative way to solve the Schrodinger equation different fromthe ones that we already knew. The equivalence above simply illustrates that math-ematical variational principles are over-arching theoretical statements from whichthe differential equations that actually contain the details of physical systems canbe extracted. Nevertheless, using similar ideas, we will derive another simple theo-rem which is indeed powerfully practical: the Variational Theorem.

7 A function of a complex variable z (or, analogously, a functional on a space of complexfunctions) may be regarded as depending on two different sets of independent variables: eitherRe(z) and Im(z) or z and z∗. The choice frequently depending on technical issues.

9

Let {Ψn} be a basis of eigenstates of the Hamiltonian operator H and {En} theircorresponding eigenvalues. Since H is self-adjoint, the eigenstates Ψn can be chosento be orthonormal (i.e., 〈Ψm|Ψn〉 = δmn) and any normalized wavefunction Ψ in theHilbert space can be written as a linear combination of them8:

|Ψ〉 =∑

n

Cn|Ψn〉 provided that∑

n

|Cn|2 = 1 . (4.4)

If we now denote by E0 the lowest En (i.e., the energy of the fundamental state)9

and calculate the expected value of the energy on an arbitrary state Ψ such as theone in eq. (4.4), we obtain

〈Ψ| H |Ψ〉 =∑

m,n

C∗mCn〈Ψm| H |Ψn〉 =

∑

m,n

C∗mCnEn〈Ψm|Ψn〉

=∑

m,n

C∗mCnEnδmn =

∑

n

|Cn|2En ≥∑

n

|Cn|2E0 = E0 . (4.5)

This simple relation is the Variational Theorem and it states that any wavefunc-tion of the Hilbert space has an energy larger than the one of the fundamental state(the equality can only be achieved if Ψ = Ψ0). However trivial this fact may ap-pear, it allows a very fruitful ‘everything-goes’ strategy when trying to approximatethe fundamental state in a difficult problem. If one has a procedure for finding apromising guess wavefunction (called variational ansatz ), no matter how heuristic,semi-empirical or intuitive it may be, one may expect that the lower the correspond-ing energy, the closer to the fundamental state it is10. This provides a systematicstrategy for improving the test wavefunction which may take a number of particularforms.

One example of the application of the Variational Theorem is to propose a fam-ily of normalized wavefunctions Ψθ parametrically dependent on a number θ andcalculate the θ-dependent expected value of the energy:

E(θ) := 〈Ψθ| H |Ψθ〉 . (4.6)

Then, one may use the typical tools of one-variable calculus to find the minimumof E(θ) and thus make the best guess of the energy E0 constrained to the family Ψθ.If the ansatz is cleverly chosen, this estimate could be rather accurate, however, for

8 We assume here, for the sake of simplicity and in order to highlight the relevant concepts,that H has only discrete spectrum. The ideas involved in a general derivation are the same butthe technical details and the notation are more complicated [47].

9 Its existence is not guaranteed: it depends on the particular potential in H . However, for thephysically relevant cases, there is indeed a minimum energy in the set {En}.

10 Of course, this not necessarily so (and, in any case, it depends on the definition of ‘closer’),since it could happen that the 〈Ψ|H |Ψ〉 landscape in the constrained subset of the Hilbert space inwhich the search is performed be ‘rugged’. In such a case, we may have very different wavefunctions(say, in the sense of the L2-norm) with similar energies 〈Ψ|H|Ψ〉. The only ‘direction’ in whichone can be sure that the situation improves when using the variational procedure is the (veryimportant) energetic one. That one is also moving towards better values of any other observableis, in general, no more than a bona fide assumption.

10

large systems that lack symmetry, it is very difficult to write a good enough formfor Ψθ.

When dealing with a large number of particles, there exists another protocolbased on the Variational Theorem that will permit us to derive the Hartree andHartree-Fock equations for the electronic wavefunction Ψe (see secs. 6 and 7, re-spectively). The first step is to devise a restricted way to express Ψe in termsof one-electron wavefunctions, also called orbitals and denoted by {ψa(x)}11, thusreducing the search space to a (typically small) subset of the whole Hilbert space:

Ψe(x1, . . . , xN) = f({ψa(xi)}

). (4.7)

The second step consists in establishing a (possibly infinite) number of constraintson the one-electron functions12,

Lk

({ψa(xi)}

)= 0 . (4.8)

With these two ingredients, we can now write the Lagrange functional that de-scribes the constrained problem in terms of the orbitals ψa (see eq. (4.2)):

F[{ψa}

]=⟨f({ψa(xi)}

) ∣∣∣ H∣∣∣ f({ψa(xi)}

)⟩+∑

k

λkLk

({ψa(xi)}

). (4.9)

Finally, we take the derivatives of F [{ψa}] with respect to every ψa(x) (normally,with respect to the complex conjugate ψ∗

a(x), see footnote 7) and we ask each oneto be zero (see appendix A). This produces the final equations that must be solvedin order to find the stationary one-electron orbitals.

Of course, these final equations may have multiple solutions. In the cases treatedin this document, there exist procedures to check that a particular solution (foundcomputationally) is, not only stationary, but also minimal [48]. However, to assurethat it is, not only locally minimal, but also globally (i.e., that is optimal), could be,in general, as difficult as for any other multi-dimensional optimization problem [49,50, 51]. In the Hartree and Hartree-Fock cases, discussed in secs. 6 and 7 respectively,the aufbau principle and a clever choice of the starting guess are particular techniquesintended to alleviate this problem.

5 Statement of the problemSolve the electronic Hamiltonian

Assuming the Born-Oppenheimer approximation (see sec. 3 and eqs. (3.6)), thecentral problem that one must solve in quantum chemistry is to find the fundamental

11 In principle, there could be more orbitals than electrons, however, in both the Hartree andHartree-Fock applications of this formalism, the index a runs, just like i, from 1 to N .

12 Actually, both restrictions (the one at the level of the total wavefunction in eq. (4.7) andthe one involving the one-particle ones in eq. (4.8)) are simply constraints (see appendix B). Thedistinction is not fundamental but operative, and it also helps us to devise variational ansatzsseparating the two conceptual playgrounds.

11

state of the electronic Hamiltonian for a fixed position R of the nuclei13:

H := T + VeN + Vee := −N∑

i=1

1

2∇2

i −N∑

i=1

NN∑

α=1

Zα

Rαi

+1

2

∑

i 6=j

1

rij. (5.1)

As already remarked in sec. 3, this problem is well posed for neutral and positivelycharged molecules, and, in the same way in which the term VeN prevented the totalwavefunction to be a product of an electronic and a nuclear part, the term Veein the expression above breaks the separability in the one-electron variables xi ofthe electronic time-independent Schrodinger equation associated to H . Hence, ageneral solution Ψ(x) cannot be a product of orbitals and the search must be apriori performed in the whole Hilbert space. However, this is a much too big placeto look for Ψ(x), since the computational requirements to solve the Schrodingerequation grow exponentially on the number of electrons.

Partially recognizing this situation, in the first days of quantum mechanics, Diracwrote that,

The underlying physical laws necessary for the mathematical theory ofa large part of physics and the whole of chemistry are thus completelyknown, and the difficulty is only that the exact application of these equa-tions leads to equations much too complicated to be soluble. It thereforebecomes desirable that approximate practical methods of applying quan-tum mechanics should be developed, which can lead to an explanationof the main features of complex atomic systems without too much com-putation [47].

The description of the most popular approximate methods, which the great physi-cist envisaged to be necessary, will be the objective of the following sections. Twobasic points responsible of the relative success of such an enterprise are the severereduction of the space in where the fundamental state is sought (which, of course,leads to only an approximation of it) and the availability of computers unimaginablyfaster than anything that could be foreseen in times of Dirac.

6 The Hartree approximationDistinguishable spinless independent electrons

One of the first and most simple approximations aimed to solve the problem posedin the previous section is due to Hartree in 1927 [20] (although the way in whichthe Hartree equations will be derived here, using the Variational Theorem, is dueto Slater [52]). In this approximation, the total wavefunction is constrained to bea product (typically referred to as Hartree product) of N one-electron orbitals (see

13 Since, from now on, we will only be dealing with the ‘electronic problem’, the notation hasbeen made simpler by dropping superfluous subindices e where there is no possible ambiguity. Asa consequence, for example, the electronic Hamiltonian is now denoted by H , the electronic kineticenergy by T and the electronic wavefunction by Ψ(x) (dropping the parametric dependence on Rin the same spirit).

12

eq. (4.7)), where the spin of the electrons and the antisymmetry (i.e., the Pauliexclusion principle) are not taken into account14:

Φ(~r1, . . . , ~rN) =N∏

i=1

φi(~ri) , (6.1)

where the a index in the orbitals has been substituted by i due to the fact thateach function is paired to a specific set of electron coordinates, consequently beingthe same number of both of them.

Also, the additional requirement that the one-particle wavefunctions be normal-ized is imposed (see eq. (4.8)):

〈φi|φi〉 = 1 , i = 1, . . . , N . (6.2)

With these two ingredients, we can construct the auxiliary functional whosezero-derivative condition produces the solution of the constrained stationary pointsproblem (see eq. (4.9)). To this effect, we introduce N Lagrange multipliers λi thatforce the normalization constraints15:

F[{φi}

]=

⟨N∏

i=1

φi(~ri)

∣∣∣∣∣ H∣∣∣∣∣

N∏

i=1

φi(~ri)

⟩+

N∑

i=1

λi

(〈φi|φi〉 − 1

). (6.3)

This functional may be considered to depend on 2N independent functions: theN one-electron φi and their N complex conjugates (see footnote 7). The Hartree

equations are obtained by imposing that the functional derivative of F with respectto φ∗

k be zero for k = 1, . . . , N . In order to obtain them and as an appetizer for theslightly more complicated process in the more used Hartree-Fock approximation, thefunctional derivative will be here computed in detail following the steps indicated inappendix A.

First, we write out16 the first term in the right-hand side of eq. (6.3):

⟨∏

i

φi(~ri)

∣∣∣∣∣ H∣∣∣∣∣∏

i

φi(~ri)

⟩=

− 1

2

∑

i

(∏

j 6=i

〈φj|φj〉)∫

φ∗i (~r)∇2φi(~r)d~r

−∑

i

(∏

j 6=i

〈φj|φj〉)∫

φ∗i (~r)φi(~r)

(NN∑

A=1

ZA

|~r − ~RA|

)d~r

14 We shall denote with capital Greek letters the wavefunctions depending on all the electronicvariables, and with lowercase Greek letters the one-electron orbitals. In addition, by Ψ (or ψ), weshall indicate wavefunctions containing spin part (called spin-orbitals) and, by Φ (or φ), those thatdepend only on spatial variables.

15 Note that the normalization of the total wavefunction is a consequence of the normalizationof the one-electron ones and needs not to be explicitly asked.

16 The limits in sums and products are dropped if there is no possible ambiguity.

13

+1

2

∑

i

∑

j 6=i

(∏

k 6=i,j

〈φk|φk〉)∫∫

φ∗i (~r)φi(~r)φ

∗j(~r

′)φj(~r′)

|~r − ~r ′| d~r ′d~r , (6.4)

where d~r denotes the Euclidean R3 volume element dx dy dz.

Now, we realize that the products outside the integrals can be dropped using theconstraints in eq. (6.2) (see the last paragraphs of appendix B for a justification thatthis can be done before taking the derivative). Then, using the previous expressionand conveniently rearranging the order of the integrals and sums, we write out thefirst term in the numerator of the left-hand side of eq. (A.1) that corresponds to aninfinitesimal variation of the function φ∗

k:

F[φ∗k + εδφ∗

k

]:=

F[φ1, φ

∗1, . . . , φk, φ

∗k + εδφ∗

k, . . . , φN , φ∗N

]=

− 1

2

∑

i

∫φ∗i (~r)∇2φi(~r) d~r −

∑

i

∫ ∣∣φi(~r)∣∣2(

NN∑

A=1

ZA

|~r − ~RA|

)d~r

+1

2

∑

i

∫ ∣∣φi(~r)∣∣2(∫ ∑

j 6=i

∣∣φj(~r′)∣∣2

|~r − ~r ′| d~r ′

)d~r +

∑

i

λi

(∫ ∣∣φi(~r)∣∣2d~r − 1

)

− 1

2ε

∫δφ∗

k(~r)∇2φk(~r) d~r − ε

∫δφ∗

k(~r)φk(~r)

(NN∑

A=1

ZA

|~r − ~RA|

)d~r

+ ε

∫δφ∗

k(~r)φk(~r)

(∫ ∑i 6=k

∣∣φi(~r′)∣∣2

|~r − ~r ′| d~r ′

)d~r + ελk

∫δφ∗

k(~r)φk(~r) d~r . (6.5)

We subtract from this expression the quantity F[{φi(~ri)}

], so that the first four

terms cancel, and we can write

limε→0

F[φ∗k + εδφ∗

k

]− F

[φ∗k

]

ε=

∫ [− 1

2∇2φk(~r)−

(NN∑

A=1

ZA

|~r − ~RA|

)φk(~r)

+

(∫ ∑i 6=k |φi(~r

′)|2|~r − ~r ′| d~r ′

)φk(~r) + λkφk(~r)

]δφ∗

k(~r) d~r . (6.6)

Now, by simple inspection of the right-hand side, we see that the functionalderivative (see eq. (A.1)) is the part enclosed by square brackets:

δF[{φi}

]

δφ∗k

=

(−1

2∇2 + Ve(~r) + V k

e (~r) + λk

)φk(~r) , (6.7)

where the nuclear potential energy and the electronic potential energy have beenrespectively defined as17

17 Compare the notation with the one in eqs. (2.3), here a subindex e has been dropped todistinguish the new objects defined.

14

VN(~r) := −NN∑

A=1

ZA

|~r − ~RA|, (6.8a)

V ke (~r) :=

∫ ∑i 6=k |φi(~r

′)|2|~r − ~r ′| d~r ′ . (6.8b)

Finally, if we ask the functional derivative to be zero for k = 1, . . . , N and wedefine εk := −λk, we arrive to the equations that the stationary points must satisfy,the Hartree equations:

Hk[φ]φk(~r) :=

(−1

2∇2 + VN(~r) + V k

e (~r)

)φk(~r) = εk φk(~r) , k = 1, . . . , N . (6.9)

Let us note that, despite the fact that the object Hk[φ] defined above is nota operator strictly speaking, since, as the notation emphasizes, it depends on theorbitals φi 6=k, we will stick to the name Hartree operator for it, in order to beconsistent with most of the literature.

Now, some remarks related to the Hartree equations are worth making. First, itcan be shown that, if the variational ansatz in eq. (6.1) included the spin degrees offreedom of the electrons, all the expressions above would be kept, simply changingthe orbitals φi(~ri) by the spin-orbitals ψi(~ri, σi).

Secondly, and moving into more conceptual playgrounds, we note that the specialstructure of V k

e (~r) in eq. (6.8b) makes it mandatory to interpret the Hartree schemeas one in which each electron ‘feels’ only the average effect of the rest. In fact, ifthe quantum charge density ρi(~r) := |φi(~r)|2 is regarded for a moment as a classicalcontinuum distribution, then the potential produced by all the electrons but the k-this precisely the one in eq. (6.8b). Supporting this image, note also the fact that, ifwe write the joint probability density of electron 1 being at the point ~r1, electron 2being at the point ~r2 and so on (simply squaring eq. (6.1)),

ρ(~r1, . . . , ~rN) :=∣∣Φ(~r1, . . . , ~rN)

∣∣ 2 =N∏

i=1

∣∣φi(~ri)∣∣ 2 =

N∏

i=1

ρi(~ri) , (6.10)

we see that, in a probabilistic sense, the electrons are independent (they couldnot be independent in a physical, complete sense, since we have already said thatthey ‘see’ each other in an average way).

Anyway, despite these appealing images and also despite the fact that, disguisedunder the misleading (albeit common) notation, these equations seem ‘one-particle’,they are rather complicated from a mathematical point of view. On the one hand, itis true that, whereas the original electronic Schrodinger equation in (3.6a) dependedon 3N spatial variables, the expressions above only depend on 3. This is what wehave gained from drastically reducing the search space to the set of Hartree productsin eq. (6.1) and what renders the approximation tractable. On the other hand,however, we have paid the price of greatly increasing the mathematical complexityof the expressions, so that, while the electronic Schrodinger equation was one linear

15

differential equation, the Hartree ones in (6.9) are N coupled non-linear integro-differential equations [53].

This complexity precludes any analytical approach to the problem and forces usto look for the solutions using less reliable iterative methods. Typically, in compu-tational studies, one proposes a starting guess for the set of N orbitals {φ0

k}; withthem, the Hartree operator Hk[φ

0] in the left-hand side of eq. (6.9) is constructedfor every k and the N equations are solved as simple eigenvalue problems. For eachk, the φ1

k that corresponds to the lowest ε1k is selected and a new Hartree operatorHk[φ

1] is constructed with the {φ1k}. The process is iterated until (hopefully) the n-

th set of solutions {φnk} differ from the (n− 1)-th one {φn−1

k } less than a reasonablysmall amount.

Many technical issues exist that raise doubts about the possible success of such anapproach. The most important ones being related to the fact that a proper definitionof the Hartree problem should be: find the global minimum of the energy functional〈Φ| H |Φ〉 under the constraint that the wavefunction Φ be a Hartree product andnot: solve the Hartree equations (6.9), whose solutions indeed include the globalminimum sought but also all the rest of stationary points.

While the possibility that a found solution be a maximum or a saddle pointcan be typically ruled out [48, 53], as we remarked in sec. 4 and due to the factthat there are an infinite number of solutions to the Hartree equations [54], to besure that any found minimum is the global one is, in a general case, impossible.There exists, however, one way, related to a theorem by Simon and Lieb [55, 56],of hopefully biasing a particular found solution of the Hartree equations to be theglobal minimum that we are looking for. They showed that, first, for neutral orpositively charged molecules (Z ≥ N), the Hartree global minimization problemhas a solution (its uniqueness is not established yet [53]) and, second, that theminimizing orbitals {φk} correspond to the lowest eigenvalues of the Hk[φ] operatorsself-consistently constructed with them18. Now, although the reverse of the secondpart of the theorem is not true in general (i.e., from the fact that a particular set oforbitals are the eigenstates corresponding to the lowest eigenvalues of the associatedHartree operators, does not necessarily follow that they are the optimal ones) [53],in practice, the insight provided by Lieb and Simon’s result is invoked to buildeach successive state in the iterative procedure described above, choosing the lowestlying eigenstates each time. In this way, although one cannot be sure that theglobal minimum has been reached, the fact that the found one has a property thatthe former also presents is regarded as a strong hint that it must be so (see also thediscussion for the Hartree-Fock case in the next section).

This drawback and all the problems arising from the fact that an iterative pro-cedure such as the one described above could converge to a fixed point, oscillate

18 In quantum chemistry, where the number of electrons considered is typically small, the versionof the Hartree equations that is used is the one derived here, with the Hartree operators dependingon the index k in a non-trivial way. However, if the number of electrons is large enough (suchas in condensed matter applications), is customary to add to the effective electronic repulsion ineq. (6.8b) the self-interaction of electron k with himself. In such a case, the Hartree operatoris independent of k so that, after having achieved self-consistency, the orbitals φk turn out tobe eigenstates corresponding to different eigenvalues of the same Hermitian operator, H[φ], and,therefore, mutually orthogonal.

16

eternally or even diverge, are circumvented in practice by a clever choice of thestarting guess orbitals {φ0

k}. If they are extracted, for example, from a slightly lessaccurate theory, one may expect that they could be ‘in the basin of attraction’ of thetrue Hartree minimum (so that the stationary point found will be the correct one)and close to it (so that the iterative procedure will converge). This kind of wish-ful thinking combined with large amounts of heuristic protocols born from manydecades of trial-and-error-derived knowledge pervade and make possible the wholequantum chemistry discipline.

7 The Hartree-Fock approximationIndistinguishable quasi-independent electrons with spin

The Hartree theory discussed in the previous section is not much used in quantumchemistry and many textbooks on the subject do not even mention it. Although itcontains the seed of almost every concept underlying the Hartree-Fock approximationdiscussed in this section, it lacks an ingredient that turns out to be essential tocorrectly describe the behaviour of molecular species: the indistinguishability of theelectrons. This was noticed independently by Fock [57] and Slater [52] in 1930, andit was corrected by proposing a variational ansatz for the total wavefunction thattakes the form of a so-called Slater determinant (see eq. (7.2) below).

The most important mathematical consequence of the indistinguishability amonga set of N quantum objects of the same type is the requirement that the total N -particle wavefunction must either remain unchanged (symmetric) or change sign(antisymmetric) when any pair of coordinates, xi and xj , are swapped. In thefirst case, the particles are called bosons and must have integer spin, while in thesecond case, they are called fermions and have semi-integer spin. Electrons arefermions, so the total wavefunction must be antisymmetric under the exchange ofany pair of one-electron coordinates. This is a property that is certainly not metby the single Hartree product in eq. (6.1) but that can be easily implemented byforming linear combinations of many of them. The trick is to add all the possibleHartree products that are obtained from eq. (6.1) changing the order of the orbitalslabels while keeping the order of the coordinates ones19, and assigning to each termthe sign of the permutation p needed to go from the natural order 1, . . . , N to thecorresponding one p(1), . . . , p(N). The sign of a permutation p is 1 if p can bewritten as a composition of an even number of two-element transpositions, and it is−1 if the number of transpositions needed is odd. Therefore, we define T (p) as theminimum number20 of transpositions needed to perform the permutation p, and wewrite the sign of p as (−1)T (p).

Using this, an antisymmetric wavefunction constructed from Hartree productsof N different orbitals may be written as

19 It is immaterial whether the orbitals labels are kept and the coordinates ones changed or viceversa.

20 It can be shown that the parity of all decompositions of p into products of elementary trans-positions is the same. We have chosen the minimum only for T (p) to be well defined.

17

Ψ(x1, . . . , xN ) =1√N !

∑

p∈SN

(−1)T (p)ψp(1)(x1) · · ·ψp(N)(xN ) , (7.1)

where the factor 1/√N ! enforces normalization of the total wavefunction Ψ (if

we use the constraints in eq. (7.3)) and SN denotes the symmetric group of order N ,i.e., the set of all permutations of N elements (with a certain multiplication rule).

The above expression is more convenient to perform the calculations that leadto the Hartree-Fock equations, however, there is also a compact way of rewritingeq. (7.1) which is commonly found in the literature and that is useful to illustratesome particular properties of the problem. It is the Slater determinant :

Ψ(x1, . . . , xN) =1√N !

∣∣∣∣∣∣∣∣∣

ψ1(x1) ψ2(x1) · · · ψN(x1)ψ1(x2) ψ2(x2) · · · ψN(x2)

......

. . ....

ψ1(xN ) ψ2(xN ) · · · ψN(xN )

∣∣∣∣∣∣∣∣∣

. (7.2)

Now, having established the constraints on the form of the total wavefunction,we ask the Hartree-Fock one-electron orbitals to be, not only normalized, like wedid in the Hartree case, but also mutually orthogonal:

〈ψi|ψj〉 = δij , i, j = 1, . . . , N . (7.3)

Note also that, contrarily to what we did in the previous section, we have nowused one-electron wavefunctions ψi dependent also on the spin σ (i.e., spin-orbitals)to construct the variational ansatz. A general spin-orbital21 may be written as (seealso footnote 2)

ψ(x) = φα(~r)α(σ) + φβ(~r) β(σ) , (7.4)

where the functions α and β correspond to the spin-up and spin-down eigenstatesof the operator associated to the z-component of the one-electron spin. They aredefined as

α(−1/2) = 0 β(−1/2) = 1α(1/2) = 1 β(1/2) = 0 .

(7.5)

Next, to calculate the expected value of the energy in a state such as the one ineqs. (7.1) and (7.2), let us denote the one-particle part (that operates on the i-thcoordinates) of the total electronic Hamiltonian H in eq. (5.1) by

hi := −∇2i

2−

NN∑

α=1

Zα

|~Rα − ~ri|, (7.6)

in such a way that,

21 Note that, if we had not included the spin degrees of freedom, the search space would have beenhalf as large, since, where we now have 2N functions of ~r (i.e., φαi (~r) and φ

βi (~r), with i = 1, . . . , N),

we would have had just N (the φi(~r)).

18

〈Ψ| H |Ψ〉 =∑

i

〈Ψ| hi |Ψ〉+ 1

2

∑

i 6=j

〈Ψ| 1

rij|Ψ〉 , (7.7)

where rij := |~rj − ~ri|.We shall compute separately each one of the sums in the expression above. Let us

start now with the first one: For a given i in the sum, the expected value 〈Ψ| hi |Ψ〉is a sum of (N !)2 terms of the form

1

N !(−1)T (p)+T (p ′)

⟨ψp(1)(x1) · · ·ψp(N)(xN)

∣∣ hi∣∣ψp ′(1)(x1) · · ·ψp ′(N)(xN)

⟩, (7.8)

but, since hi operates on the xi and due to the orthogonality of the spin-orbitalswith different indices, we have that the only non-zero terms are those with p = p ′.Taking this into account, all permutations are still present, and we see that everyorbital ψj appears depending on each coordinates xi (in the terms for which p(i) = j).In such a case, the term above reads

1

N !

(∏

k 6=j

〈ψk|ψk〉)〈ψj| h |ψj〉 , (7.9)

where we have used that (−1)2 T (p) = 1, and we have dropped the index i fromhi noticing that the integration variables in 〈ψj(xi)| hi |ψj(xi)〉 are actually dummy.

Next, we use again the one-electron wavefunctions constraints in eq. (7.3) toremove the product of norms in brackets, and we realize that, for each j, there asmany terms like the one in the expression above as permutations of the remainingN − 1 orbital indices (i.e., (N − 1)!). In addition, we recall that all j must occurand we perform the first sum in eq. (7.7), yielding

∑

i

〈Ψ| hi |Ψ〉 =∑

i

(N − 1)!

N !

∑

j

〈ψj | h |ψj〉 =∑

j

〈ψj| h |ψj〉 . (7.10)

The next step is to calculate the second sum in eq. (7.7). Again, we have that,for each pair (i, j), 〈Ψ| 1/rij |Ψ〉 is a sum of (N !)2 terms like

1

N !(−1)T (p)+T (p ′)

⟨ψp(1)(x1) · · ·ψp(N)(xN )

∣∣ 1

rij

∣∣ψp ′(1)(x1) · · ·ψp ′(N)(xN )⟩. (7.11)

For this expected value, contrarily to the case of hi and due to the two-bodynature of the operator 1/rij, not only do the terms with p = p ′ survive, but alsothose in which p and p ′ only differ on i and j, i.e., those for which p(i) = p ′(j),p(j) = p ′(i) and p(k) = p ′(k), ∀k 6= i, j.

Using that 1/rij operates only on xi and xj , the orthonormality conditions ineq. (7.3) and the fact that (−1)2 T (p) = 1, we have that, when ψk depends on xi andψl depends on xj , the p = p ′ part of the term in eq. (7.11) reads

1

N !〈ψkψl|

1

r|ψkψl〉 , (7.12)

19

where we have defined

〈ψiψj |1

r|ψkψl〉 :=

∑

σ, σ ′

∫∫ψ∗i (x)ψj(x

′)ψ∗k(x)ψl(x

′)

|~r − ~r ′| d~r d~r ′ . (7.13)

Next, we see that, for each pair (k, l), there are (N−2)! permutations among theN − 2 indices of the orbitals on which 1/rij does not operate, therefore producing(N − 2)! identical terms like the one above. In addition, if we perform the sum oni and j in eq. (7.7) and remark that the term in eq. (7.12) does not depend on thepair (i, j) (which is obvious from the suggestive notation above), we have that thep = p ′ part of the second sum in eq. (7.7), which is typically called Coulomb energy,reads

1

2

∑

i 6=j

(N − 2)!

N !

∑

k 6=l

〈ψkψl|1

r|ψkψl〉 =

1

2

∑

k 6=l

〈ψkψl|1

r|ψkψl〉 . (7.14)

On the other hand, in the case in which p and p ′ only differ in that the indices ofthe orbitals that depend on xi and xj are swapped, all the derivation above appliesexcept for the facts that, first, (−1)T (p)+T (p ′) = −1 and, second, the indices k andl must be exchanged in eq. (7.14) (it is immaterial if they are exchanged in the braor in the ket, since the indices are summed over and are dummy). Henceforth, theremaining part of the second sum in eq. (7.7), typically termed exchange energy,may be written as

− 1

2

∑

k 6=l

〈ψkψl|1

r|ψlψk〉 , (7.15)

so that the expected value of the energy in the Hartree-Fock variational state Ψturns out to be

E := 〈Ψ| H |Ψ〉 =∑

i

〈ψi| h |ψi〉︸︷︷︸hi

+1

2

∑

i,j

(〈ψiψj|

1

r|ψiψj〉

︸︷︷︸Jij

−〈ψiψj |1

r|ψjψi〉

︸︷︷︸Kij

),

(7.16)where the one-electron integrals hi have been defined together with the two-

electron integrals, Jij and Kij, and the fact that Jii = Kii, ∀i has been used toinclude the diagonal terms in the second sum.

Now, the energy functional above is what we want to minimize under the or-thonormality constraints in eq. (7.3). So we are prepared to write the auxiliary

functional F , introducing N2 Lagrange multipliers λij (see eq. (4.9) and comparewith the Hartree example in the previous section):

F[{ψi}

]=∑

i

hi +1

2

∑

i,j

(Jij −Kij) +∑

i,j

λij

(〈ψi|ψj〉 − δij

). (7.17)

In order to get to the Hartree-Fock equations that the stationary orbitals ψk

must satisfy, we impose that the functional derivative of F[{ψi}

]with respect to

20

ψ∗k be zero. To calculate δF/δψ∗

k, we follow the procedure described in appendix A,using the same notation as in eq. (6.5):

limε→0

F[ψ∗k + εδψ∗

k

]− F

[ψ∗k

]

ε=

〈δψk| h |ψk〉+∑

j

(〈δψkψj |

1

r|ψkψj〉 − 〈δψkψj |

1

r|ψjψk〉

)+∑

j

λkj〈δψk|ψj〉 =

∫ [h ψk(x) +

∑

j

(ψk(x)

∫ |ψj(x′)|2

|~r − ~r ′| dx ′ − ψj(x)

∫ψ∗j (x

′)ψk(x′)

|~r − ~r ′| dx ′

)

+∑

j

λkj ψj(x)

]δψ∗

k(x) dx , (7.18)

where we have used the more compact notation∫dx instead of

∑σ

∫d~r.

Now, like in the previous section, by simple inspection of the right-hand side,we see that the functional derivative is the part enclosed by square brackets (seeeq. (A.1)):

δF[{ψi}

]

δψ∗k

=

[h+

∑

j

(Jj[ψ]− Kj [ψ] + λkj

)]ψk(~x) , (7.19)

where the Coulomb and exchange operators are respectively defined by theiraction on an arbitrary function ϕ(x) as follows22:

Jj[ψ]ϕ(x) :=

(∫ |ψj(x′)|2

|~r − ~r ′| dx ′

)ϕ(x) , (7.20a)

Kj [ψ]ϕ(x) :=

(∫ψ∗j (x

′)ϕ(x ′)

|~r − ~r ′| dx ′

)ψj(x) . (7.20b)

Therefore, if we define the Fock operator as

F [ψ] := h +∑

j

(Jj[ψ]− Kj [ψ]

), (7.21)

and denote ηkj := −λkj, we may arrive to a first version of the Hartree-Fockequations by asking that the functional derivative in eq. (7.19) be zero:

F [ψ]ψk(x) =∑

j

ηkjψj(x) , k = 1, . . . , N . (7.22)

Now, in order to obtain a simpler version of them, we shall take profit for thefact that the whole problem is invariant under a unitary transformation among theone-electron orbitals.

22 Like in the Hartree case in the previous section, the word operator is a common notationalabuse if they act upon the very ψi on which they depend. This is again made explicit in thenotation.

21

If we repeat the calculation in eq. (7.18) but varying ψk this time, instead of ψ∗k,

we find that

limε→0

F[ψk + εδψk

]− F

[ψk

]

ε=

〈ψk| h |δψk〉+∑

j

(〈ψkψj |

1

r|δψkψj〉 − 〈ψjψk|

1

r|δψkψj〉

)+∑

j

λjk〈ψj |δψk〉 ,(7.23)

where the order of the indices in the exchange part has been conveniently ar-ranged (using that they are dummy) in order to facilitate the next part of thecalculations.

If we use the following relations:

〈ψi|ψj〉∗ = 〈ψj|ψi〉 , (7.24a)

〈ψi| h |ψj〉∗ = 〈ψj| h |ψi〉 , (7.24b)

〈ψiψj|1

r|ψkψl〉∗ = 〈ψkψl|

1

r|ψiψj〉 , (7.24c)

we may subtract the complex conjugate of eq. (7.23) from eq. (7.18) (both mustbe zero when evaluated on solutions of the Hartree-Fock equations) yielding

∑

j

(λkj − λ∗jk

)〈δψk|ψj〉 = 0 , k = 1, . . . , N , (7.25)

so that, since the variations δψk are arbitrary, we have that λkj = λ∗jk and theN × N matrix λ := (λij) of Lagrange multipliers is Hermitian (i.e., λ = λ+). Ofcourse, the same is true about the matrix η := (ηij) in eq. (7.22), and, consequently,a unitary matrix U exists that diagonalizes η; in the sense that ε := U−1ηU = U+ηUis a diagonal matrix, i.e., εij = δijεi.

Using that unitary matrix U or any other one, we can transform the set of orbitals{ψi} into a new one {ψ ′

i }:

ψk(x) =∑

j

Ukj ψ′j(x) . (7.26)

This transformation is physically legitimate since it only changes the N -electronwavefunction Ψ in an unmeasurable phase eiφ. To see this, let us denote by Sij

the (ij)-element of the matrix inside the Slater determinant in eq. (7.2), i.e., Sij :=ψj(xi). Then, after using the expression above, the (ij)-element of the new matrixS ′ can be related to the old ones via Skj =

∑j UkjS

′ij, in such a way that S = S ′UT

and the desired result follows:

Ψ({ψi}

)=

detS√N !

=det(S ′UT

)√N !

=detS ′ detUT

√N !

=eiφ detS ′

√N !

= eiφΨ({ψ′

i}).

(7.27)

22

Now, we insert eq. (7.26) into the first version of the Hartree-Fock equationsin (7.22):

F [Uψ ′]

(∑

j

Ukjψ′j(x)

)=∑

i,j

ηkiUijψ′j(x) , k = 1, . . . , N . (7.28)

Next, we multiply by U−1lk each one of the N expressions and sum in k:

F [Uψ ′]

∑

j,k

U−1lk Ukj︸︷︷︸δlj

ψ ′j(x)

=

∑

i,j,k

U−1lk ηkiUij︸︷︷︸

εlj = δljεj

ψ ′j(x)

=⇒ F [Uψ ′]ψ ′l (x) = εlψ

′l (x) , l = 1, . . . , N . (7.29)

Although this new version of the Hartree-Fock equations can be readily seen asa pseudo-eigenvalue problem and solved by the customary iterative methods, we cango a step further and show that, like the N -particle wavefunction Ψ (see eq. (7.27)),the Fock operator F [ψ], as a function of the one-electron orbitals, is invariant undera unitary transformation such as the one in eq. (7.26). In fact, this is true for eachone of the sums of Coulomb and exchange operators in eq. (7.21) separately:

∑

j

Jj [Uψ′]ϕ(x) =

∑

j

(∫ |∑k Ujk ψ′k(x

′)|2|~r − ~r ′| dx ′

)ϕ(x) =

∑

j

(∫ ∑k,l

U−1kj︷︸︸︷U∗jk Ujl ψ

′∗k (x ′)ψ ′

l (x′)

|~r − ~r ′| dx ′

)ϕ(x) =

(∫ ∑j,k,lU

−1kj Ujl ψ

′∗k (x ′)ψ ′

l (x′)

|~r − ~r ′| dx ′

)ϕ(x) =

∑

k

(∫ |ψ ′k(x

′)|2|~r − ~r ′| dx ′

)ϕ(x) =

∑

j

Jj[ψ′]ϕ(x) , ∀ϕ(x) , (7.30)

where, in the step before the last, we have summed on j and l, using that∑j U

−1kj Ujl = δkl.

Performing very similar calculations, one can show that

∑

j

Kj [Uψ′]ϕ(x) =

∑

j

Kj [ψ′]ϕ(x) , ∀ϕ(x) , (7.31)

and therefore, that F [Uψ ′] = F [ψ ′]. In such a way that any unitary transfor-mation on a set of orbitals that constitute a solution of the Hartree-Fock equationsin (7.22) yields a different set that is also a solution of the same equations. Forcomputational and conceptual reasons (see, for example, Koopmans’ Theorem be-low), it turns out to be convenient to use this freedom and choose the matrix U

23

in such a way that the Lagrange multipliers matrix is diagonalized (see eq. (7.25)and the paragraph below it). The particular set of one-electron orbitals {ψ ′

i } ob-tained with this U are called canonical orbitals and their use is so prevalent that wewill circumscribe the forecoming discussion to them and drop the prime from thenotation.

Using the canonical orbitals, the Hartree-Fock equations can be written as

F [ψ]ψi(x) = εiψi(x) , i = 1, . . . , N . (7.32)

Many of the remarks related to these equations are similar to those made aboutthe Hartree ones in (6.9), although there exist important differences due to theinclusion of the indistinguishability of the electrons in the variational ansatz. Thisis clearly illustrated if we calculate the joint probability density associated to awavefunction like the one in eq. (7.1) of the coordinates with label 1 taking thevalue x1, the coordinates with label 2 taking the value x2, and so on:

ρ(x1, . . . , xN) =∣∣Ψ(x1, . . . , xN)

∣∣2 =1

N !

∑

p,p ′∈SN

(−1)T (p)+T (p ′)ψ∗p(1)(x1) · · ·ψ∗

p(N)(xN)ψp ′(1)(x1) · · ·ψp ′(N)(xN ) .(7.33)

If we compare this expression with eq. (6.10), we see that the antisymmetryof Ψ has completely spoiled the statistical independence among the one-electroncoordinates. However, there is a weaker quasi-independence that may be recovered:If, using the same reasoning about permutations that took us to the one-electronpart

∑i〈Ψ| hi |Ψ〉 of the energy functional in page 19, we calculate the marginal

probability density of the i-th coordinates taking the value xi, we find

ρi(xi) :=

∫ (∏

k 6=i

dxk

)ρ(x1, . . . , xN) =

1

N

∑

j

∣∣ψj(xi)∣∣2 . (7.34)

Now, since the coordinates indices are just immaterial labels, the actual proba-bility density of finding any electron with coordinates x is given by

ρ(x) :=∑

i

ρi(x) =∑

i

∣∣ψi(x)∣∣2 , (7.35)

which can be interpreted as a charge density (except for the sign), as, in atomicunits, the charge of the electron is e = −1. The picture being consistent with thefact that ρ(x) is normalized to the number of electrons N :

∫ρ(x) dx = N . (7.36)

Additionally, if we perform the same type of calculations that allowed to calculatethe two-electron part of the energy functional in page 19, we have that the two-body probability density of the i-th coordinates taking the value xi and of the j-thcoordinates taking the value xj reads

24

ρij(xi, xj) :=

∫ (∏

k 6=i,j

dxk

)ρ(x1, . . . , xN) =

1

N(N − 1)

(∑

k

∣∣ψk(xi)∣∣2∑

l

∣∣ψl(xj)∣∣2 −

∑

k,l

ψ∗k(xi)ψ

∗l (xj)ψl(xi)ψk(xj)

),(7.37)

and, if we reason in the same way as in the case of ρi(xi), in order to get tothe probability density of finding any electron with coordinates x at the same timethat any other electron has coordinates x ′, we must multiply the function above byN(N − 1)/2, which is the number of immaterial (i, j)-labelings, taking into accountthat the distinction between x and x ′ is also irrelevant:

ρ(x, x ′) :=N(N − 1)

2ρij(x, x

′) =

1

2

(∑

k

∣∣ψk(x)∣∣2∑

l

∣∣ψl(x′)∣∣2 −

∑

k,l

ψ∗k(x)ψ

∗l (x

′)ψl(x)ψk(x′)

). (7.38)

Finally, taking eq. (7.35) to this one, we have

ρ(x, x ′) =1

2

(ρ(x) ρ(x ′)−

∑

k,l

ψ∗k(x)ψ

∗l (x

′)ψl(x)ψk(x′)

), (7.39)

where the first term corresponds to independent electrons and the second onecould be interpreted as an exchange correction.

Although, in general, this is the furthest one may go, when additional constraintsare imposed on the spin part of the one-electron wavefunctions (see the discussionabout Restricted Hartree-Fock in the following pages, for example), the exchangecorrection in eq. (7.39) above vanishes for electrons of opposite spin, i.e., electronsof opposite spin turn out to be pairwise independent. However, whereas it is truethat more correlation could be added to the Hartree-Fock results by going to higherlevels of the theory (see, for example, sec. 10) and, in this sense, Hartree-Fock couldbe considered the first step in the ‘correlation ladder’, one should not regard it as an‘uncorrelated’ approximation, since, even in the simplest case of RHF (see below),Hartree-Fock electrons (of the same spin) are statistically correlated.

Let us now point out that, like in the Hartree case, the left-hand side of theHartree-Fock equations in (7.32) is a complicated, non-linear function of the or-bitals {ψi} and the notation chosen is intended only to emphasize the nature of theiterative protocol that is typically used to solve the problem. However, note that,while the Hartree operator Hk[φ] depended on the index of the orbital φk on whichit acted, the Fock operator in eq. (7.32) is the same for all the spin-orbitals ψi. Thisis due to the inclusion of the i = j terms in the sum of the Coulomb and exchangetwo-electron integrals in eq. (7.16) and it allows to perform the iterative proceduresolving only one eigenvalue problem at each step, instead of N of them like in theHartree case.

25

The one-particle appearance of eqs. (7.32) is again strong and, whereas the ‘eigen-values’ εi are not the energies associated to individual orbitals (or electrons) as itmay seem, they have some physical meaning via the well-known Koopmans’ Theorem[58].

To introduce it, let us multiply eq. (7.32) from the left by ψi(x), for a given i,and then integrate over x. Using the definition of the Fock operator in eq. (7.21)together with the Coulomb and exchange ones in eqs. (7.20), we obtain

〈ψi| F |ψi〉 = hi +∑

j

(Jij −Kij

)= εi , i = 1, . . . , N , (7.40)

where we have used the same notation as in eq. (7.16) and the fact that theone-electron orbitals are normalized.

If we next sum on i and compare the result with the expression in eq. (7.16), wefound that the relation of the eigenvalues εi with the actual Hartree-Fock energy isgiven by

E =∑

i

εi −1

2

∑

i,j

(Jij −Kij

). (7.41)

Finally, if we assume that upon ‘removal of an electron from the k-th orbital’ therest of the orbitals will remain unmodified, we can calculate the ionization energyusing the expression in (7.16) together with the equations above:

∆E := EN−1 − EN =∑

i 6=k

hi −∑

i

hi +1

2

∑

i,j 6=k

(Jij −Kij

)− 1

2

∑

i,j

(Jij −Kij

)=

−hk −∑

j

(Jkj −Kkj

)= −εk , (7.42)

and this is Koopmans’ Theorem, namely, that the k-th ionization energy in thefrozen-orbitals approximation is εk (of course, the ambiguity in the sign depends onwhether we define the ionization energy as the one lost by the system or as the onethat we must provide from the outside to make the process happen).

Turning now to the issue about the solution of the Hartree-Fock equationsin (7.32), we must remark that the necessity of using the relatively unreliable it-erative approach to tackle them stems again from their complicated mathematicalform. Like in the Hartree case, we have managed to largely reduce the dimensionof the space on which the basic equations are defined: from 3N in the electronicSchrodinger equation in (3.6a) to 3 in the Hartree-Fock ones. However, to havethis, we have payed the price of dramatically increasing their complexity [53], since,while the electronic Schrodinger equation was one linear differential equation, theHartree-Fock ones in (7.32) are N coupled non-linear integro-differential equations,thus precluding any analytical approach to their solution.

A typical iterative procedure23 begins by proposing a starting guess for the set ofN spin-orbitals {ψ0

i }. With them, the Fock operator F [ψ0] in the left-hand side of

23 The process described in this paragraph must be taken only as an outline of the one that isperformed in practice. It is impossible to deal in a computer with a general function as it is (a

26

eq. (7.32) is constructed and the set of N equations is solved as one simple eigenvalueproblem. Then, the {ψ1

i } that correspond to the N lowest eigenvalues ε1i are selected(see the discussion of the aufbau principle below) and a new Fock operator F [ψ1]is constructed with them. The process is iterated until (hopefully) the n-th setof solutions {ψn

i } differs from the (n − 1)-th one {ψn−1i } less than a reasonably

small amount (defining the distance among solutions in some suitable way typicallycombined with a convergence criterium related to the associated energy change).When this occurs, the procedure is said to have converged and the solution orbitalsare called self-consistent ; also, a calculation of this kind is commonly termed self-consistent field (SCF).

Again, like in the Hartree case, many issues exist that raise doubts about the pos-sible success of such an approach. The most important ones are related to the factthat a proper definition of the Hartree-Fock problem should be: find the global min-imum of the energy functional 〈Ψ| H |Ψ〉 under the constraint that the wavefunctionΨ be a Slater determinant of one-electron spin-orbitals, and not: solve the Hartree-Fock equations (7.32). The solutions of the latter are all the stationary points ofthe constrained energy functional, while we are interested only in the particular onethat is the global minimum. Even ruling out the possibility that a found solutionmay be a maximum or a saddle point (which can be done [48, 53]), one can neverbe sure that it is the global minimum and not a local one.

There exists, however, one way, related to the Hartree-Fock version of the the-orem by Simon and Lieb [55, 56] mentioned in the previous section, of hopefullybiasing a particular found solution of eqs. (7.32) to be the global minimum that weare looking for. They showed, first, that for neutral or positively charged molecules(Z ≥ N), the Hartree-Fock global minimization problem has a solution (its unique-ness is not established yet [53]) and, second, that the minimizing orbitals {ψi} corre-spond to the N lowest eigenvalues of the Fock operator F [ψ] that is self-consistentlyconstructed with them. Therefore, although the reverse is not true in general [53](i.e., from the fact that a particular set of orbitals are the eigenstates correspondingto the lowest eigenvalues of the associated Fock operator, does not necessarily followthat they are the optimal ones), the information contained in Simon and Lieb’s resultis typically invoked to build each successive state in the iterative procedure describedabove by keeping only the N orbitals that correspond to the N lowest eigenvalues εi.Indeed, by doing that, one is effectively constraining the solutions to have a prop-erty that the true solution does have, so that, in the worst case, the space in whichone is searching is of the same size as the original one, and, in the best case (evenplaying with the possibility that the reverse of Simon and Lieb’s theorem be true,though not proved), the space of solutions is reduced to the correct global minimumalone. This wishful-thinking way of proceeding is termed the aufbau principle [53],and, together with a clever choice of the starting-guess set of orbitals [59] (typicallyextracted from a slightly less accurate theory, so that one may expect that it couldbe ‘in the basin of attraction’ of the true Hartree-Fock minimum), constitute one ofthe many heuristic strategies that make possible that the aforementioned drawbacks

non-countable infinite set of numbers), and the problem must be discretized in some way. Thetruncation of the one-electron Hilbert space using a finite basis set, described in secs. 8 and 9, isthe most common way of doing this.

27

(and also those related to the convergence of iterative procedures) be circumventedin real cases, so that, in practice, most of SCF calculations performed in the fieldof quantum chemistry do converge to the true solution of eqs. (7.32) in spite of thetheoretical notes of caution.

Now, to close this section, let us discuss some points regarding the imposition ofconstraints as a justification for subsequently introducing two commonly used formsof the Hartree-Fock theory that involve additional restrictions on the variationalansatz (apart from the ones in eqs. (7.1) and (7.3)).

In principle, the target systems in which we are interested and to which the theorydeveloped in this document is meant to be applied are rather complex. They havemany degrees of freedom and the different interactions that drive their behaviourtypically compete with one another, thus producing complicated, ‘frustrated’ energylandscapes (see refs. 60, 61, 62, 63, 64, 65, but note, however, that we do notneed to think about macromolecules; a small molecule like CO2 already has 22electrons). This state of affairs renders the a priori assessment of the accuracyof any approximation to the exact equations an impossible task. As researcherscalculate more and more properties of molecular species using quantum chemistryand the results are compared to higher-level theories or to experimental data, muchempirical knowledge about ‘how good is Theory A for calculating Property X’ isbeing gathered. However, if the characterization of a completely new molecule thatis not closely related to any one that has been previously studied is tackled with, say,the Hartree-Fock approximation, it would be very unwise not to ‘ask for a secondopinion’.

All of this also applies, word by word, to the choice of the constraints on thewavefunction in variational approaches like the one discussed in this section: Forexample, it is impossible to know a priori what will be the loss of accuracy due tothe requirement that the N -particle wavefunction Ψ be a Slater determinant as ineq. (7.1). However, in the context of the Hartree-Fock approximation, there exists away of proceeding, again, partly based on wishful thinking and partly confirmed byactual calculations in particular cases, that is almost unanimously used to chooseadditional constraints which are expected to yield more efficient theories. It consistsof imposing constraints to the variational wavefunction that are properties that theexact solution to the problem does have. In such a way that the obvious loss ofaccuracy due to the reduction of the search space is expected to be minimized, whilethe decrease in computational cost could be considerable.

This way of thinking is clearly illustrated by the question of whether or not oneshould allow that the one-electron spin-orbitals ψi (and therefore the total wavefunc-tion Ψ) be complex valued. Indeed, due to the fact that the electronic Hamiltonianin eq. (5.1) is self-adjoint, the real and imaginary parts of any complex eigenfunctionsolution of the time independent Schrodinger equation in (3.6a) are also solutions ofit [53]. Therefore, the fundamental state, which is the exact solution of the problemthat we are trying to solve, may be chosen to be real valued. Nevertheless, theexact minimum will not be achieved, in general, in the smaller space defined by theHartree-Fock constraints in eqs. (7.1) and (7.3), so that there is no a priori reason tobelieve that allowing the Hartree-Fock wavefunction to take complex values wouldnot improve the results by finding a lower minimum. In fact, in some cases, this

28

ReGHF

CoGHF

ReUHF

CoUHF CoRHF

ReRHF

1 1/2 1/4

1/81/41/2

Figure 1: Schematic relation map among the six types of Hartree-Fock methods dis-

cussed in the text: General Hartree-Fock (GHF), Unrestricted Hartree-Fock (UHF) and

Restricted Hartree-Fock (RHF), in both their complex (Co) and real (Re) versions. The

arrows indicate imposition of constraints; horizontally, in the spin part of the orbitals,

and, vertically, from complex- to real-valued wavefunctions. Next to each method, the size

of the search space relative to that in CoGHF is shown.

happens [59]. Nevertheless, if one constrains the search to real orbitals, the compu-tational cost is reduced by a factor two, and, after all, ‘some constraints must beimposed’.

Additionally, apart from these ‘complex vs. real’ considerations, there exist twofurther restrictions that are commonly found in the literature and that affect the spinpart of the one-electron orbitals ψi. The N -electron wavefunction Ψ of the GeneralHartree-Fock (GHF) approximation (which is the one discussed up to now) is notan eigenstate of the total-spin operator, S2, nor of the z-component of it, Sz [59].However, since both of them commute with the electronic Hamiltonian in eq. (5.1),the true fundamental state of the exact problem can be chosen to be an eigenstate ofboth operators simultaneously. So two additional constraints on the spin part of theGHF wavefunction in (7.1) are typically made that force the variational ansatz tosatisfy these fundamental-state properties and that should be seen in the light of theabove discussion, i.e., as reducing the search space, thus yielding an intrinsically lessaccurate theory, but also as being good candidates to hope that the computationalsavings will pay for this.

The first approximation to GHF (in a logical sense) is called Unrestricted Hartree-Fock (UHF) and it consists of asking the orbitals ψi to be a product of a partφi(~r) depending on the positions ~r times a spin eigenstate of the one-electron szoperator, i.e., either α(σ) or β(σ) (see eq. (7.5)). This is denoted by ψi(x) :=φi(~r)γi(σ), where γi is either the α or the β function. Now, if we call Nα and Nβ thenumber of spin-orbitals of each type, we have that, differently from the GHF one, theUHF N -particle wavefunction Ψ is an eigenstate of the Sz operator with eigenvalue(1/2)(Nα − Nβ) (in atomic units, see sec. 2). However, it is not an eigenstate

of S2 (the UHF wavefunction can be projected into pure S2-states, however, theresult is multideterminantal [59] and will not be considered here). Regarding thecomputational cost of the UHF approximation, it is certainly lower than that ofGHF, since the search space is half as large: In the latter case, we had to consider2N (complex or real) functions of R3 (the φα

i (~r) and the φβi (~r), see eq. (7.4)), while

in UHF we only have to deal with N of them: the φi(~r).

29

Now, we may perform a derivation analogous to the one performed for the GHFcase, using the UHF orbitals, and get to the Unrestricted version of the Hartree-Fockequations24:

FUHFi [φ]φi(~r) :=

(h+

N∑

j

Jj[φ]−N∑

j

δγiγjKj [φ]

)φi(~r) = εiφi(~r) , i = 1, . . . , N ,

(7.43)where the Coulomb and exchange operators dependent on the spatial orbitals φi

are defined by their action on an arbitrary function ϕ(~r) as follows:

Jj[φ]ϕ(~r) :=

(∫ |φj(~r′)|2

|~r − ~r ′| d~r ′

)ϕ(~r) , (7.44a)

Kj [φ]ϕ(~r) :=

(∫φ∗j (~r

′)ϕ(~r ′)

|~r − ~r ′| d~r ′

)φj(~r) , (7.44b)

and one must note that, differently from the GHF case, due to the fact that theexchange interaction only takes place between orbitals ‘of the same spin’, the UHFFock operator FUHF

i [φ] depends on the index i.If we now introduce the special form of the UHF orbitals into the general ex-

pression in (7.38), we can calculate the two-body probability density of finding anyelectron with coordinates x at the same time that any other electron has coordinatesx ′:

ρUHF(x, x ′) =1

2

(∑

k,l

∣∣φk(~r)∣∣2∣∣φl(~r

′)∣∣2γk(σ) γl(σ ′)

−∑

k,l

φ∗k(~r)φ

∗l (~r

′)φl(~r)φk(~r′) γk(σ) γl(σ

′) γk(σ′) γl(σ)

).(7.45)

If we compute this probability density for ‘electrons of the same spin’, i.e., forσ = σ ′, we obtain25

ρUHF(~r, ~r ′; σ = σ ′) =

1

2

∑

k,l∈Iσ

(∣∣φk(~r)∣∣2∣∣φl(~r

′)∣∣2 − φ∗

k(~r)φ∗l (~r

′)φl(~r)φk(~r′))=

1

2

(ρ(~r, σ) ρ(~r ′, σ)−

∑

k,l

δγkσδγlσφ∗k(~r)φ

∗l (~r

′)φl(~r)φk(~r′)

), (7.46)

24 Placing functions as arguments of the Kronecker’s delta δγiγjis a bit unorthodox mathemat-

ically, but it constitutes an intuitive (and common) notation.25 Placing a function and a coordinate as arguments of the Kronecker’s delta δγkσ is even more

unorthodox mathematically than placing two functions (in fact, δγkσ is exactly the same as γk(σ)),however, the intuitive character of the notation compensates again for this.

30

where the following expression for the one-electron charge density has been used:

ρUHF(~r, σ) =∑

k∈Iσ

∣∣φk(~r)∣∣2 . (7.47)

At this point, note that eq. (7.46) contains, like in the GHF case, the exchangecorrection to the first (independent electrons) term. Nevertheless, as we advanced,if we calculate the two-body ρUHF for ‘electrons of opposite spin’, i.e., for σ 6= σ ′,we have that

ρUHF(~r, ~r ′; σ 6= σ ′) =1

2ρ(~r, σ) ρ(~r ′, σ ′) , (7.48)

i.e., that UHF electrons of opposite spin are statistically pairwise independent.The other common approximation to GHF is more restrictive than UHF and is

called Restricted Hartree-Fock (RHF). Apart from asking the orbitals ψi to be aproduct of a spatial part times a spin eigenstate of the one-electron sz operator likein the UHF case, in RHF, the number of ‘spin-up’ and ‘spin-down’ orbitals is thesame, Nα = Nβ (note that this means that RHF may only be used with moleculescontaining an even number of electrons), and each spatial wavefunction occurs twice:once multiplied by α(σ) and the other time by β(σ). This is typically referred toas a closed-shell situation and we shall denote it by writing ψi(x) := φi(~r)α(σ)if i ≤ N/2, and ψi(x) := φi−N/2(~r) β(σ) if i > N/2; in such a way that thereare N/2 different spatial orbitals denoted by φI(~r), with, I = 1, . . . , N/2. Due tothese additional restrictions, we have that, differently from the GHF one, the RHFN -particle wavefunction Ψ is an eigenstate of both the S2 and the Sz operators,with zero eigenvalue in both cases [18, 59], just like the fundamental state of theexact problem. Regarding the computational cost of the RHF approximation, it iseven lower than that of UHF, since the size of the search space has been reducedto one quarter that of GHF: In the latter case, we had to consider 2N (complex orreal) functions of R3 (the φα

i (~r) and the φβi (~r), see eq. (7.4)), while in RHF we only

have to deal with N/2 of them: the φI(~r) (see fig. 1).Again, we may perform a derivation analogous to the one performed for the GHF

case and get to the Restricted version of the Hartree-Fock equations :

FRHF[φ]φI(~r) :=

h+N/2∑

J

(2JJ [φ]− KJ [φ]

)

φI(~r) = εIφI(~r) , I = 1, . . . , N/2 ,

(7.49)where the RHF Fock operator FRHF[φ] has been defined in terms of the Coulomb

and exchange operators in eq. (7.44) and, differently from the UHF case, it does notdepend on the index I of the orbital on which it operates.

If we follow the same steps as in the GHF case in page 26, we can relate theRHF energy to the eigenvalues εI and the two-electron spatial integrals:

E = 2

N/2∑

I

εI −N/2∑

I,J

(2〈φIφJ |

1

r|φIφJ〉 − 〈φIφJ |

1

r|φJφI〉

). (7.50)

31

Now, if we introduce the special form of the RHF orbitals into the general expres-sion in (7.38), we can calculate the RHF two-body probability density of finding anyelectron with coordinates x at the same time that any other electron has coordinatesx ′:

ρRHF(x, x ′) =1

2

N/2∑

K,L

∣∣φK(~r)∣∣2∣∣φL(~r

′)∣∣2 − δσσ ′

N/2∑

K,L

φ∗K(~r)φ

∗L(~r

′)φL(~r)φK(~r′)

=

1

2

ρ(~r, σ) ρ(~r ′, σ ′)− δσσ ′

N/2∑

K,L

φ∗K(~r)φ

∗L(~r

′)φL(~r)φK(~r′)

. (7.51)

where the following expression for the RHF one-electron charge density has beenused:

ρRHF(~r, σ) =

N/2∑

K

∣∣φK(~r)∣∣2 . (7.52)

We notice that the situation is the same as in the UHF case: For RHF electronswith equal spin, there exists an exchange term in ρRHF(x, x ′) that corrects the ‘in-dependent’ part, whereas RHF electrons of opposite spin are statistically pairwiseindependent.

Finally, let us stress that the RHF approximation (in its real-valued version)is the most used one in the literature [13, 66, 67, 68, 69, 70, 71, 72, 73, 74] (formolecules with an even number of electrons) and that all the issues related to theexistence of solutions, as well as those regarding the iterative methods used to solvethe equations, are essentially the same for RHF as for GHF (see the discussion inthe previous pages).

8 The Roothaan-Hall equations

Let’s discretize

The Hartree-Fock equations in the RHF form in expression (7.49) are a set of N/2of coupled integro-differential equations. As such, they can be tackled by finite-differences methods and solved on a discrete grid; this is known as numerical Hartree-Fock [75], and, given the present power of computers, it is only applicable to verysmall molecules.

In order to deal with larger systems, such as biological macromolecules, indepen-dently proposed by Roothaan [76] and Hall [77] in 1951, a different kind of discretiza-tion must be performed, not in R

3 but in the Hilbert space H of the one-electronorbitals. Hence, although the actual dimension of H is infinite, we shall approximateany function in it by a finite linear combination of M different functions χa

26. In

26 In all this section and the forecoming ones, the indices belonging to the first letters of thealphabet, a, b, c, d, etc., run from 1 to M (the number of functions in the finite basis set); whereasthose named with capital letters from I towards the end of the alphabet, I, J,K, L, etc., run from1 to N/2 (the number of spatial wavefunctions φI , also termed the number of occupied orbitals).

32

particular, the one-electron orbitals that make up the RHF wavefunction, shall beapproximated by

φI(~r) ≃M∑

a

caI χa(~r) , I = 1, . . . , N/2 , M ≥ N

2. (8.1)

In both cases, numerical Hartree-Fock and discretization of the function space,the correct result can be only be reached asymptotically; when the grid is very fine,for the former, and when M → ∞, for the latter. This exact result, which, in thecase of small systems, can be calculated up to several significant digits, is known asthe Hartree-Fock limit [78].

In practical cases, however, M is finite (often, only about an order of magnitudelarger than N/2) and the set {χa}Ma=1 in the expression above is called the basis set.We shall devote the next section to discuss its special characteristics, but, for now,it suffices to say that, in typical applications, the functions χa are atom-centered,i.e., each one of them has non-negligible value only in the vicinity of a particularnucleus. Therefore, like all the electronic wavefunctions we have dealt with in thelast sections, they parametrically depend on the positions R of the nuclei (see sec. 3).This is why, sometimes, the functions χa are called atomic orbitals27 (AO) (since theyare localized at individual atoms), the φI are referred to as molecular orbitals (MO)(since they typically have non-negligible value in the whole space occupied by themolecule), and the approximation in eq. (8.1) is called linear combination of atomicorbitals (LCAO). In addition, since we voluntarily circumscribe to real-RHF, weassume that both the coefficients caI and the functions χa in the above expressionare real.

Now, if we introduce the linear combination in eq. (8.1) into the Hartree-Fockequations in (7.49), multiply the result by χb (for a general value of b) and integrateon ~r, we obtain

∑

a

FbacaI = εI∑

a

SbacaI , I = 1, . . . , N/2 , b = 1, . . . ,M , (8.2)

where we denote by Fba the (b, a)-element of the Fock matrix 28, and by Sba theone of the overlap matrix, defined as

Fba := 〈χb| F [φ] |χa〉 and Sba := 〈χb|χa〉 , (8.3)

respectively.Note that we do not ask the χa in the basis set to be mutually orthogonal, so

that the overlap matrix is not diagonal in general.Next, if we define the M ×M matrices F [c] := (Fab) and S := (Sab), together

with the (column) M-vector cI := (caI), we can write eq. (8.2) in matricial form:

27 Some authors [17] suggest that, being strict, the term atomic orbitals should be reserved forthe one-electron wavefunctions φI that are the solution of the Hartree-Fock problem (or even tothe exact Schrodinger equation of the isolated atom), and that the elements χa in the basis setshould be termed simply localized functions. However, it is very common in the literature not tofollow this recommendation and choose the designation that appear in the text [17, 76, 79]. Weshall do the same for simplicity.

28 Note that the RHF superindex has been dropped from F .

33

F [c]cI = εI ScI . (8.4)

Hence, using the LCAO approximation, we have traded the N/2 coupled integro-differential Hartree-Fock equations in (7.49) for this system of N/2 algebraic equa-tions for the N/2 orbital energies εI and theM ·N/2 coefficients caI , which are calledRoothaan-Hall equations [76, 77] and which are manageable in a computer.

Now, if we forget for a moment that the Fock matrix depends on the coefficientscaI (as stressed by the notation F [c]) and also that we are only looking for N/2vectors cI while the matrices F and S areM×M , we may regard the above expressionas a M-dimensional generalized eigenvalue problem. Many properties are sharedbetween this kind of problem and a classical eigenvalue problem (i.e., one in whichSab = δab) [76], being the most important one that, due to the hermiticity of F [c],one can find an orthonormal set of M vectors ca corresponding to real eigenvaluesεa (where, of course, some eigenvalue could be repeated).

In fact, it is using this formalism how actual Hartree-Fock computations areperformed, the general outline of the iterative procedure being essentially the sameas the one discussed in sec. 7: Choose a starting-guess for the coefficients caI (letus denote it by c0aI), construct the corresponding Fock matrix F [c0]29 and solvethe generalized eigenvalue problem in eq. (8.4). By virtue of the aufbau principlediscussed in the previous section, from the M eigenvectors ca, keep the N/2 onesc1I that correspond to the N/2 lowest eigenvalues ε1I , construct the new Fock matrixF [c1] and iterate (by convention, the eigenvalues εna , for all n, are ordered from thelowest to the largest as a runs from 1 to M). This procedure ends when the n-thsolution is close enough (in a suitable defined way) to the (n− 1)-th one. Also, notethat, after convergence has been achieved, we end up with M orthogonal vectorsca. Only the N/2 ones that correspond to the lowest eigenvalues represent real one-electron solutions and they are called occupied orbitals ; the M − N/2 remainingones do not enter in the N -electron wavefunction (although they are relevant forcalculating corrections to the Hartree-Fock results, see sec. 10) and they are calledvirtual orbitals.

Regarding the mathematical foundations of this procedure, let us stress, however,that, whereas in the finite-dimensional GHF and UHF cases it has been proved thatthe analogous of Lieb and Simon’s theorem (see the previous section) is satisfied, i.e.,that the global minimum of the original optimization problem corresponds to thelowest eigenvalues of the self-consistent Fock operator, in the RHF case, contrarily,no proof seems to exist [53]. Of course, in practical applications, the positive resultis assumed to hold.

Finally, if we expand Fab in eq. (8.3), using the shorthand |a〉 for |χa〉, we have

Fab = 〈a| h |b〉+∑

c,d

(∑

J

ccJcdJ

)

︸︷︷︸Dcd[c]

(2〈ac| 1

r|bd〉 − 〈ac| 1

r|db〉

)

︸︷︷︸Gcd

ab

, (8.5)

29 Note (in eq. (8.5), for example) that the Fock matrix only depends on the vectors ca witha ≤ N/2.

34

where we have introduced the density matrix Dcd[c], and also the matrix Gcdab,

made up by the four-center integrals 〈ac| 1/r |bd〉 defined by

〈ac| 1r|bd〉 :=

∫∫χa(~r)χb(~r

′)χc(~r)χd(~r′)

|~r − ~r ′| d~r d~r ′ . (8.6)

After convergence has been achieved, the RHF energy in the finite-dimensionalcase can be computed using the discretized version of eq. (7.50):

E = 2

N/2∑

I

εI −N/2∑

I,J

∑

a,b,c,d

(2caIcbJccIcdJ〈ab|

1

r|cd〉 − caIcbJccJcdI〈ab|

1

r|dc〉

)=

2

N/2∑

I

εI −N/2∑

I,J

∑

a,b,c,d

caIcbJccIcdJ〈ab|1

r|cd〉 =

2

N/2∑

i

εi −∑

a,b,c,d

Dac[c]Dbd[c]〈ab|1

r|cd〉 , (8.7)

where a convenient rearrangement of the indices in the two sums has been per-formed from the first to the second line.

9 Introduction to Gaussian basis sets

Flora and fauna

In principle, arbitrary functions may be chosen as the χa to solve the Roothaan-Hallequations in the previous section, however, in eq. (8.5), we see that one of the mainnumerical bottlenecks in SCF calculations arises from the necessity of calculating the∼M4 four-center integrals 〈ab| 1

r|cd〉 (since the solution of the generalized eigenvalue

problem in eq. (8.4) typically scales only like M3, and there are ∼M2 two-center〈a| h |b〉 integrals). Either if these integrals are calculated at each iterative stepand directly taken from RAM memory (direct SCF ) or if they are calculated atthe first step, written to disk, and then read from there when needed (conventionalSCF ), an appropriate choice of the functions χa in the finite basis set is essentialif accurate results are sought, M is intended to be kept as small as possible andthe integrals are wanted to be computed rapidly. When one moves into higher-leveltheoretical descriptions (such as the MP2 method discussed in the next section) andthe numerical complexity scales with M even more unpleasantly, the importance ofthis choice greatly increases.

In this section, in order to support that study, we shall introduce some of theconcepts involved in the interesting field of basis-set design. For further details notcovered here, the reader may want to check refs. 17, 18, 80, 81.

The only analytically solvable molecular problem in non-relativistic quantummechanics is the hydrogen-like atom, i.e., the system formed by a nucleus of chargeZ and only one electron (H, He+, Li2+, etc.). Therefore, it is not strange that all the

35

thinking about atomic-centered basis sets in quantum chemistry is much influencedby the particular solution to this problem.

The spatial eigenfunctions of the Hamiltonian operator of an hydrogen-like atom,in atomic units and spherical coordinates, read30

φnlm(r, θ, ϕ) =

√(2Z

n

)3(n− l − 1)!

2n[(n+ l)!]3

(2Z

nr

)l

L2l+1n−l−1

(2Z

nr

)e−Zr/nYlm(θ, ϕ) ,

(9.1)where n, l and m are the energy, total angular momentum and z-angular mo-

mentum quantum numbers, respectively. Their ranges of variation are coupled: allbeing integers, n runs from 1 to ∞, l from 0 to n− 1 and m from −l to l. The func-tion L2l+1

n−l−1 is a generalized Laguerre polynomial [82], for which it suffices to say herethat it is of order n− l−1 (thus having, in general, n− l−1 zeros), and the functionYlm(θ, ϕ) is a spherical harmonic, which is a simultaneous eigenfunction of the totalangular momentum operator l 2 (with eigenvalue l(l+ 1)) and of its z-component lz(with eigenvalue m).

The hope that the one-electron orbitals that are the solutions of the Hartree-Fockproblem in many-electron atoms could not be very different from the φnlm above31,together with the powerful chemical intuition that states that ‘atoms-in-moleculesare not very different from atoms-alone’, is what mainly drives the choice of thefunctions χa in the basis set, and, in the end, the variational procedure that willbe followed is expected to fix the largest failures coming from these too-simplisticassumptions.

Hence, it is customary to choose functions that are centered at atomic nuclei andthat partially resemble the exact solutions for hydrogen-like atoms. In this spirit,the first type of AOs to be tried [79] were the Slater-type orbitals (STOs), proposedby Slater [83] and Zener [84] in 1930:

χSTOa (~r ; ~Rαa

) := N STOa Y c,s

lama(θαa

, ϕαa)∣∣~r − ~Rαa

∣∣na−1exp

(− ζa |~r − ~Rαa

|), (9.2)

where N STOa is a normalization constant and ζa is an adjustable parameter. The

index αa is that of the nucleus at which the function is centered, and, of course, inthe majority of cases, there will be several χSTO

a corresponding to different values ofa centered at the same nucleus. The integers la and ma can be considered quantumnumbers, since, due to the fact that the only angular dependence is in Y c,s

lama(see

below for a definition), the STO defined above is still a simultaneous eigenstate of

30 For consistency with the rest of the text, the Born-Oppenheimer approximation has been alsoassumed here. So that the reduced mass µ := meMN/(me+MN ) that should enter the expressionis considered to be the mass of the electron µ ≃ me (recall that, in atomic units, me = 1 andMN & 2000).

31 Note that the N -electron wavefunction of the exact fundamental state of a non-hydrogen-like atom depends on 3N spatial variables in a way that cannot be written, in general, as aSlater determinant of one-electron functions. The image of single electrons occupying definiteorbitals, together with the possibility of comparing them with the one-particle eigenfunctions ofthe Hamiltonian of hydrogen-like atoms, vanishes completely outside the Hartree-Fock formalism.

36

the one-electron angular momentum operators l 2 and lz (with the origin placed at~Rαa

). The parameter na, however, should be regarded as a ‘principal (or energy)quantum number’ only by analogy, since, on the one hand, it does not exist a ‘one-atom Hamiltonian’ whose exact eigenfunctions it could label and, on the other hand,only the leading term of the Laguerre polynomial in eq. (9.1) has been kept in theSTO32.

Additionally, in the above notation, the fact that χSTOa parametrically depends

on the position of a certain αa-th nucleus has been stressed, and the functions Y c,slama

,which are called real spherical harmonics [85] (remember that we want to do realRHF), are defined in terms of the classical spherical harmonics Ylama

by

Y clama

(θαa, ϕαa

) :=Ylama

(θαa, ϕαa

) + Y ∗lama

(θαa, ϕαa

)√2

∝ Pma

la(cos θαa

) cos(maϕαa) ,

(9.3a)

Y slama

(θαa, ϕαa

) := −iYlama(θαa

, ϕαa)− Y ∗

lama(θαa

, ϕαa)√

2∝ Pma

la(cos θαa

) sin(maϕαa) ,

(9.3b)

where c stands for cosine, s for sine, the functions Pma

laare the associated Legendre

polynomials [82], and the spherical coordinates θαaand ϕαa

also carry the αa-label toremind that the origin of coordinates in terms of which they are defined is locatedat ~Rαa

. Also note that, using that Y cla0

= Y sla0, there is the same number of real

spherical harmonics as of classical ones.These χSTO

a have some good physical properties. Among them, we shall mention

that, for |~r− ~Rαa| → 0, they present a cusp (a discontinuity in the radial derivative),

as required by Kato’s theorem [86]; and also that they decay at an exponential rate

when |~r − ~Rαa| → ∞, which is consistent with the image that, an electron that

is taken apart from the vicinity of the nucleus must ‘see’, at large distances, anunstructured point-like charge (see, for example, the STO in fig. 2). Finally, thefact that they do not present radial nodes (due to the aforementioned absence of thenon-leading terms of the Laguerre polynomial in eq. (9.1)) can be solved by makinglinear combinations of functions with different values of ζa

33.Now, despite their being good theoretical candidates to expand the MO φI that

make up the N -particle solution of the Hartree-Fock problem, these STOs haveserious computational drawbacks: Whereas the two-center integrals (such as 〈a| h |b〉in eq. (8.5)) can be calculated analytically, the four-center integrals 〈ac| 1/r |bd〉 cannot [17, 79] if functions like the ones in eq. (9.2) are used. This fact, which wasknown as “the nightmare of the integrals” in the first days of computational quantum

32 If we notice that, within the set of all possible STOs (as defined in eq. (9.2)), every hydrogen-like energy eigenfunction (see eq. (9.1)) can be formed as a linear combination, we easily see thatthe STOs constitute a complete basis set. This is important to ensure that the Hartree-Fock limitcould be actually approached by increasing M .

33 This way of proceeding renders the choice of the exponent carried by the |~r− ~Rαa| part (na−1

in the case of the STO in eq. (9.2)) a rather arbitrary one. As a consequence, different definitionsmay be found in the literature and the particular exponent chosen in actual calculations turns outto be mostly a matter of computational convenience.

37

chemistry [79], precludes the use of STOs in practical ab initio calculations of largemolecules.

A major step to overcome these difficulties that has revolutioned the whole field ofquantum chemistry [53, 79] was the introduction of Cartesian Gaussian-type orbitals(cGTO):

χcGTOa (~r ; ~Rαa

) := N cGTOa

(r1−R1

αa

)l xa (r2−R2

αa

)l ya (r3−R3

αa

)l zaexp

(−ζa |~r−~Rαa

|2),

(9.4)where the rp and the Rp

αa, with p = 1, 2, 3, are the Euclidean coordinates of the

electron and the αa-th nucleus respectively, and the integers l xa , lya and l za , which

take values from 0 to ∞, are called orbital quantum numbers34.Although these GTOs do not have the good physical properties of the STOs

(compare, for example, the STO and the GTO in fig. 2), in 1950, Boys [87] showedthat all the integrals appearing in SCF theory could be calculated analytically ifthe χa had the form in eq. (9.4). The enormous computational advantage thatthis entails makes possible to use a much larger number of functions to expandthe one-electron orbitals φi if GTOs are used, partially overcoming their bad short-and long-range behaviour and making the Gaussian-type orbitals the universallypreferred choice in SCF calculations [17].

To remedy the fact that the angular behaviour of the Cartesian GTOs in eq. (9.4)is somewhat hidden, they may be linearly combined to form Spherical Gaussian-typeorbitals (sGTO):

χsGTOa (~r ; ~Rαa

) := N sGTOa Y c,s

lama(θαa

, ϕαa)∣∣~r − ~Rαa

∣∣ la exp(− ζa |~r − ~Rαa

|2), (9.5)

which are proportional to the real spherical harmonic Y c,slama

(θαa, ϕαa

), and towhich the same remarks made in footnote 33 for the STOs, regarding the exponentin the |~r − ~Rαa

| part, may be applied.The fine mathematical details about the linear combination that relates the

Cartesian GTOs to the spherical ones are beyond the scope of this introduction.We refer the reader to refs. 88 and 85 for further information and remark here somepoints that will have interest in the subsequent discussion.

First, the cGTOs that are combined to make up a sGTO must have all thesame value of la := l xa + l ya + l za and, consequently, this sum of the three orbitalquantum numbers l xa , l

ya and l za in a particular Cartesian GTO is typically (albeit

dangerously) referred to as the angular momentum of the function. In addition,apart from the numerical value of la, the spectroscopic notation is commonly usedin the literature, so that cGTOs with la = 0, 1, 2, 3, 4, 5, . . . are called s, p, d, f, g, h,. . . , respectively. Where the first four come from the archaic words sharp, principal,diffuse and fundamental, while the subsequent ones proceed in alphabetical order.

34 Since the harmonic-oscillator energy eigenfunctions can be constructed as linear combinationsof Cartesian GTOs, we have that the latter constitute a complete basis set and, like in the case ofthe STOs, we may expect that the Hartree-Fock limit is approached as M is increased.

38

Second, for a given la > 1, there are more Cartesian GTOs ((la+1)(la+2)/2) thanspherical ones (2la + 1), in such a way that, from the (la + 1)(la + 2)/2 functionallyindependent linear combinations that can be formed using the cGTOs of angularmomentum la, only the angular part of 2la +1 of them turns out to be proportionalto a real spherical harmonic Y c,s

lama(θαa

, ϕαa); the rest of them are proportional to

real spherical harmonic functions with a different value of the angular momentumquantum number. For example, from the six different d-Cartesian GTOs, whosepolynomial parts are x2, y2, z2, xy, xz and yz (using an evident, compact notation),only five different spherical GTOs can be constructed: the ones with polynomialparts proportional to 2z2 − x2 − y2, xz, yz, x2 − y2 and xy [88]. Among these newsGTOs, which, in turn, are proportional (neglecting also powers of r, see footnote 33)

to the real spherical harmonics Y20, Yc21, Y

s21, Y

c22 and Y

s22, the linear combination x2+

y2+z2 is missing, since it presents the angular behaviour of an s-orbital (proportional

to Y00).Finally, let us remark that, whereas Cartesian GTOs in eq. (9.4) are easier to be

coded in computer applications than sGTOs[88], it is commonly accepted that thesespurious spherical orbitals of lower angular momentum that appear when cGTOsare used do not constitute efficient choices to be included in a basis set [85] (afterall, if we wanted an additional s-function, why include it in such an indirect andclumsy way instead of just designing a specific one that suits our particular needs?).Consequently, the most common practice in the field is to use Cartesian GTOsremoving from the basis sets the linear combinations such as the x2+ y2+ z2 above.

Now, even if the integrals involving cGTOs can be computed analytically, thereare still ∼ M4 of them in a SCF calculation. For example, in the model dipeptideHCO-L-Ala-NH2, which is commonly used to mimic an alanine residue in a protein[69, 72, 89, 90], there are 62 electrons and henceforth 31 RHF spatial orbitals φI .If a basis set with only 31 functions is used (this is a lower bound that will berarely reached in practical calculations due to symmetry issues, see below), near amillion of four-center 〈ac| 1/r |bd〉 integrals must be computed. This is why, onemust use the freedom that remains once the decision of sticking to cGTOs has beentaken (namely, the choice of the exponents ζa and the angular momentum la) todesign basis sets that account for the relevant behaviour of the systems studiedwhile keeping M below the ‘pain threshold’.

The work by Nobel Prize John Pople’s group has been a major reference in thisdiscipline, and their STO-nG family [91], together with the split-valence Gaussianbasis sets, 3-21G, 4-31G, 6-31G, etc. [92, 93, 94, 95, 96, 97, 98, 99], shall be usedhere to exemplify some relevant issues.

To begin with, let us recall that the short- and long-range behaviour of the Slater-type orbitals in eq. (9.2) is better than that of the more computationally efficientGTOs. In order to improve the physical properties of the latter, it is customaryto linearly combine Ma Cartesian GTOs, denoted now by ξ µ

a (µ = 1, . . . ,Ma), andtermed primitive Gaussian-type orbitals (PGTO), having the same atomic center~Rαa

, the same set of orbital quantum numbers, l xa , lya and l za , but different exponents

ζ µa , to make up a contracted Gaussian-type orbitals (CGTO), defined by

39

4 3 2 1 0

ra

contracted GTOprimitive GTOs

STOsingle GTO

Figure 2: Radial behaviour of the 1s-contracted GTO of the hydrogen atom in the STO-

3G basis set [91], the three primitive GTOs that form it, the STO that is meant to be

approximated and a single GTO with the same norm and exponent as the STO. The

notation ra is shorthand for the distance to the αa-th nucleus |~r − ~Rαa |.

χa(~r ; ~Rαa) :=

Ma∑

µ

g µa ξ

µa (~r ;

~Rαa) =

(r1 −R1

αa

)l xa (r2 −R2

αa

)l ya (r3 − R3

αa

)l za Ma∑

µ

g µa N µ

a exp(− ζ µ

a |~r − ~Rαa|2),(9.6)

where the normalization constants N µa have been kept inside the sum because

they typically depend on ζ µa . Also, we denote now by MC the number of contracted

GTOs and, by MP :=∑

aMa, the number of primitive ones.In the STO-nG family of basis sets, for example, n primitive GTOs are used for

each contracted one, fitting the coefficients g µa and the exponents ζ µ

a to resemble theradial behavior of Slater-type orbitals [91]. In fig. 2, the 1s-contracted GTO (seethe discussion below) of the hydrogen atom in the STO-3G basis set is depicted,together with the three primitive GTOs that form it and the STO that is meant tobe approximated35. We can see that the contracted GTO has a very similar behaviorto the STO in a wide range of distances, while the single GTO that is also shownin the figure (with the same norm and exponent as the STO) has not.

35 Basis sets were obtained from the Extensible Computational Chemistry Environment BasisSet Database at http://www.emsl.pnl.gov/forms/basisform.html, Version 02/25/04, as devel-oped and distributed by the Molecular Science Computing Facility, Environmental and MolecularSciences Laboratory which is part of the Pacific Northwest Laboratory, P.O. Box 999, Richland,Washington 99352, USA, and funded by the U.S. Department of Energy. The Pacific NorthwestLaboratory is a multi-program laboratory operated by Battelle Memorial Institute for the U.S. De-partment of Energy under contract DE-AC06-76RLO 1830. Contact Karen Schuchardt for furtherinformation.

40

1s-shell 2sp-shell

ζ µa g µ

a ζ µa g µ

a (p) g µa (s)

71.6168370 0.15432897 2.9412494 -0.09996723 0.1559162713.0450960 0.53532814 0.6834831 0.39951283 0.607683723.5305122 0.44463454 0.2222899 0.70115470 0.39195739

Table 3: Exponents ζ µa and contraction coefficients g µ

a of the primitive Gaussian shells

that make up the three different contracted ones in the STO-3G basis set for carbon (see

ref. 91 and footnote 35). The exponents of the 2s- and 2p-shells are constrained to be the

same.

Typically, the fitting procedure that leads to contracted GTOs is performed onisolated atoms and, then, the already mentioned chemical intuition that states that‘atoms-in-molecules are not very different from atoms-alone’ is invoked to keep thelinear combinations fixed from there on. Obviously, better results would be obtainedif the contraction coefficients were allowed to vary. Moreover, the number of four-center integrals that need to be calculated depends on the number of primitive GTOs(like ∼ M4

P ), so that we have not gained anything on this point by contracting.However, the size of the variational space is MC (i.e., the number of contractedGTOs), in such a way that, once the integrals 〈ac| 1/r |bd〉 are calculated (for non-direct SCF), all subsequent steps in the iterative self-consistent procedure scale aspowers of MC . Also, the disk storage (again, for non-direct schemes) depends onnumber of contracted GTOs and, frequently, it is the disk storage and not the CPUtime the limiting factor of a calculation.

An additional chemical concept that is usually defined in this context and thatis needed to continue with the discussion is that of shell : Atomic shells in quantumchemistry are defined analogously to those of the hydrogen atom, so that each elec-tron is regarded as ‘filling’ the multi-electron atom ‘orbitals’ according to Hund’srules [100]. Hence, the occupied shells of carbon, for example, are defined to be 1s,2s and 2p, whereas those of, say, silicon, would be 1s, 2s, 2p, 3s an 3p. Each shellmay contain 2(2l + 1) electrons if complete, where 2l + 1 accounts for the orbitalangular momentum multiplicity and the 2 factor for that of electron spin.

Thus, using these definitions, all the basis sets in the aforementioned STO-nGfamily are minimal ; in the sense that they are made up of only 2l + 1 contractedGTOs for each completely or partially occupied shell, so that the STO-nG basis setsfor carbon, for example, contain two s-type contracted GTOs (one for the 1s- and theother for the 2s-shell) and three p-type ones (belonging to the 2p-shell). Moreover,due to rotational-symmetry arguments in the isolated atoms, all the 2l+1 functionsin a given shell are chosen to have the same exponents and the same contractioncoefficients, differing only on the polynomial that multiplies the Gaussian part. Such2l + 1 CGTOs shall be said to constitute a Gaussian shell (GS), and we shall alsodistinguish between the primitive (PGS) and contracted (CGS) versions.

In table 3, the exponents ζ µa and the contraction coefficients g µ

a of the primitive

41

1s-shell 2sp-shell

ζ µa g µ

a ζ µa g µ

a (p) g µa (s)

3047.52490 0.0018347 7.8682724 -0.1193324 0.0689991457.36951 0.0140373 1.8812885 -0.1608542 0.3164240103.94869 0.0688426 0.5442493 1.1434564 0.744308329.21015 0.23218449.286663 0.4679413 0.1687144 1.0000000 1.00000003.163927 0.3623120

Table 4: Exponents ζ µa and contraction coefficients g µ

a of the primitive Gaussian shells

that make up the three different constrained ones in the 6-31G basis set for carbon (see

ref. 93 and footnote 35). In the 1s-shell, there is only one contracted Gaussian shell made

by six primitive ones, whereas, in the 2s- and 2p-valence shells, there are two CGSs, one

of them made by three PGSs and the other one only by a single PGS. The exponents of

the 2s- and 2p-shells are constrained to be the same.

GTOs that make up the three different shells in the STO-3G basis set for carbon arepresented (see ref. 91 and footnote 35). The fact that the exponents ζ µ

a in the 2s-and 2p-shells are constrained to be the same is a particularity of some basis sets (likethis one) which saves some computational effort and deserves no further attention.

Next, let us introduce a common notation that is used to describe the contractionscheme: It reads (primitive shells) / [contracted shells ], or alternatively (primitiveshells) → [contracted shells ]. According to it, the STO-3G basis set for carbon,for example, is denoted as (6s,3p) → [2s,1p], or (6,3) → [2,1]. Moreover, since fororganic molecules it is frequent to have only hydrogens and the 1st-row atoms C,N and O36 (whose occupied shells are identical), the notation is typically extendedand the two groups of shells are separated by a slash; as in (6s,3p/3s) → [2s,1p/1s]for STO-3G.

The first improvement that can be implemented on a minimal basis set such asthe ones in the STO-nG is the splitting, which consists in including more than oneGaussian shell for each occupied one. If the splitting is evenly performed, i.e., eachshell has the same number of GSs, then the basis set is called double zeta (DZ),triple zeta (TZ), quadruple zeta (QZ), quintuple zeta (5Z), sextuple zeta (6Z), andso on; where the word zeta comes from the Greek letter ζ used for the exponents.A hypothetical TZ basis set in which each CGTO is made by, say, four primitiveGTOs, would read (24s,12p/12s) → [6s,3p/3s] in the aforementioned notation.

At this point, the already familiar intuition that says that ‘atoms-in-molecules arenot very different from atoms-alone’ must be refined with another bit of chemicalexperience and qualified by noticing that ‘core electrons are less affected by themolecular environment and the formation of bonds than valence electrons’37. In this

36 In proteins, one may also have sulphur in cysteine and methionine residues.37 Recall that, for the very concept of ‘core’ or ‘valence electrons’ (actually for any label applied

to a single electron) to have any sense, we must be in the Hartree-Fock formalism (see footnote 31).

42

spirit, the above evenness among different shells is typically broken, and distinctbasis elements are used for the energetically lowest lying (core) shells than for thehighest lying (valence) ones.

On one side, the contraction scheme may be different. In which case, the notationused up to now becomes ambiguous, since, for example, the designation (6s,3p/3s)→[2s,1p/1s], that was said to correspond to STO-3G, would be identical for a differentbasis set in which the 1s-Gaussian shell of heavy atoms be formed by 4 PGSs andthe 2s-Gaussian shell by 2 PGSs (in 1st-row atoms, the 2s- and 2p-shells are definedas valence and the 1s-one as core, while in hydrogen atoms, the 1s-shell is a valenceone). This problem can be solved by explicitly indicating how many primitive GSsform each contracted one, so that, for example, the STO-3G basis set is denoted by(33,3/3) → [2,1/1], while the other one mentioned would be (42,3/3) → [2,1/1] (wehave chosen to omit the angular momentum labels this time).

The other point at which the core and valence Gaussian shells may differ is intheir respective ‘zeta quality’, i.e., the basis set may contain a different number ofcontracted Gaussian shells in each case. For example, it is very common to use asingle CGS for the core shells and a multiple splitting for the valence ones. Thesetype of basis sets are called split-valence and the way of naming their quality isthe same as before, except for the fact that a capital V, standing for valence, isadded either at the beginning or at the end of the acronyms DZ, TZ, QZ, etc., thusbecoming VDZ, VTZ, VQZ, etc. or DZV, TZV, QZV, etc.

Pople’s 3-21G [97], 4-31G [92], 6-31G [93] and 6-311G [96] are well-known ex-amples of split-valence basis sets that are commonly used for SCF calculations inorganic molecules and that present the two characteristics discussed above. Theirnames indicate the contraction scheme, in such a way that the number before thedash represents how many primitive GSs form the single contracted GSs that is usedfor core shells, and the numbers after the dash how the valence shells are contracted,in much the same way as the notation in the previous paragraphs. For example, the6-31G basis set (see table 4), contains one CGS made up of six primitive GSs in the1s-core shell of heavy atoms (the 6 before the dash) and two CGS, formed by threeand one PGSs respectively, in the 2s- and 2p-valence shells of heavy atoms and inthe 1s-shell of hydrogens (the 31 after the dash). The 6-311G basis set, in turn, isjust the same but with an additional single-primitive Gaussian shell of functions inthe valence region. Finally, to fix the concepts discussed, let us mention that, usingthe notation introduced above, these two basis sets may be written as (631,31/31)→ [3,2/2] and (6311,311/311) → [4,3/3], respectively.

Two further improvements that are typically used and that may also be incor-porated to Pople’s split-valence basis sets are the addition of polarization [94, 95] ordiffuse functions [95, 98, 99]. We shall discuss them both to close this section.

Up to now, neither the contraction nor the splitting involved GTOs of largerangular momentum than the largest one among the occupied shells. However, themolecular environment is highly anisotropic and, for most practical applications,it turns out to be convenient to add these polarization (large angular momentum)Gaussian shells to the basis set, since they present lower symmetry than the GSsdiscussed in the preceding paragraphs. Typically, the polarization shells are single-primitive GSs and they are denoted by adding a capital P to the end of the previous

43

acronyms, resulting into, for example, DZP, TZP, VQZP, etc., or, say, DZ2P, TZ3P,VQZ4P if more than one polarization shell is added. In the case of Pople’s basissets [94, 95], these improvements are denoted by specifying, in brackets and afterthe letter G, the number and type of the polarization shells separating heavy atomsand hydrogens by a comma38. For example, the basis set 6-31G(2df,p) contains thesame Gaussian shells as the original 6-31G plus two d-type shells and one f-type shellcentered at the heavy atoms, as well as one p-type shell centered at the hydrogens.

Finally, for calculations in charged species (specially anions), where the chargedensity extends further in space and the tails of the distribution are more importantto account for the relevant behaviour of the system, it is common to augment thebasis sets with diffuse functions, i.e., single-primitive Gaussian shells of the sameangular momentum as some preexisting one but with a smaller exponent ζ thanthe smallest one in the shell. In general, this improvement is commonly denoted byadding the prefix aug- to the name of the basis set. In the case of Pople’s basis sets,on the other hand, the insertion of a plus sign ‘+’ between the contraction schemeand the letter G denotes that the set contains one diffuse function in the 2s- and2p-valence shells of heavy atoms. A second + indicates that there is another one inthe 1s-shell of hydrogens. For example, one may have the doubly augmented (anddoubly polarized) 6-31++G(2d,2p) basis set.

10 Møller-Plesset 2

Despite the simplicity of the variational ansatz in eq. (7.1) and the severe truncationof the N -electron space that it implies, Hartree-Fock calculations typically accountfor a large part of the energy of molecular systems [17, 18], and they are knownto yield surprisingly accurate results in practical cases [68, 101, 102, 103, 104].Therefore, it is not strange that the Hartree-Fock method is chosen as a first stepin many works found in the literature [13, 72, 105, 106].

However, the interactions that drive the conformational behaviour of organicmolecules are weak and subtle [107, 108], so that, sometimes, higher levels of thetheory than HF are required in order to achieve the target accuracy (see for examplerefs. 72, 74). As we discussed in the previous sections, the error in the Hartree-Fockresults may come from two separate sources: the fact that we are using only oneSlater determinant (see sec. 7) and the fact that, in actual calculations, the basisset that spans the one-electron space is finite (see sec. 9). Increasing the basis setsize leads to the so-called Hartree-Fock limit, i.e., the exact HF result. On theother hand, any improvement in the description of the N -electron space is said toadd correlation to the problem, since, due to the good behaviour mentioned above,Hartree-Fock is almost ever used as a reference to improve the formalism, and, aswe remarked in sec. 7, the popular RHF and UHF versions of Hartree-Fock presentsome interesting properties of statistical independence between pairs of electrons.Apart from that, the word correlation is not much more than a convenient wayof referring to the difference between the exact fundamental state of the electronic

38 There also exists an old notation for the addition of a single polarization shell per atom thatreads 6-31G** and that is equivalent to 6-31G(d,p).

44

Schrodinger equation and that provided by the solution of the Hartree-Fock problemin the infinite basis set limit. In this spirit, the correlation energy is defined as

Ecorr := E − EHF , (10.1)

where E is the exact energy and EHF is the Hartree-Fock one (in the HF limit).One may improve the description of theN -electron wavefunction in different ways

that yield methods of distinct computational and physical properties (see refs. 17,18, 109, 110 for detailed accounts). Here, we shall circumscribe to the particularfamily of methods known as Møller-Plesset, after the physicists that introducedthem in 1934 [111], and, we shall give explicit expressions for the 2nd order Møller-Plesset (MP2) variant. MP2 is typically regarded as accurate in the literature andit is considered as the reasonable starting point to include correlation [8, 102, 106,108, 112, 113]. It is also commonly used as a reference calculation to evaluate orparameterize less demanding methods [11, 13, 114, 115, 116].

The basic formalism on which the Møller-Plesset methods are based is the Rayleigh-Schrodinger perturbation theory [18]. According to it, one must separate, if possible,the exact Hamiltonian operator H of the problem into a part H0 which is easy tosolve and another one H ′ which is ‘much smaller’ than H0 and that can, therefore,be regarded as a perturbation:

H = H0 + H ′ . (10.2)

Then, a dimensionless parameter λ is introduced,

H(λ) = H0 + λH ′ , (10.3)

so that, for λ = 0, we have the unperturbed Hamiltonian (i.e., H(0) = H0), and,for λ = 1, we have the exact one (i.e., H(1) = H).

The only utility of this parameter λ is to formally expand any exact eigenfunctionΨn of H and the corresponding exact eigenvalue En in powers of it,

Ψn(x) = Ψ(0)n (x) + λΨ(1)

n (x) + λ2Ψ(2)n (x) + λ3Ψ(3)

n (x) + o(λ4) , (10.4a)

En = E(0)n + λE(1)

n + λ2E(2)n + λ3E(3)

n + o(λ4) , (10.4b)

and then truncate the expansion at low orders invoking the aforementioned ‘rel-ative smallness’ of H ′39.

Now, to calculate the different terms in the expressions above, we take them,together with eq. (10.3), to the Schrodinger equation HΨn(x) = EnΨn(x), obtaininga series of coupled relations that result from equating the factors that multiply eachpower of λ:

39 The approach presented here is common in quantum chemistry books [17, 18] but differentfrom the one usually taken in physics [117, 118], where the perturbation is not H ′ but λH ′, sinceλ is considered to be small and H ′ of the same ‘size’ as H0. Nevertheless, the derivation and theconcepts involved are very similar and, of course, the final operative equations are identical.

45

H0Ψ(0)n (x) = E(0)

n Ψ(0)n (x) , (10.5a)

H0Ψ(1)n (x) + H ′Ψ(0)

n (x) = E(0)n Ψ(1)

n (x) + E(1)n Ψ(0)

n (x) , (10.5b)

H0Ψ(2)n (x) + H ′Ψ(1)

n (x) = E(0)n Ψ(2)

n (x) + E(1)n Ψ(1)

n (x) + E(2)n Ψ(0)

n (x) , (10.5c)

. . .

H0Ψ(k)n (x) + H ′Ψ(k−1)

n (x) =

k∑

i=0

E(i)n Ψ(k−i)

n (x) . (10.5d)

Eq. (10.5a) indicates that the zeroth order energies E(0)n and wavefunctions

Ψ(0)n (x) are actually the eigenvalues and eigenvectors of the unperturbed Hamil-

tonian H0. Then, we denote |n〉 := |Ψ(0)n 〉 and require that 〈n|Ψn〉 = 1, ∀n. This

condition, which is called intermediate normalization, can always be achieved, if|Ψn〉 and |n〉 are not orthogonal, by appropriately choosing the norm of |Ψn〉. Next,if we ‘multiply’ eq. (10.4a) from the left by 〈n| and use this property, we have that

〈n|Ψn〉 = 〈n|n〉+ λ 〈n|Ψ(1)n 〉+ λ2〈n|Ψ(2)

n 〉+ λ3〈n|Ψ(3)n 〉+ o(λ4) = 1 . (10.6)

Since this identity must hold for every value of λ, we see that, due to the normal-ization chosen, the zeroth order eigenfunctions |n〉 are orthogonal to all their higher

order corrections, i.e., that 〈n|Ψ(k)n 〉 = 0, ∀k > 0. Next, if we use this property and

multiply all eqs. (10.5) from the left by 〈n|, we obtain very simple expressions forthe different orders of the energy:

E(0)n = 〈n| H0 |n〉 , (10.7a)

E(1)n = 〈n| H ′ |n〉 , (10.7b)

E(2)n = 〈n| H ′ |Ψ(1)

n 〉 , (10.7c)

. . .

E(k)n = 〈n| H ′ |Ψ(k−1)

n 〉 . (10.7d)

The expressions for the zeroth and first orders of the n-th energy level En, ineqs. (10.7a) and (10.7b) respectively, do not require the knowledge of any correction

to the wavefunction. The second order energy E(2)n , however, involves the function

Ψ(1)n , which is a priori unknown.In order to arrive to an equation for E

(2)n that is operative, we write each correc-

tion Ψ(k>0)n to the n-th eigenstate (and particularly Ψ

(1)n ) as a linear combination of

the eigenstates of H0 themselves (which we assume to form a complete basis of theN -electron Hilbert space)40:

40 Like we did in sec. 4, we assume that the Hamiltonian operator H0 has only discrete spectrumin order to render the derivation herein more clear. Anyway, in the only application of the Rayleigh-Schrodinger perturbation theory that is performed in this document, the Møller-Plesset family ofmethods in computational quantum chemistry, the Hilbert space is finite-dimensional (since thebasis sets is finite) and we are in the conditions assumed.

46

|Ψ(k)n 〉 =

∑

m6=n

c(k)nm|m〉 =∑

m6=n

〈m|Ψ(k)n 〉 |m〉 , ∀k > 0 , (10.8)

where the m = n term has been omitted because, from the orthogonality condi-tions mentioned in the preceding paragraphs, it follows that c

(k)nn = 0, ∀k > 0.

Now, we take this to eq. (10.7c) and obtain

E(2)n =

∑

m6=n

〈n| H ′ |m〉〈m|Ψ(1)n 〉 . (10.9)

Next, we rearrange eq. (10.5b):

(E(0)

n − H0

)|Ψ(1)

n 〉 =(H ′ −E(1)

n

)|n〉 =

(H ′ − 〈n| H ′ |n〉

)|n〉 , (10.10)

where, in the last step, eq. (10.7b) has been used.Recalling that 〈m|n〉 = 0, we multiply the identity above from the left by an

arbitrary eigenvector 〈m| 6= 〈n| of H0:

(E(0)

n −E(0)m

)〈m|Ψ(1)

n 〉 = 〈m| H ′ |n〉 =⇒ 〈m|Ψ(1)n 〉 = 〈m| H ′ |n〉

E(0)n − E

(0)m

. (10.11)

Finally, we take this to eq. (10.9) to write an operative expression for the secondorder correction to the energy:

E(2)n =

∑

m6=n

|〈m| H ′ |n〉|2E

(0)n −E

(0)m

. (10.12)

The higher order corrections, which can be derived performing similar calcula-tions, shall not be obtained here. Also, note that, although the equation above isvalid for all n, we intend to apply it to the fundamental state, in such a way thatn = 0 and, since E

(0)0 < E

(0)m , ∀m 6= 0, all the terms in the sum are negative, and

E(2)0 < 0.So far, we have introduced the Rayleigh-Schrodinger perturbation theory in a

completely general manner that may be applied to any quantum system. Thepractical use of all this formalism in quantum chemistry is what is known as theMøller-Plesset approach. According to it, one must first perform a SCF Hartree-Fock calculation (which shall be assumed here to be of the RHF class) using a finitebasis set of size M (see secs. 8 and 9 for reference). As a result, a set of M orthogo-nal orbitals φa (also specified by M-tuples cba) are produced, among which the N/2ones with the lowest eigenvalues εa of the self-consistent Fock operator F [φ] are saidto be occupied and denoted with capital indices starting in the middle part of thealphabet (I, J,K, L, . . .). The remaining M − N/2 are called virtual orbitals andshall be indicated using indices beginning by r, s, t, . . .

The unperturbed Hamiltonian H0 used in the general expressions above is definedto be here the sum of the self-consistent one-electron Fock operators in eq. (7.49),acting each one of them on one of the N coordinates ~ri (just add an i-label to the ~r

47

coordinates in the definition of the Coulomb and exchange operators in in eq. (7.44)and simply use eq. (7.6) for hi):

H0 :=

N∑

i

Fi[φ] =

N∑

i

hi +

N/2∑

J

(2J i

J [φ]− KiJ [φ]

) . (10.13)

The perturbation H ′ is the difference

H ′ := H − H0 =1

2

∑

i 6=j

1

rij−

N/2∑

J

(2J i

J [φ]− KiJ [φ]

), (10.14)

where H is the total electronic Hamiltonian in eq. (5.1) and one must note thatthere is no reason to believe that H ′ is ‘small’. As a result, the Møller-Plessetseries may be ill-behaved or even divergent in some situations [17] and, like in theHartree-Fock case, the quality of theoretical predictions must be ultimately assessedby comparison to experimental data or to more accurate methods.

Now, if, despite this lack of certainty, we are brave enough to keep on walking theRayleigh-Schrodinger path, we must first note that, although it is easy to show thatthe N -particle Slater determinant constructed with the N/2 Hartree-Fock optimizedorbitals is an eigenstate of H0, we need to know the rest of the eigenstates if thesecond order expression for the energy in eq. (10.12) is to be used.

The whole N -electron Hilbert space in which these eigenstates ‘live’ has infinitedimension, however, as a part of the Møller-Plesset strategy, there is a convenientway of extrapolating the LCAO approximation in sec. 8 to the N-particle space.This strategy is not only used in MP2 but also in other post-HF methods [17, 18,109, 110] and it allows us to construct a natural finite basis of eigenfunctions ofH0. The trick is simple: Due to the fact that, except for the case of minimalbasis sets and complete-shell atoms (noble gases), the total number of canonicalorbitals φa obtained from a SCF calculation (M) is larger than that of the occupiedones (N/2), we may take any φI from this second group and substitute it by anyφr among the virtual orbitals in order to form a new Slater determinant. This iscalled a single substitution, and the determinants obtained from all possible single,double, triple,. . . , M − 1, and M substitutions constitute a basis set of N -particleeigenfunctions of H0 that is complete in the (antisymmetrized) tensorial product ofthe one-particle Hilbert spaces spanned by the M molecular orbitals φa (i.e., in thefull-Configuration Interaction space [17, 18, 109, 110]).

Any one of these N -electron states is completely specified, in the RHF case,by two unordered set of integers: on the one hand, Iα := {a1, a2, . . . , aN/2}, whichcontains the indices of the spatial orbitals that appear multiplied by α spin functions,and, on the other, Iβ := {b1, b2, . . . , bN/2}, which corresponds to the β electrons. Inorder for the Slater determinant defined by a given pair I := (Iα, Iβ) not to be zero,all the indices within each set must be different, however, any of them may appearin both sets at the same time. The number of possible Iα (or Iβ) sets is, therefore,

(M

N/2

):=

M !

(M −N/2)!(N/2)!,

48

and, consequently, the number of different RHF Slater determinants constructedfollowing this procedure (the number of elements in I) is

(M

N/2

)2

:=

(M !

(M −N/2)!(N/2)!

)2

,

which, already for small molecules and small basis sets, is a daunting number.For example, in the case of the model dipeptide HCO-L-Ala-NH2 [69, 72, 89, 90]and the 6-31G(d) basis set (see sec. 9), we have that N/2 = 31 and M = 128,so that there exist more than 1059 different RHF Slater determinants that can beconstructed with the Hartree-Fock molecular orbitals.

Due to the fact that any given orbital φa is an eigenfunction of the Fock operatorF [φ] with eigenvalue εa, one can show that, if ΨI is the Slater determinant thatcorresponds to the set of indices I :=

({a1, . . . , aN/2}, {b1, . . . , bN/2}

), then, it is an

eigenstate of H0 with eigenvalue

√N !

N/2∑

I

εaI +

N/2∑

I

εbI

. (10.15)

Henceforth, it is clear that the Slater determinant of lowest H0 eigenvalue, i.e.,the zeroth order approximation |0〉 to the fundamental state, is the one that corre-sponds to the particular set of orbital indices I0 := {1, 1, 2, 2, . . . , N/2, N/2}. Thisis precisely the optimal Hartree-Fock wavefunction, and the zeroth order approxi-mation to the energy of the fundamental state in eq. (10.7a) is

E(0)0 = 2

N/2∑

I

εI . (10.16)

Next, in order to find the first and second order corrections, we need to calculate,in the basis of N -electron functions introduced above, the matrix elements 〈m| H ′ |n〉which are associated to the perturbation.

First, if we are interested in a diagonal matrix element 〈ΨI | H ′ |ΨI〉, correspond-ing to the eigenstate |ΨI〉 := |m〉 = |n〉, the calculations performed in sec. 7 can berecalled to show that, in the RHF case, we have that

〈ΨI| H ′ |ΨI〉 =

−N/2∑

I,J

[〈φaIφbJ |

1

r|φaIφbJ 〉+

1

2

(〈φaIφaJ |

1

r|φaIφaJ 〉+ 〈φbIφbJ |

1

r|φbIφbJ 〉

)

− 1

2

(〈φaIφaJ |

1

r|φaJφaI 〉+ 〈φbIφbJ |

1

r|φbJφbI 〉

)]. (10.17)

If we now particularize this equation to the case of the Hartree-Fock wavefunction|ΨI0〉 = |0〉, an operative expression for the first order energy correction in eq. (10.7b)is obtained:

49

E(1)0 = 〈0| H ′ |0〉 = −

N/2∑

I,J

(2〈φIφJ |

1

r|φIφJ〉 − 〈φIφJ |

1

r|φJφI〉

). (10.18)

Using this relation together with eq. (10.16), we see that the first order-correctedenergy, i.e., the MP1 result,

EMP10 := E

(0)0 +E

(1)0 = 2

N/2∑

I

εI−N/2∑

I,J

(2〈φIφJ |

1

r|φIφJ〉−〈φIφJ |

1

r|φJφI〉

), (10.19)

exactly matches the RHF energy in eq. (7.50), so that the differences betweenHartree-Fock and Møller-Plesset perturbation theory appear at second order for thefirst time.

In order to calculate E(2)0 , we first remark that it can be easily shown that, since

H ′ is made up of (at most) two-body operators, the matrix element 〈m| H ′ |n〉 willbe zero if |n〉 and |m〉 differ in more than two φa orbitals (due to the orthogonalityof the latter).

On the other hand, Brillouin’s theorem [17, 18] states that the matrix elements〈m| H |n〉 of the exact electronic Hamiltonian H between states that differ in only oneorbital are zero, and, using again the orthogonality of the φa, one can additionallyshow that the same happens for matrix elements 〈m| H0 |n〉 of the unperturbedHamiltonian. Since H ′ is just the difference H − H0, it follows that 〈m| H ′ |n〉 iszero if |n〉 and |m〉 differ in only one φa orbital.

Putting all together, the only terms that survive in the sum in eq. (10.12) (forn = 0) are those corresponding to matrix elements between the Hartree-Fock fun-damental state |0〉 and its doubly-substituted Slater determinants. Consequently,and after some calculations, it can be shown that the second order correction to theenergy of the fundamental state in terms of the SCF orbitals |φa〉 (denoted again as|a〉) is given by

E(2)0 =

N/2∑

I,J

M∑

r,s=N/2+1

2∣∣〈IJ | 1

r|rs〉

∣∣2 − 〈IJ | 1r|rs〉〈IJ | 1

r|sr〉

εI + εJ − εr − εs. (10.20)

The computational cost of calculating E(2)0 scales like M5 [17], and the MP2

energy is

EMP20 := E

(0)0 + E

(1)0 + E

(2)0 = ERHF

0 + E(2)0 . (10.21)

Finally, let us note that, in agreement with the already mentioned chemicalintuition (in sec. 9) that states that ‘core electrons are less affected by the molecularenvironment and the formation of bonds than valence electrons’, it is common toremove the core shells from the calculation of the second order correction to theenergy in eq. (10.20). This is called the frozen core approximation.

50

A Functional derivatives

A functional F [Ψ] is a mapping that takes functions to numbers (in this document,only functionals in the real numbers are going to be considered):

F : G −→ R

Ψ 7−→ F [Ψ]

For example, if the function space G is the Hilbert space of square-integrablefunctions L2 (the space of states of quantum mechanics), the objects in the domainof F (i.e., the functions in L2) can be described by infinite-tuples (c1, c2, . . .) ofcomplex numbers and F may be pictured as a function of infinite variables.

When dealing with function spaces G that meet certain requirements41, the limiton the left-hand side of the following equation can be written as the integral on theright-hand side:

limε→0

F [Ψ0 + εδΨ]− F [Ψ0]

ε:=

∫δF [Ψ0]

δΨ(x)δΨ(x)dx , (A.1)

where x denotes a point in the domain of the functions in G, and the the object(δF [Ψ0]/δΨ)(x) (which is a function of x not necessarily belonging to G) is calledthe functional derivative of F [Ψ] in the in the point Ψ0.

One common use of this functional derivative is to find stationary points offunctionals. A function Ψ0 is said to be an stationary point of F [Ψ] if:

δF [Ψ0]

δΨ(x) = 0 . (A.2)

In order to render this definition operative, one must have a method for comput-ing (δF [Ψ0]/δΨ)(x). Interestingly, it is possible, in many useful cases (and in all theapplications of the formalism in this document), to calculate the sought derivativedirectly from the left-hand side of eq. A.1. The procedure, in such a situation, beginsby writing out F [Ψ0 + εδΨ] and clearly separating the different orders in ε. Sec-ondly, one drops the terms of zero order (by virtue of the subtraction of the quantityF [Ψ0]) and those of second order or higher (because they vanish when divided by εand the limit ε → 0 is taken). The remaining terms, all of order one, are dividedby ε and, finally, (δF [Ψ0]/δΨ)(x) is identified out of the resulting expression (whichmust written in the form of the right-hand side of eq. A.1). For a practical exampleof this process, see secs. 6 and 7.

B Lagrange multipliers and constrained

stationary points

Very often, when looking for the stationary points of a function (or a functional), thesearch space is not the whole one, in which the derivatives are taken, but a certain

41 We will not discuss the issue further but let it suffice to say that L2 does satisfy theserequirements.

51

Figure 1: Schematic depiction of a constrained stationary points problem. Σ is the

2-dimensional search space, which is embedded in R3. The white-filled circles are solu-

tions of the unconstrained problem only, the gray-filled circles are solutions of only the

constrained one and the gray-filled circles inside white-filled circles are solutions of both.

A, B, C and D are examples of different situations discussed in the text.

subset of it defined by a number of constraints. An elegant and useful method forsolving the constrained problem is that of the Lagrange multipliers.

Although it can be formally generalized to infinite dimensions (i.e., to functionals,see appendix A), here we will introduce the method in R

N in order to gain somegeometrical insight and intuition.

The general framework may be described as follows: we have a differentiablefunction f(~x) that takes points in R

N to real numbers and we want to find thestationary points of f restricted to a certain subspace Σ of RN, which is defined byK constraints42:

Li(~x) = 0 i = 1, . . . , K . (B.1)

The points that are the solution of the constrained problem are those ~x belongingto Σ where the first order variation of f would be zero if the derivatives were taken‘along’ Σ. In other words, the points ~x where the gradient ~∇f has only components(if any) in directions that ‘leave’ Σ (see below for a rigorous formalization of theseintuitive ideas). Thus, when comparing the solutions of the unconstrained problemto the ones of the constrained problem, three distinct situations arise (see fig. 1):

i) A point ~x is a solution of the unconstrained problem (i.e. it satisfies ~∇f(~x) = 0)but it does not belong to Σ. Hence, it is not a solution of the constrainedproblem. This type of point is depicted as a white-filled circle in fig. 1.

42 If the constraints are functionally independent, one must also ask that K < N . If not, Σ willbe either a point (if K = N) or empty (if K > N).

52

ii) A point ~x is a solution of the unconstrained problem (i.e. it satisfies ~∇f(~x) = 0)and it belongs to Σ. Hence, it is also a solution of the constrained problem,since, in particular, the components of the gradient in directions that do notleave Σ are zero. This type of point is depicted as a gray-filled circle inside awhite-filled circle in fig. 1.

iii) A point ~x is not a solution of the unconstrained problem (i.e., one has ~∇f(~x) 6= 0)but it belongs to Σ and the only non-zero components of the gradient are indirections that leave Σ. Hence, it is a solution of the constrained problem.This type of point is depicted as a gray-filled circle in fig. 1.

From this discussion, it can be seen that, in principle, no conclusions about thenumber (or existence) of solutions of the constrained problem may be drawn onlyfrom the number of solutions of the unconstrained one. This must be investigatedfor each particular situation.

In fig. 1, an schematic example in R3 is depicted. The constrained search space

Σ is a 2-dimensional surface and the direction43 in which one leaves Σ is shownat several points as a perpendicular vector ~pΣ. In such a case, the criterium that~∇f has only components in the direction of leaving Σ may be rephrased by asking~∇f to be parallel to ~pΣ, i.e., by requiring that there exists a number λ such that~∇f = −λ~pΣ. The case λ = 0 is also admitted and the explanation of the minus signwill be given in the following.

In this case, K = 1, and one may note that the perpendicular vector ~pΣ isprecisely ~pΣ = ~∇L1. Let us define f as

f(~x) := f(~x) + λL1(~x) . (B.2)

It is clear that, requiring the gradient of f to be zero, one recovers the condition~∇f = −λ~pΣ, which is satisfied by the points solution of the constrained stationarypoints problem. If one also asks that the derivative of f with respect to λ be zero,the constraint L1(~x) = 0 that defines Σ is obtained as well.

This process illustrates the Lagrange multipliers method in this particular exam-ple. In the general case, described by eq. B.1 and the paragraph above it, it can beproved that the points ~x which are stationary subject to the constraints imposedsatisfy

~∇f(~x) = 0 and∂f(~x)

∂λi= 0 i = 1, . . . , K , (B.3)

where

f(~x) := f(~x) +

K∑

i=1

λiLi(~x) . (B.4)

43 Note that, only ifK = 1, i.e., if the dimension of Σ is N−1, there will be a vector perpendicularto the constrained space. For K > 1, the dimensionality of the vector space of the directions inwhich one ‘leaves’ Σ will be also larger than 1.

53

Of course, if one follows this method, the parameters λi (which are, in fact, theLagrange multipliers) must also be determined and may be considered as part of thesolution.

Also, it is worth remarking here that any two pair of functions, f1 and f2, of RN

whose restrictions to Σ are equal (i.e., that satisfy f1|Σ = f2|Σ) obviously representthe same constrained problem and they may be used indistinctly to construct theauxiliary function f . This fact allows us, after having constructed f from a particularf , to use the equations of the constraints to change f by another simpler functionwhich is equal to f when restricted to Σ. This freedom is used to derive the Hartreeand Hartree-Fock equations, in secs. 6 and 7, respectively.

The formal generalization of these ideas to functionals (see appendix A) isstraightforward if the space R

N is substituted by a functions space F , the points ~xby functions, the functions f , Li and f by functionals and the requirement that thegradient of a function be zero by the requirement that the functional derivative ofthe analogous functional be zero.

Finally, let us stress something that is rarely mentioned in the literature: Thereis another (older) method, apart from the Lagrange multipliers one, for solving aconstrained optimization problem: simple substitution. I.e., if we can find a setof N − K independent adapted coordinates that parameterize Σ and we can writethe score function f in terms of them, we would be automatically satisfying theconstraints. Actually, in practical cases, the method chosen is a suitable combinationof the two; in such a way that, if substituting the constraints in f is difficult, thenecessary Lagrange multipliers are introduced to force them, and vice versa.

As a good example of this, the reader may want to check the derivation ofthe Hartree equations in sec. 6 (or the Hartree-Fock ones in sec. 7). There, westart by proposing a particular form for the total wavefunction Φ in terms of theone-electron orbitals φi (see eq. (6.1)) and we write the functional F (which isthe expected value of the energy) in terms of that special Φ (see eq. (6.3)). In asecond step, we impose the constraints that the one-particle orbitals be normalized(〈φi|φi〉 = 1, i = 1, . . . , N) and force them by means of N Lagrange multipliers λi.Despite the different treatments, both conditions are constraints standing on thesame footing. The only difference is not conceptual, but operative: for the firstcondition, it would be difficult to write it as a constraint; while, for the second one,it would be difficult to define adapted coordinates in the subspace of normalizedorbitals. So, in both cases, the easiest way for dealing with them is chosen.

References

[1] J. Skolnick, Putting the pathway back into protein folding, Proc. Natl. Acad.Sci. USA 102 (2005) 2265–2266.

[2] C. D. Snow, E. J. Sorin, Y. M. Rhee, and V. S. Pande, How well cansimulation predict protein folding kinetics and thermodynamics?, Annu. Rev.Biophys. Biomol. Struct. 34 (2005) 43–69.

54

[3] O. Schueler-Furman, C. Wang, P. Bradley, K. Misura, andD. Baker, Progress in modeling of protein structures and interactions, Sci-ence 310 (2005) 638–642.

[4] K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski, Practicallessons from protein structure prediction, Nucleic Acids Research 33 (2005)1874–1891.

[5] R. Bonneau and D. Baker, Ab initio protein structure prediction: Progressand prospects, Annu. Rev. Biophys. Biomol. Struct. 30 (2001) 173–189.

[6] M.-H. Hao and H. A. Scheraga, Designing potential energy functions forprotein folding, Curr. Opin. Struct. Biol. 9 (1999) 184–188.

[7] A. V. Morozov, K. Tsemekhman, and D. Baker, Electron density re-distribution accounts for half the cooperativity of α-helix formation, J. Phys.Chem. B 110 (2006) 4503–4505.

[8] F. Jensen, An introduction to the state of the art in quantum chemistry,Ann. Rep. Comp. Chem. 1 (2005) 1–17.

[9] A. R. MacKerell Jr., M. Feig, and C. L. Brooks III, Extending thetreatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing protein conformational distributionsin molecular dynamics simulations, J. Comp. Chem. 25 (2004) 1400–1415.

[10] A. V. Morozov, T. Kortemme, K. Tsemekhman, and D. Baker, Closeagreement between the orientation dependence of hydrogen bonds observed inprotein structures and quantum mechanical calculations, Proc. Natl. Acad.Sci. USA 101 (2004) 6946–6951.

[11] A. J. Bordner, C. N. Cavasotto, and R. A. Abagyan, Direct derivationof van der Waals force fields parameters from quantum mechanical interactionenergies, J. Phys. Chem. B 107 (2003) 9601–9609.

[12] R. A. Friesner and M. D. Beachy, Quantum mechanical calculations onbiological systems, Curr. Opin. Struct. Biol. 8 (1998) 257–262.

[13] M. Beachy, D. Chasman, R. Murphy, T. Halgren, and R. Friesner,Accurate ab initio quantum chemical determination of the relative energetics ofpeptide conformations and assessment of empirical force fields, J. Am. Chem.Soc. 119 (1997) 5908–5920.

[14] C. J. Barden and H. F. Schaffer III, Quantum chemistry in the 21st

century, Pure Appl. Chem. 72 (2000) 1405–1423.

[15] J. Simons, An experimental chemists’s guide to ab initio quantum chemistry,J. Chem. Phys. 95 (1991) 1017–1029.

[16] I. N. Levine, Quantum Chemistry, Prentice Hall, Upper Saddle River, 5thedition, 1999.

55

[17] F. Jensen, Introduction to Computational Chemistry, John Wiley & Sons,Chichester, 1998.

[18] A. Szabo and N. S. Ostlund, Modern Quantum Chemistry: Introduced toAdvanced Electronic Structure Theory, Dover Publications, New York, 1996.

[19] B. N. Taylor, Guide for the Use of the International System of Units (SI),NIST Special Publication 811, National Institute of Standards and Technology,1995.

[20] D. R. Hartree, The wave mechanics of an atom with a non-Coulomb centralfield. Part I. Theory and methods, Proc. Camb. Philos. Soc. 24 (1927) 89.

[21] H. Shull and G. G. Hall, Atomic units, Nature 184 (1959) 1559.

[22] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, chapterAppendices VII and VIII, Oxford University Press, London, 1954.

[23] M. Born and J. R. Oppenheimer, Zur Quantentheorie der Molekeln, Ann.Phys. Leipzig 84 (1927) 457–484.

[24] M. P. Marder, Condensed Matter Physics, Wiley-Interscience, New York,2000.

[25] T. Shida, The Chemical Bond: A Fundamental Quantum-Mechanical Picture,Springer Series in Chemical Physics, Springer-Verlag, Berlin, 2006.

[26] C. J. Cramer, Essentials of Computational Chemistry: Theories and Models,John Wiley & Sons, Chichester, 2nd edition, 2002.

[27] R. G. Parr and W. Yang, Density-Functional Theory of Atoms andMolecules, volume 16 of International series of monographs on chemistry, Ox-ford University Press, New York, 1989.

[28] B. T. Sutcliffe and R. G. Woolley, Molecular structure calculationswithout clamping the nuclei, Phys. Chem. Chem. Phys. 7 (2005) 3664–3676.

[29] B. T. Sutcliffe, The nuclear motion problem in molecular physics, Adv.Quantum Chem. 28 (1997) 65–80.

[30] B. T. Sutcliffe, The coupling of nuclear and electronic motions inmolecules, J. Chem. Soc. Faraday Trans. 89 (1993) 2321–2335.

[31] G. Hunter, Conditional probability amplitudes in wave mechanics, Intl. J.Quant. Chem. 9 (1975) 237–242.

[32] H. Yserentant, On the electronic Schrodinger equation, Technical report,Universitat Tubingen, 2003, http://www.math.tu-berlin.de/∼yserenta/.

[33] B. Simon, Schrodinger operators in the twentieth century, J. Math. Phys. 41(2000) 3523–3555.

56

[34] W. Hunziger and I. M. Sigal, The quantum N-body problem, J. Math.Phys. 41 (2000) 3448–3510.

[35] M. B. Ruskai and J. P. Solovej, Asymptotic neutrality of polyatomicmolecules, Lect. Notes Phys. 403 (1992) 153–174.

[36] W. Hunziker, On the spectra of Schrodinger multiparticle Hamiltonians,Helv. Phys. Acta 39 (1966) 451–462.

[37] C. Van Winter, Theory of finite systems of particles I. The Green function,Mat. Fys. Skr. Dan. Vid. Selsk 2 (1964) 1–60.

[38] G. Friesecke, The multiconfiguration equations for atoms and molecules:Charge quantization and existence of solutions, Arch. Rational Mech. Anal.169 (2003) 35–71.

[39] G. M. Zhislin, Discussion of the spectrum of Schrodinger operators forsystems of many particles, Trudy Moskovskogo matematiceskogo obscestva 9

(1960) 81–120, (In Russian).

[40] H. P. Hratchian and H. B. Schlegel, Finding minima, transition states,and following reaction pathways on ab initio potential energy surfaces, inTheory and Applications of Computational Chemistry: The First Forty Years,edited by C. D. et al., chapter 10, Elsevier, 2005.

[41] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States,S. Swaminathan, and M. Karplus, CHARMM: A program for macro-molecular energy, minimization, and dynamics calculations, J. Comp. Chem.4 (1983) 187–217.

[42] A. D. MacKerell Jr., B. Brooks, C. L. Brooks III, L. Nilsson,B. Roux, Y. Won, and M. Karplus, CHARMM: The energy function andits parameterization with an overview of the program, in The Encyclopedia ofComputational Chemistry, edited by P. v. R. Schleyer et al., pp. 217–277,John Wiley & Sons, Chichester, 1998.

[43] D. A. Pearlman, D. A. Case, J. W. Caldwell, W. R. Ross, T. E.

Cheatham III, S. DeBolt, D. Ferguson, G. Seibel, and P. Koll-

man, AMBER, a computer program for applying molecular mechanics, normalmode analysis, molecular dynamics and free energy calculations to elucidate thestructures and energies of molecules, Comp. Phys. Commun. 91 (1995) 1–41.

[44] J. W. Ponder and D. A. Case, Force fields for protein simulations, Adv.Prot. Chem. 66 (2003) 27–85.

[45] T. E. Cheatham III and M. A. Young, Molecular dynamics simulationof nucleic acids: Successes, limitations and promise, Biopolymers 56 (2001)232–256.

57

[46] W. L. Jorgensen and J. Tirado-Rives, The OPLS potential functions forproteins. Energy minimization for crystals of cyclic peptides and Crambin, J.Am. Chem. Soc. 110 (1988) 1657–1666.

[47] P. A. M. Dirac, Quantum Mechanics of Many-Electron Systems, Proc. Roy.Soc. London 123 (1929) 714.

[48] R. Seeger and J. A. Pople, Self-consistent molecular orbital methods.XVIII. Constraints and stability in Hartree-Fock theory, J. Chem. Phys. 66(1977) 3045–3050.

[49] E. Marinari and G. Parisi, Simulated tempering: A new Monte Carloscheme, Europhys. Lett. 19 (1992) 451–458.

[50] V. Cerny, A thermodynamical approach to the travelling salesman problem:An efficient simulation algorithm, J. Optimiz. Theory App. 45 (1985) 41–51.

[51] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Optimization bysimulated annealing, Science 220 (1983) 671–680.

[52] J. C. Slater, Note on Hartree’s method, Phys. Rev. 35 (1930) 210–211.

[53] E. Cances, M. DeFranceschi, W. Kutzelnigg, C. Le Bris, andY. Maday, Computational quantum chemistry: A primer, in Handbook of nu-merical analysis. Volume X: Special volume: Computational chemistry, editedby P. Ciarlet and C. Le Bris, pp. 3–270, Elsevier, 2003.

[54] P. L. Lions, Solutions of Hartree-Fock equations for Coulomb systems, Com-mun. Math. Phys. 109 (1987) 33–97.

[55] E. H. Lieb and B. Simon, On solutions to the Hartree-Fock problem foratoms and molecules, J. Chem. Phys. 61 (1974) 735–736.

[56] E. H. Lieb and B. Simon, The Hartree-Fock theory for Coulomb systems,Commun. Math. Phys. 53 (1977) 185–194.

[57] V. Fock, Naherungsmethode zur Losung des quantenmechanischenMehrkorperproblems, Z. Phys. 61 (1930) 126.

[58] T. Koopmans, Uber die Zuordnung von Wellenfunktionen und Eigenwertenzu den einzelnen Electronen eines Atoms, Physica 1 (1934) 104.

[59] H. B. Schlegel and J. J. W. McDouall, Do you have SCF stabilityand convergence problems?, in Computational Advances in Organic Chem-istry: Molecular Structure and Reactivity, edited by C. Ogretir and I. G.

Csizmadia, pp. 167–185, Kluwer Academic, The Netherlands, 1991.

[60] J. N. Onuchic and P. G. Wolynes, Theory of protein folding, Curr. Opin.Struct. Biol. 14 (2004) 70–75.

58

[61] S. S. Plotkin and J. Onuchic, Understanding protein folding with energylandscape theory. Part I: Basic concepts, Quart. Rev. Biophys. 35 (2002)111–167.

[62] K. A. Dill, Polymer principles and protein folding, Prot. Sci. 8 (1999)1166–1180.

[63] C. M. Dobson, A. Sali, and M. Karplus, Protein folding: A perspectivefrom theory and experiment, Angew. Chem. Int. Ed. 37 (1998) 868–893.

[64] J. D. Bryngelson, J. N. Onuchic, N. D. Socci, and P. G. Wolynes,Funnels, pathways, and the energy landscape of protein folding: A synthesis,Proteins 21 (1995) 167–195.

[65] J. D. Bryngelson and P. G. Wolynes, Spin-glasses and the statistical-mechanics of protein folding, Proc. Natl. Acad. Sci. USA 84 (1987) 7524–7528.

[66] R. Vargas, J. Garza, B. P. Hay, and D. A. Dixon, Conformationalstudy of the alanine dipeptide at the MP2 and DFT levels, J. Phys. Chem. A106 (2002) 3213–3218.

[67] A. Lang, I. G. Csizmadia, and A. Perczel, Peptide models. XLV: Con-formational properties of N-formyl-L-methioninamide ant its relevance to me-thionine in proteins, PROTEINS: Struct. Funct. Bioinf. 58 (2005) 571–588.

[68] A. Perczel, O. Farkas, I. Jakli, I. A. Topol, and I. G. Csizmadia,Peptide models. XXXIII. Extrapolation of low-level Hartree-Fock data of pep-tide conformation to large basis set SCF, MP2, DFT and CCSD(T) results.The Ramachandran surface of alanine dipeptide computed at various levels oftheory, J. Comp. Chem. 24 (2003) 1026–1042.

[69] C.-H. Yu, M. A. Norman, L. Schafer, M. Ramek, A. Peeters, andC. van Alsenoy, Ab initio conformational analysis of N-formyl L-alanineamide including electron correlation, J. Mol. Struct. 567–568 (2001) 361–374.

[70] M. Elstner, K. J. Jalkanen, M. Knapp-Mohammady, and S. Suhai,Energetics and structure of glycine and alanine based model peptides: Approx-imate SCC-DFTB, AM1 and PM3 methods in comparison with DFT, HF andMP2 calculations, Chem. Phys. 263 (2001) 203–219.

[71] H. A. Baldoni, G. Zamarbide, R. D. Enriz, E. A. Jauregui,O. Farkas, A. Perczel, S. J. Salpietro, and I. G. Csizmadia, Peptidemodels XXIX. Cis-trans iosmerism of peptide bonds: Ab initio study on smallpeptide model compound; the 3D-Ramachandran map of formylglycinamide, J.Mol. Struct. 500 (2000) 97–111.

[72] A. M. Rodrıguez, H. A. Baldoni, F. Suvire, R. Nieto Vazquez,G. Zamarbide, R. D. Enriz, O. Farkas, A. Perczel, M. A. McAllis-

ter, L. L. Torday, J. G. Papp, and I. G. Csizmadia, Characteristics ofRamachandran maps of L-alanine diamides as computed by various molecular

59

mechanics, semiempirical and ab initio MO methods. A search for primarystandard of peptide conformational stability, J. Mol. Struct. 455 (1998) 275–301.

[73] A. G. Csaszar and A. Perczel, Ab initio characterization of building unitsin peptides and proteins, Prog. Biophys. Mol. Biol. 71 (1999) 243–309.

[74] R. F. Frey, J. Coffin, S. Q. Newton, M. Ramek, V. K. W. Cheng,F. A. Momany, and L. Schafer, Importance of correlation-gradient geom-etry optimization for molecular conformational analyses, J. Am. Chem. Soc.114 (1992) 5369–5377.

[75] J. Kobus, Diatomic molecules: Exact solutions of HF equations, Adv. Quan-tum Chem. 28 (1997) 1–14.

[76] C. C. J. Roothaan, New developments in molecular orbital theory, Rev.Mod. Phys. 23 (1951) 69–89.

[77] G. G. Hall, The molecular orbital theory of chemical valency VIII. A methodof calculating ionization potentials, Proc. Roy. Soc. London Ser. A 205 (1951)541–552.

[78] F. Jensen, Estimating the Hartree-Fock limit from finite basis set calculations,Theo. Chem. Acc. 113 (2005) 267–273.

[79] J. A. Pople, Nobel lecture: Quantum chemical models, Rev. Mod. Phys. 71(1999) 1267–1274.

[80] J. M. Garcıa de la Vega and B. Miguel, Basis sets for computationalchemistry, in Introduction to Advanced Topics of Computational Chemistry,edited by L. A. Montero, L. A. Dıaz, and R. Bader, chapter 3, pp.41–80, Editorial de la Universidad de la Habana, 2003.

[81] T. Helgaker and P. R. Taylor, Gaussian basis sets and molecular in-tegrals, in Modern Electronic Structure Theory. Part II, edited by D. R.

Yarkony, pp. 725–856, World Scientific, Singapore, 1995.

[82] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functionswith Formulas, Graphs, and Mathematical Tables, Dover, New York, 9 edition,1964.

[83] J. C. Slater, Atomic shielding constants, Phys. Rev. 36 (1930) 57–54.

[84] C. Zener, Analytic atomic wave functions, Phys. Rev. 36 (1930) 51–56.

[85] R. J. Mathar, Mutual conversion of three flavors of Gaussian Type Orbitals,Intl. J. Quant. Chem. 90 (2002) 227–243.

[86] T. Kato, On the eigenfunctions of many-particle systems in quantum me-chanics, Commun. Pure Appl. Math. 10 (1957) 151–177.

60

[87] S. F. Boys, Electronic wavefunctions. I. A general method of calculation forstationary states of any molecular system, Proc. Roy. Soc. London Ser. A 200

(1950) 541–554.

[88] H. B. Schlegel and M. J. Frisch, Transformation between Cartesian andpure spherical harmonic Gaussians, Intl. J. Quant. Chem. 54 (1995) 83–87.

[89] P. Echenique, I. Calvo, and J. L. Alonso, Quantum mechanical calcu-lation of the effects of stiff and rigid constraints in the conformational equilib-rium of the Alanine dipeptide, J. Comp. Chem. 27 (2006) 1748–1755.

[90] P. Echenique and J. L. Alonso, Definition of Systematic, ApproximatelySeparable and Modular Internal Coordinates (SASMIC) for macromolecularsimulation, J. Comp. Chem. 27 (2006) 1076–1087.

[91] W. J. Hehre, R. F. Stewart, and J. A. Pople, Self-consistent molecular-orbital methods. I. Use of Gaussian expansions of Slater-type atomic orbitals,J. Chem. Phys. 51 (1969) 2657–2664.

[92] R. Ditchfield, W. J. Hehre, and J. A. Pople, Self-consistent molecular-orbital methods. IX. An extended Gaussian-type basis for molecular-orbitalstudies of organic molecules, J. Chem. Phys. 54 (1971) 724–728.

[93] W. J. Hehre, R. Ditchfield, and J. A. Pople, Self-consistent molecular-orbital methods. XII. Further extensions of Gaussian-type basis sets for usein molecular-orbital studies of organic molecules, J. Chem. Phys. 56 (1972)2257–2261.

[94] P. C. Hariharan and J. A. Pople, The influence of polarization functionson molecular orbital hydrogenation energies, Theor. Chim. Acta 28 (1973)213–222.

[95] M. J. Frisch, J. A. Pople, and J. S. Binkley, Self-consistent molecular-orbital methods. 25. Supplementary functions for Gaussian basis sets, J. Chem.Phys. 80 (1984) 3265–3269.

[96] R. Krishnan, J. S. Binkley, R. Seeger, and J. A. Pople, Self-consistentmolecular-orbital methods. XX. A basis set for correlated wave functions, J.Chem. Phys. 72 (1980) 650–654.

[97] J. S. Binkley, J. A. Pople, and W. J. Hehre, Self-consistent molecular-orbital methods. 21. Small split-valence basis sets for first-row elements, J.Am. Chem. Soc. 102 (1980) 939–947.

[98] G. W. Spitznagel, T. Clark, J. Chandrasekhar, and P. v. R.

Schleyer, Stabilization of methyl anions by first row substituents. The su-periority of diffuse function-augmented basis sets for anion calculations, J.Comp. Chem. 3 (1982) 363–371.

61

[99] T. Clark, J. Chandrasekhar, G. W. Spitznagel, and P. v. R.

Schleyer, Efficient diffuse function-augmented basis sets for anion calcula-tions. III. The 3-21+G basis set for first-row elements Li–F, J. Comp. Chem.4 (1983) 294–301.

[100] G. K. Woodgate, Elementary Atomic Structure, Oxford University Press,USA, 2 edition, 1983.

[101] P. Hudaky, I. Jakli, A. G. Csaszar, and A. Perczel, Peptide models.XXXI. Conformational properties of hydrophobic residues shaping the coreof proteins. An ab initio study of N-formyl-L-valinamide and N-formyl-L-phenylalaninamide, J. Comp. Chem. 22 (2001) 732–751.

[102] M. D. Halls and H. B. Schlegel, Comparison study of the predictionof Raman intensities using electronic structure methods, J. Chem. Phys. 111(1999) 8819–8824.

[103] A. P. Scott and L. Radom, Harmonic vibrational frequencies: An eval-uation of Hartree-Fock, Møller-Plesset, Quadratic Configuration Interaction,Density Functional Theory, and semiempirical scale factors, J. Phys. Chem.100 (1996) 16502–16513.

[104] A. G. Csaszar, On the structures of free glycine and α-alanine, J. Mol.Struct. 346 (1995) 141–152.

[105] C. P. Chun, A. A. Connor, and G. A. Chass, Ab initio conformationalanalysis of N- and C-terminally-protected valyl-alanine dipeptide model, J.Mol. Struct. 729 (2005) 177–184.

[106] A. St.-Amant, W. D. Cornell, P. A. Kollman, and T. A. Halgren,Calculation of molecular geometries relative conformational energies, dipolemoments, and molecular electrostatic potential fitted charges of small organicmolecules of biochemical interest by Density Functional Theory, J. Comp.Chem. 12 (1995) 1483–1506.

[107] T. Van Mourik, P. G. Karamertzanis, and S. L. Price, Molecularconformations and relative stabilities can be as demanding of the electronicstructure method as intermolecular calculations, J. Phys. Chem. 110 (2006)8–12.

[108] P. Hobza and J. Sponer, Toward true DNA base-stacking energies: MP2,CCSD(T) and Complete Basis Set calculations, J. Am. Chem. Soc. 124 (2002)11802–11808.

[109] P. Knowles,M. Schutz, andH.-J. Werner, Ab initio nethods for electroncorrelation in molecules, in Modern Methods and Algorithms of QuantumChemistry, edited by J. Grotendorst, volume 3, pp. 97–179, Julich, 2000,John von Neumann Institute for Computing.

62

[110] W. Kutzelnigg and P. Von Herigonte, Electron correlation at the dawnthe 21st century, Adv. Quantum Chem. 36 (1999) 185–229.

[111] C. Møller and M. S. Plesset, Note on an approximation treatment formany-electron systems, Phys. Rev. 46 (1934) 618–622.

[112] R. A. DiStasio Jr., Y. Jung, and M. Head-Gordon, A Resolution-of-The-Identity implementation of the local Triatomics-In-Molecules model forsecond-order Møller-Plesset perturbation theory with application to alaninetetrapeptide conformational energies, J. Chem. Theory Comput. 1 (2005) 862–876.

[113] P. Hobza and J. Sponer, Structure, energetics, and dynamics of the nucleicacid base pairs: Nonempirical ab initio calculations, Chem. Rev. 99 (1999)3247–3276.

[114] Y. Zhao and D. G. Truhlar, Infinite-basis calculations of binding energiesfor the hydrogen bonded and stacked tetramers of formic acid and formamideand their use for validation of hybrid DFT and ab initio methods, J. Phys.Chem. A 109 (2005) 6624–6627.

[115] J. C. Sancho-Garcıa and A. Karpfen, The torsional potential in 2,2′

revisited: High-level ab initio and DFT results, Chem. Phys. Lett. 411 (2005)321–326.

[116] W. Wang, Method-dependent relative stability of hydrogen bonded and π–πstacked structures of the formic acid tetramer, Chem. Phys. Lett. 402 (2005)54–56.

[117] A. Galindo and P. Pascual, Quantum Mechanics, Springer-Verlag, Berlin,1990.

[118] C. Cohen-Tannoudji, B. Diu, and F. Laloe, Quantum Mechanics, Her-mann and John Wiley & Sons, Paris, 2nd edition, 1977.

63

amathematicalintroductiontohartree-fockscf ...formalism needed for the fundamental-state quantum...

Documents