quantum simulation techniques for electrons and ions · 2015. 10. 7. · j. p. perdew and s. kurth,...

Quantum simulation techniques for electronsand ions

Roman MartonákDepartment of Experimental Physics

Faculty of Mathematics, Physics and InformaticsComenius University, Bratislava, Slovakia

[email protected]

2013

Literature:Richard M. Martin, Electronic Structure Basic Theory and Practical Methods,

Cambridge University Press (2004)

Outline

• I Ab Initio Total Energy Calculations (Introduction to Density FunctionalTheory)

• II Ab-initio Molecular Dynamics (Car-Parrinello) Method

• III Special topics

• IV Path Integral Quantum Monte Carlo methods

1

Autor: Roman MartonákNázov: Quantum simulation techniques for electrons and ionsVydavatel’: Knižnicné a edicné centrum FMFI UKRok vydania: 2013Miesto vydania: BratislavaVydanie: prvéPocet strán (císlované): 112Internetová adresa: http://www.fmph.uniba.sk/index.php?id=el_st_mISBN: 978-80-8147-014-1EAN: 9788081470141

2

Part I

Ab initio total-energy calculations(Introduction to Density FunctionalTheory)Need for better accuracy of the potential energy

• classical potentials are good for many applications but have clear limits

• good close to equilibrium state for which they are fitted, away from equilib-rium the accuracy deteriorates rapidly

• processes of bond breaking and remaking are typically described poorly

• chemistry is excluded

• because of insufficient accuracy the predictive power is often limited

• improving accuracy of the total energy calculations is one of the main chal-lenges of computational physics

• we have to solve the Schrödinger equation for the electrons (in some ap-proximation)

• hierarchy (in the order of increasing accuracy):

– semi-empirical methods (e.g. tight-binding schemes)

– density-functional theory (DFT)

– Quantum Monte Carlo (QMC)

– accurate quantum chemistry methods (configurational interaction, cou-pled cluster), extremely demanding

An accurate description allows us to calculate a wide spectrum of properties:

• shapes and size of molecules

• crystal structures of solids

• binding energies

• ionization energies and electron affinities

3

• heights of energy barriers to various processes

• static response functions

• vibrational and phonon frequencies

• mechanical properties of solids

• phase boundaries

• once we have an accurate potential, we can run MD and directly observesome processes

• provides a predictive power

How to solve the electronic problem ?

• we assume the Born-Oppenheimer approximation

• we use the atomic units, mass unitme, charge unit e, length unit Bohr radiusa0 = 4πε0~2

e2me= 0.529 Å, energy unit Hartree = ~2

mea20= 27.21 eV

• electrons in the field of the nuclei

Te(r) = −∑i

1

2∇ri

2

Vee(r) =∑i<j

1

|ri − rj|

VeN(r, R) = −∑iJ

ZJ|ri −RJ |

=N∑i=1

v(ri)

H = Te(r) + Vee(r) + VeN(r, R)

• we want to solve the time-independent Schrödinger equation

HΨ(r, R) = E(R) Ψ(r, R)

• we are particularly interested in the ground state energy E0(R)

4

1 The Hartree-Fock methodThe variational principle for the ground state

• we define energy E[Ψ] as a functional of the wavefunction Ψ

E[Ψ] =〈Ψ|H|Ψ〉〈Ψ|Ψ〉

=

∫drΨ∗HΨ∫drΨ∗Ψ

• variational principle

E[Ψ] ≥ E0

E0 = minΨE0

• every eigenstate Ψ is an extremum of E[Ψ]

• condition δE[Ψ] = 0 is equivalent to the Schrödinger equation

• the normalization condition can be included via Lagrange multiplier δ[〈Ψ|H|Ψ〉−E〈Ψ|Ψ〉] = 0

• the condition is true also for excited states Ψ1,Ψ2, . . .

• the ground-state energy E[Ψ] depends only on the number of electrons Nand the nuclear potential VeN(r)

• kinetic energy and electron repulsion terms are universal (determined onlyby N )

• the ground-state energy E can be seen as functional of N and VeN(r)

The Hartree-Fock approximation (mean-field theory) Hartree (1928), Fock(1930)

• electrons are fermions

• we take a simple ansatz for the wavefunction Ψ

• we approximate the wavefunction ΨHF as an antisymmetrized product (Slaterdeterminant) of N orthornormal spin orbitals Ψi(x)

5

• each Ψi(r) is a product of a spatial orbital φi(r) and spin function σ(s) =α(s) (↑) or β(s) (↓)

ΨHF =1√N !

∣∣∣∣∣∣∣∣∣Ψ1(x1) Ψ2(x1) . . . ΨN(x1)Ψ1(x2) Ψ2(x2) . . . ΨN(x2)

......

...Ψ1(xN) Ψ2(xN) . . . ΨN(xN)

∣∣∣∣∣∣∣∣∣ =1√N !

det[Ψ1Ψ2 . . .ΨN ]

• the Hartree-Fock wavefunction is normalized 〈ΨHF |ΨHF 〉 = 1

• the product approximation corresponds to treating each electron in the mean-field of the other electrons

• the optimal orbitals Ψi(x) can be determined by a variational procedure

The Hartree-Fock equationsthe expectation value of the energy reads

EHF = 〈ΨHF |H|ΨHF 〉 =N∑i=1

Hi +1

2

N∑i,j=1

(Jij −Kij)

where Hi =

∫Ψ∗i (x)[−1

2∇2 + v(x)]Ψi(x)dx

Jij =

∫ ∫Ψi(x1)Ψ∗i (x1)

1

r12

Ψj(x2)Ψ∗j(x2)dx1dx2

Kij =

∫ ∫Ψ∗i (x1)Ψj(x1)

1

r12

Ψi(x2)Ψ∗j(x2)dx1dx2

• Jij are Coulomb integrals, Kij are exchange integrals, Jii = Kii

• EHF is minimized with respect to orbitals Ψi(x) subject to orthonormaliza-tion conditions

∫Ψ∗i (x)Ψj(x) = δij

The Hartree-Fock equations

• set of Hartree-Fock differential equations

FΨi(x) =N∑i=1

εijΨj(x)

F = −1

2∇2 + v + g

6

• εij are Lagrange multipliers for the constraints

• g is the Coulomb exchange operator, g = j − k

j(x1)f(x1) =N∑k=1

∫Ψ∗k(x2)Ψk(x2)

1

r12

f(x1)dx2

k(x1)f(x1) =N∑k=1

∫Ψ∗k(x2)f(x2)

1

r12

Ψk(x1)dx2

• coupled system of equations can be solved by iteration until self-consistencyis achieved

The Hartree-Fock equations

• “orbital energies”

εi = εii = 〈Ψi|F |Ψi〉 = Hi +N∑j=1

(Jij −Kij)

• the total energy EHF is not a sum of orbital energies

EHF =N∑j=1

εi − Vee

Vee =

∫Ψ∗HF (xN)

(∑ij

1

rij

)ΨHF (xN) dxN =

1

2

N∑i,j=1

(Jij −Kij)

• nonlinear self-consistent field method

• EHF is an exact upper bound for the energy E (variational principle)

Correlation energy

• the exact wavefunction of an interacting system is never a single determi-nant

• error in calculation by Hartree-Fock is called correlation energy

Ecorr = E − EHF , Ecorr < 0

7

• Ecorr is the part of energy due to many-body effects beyond mean-field de-scription

• calculation of Ecorr is a major problem in many-body theory

• in quantum chemistry, mixing of many determinants is used to calculateaccurate energies (Configurational Interaction CI)

• many-body perturbation techniques

• only small systems can be treated accurately

• Ecorr is of crucial importance when the bonds change

Electron density

ρ(r1) = N∑s1...sn

∫. . .

∫|Ψ(r1s1, r2s2, . . . , rNsN)|2 dr2 . . . drN , ρ(r) ≥ 0

• electron density ρ(r) is a function of just one position r and∫ρ(r)dr = N

〈Vext〉 =

∫. . .

∫Ψ∗(r1, . . . , rN)

N∑i=1

Vext(ri) Ψ(r1, . . . , rN) dr1 . . . drN

=N∑i=1

∫. . .

∫dr1 . . . drNΨ∗(r1, . . . , rN)Vext(ri)Ψ(r1, . . . , rN)

∫dr δ(r− ri)

=N∑i=1

∫drVext(r)

∫. . .

∫dr1 . . . drNΨ∗(r1, . . . , rN)Ψ(r1, . . . , rN)δ(r− ri)

=

∫drVext(r)〈

N∑i=1

δ(r− ri)〉 =

∫drVext(r)ρ(r)

• 〈Vext〉 depends on the wavefunction Ψ only through the density ρ(r)

Hellman - Feynman theorem

• Hamiltonian H(λ) dependent on a parameter λ

8

• Ψ(λ) is a normalized eigenstate with eigenvalue E(λ)

H(λ)Ψ(λ) = E(λ)Ψ(λ)

E(λ) = 〈Ψ(λ)|H(λ)|Ψ(λ)〉dE(λ)

dλ=

d

dλ′〈Ψ(λ′)|H(λ)|Ψ(λ′)〉

∣∣∣λ′=λ

+ 〈Ψ(λ)|∂H(λ)

∂λ|Ψ(λ)〉

• the first term is zero because of the variational principle

• the Hellman - Feynman theorem

dE(λ)

dλ= 〈Ψ(λ)|∂H(λ)

∂λ|Ψ(λ)〉

• integral form

E(λ1)− E(λ2) =

∫ λ2

λ1

〈Ψ(λ)|∂H(λ)

∂λ|Ψ(λ)〉

Electrostatic theorem Feynman (1939)

• assume the electronic Hamiltonian for static nuclei, nuclear positions Rare parameters

H + VNN = −∑i

1

2∇ri

2 +∑i<j

1

|ri − rj|−∑iJ

ZJ|ri −RJ |

+∑I<J

ZIZJ|RI −RJ |

• force acting on the nuclei is

−∂(E + VNN)

∂RI

= 〈Ψ| − ∂(H + VNN)

∂RI

|Ψ〉

=

∫dr ρ(r)

ZI(r−RI)

|r−RI |3+∑J 6=I

ZIZJ(RI −RJ)

|RI −RJ |3

• electrostatic theorem for the forces

• the quantum expression is exactly identical to the classical one

• all we need is the electronic density

• can be used for the geometry optimization of molecular structure as well asfor ab initio molecular dynamics

9

2 Density functional theory (DFT)Literature:

• Robert G. Parr and Weitao Yang, Density Functional Theory of Atoms andMolecules, Oxford University press (1989)

• J. P. Perdew and S. Kurth, Density Functionals for Non-relativistic CoulombSystems in the New Century. Lecture Notes in Physics 620, pp. 1-55, (2003)Springer Verlag

Density functional theory

• techniques based on wavefunctions are complicated

• for ground states, a special theory exists

• instead of the wavefunction Ψ(x1, . . . , xN), the density ρ(r) is the basicvariable

• conceptually this represents a great simplification

• offers a practical computational scheme

• many free and commercial codes exist

• widespread use in many branches of physics and chemistry

The Thomas-Fermi model Thomas (1927), Fermi (1927)

• how does the kinetic energy of electrons depend on density?

• consider noninteracting electrons at T = 0

• we divide space into small cubes of size ∆V = l3 containing ∆N electrons,ρ(r) = ∆N

∆V(may be different in different cells)

• density of states g(ε) = π4

(8ml2

h2

) 32ε12

• at T = 0 occupation f(ε) = 1 for ε < εF , f(ε) = 0 for ε > εF

• kinetic energy ∆E = 2∫ε f(ε)g(ε)dε = 8π

5

(2mh2

) 32 l3ε

52F

• number of particles ∆N = 2∫f(ε)g(ε)dε = 8π

3

(2mh2

) 32 l3ε

32F

∆E =3h2

10m

(3

8π

) 23

l3(

∆N

l3

) 53

=3h2

10m

(3

8π

) 23

ρ53 (r) ∆V

10

The Thomas-Fermi model

• integrating over the whole space we get the Thomas-Fermi kinetic energyfunctional (in atomic units)

TTF [ρ] = CF

∫ρ

53 (r)dr , CF =

3

10(3π2)

23

• the underlying idea is in fact the local density approximation (LDA)

• application to an atom

ETF [ρ(r)] = CF

∫ρ

53 (r)dr− Z

∫ρ(r)

rdr +

1

2

∫ ∫ρ(r1)ρ(r2)

|r1 − r2|dr1 dr2

• one can search for the density ρ(r) which minimizes the functionalETF [ρ(r)]under the constraint

∫ρ(r)dr = N

• fails to predict binding in molecules

• before the advent of DFT (1964) the TF model was considered merely as asimple and inaccurate model

• the Hohenberg-Kohn theorems provide a different interpretation: TF modelrepresents an approximation to the exact density functional theory

Hohenberg-Kohn theorems Hohenberg and Kohn 1964

• the external potential v(r) and number of electrons N determine the Hamil-tonian entirely

• all properties of the ground state are given by v(r) and N

• we assume a non-degenerate ground state

Theorem 1 (First Hohenberg-Kohn theorem). The external potential v(r) is uniquelydetermined (up to an additive constant) by the electron density ρ(r).

The theorem allows to replace the wavefunction Ψ(~r) by the density ρ(r), whichis much more practical.

11

First Hohenberg-Kohn theorem

Proof. • we assume a non-degenerate ground state

• ρ(r) determines the number of electrons N

• assume there are two different potentials v, v′ defining two HamiltoniansH, H ′ such that the respective ground states Ψ,Ψ′ are different but give thesame density ρ(r)

• variational principle

E0 < 〈Ψ′|H|Ψ′〉 = 〈Ψ′|H ′|Ψ′〉+ 〈Ψ′|H − H ′|Ψ′〉 = E ′0 +

∫ρ(r)[v(r)− v′(r)]dr

E ′0 < 〈Ψ|H ′|Ψ〉 = 〈Ψ|H|Ψ〉+ 〈Ψ|H ′ − H|Ψ〉 = E0 −∫ρ(r)[v(r)− v′(r)]dr

• summing the two equations we get a contradiction E0 + E ′0 < E ′0 + E0

• there cannot be two different potentials v, v′ that give for the ground statethe same ρ(r)

First Hohenberg-Kohn theorem Consequences

• ρ(r) itself determines N and v and therefore all properties of the groundstate

• ground state energy E[ρ] is functional of the density

Ev[ρ] = T [ρ] + VNe[ρ] + Vee[ρ] =

∫ρ(r)v(r)dr + FHK [ρ]

where FHK [ρ] = T [ρ] + Vee[ρ]

• FHK [ρ] is independent of v(r) and is a universal functional of ρ(r)

• we separate from Vee the classical repulsion J [ρ] (Hartree term)

J [ρ] =1

2

∫ ∫ρ(r1)ρ(r2)

|r1 − r2|dr1 dr2

Vee[ρ] = J [ρ] + non-classical term

• the non-classical term will be the main part of the exchange-correlationenergy

12

Second Hohenberg-Kohn theoremrepresents a variational principle for the density

Theorem 2 (Second Hohenberg-Kohn theorem). For any normalized density ρ(r) ≥0,∫ρ(r)dr = N ,

E0 ≤ Ev[ρ] ,

where Ev[ρ] =∫ρ(r)v(r)dr + FHK [ρ].

Proof. • ρ(r) determines the v(r), Hamiltonian H and wavefunction Ψ

〈Ψ|H|Ψ〉 =

∫ρ(r)v(r)dr + FHK [ρ] = Ev[ρ] ≥ Ev[ρ]

Second Hohenberg-Kohn theorem Consequences

• ground-state energy is stationary with respect to variations of ρ(r)

δ

Ev[ρ]− µ

[∫ρ(r)dr−N

]= 0

• Euler-Lagrange equation

µ =δEv[ρ]

δρ(r)= v(r) +

δFHK [ρ]

δρ(r)

• µ is the chemical potential

• in principle the last equation is an exact equation for the ground state ρ(r)but we don’t know the exact FHK [ρ]

• we can only construct approximations

Levy constrained-search derivation (Levy 1979)

• in the variational principle we split the minimization over all wavefunctionsΨ into two steps

E = minΨ〈Ψ|H|Ψ〉 = minρ

minΨ→ρ〈Ψ|H|Ψ〉

13

• first we minimize over all wavefunctions Ψ → ρ(r) that yield the givenN-particle density ρ(r)

minΨ→ρ〈Ψ|H|Ψ〉 = minΨ→ρ〈Ψ|T + Vee|Ψ〉+

∫ρ(r)v(r)dr

• all Ψ that yield the same ρ(r) have the same value of∫ρ(r)v(r)dr

• we define the universal functional

FHK [ρ] = minΨ→ρ〈Ψ|T + Vee|Ψ〉 = 〈Ψminρ |T + Vee|Ψmin

ρ 〉

• in the second step we minimize over all densities ρ

E = minρ

FHK [ρ] +

∫ρ(r)v(r)dr

• the minimizing density is the ground-state density

• equivalent and more "constructive" derivation of the DFT (avoids proof bycontradiction)

• in principle it provides a path towards the exact functional FHK [ρ], but it isnot possible to calculate it explicitly

3 The Kohn-Sham methodThe Kohn-Sham method Kohn, Sham (1965)

• DFT provides an enormous simplification due to use of electronic densityρ(r) as basic variable instead of Ψ(r)

• problem: the functional

FHK [ρ] = T [ρ] + Vee[ρ] = T [ρ] + J [ρ] + non-classical term

is unknown

• Thomas-Fermi theory is an approximation to DFT: T [ρ] is taken from non-interacting electron gas and the non-classical term is neglected completely

• there were attempts to improve the accuracy of the TF approach and con-struct better approximations for the FHK [ρ], but no much success

14

• the method of Kohn and Sham employs a different approach, based on re-introducing wavefunctions in the form of one-electron orbitals

• much better accuracy can be achieved

• method provides a practical scheme for numerical calculations and is cur-rently the most widespread use of DFT in physics and chemistry

The Kohn-Sham method

• we introduce a fictitious non-interacting system with no electron-electronrepulsion Vee and an effective external potential vs

Hs = −N∑i=1

1

2∇ri

2 +N∑i=1

vs(ri) =N∑i=1

hs(ri)

• by construction the ground state density of the non-interacting system mustbe identical to the ground state density of the interacting system

• the ground-state wavefunction of the non-interacting system is a Slater de-terminant

Ψs =1√N !

det[Ψ1Ψ2 . . .ΨN ]

• Ψi are the N lowest eigenstates of the one-electron Hamiltonian hs

hsΨi =

[−1

2∇2

ri+ vs(ri)

]Ψi = εiΨi

ρ(r) =N∑i=1

∑s

|Ψi(r, s)|2

The Kohn-Sham method

• the exact kinetic energy of the non-interacting system Ts[ρ] is easy to cal-culate

Ts[ρ] = 〈Ψs|N∑i=1

(−1

2∇2

ri)|Ψs〉 =

N∑i=1

〈Ψi| −1

2∇2

r|Ψi〉

15

• we express the functional F [ρ] in a different way

F [ρ] = T [ρ] + Vee[ρ] = Ts[ρ] + J [ρ] + Exc[ρ]

Exc[ρ] = T [ρ]− Ts[ρ] + Vee[ρ]− J [ρ]

• Exc[ρ] is called the exchange-correlation energy

• Exc[ρ] contains the difference T [ρ] − Ts[ρ] (hopefully small) and the non-classical part of electron-electron repulsion Vee[ρ]

• T [ρ] is typically large while Exc[ρ] is much smaller

• Exc[ρ] is better suited for approximations than T [ρ], many different approx-imations were constructed

• Exc[ρ] is of crucial importance for chemical bonding

Exchange-correlation potential

• the energy functional to be minimized is

E[ρ] = Ts[ρ] + J [ρ] + Exc[ρ] +

∫ρ(r)v(r)dr

• the Euler variational equation becomes

µ =δE[ρ]

δρ(r)= v(r) +

δTs[ρ]

δρ(r)+δJ [ρ]

δρ(r)+δExc[ρ]

δρ(r)= veff (r) +

δTs[ρ]

δρ(r)

where veff (r) = v(r) +δJ [ρ]

δρ(r)+δExc[ρ]

δρ(r)= v(r) +

∫ρ(r′)

|r− r′|dr′ + vxc(r)

• veff (r) is the Kohn-Sham effective potential

• the quantity vxc(r) = δExc[ρ]δρ(r)

is the exchange-correlation potential

• the form of Ts[ρ] is still unknown

• an additional ingredient is necessary to make progress

16

The Kohn-Sham equations

• we formulate the variational problem in terms of the Kohn-Sham orbitalsΨi, required to be orthonormal∫

Ψ∗i (r)Ψi(r) dr = δij

• E[ρ] is expressed as functional of the KS orbitals

E[ρ] = Ts[ρ] + J [ρ] + Exc[ρ] +

∫ρ(r)v(r)dr

=N∑i=1

∫Ψ∗i (r)(−1

2∇2

r)Ψi(r) dr + J [ρ] + Exc[ρ] +

∫ρ(r)v(r)dr

where ρ(r) =N∑i=1

∑s

|Ψi(r, s)|2

• we minimize Ω[Ψi], εij are Lagrange multipliers

Ω[Ψi] = E[ρ]−N∑i=1

N∑j=1

εij

∫Ψ∗i (r)Ψi(r) dr


• imposing the variational condition δΩ[Ψi] = 0 provides the Kohn-Shamequations [

−1

2∇2

r + veff (r)

]Ψi = εiΨi

veff (r) = v(r) +

∫ρ(r′)

|r− r′|dr′ + vxc(r)

ρ(r) =N∑i=1

∑s

|Ψi(r, s)|2

• HKS = −12∇2

r + veff (r) is called the Kohn-Sham Hamiltonian

• veff (r) depends on ρ(r)

• ρ(r) depends on Ψi which depend on veff (r)

17

• one-particle Schrödinger equation is typically solved by diagonalization insome basis, N lowest lying eigenstates are found

• coupled system of nonlinear equations that can be solved iteratively to self-consistency

• to start we have to have an initial guess for the density


• total energy E in the KS theory is not the sum of the orbital energies

N∑i=1

εi =N∑i=1

〈Ψi| −1

2∇2

r + veff (r)|Ψi〉 = Ts[ρ] +

∫ρ(r)veff (r)dr

E =N∑i=1

εi −1

2

∫ ∫ρ(r)ρ(r′)

|r− r′|dr dr′ + Exc[ρ]−

∫ρ(r)vxc(r)dr

• many-electron system is described by one-electron equations (similar to theHartree-Fock theory)

• however, Hartree-Fock theory is approximate by construction while in theKS theory the accuracy can be systematically improved by better approxi-mations to Exc[ρ]

• the orbital energies εi have no direct physical meaning

Local density approximation (LDA)

• the difficult problem was transferred into the terms Exc[ρ] and vxc(r)

• we still need some approximation for it

• search for an improved Exc[ρ] is the greatest challenge in the DFT

• the simplest choice is the local density approximation

ELDAxc [ρ] =

∫ρ(r)εxc(ρ(r))dr

• εxc is the exchange and correlation energy per particle of a uniform electrongas of density ρ

vLDAxc (r) =δELDA

xc [ρ]

δρ(r)= εxc(ρ(r)) + ρ(r)

∂εxc(ρ)

∂ρ

18

• accurate values for εxc(ρ) are known from QMC calculations (Ceperley,Alder 1980)

• QMC values have been interpolated to provide an analytic expression

• LDA was a widely used method in the beginnings of DFT, nowadays moreelaborated schemes are preferred due to better accuracy

Iterative solution of the Kohn-Sham equations basic iteration loop

• good choice of the initial density is important

• for an atom, one can start from the Thomas-Fermi density

• for a molecular or solid state system one can start from a sum of atomicdensities

Simple argument about DFT E. Bright Wilson (1965)

• the total density determines the number of electrons

N =

∫ρ(r)dr

• at the nucleus the Coulomb potential is singular and the wavefunction isfinite but non-analytic

• the position of cusps in the density determines the position of the nuclei

• the nuclear cusp condition (Kato 1957, Steiner 1963) determines the nuclearcharges

∂

∂rAρ(rA)

∣∣∣∣rA=0

= −2 ZA ρ(0)

where ρ(rA) is angularly averaged density around nucleus A

• the nuclear charges determine the whole Hamiltonian, because kinetic en-ergy and electron-electron repulsion depend only on the number of electronsN

• density itself determines all the properties

19

Atoms [1]

• in atoms we assume both the density and the KS potential to be sphericallysymmetric (always true in a closed shell system)

• in open shell atoms, we assume all degenerate orbitals to be equally filled

• Hartree potential

vHartree(r) =4π

r

∫ r

0

dr′ r′2n(r′) + 4π

∫ ∞r

dr′ r′ n(r′)

• KS wave functions have simple separable form with traditional atomic quan-tum numbers n,m, l

φi(r) = Rnl(r)Ylm(θ,Φ)

• KS equation becomes a simple 1D equation[−1

2

d2

dr2− 1

r

d

dr+l(l + 1)

2r2+ vKS(r)

]Rnl(r) = εnlRnl(r)

Atoms

• 2nd order equation can be transformed in two coupled 1st order equations(fnl(r) ≡ Rnl(r))

dfnl(r)

dr= gnl(r)

dgnl(r)

dr+

2

rgnl(r)−

l(l + 1)

2r2fnl(r) + 2[εnl − vKS(r)]fnl(r) = 0

• asymptotic behaviour for r →∞ (we have vKS(r)→ 0)

dfnl(r)

dr= gnl(r)

dgnl(r)

dr+ 2εnlfnl(r) ' 0

fnl(r) → exp(−√−2εnlr)

gnl(r) → −√−2εnlfnl(r)

20

Atoms

• at the origin r → 0 the solutions have form

fnl(r) → Arα

gnl(r) → Brβ

where B = lA, α = l, β = l − 1

• for fixed εnl and A one can integrate the 1st order equations starting fromthe above initial condition

• if εnl is not an eigenvalue, the solution will not obey the boundary conditionat r →∞ (will diverge)

• eigenvalues εnl can be found by integrating the equations forward (fromr = 0) and backward (from “practical”∞) to some intermediate point andmatching the two solutions

• no diagonalization is necessary

• important to generate pseudopotentials (see later)

4 Plane-wave pseudopotential methodLiterature

• W. E. Pickett, Pseudopotential methods in Condensed matter applications,Comp. Phys. Rep. 9, 115 (1989)

• M. C. Payne, M. P. Teter, D. C. Allan, T. A. Arias, J. D. Joannopoulos, Itera-tive minimization techniques for ab initio total-energy calculations: molec-ular dynamics and conjugate gradients, Rev. Mod. Phys. 64, 1045 (1992).

• M. Fuchs and M. Scheffler, Ab initio pseudopotentials for electronic struc-ture calculations of poly-atomic systems using density-functional theory,Comp. Phys. Comm. 119, 67-98 (1999)

21

Condensed matter calculations Choice of a basis set

• to solve the KS equations numerically we need to choose a basis set fν(simple analytic functions) to expand the orbitals

Ψi(r) =∑ν

ciνfν(r, R)

• various choices are possible (accuracy vs. efficiency)

• in quantum chemistry, localized basis is often used (Gaussian type orbitalscentered on the nuclei)

• if PBC are applied, a completely delocalized basis set is provided by planewaves

fG(r) =1√Ω

exp(iG · r), Ω is the volume of the periodic cell

• other possibilities:

– combinations of localized and originless sets

– wavelets - localized in both real and reciprocal space

– real space grids - potential energy operator is diagonal and derivativesare approximated by finite difference formulas

Plane waves

• the plane-wave basis set is complete and orthonormal

• plane waves do not depend on the positions of the nuclei - originless basis

• derivatives become just multiplications in the G space

• operators can be evaluated in the space where there are diagonal and FastFourier transform can be efficiently used

• unbiased, all regions within the periodic cell are treated on equal footing(can be an advantage as well as disadvantage)

• problem - a large number of PW is needed to achieve a high spatial resolu-tion in the region close to the nuclei

• solution - pseudopotentials

• plane wave scheme with pseudopotentials is considered a “standard model”for the total energy calculations

22

Supercell

• perfect solids have natural periodicity determined by the unit cell

• also non-periodic systems (defective solids, molecules, clusters, surfaces,liquids etc.) can be studied by using PBC - supercell

Bloch’s theorem

• Kohn-Sham potential itself has the periodicity of the cell

• electronic wavefunction in a periodic system

φk,n(r) = exp(ik · r)uk,n(r)

• k is a vector in the first Brillouin zone

• uk,n is a cell-periodic function that can be expanded in plane waves

uk,n(r) =∑G

ck,n(G) exp(iG · r)

• G are reciprocal lattice vectors, G · l = 2πm, l is lattice vector of thesupercell, m is an integer

φk,n(r) =∑G

ck,n(G) exp[i(k + G) · r]

• KS energies are εk,n, n is the band index

k-point sampling

• density of k-points is proportional to the volume of the supercell

• for an infinite system, we would calculate finite number of electronic wavefunctions at an infinite number of k-points

• wave functions at close k-points are very similar

• special methods exist to approximate a filled electronic band by calculationof states at special set of k-points in the Brillouin zone (e.g. Monkhorst andPack 1976) ∫

dk→∑

k

wk

where wk are weights of the integration points

23

• good for insulators and semiconductors, good accuracy can be reached witha very small set of k-points

• for metals, a dense set of k-points is needed to define the Fermi surface

• the density of k-points can be increased until convergence of the total energyis reached

• for a large supercell, only the Γ point is often taken

Plane-wave basis sets

• kinetic energy of a plane wave is 12|k + G|2

• we introduce a cutoff energy Ecutoff and keep only plane waves with 12|k +

G|2 < Ecutoff

• we obtain a finite basis set with NPW ∼√

23π2 ΩE

32cutoff plane waves

• the cutoff introduces an error to the total energy calculation

• cutoff Ecutoff can be increased until convergence is reached

• both the Brillouin zone sampling and the cutoff have to be fine tuned in anycalculation

• electron density

ρ(r) =∑k,n

∑G,G′

f(εk,n)c∗k,n(G′)ck,n(G) exp[i(G−G′) · r]

ρ(G) =∑k,n

∑G′

f(εk,n)c∗k,n(G′ −G)ck,n(G′)

• sum over G in the expansion of the density extends over double the rangeof the wavefunction expansion

Pseudopotentials (Motivation)

• in the region close to the nuclei, wavefunctions oscillate rapidly over shortlength scale, because the wavefunctions of different orbitals are orthogonal

• huge number of plane waves would be needed to perform all-electron cal-culations (to describe the core orbitals)

24

• most physical and chemical properties of condensed matter systems aremainly dependent on the valence electrons and to much lesser extent onthe core electrons

• pseudopotential approximation (originally proposed by Fermi 1934, Philips1958, Heine and Cohen 1970, etc.) - core electrons are removed

• core electrons and the strong ionic potential VAE are replaced by a weakerand smooth pseudopotential VPS

• PP acts on the pseudo wavefunctions |ΦPS〉 rather than on true valencewavefunctions |ΨAE〉

• Schrödinger equation for all electrons is replaced by equation for valenceelectrons only

(T + VAE)|ΨAE〉 = ε|ΨAE〉(T + VPS)|ΦPS〉 = ε|ΦPS〉

Pseudopotentials

• pseudo wavefunctions approach the true valence wavefunctions in the re-gion outside the core radius rc

• inside the core radius the nodal structure of the true valence wavefunctionis replaced by a smooth function

• this allows a small plane wave cutoff

• pseudopotential should be constructed to be transferable - usable in differ-ent chemical environments

• |ψc〉, |ψv〉 are exact solutions of the Schrödinger equation for the core andvalence electrons

H|ψc〉 = Ec|ψc〉, H|ψv〉 = Ev|ψv〉

• valence orbitals are a sum of smooth pseudo wavefunction |φv〉 and an os-cillating part resulting from orthogonalization to the core orbitals

|ψv〉 = |φv〉+∑c

αcv|ψc〉

〈ψc|ψv〉 = 0 ⇒ αcv = −〈ψc|φv〉H|φv〉 = Ev|φv〉+

∑c

(Ec − Ev)|ψc〉〈ψc|φv〉

25

• |φv〉 satisfies a Schrödinger-like equation with an energy-dependent pseudoHamiltonian

HPK(E) = H −∑c

(Ec − E)|ψc〉〈ψc|

Pseudopotentials

wPK(E) = v −∑c

(Ec − E)|ψc〉〈ψc|

• v is the bare potential and wPK(E) is effective potential for the valenceelectrons

• at radius where the core states decay to zero, w → v

• additional advantages of using pseudopotentials:

• fewer electronic states have to be calculated

• the total energy is a fraction of the true total energy, so much lower accuracyis sufficient to get the energy differences between different configurations

• however, the total energy has no physical meaning, only differences aremeaningful

• originally, pseudopotentials were constructed empirically by fitting an ana-lytic form to experimental data (band structure)

Ab-initio pseudopotentials

• much more realistic pseudopotentials can be generated ab initio

• in early DFT studies, they were developed by fitting parametrized functionalforms to reproduce the atomic eigenvalues and orbital shapes

• general form of the angular momentum dependent pseudopotential

W =∑lm

|lm〉Vl〈lm|

• wavefunction is projected on the angular momentum eigenfunctions andeach component is multiplied by the relevant pseudopotential Vl

26

• the projector is non-local

〈r|lm〉〈lm|r′〉 = 〈Ω|lm〉〈lm|Ω′〉δ(|r| − |r′|)

• taking Vl as a local function Vl = wl(r) we get the PP in the (most common)semi-local form (Ylm are spherical harmonics)

w(r, r′) =∑l

l∑m=−l

Y ∗lm(r)wl(r, r′)Ylm(r′)

wl(r, r′) = wl(r)δ(r − r′)

Norm conserving pseudopotentials Hamann, Schlüter, Chiang (1979) [2]

• the exchange-correlation energy depends on the electron density

• outside the core radius rc the pseudo wavefunction should produce the iden-tical density as the real wavefunction

• the two wavefunctions should be identical in absolute magnitude, not justin spatial dependence

• all-electron calculation is performed for an isolated atom

• inversion of the Schrödinger equation is performed to find the PP

• PP found in this way is used for any environment of the atom

• when the PP is used, the same XC functional has to be used that was usedto generate the PP

• Bachelet, Hamann, Schlüter (1982) [3] published a table of pseudopoten-tials for all elements from Z = 1 to Z = 94

Generation of a norm conserving pseudopotential

• integral of the pseudo and real wavefunction inside rc should be equal sothat the Coulomb potential generated outside rc is identical

RPPl (r) = RAE

nl (r) if r > rl∫ rl

0

dr r2|RPPl (r)|2 =

∫ rl

0

dr r2|RAEnl (r)|2 if r < rl

εPPl = εAEnl

27

• pseudowavefunctions should have no nodal surfaces

• cutoff radii rl depend on l, cannot be smaller than the outermost nodal sur-face of the true wavefunction

• rl are not adjustable paramters

• with smaller rl the PP becomes more accurate and transferable, but alsostronger

• the choice of rl is a balance between size of the plane-wave basis set andaccuracy

Generation of a norm conserving pseudopotential Procedure

1. the free atom KS radial equations are solved[−1

2

d2

dr2+l(l + 1)

2r2+ vAEKS [nAE](r)

]rRAE

nl (r) = εAEnl rRAEnl (r)

vAEKS [nAE](r) = −Zr

+ vHartree[nAE](r) + vXC [nAE](r)

2. pseudo wavefunction is defined using norm conservation conditions, theshape of the wavefunction for r < rl depends on the kind of PP and has tobe defined

3. PP is determined by inverting the radial KS equation

wl,scr(r) = εPPl −l(l + 1)

2r2+

1

2rRPPl (r)

d2

dr2[rRPP

l (r)]

4. unscreening of the PP - subtraction of the screening effects due to the va-lence electrons

wl(r) = wl,scr(r)− vHartree[nPP ](r)− vXC [nPP ](r)

Non-linear core correction Louie, Froyen, Cohen (1982) [4]

• in the unscreening we actually assumed a linear dependence of the vXC onthe density

• this is only true if the frozen core electrons and the valence states do notoverlap

28

• if overlap is present, linearization will lead to poor transferability becausevXC is a non-linear function

• solution: the unscreened PP is redefined as

wl(r) = wl,scr(r)− vHartree[nPP ](r)− vXC [ncore + nPP ](r)

and the partial core density ncore(r) is supplied together with the PP

• in the calculation, we use

EXC = EXC [ncore + n]

and VXC = VXC [ncore + n]

Non-linear core correction

• ncore(r) is equal to the core density of the atomic reference state in theregion of overlap with valence states

• in order to eliminate the rugged core density, ncore(r) joins the true coredensity at the core cutoff radius rnlc and in the core region equals a smoothpolynomial P (r) going to zero at r = 0

ncore(r) =

ncore(r) for r ≥ rnlcP (r) for r < rnlc P (r)∫ rnlc

0

dr r2 ncore(r) <

∫ rnlc

0

dr r2 ncore(r)

• rnlc defines the region of overlap, typically is chosen at the point where thetrue core density becomes smaller than the atomic valence density

• important for the alkali and transition metals where core orbitals extend intovalence density (Zn, Cd)

• the NLC correction substantially increases the transferability of the PP

Troullier-Martins PP (Troullier, Martins 1991) [5]

• smooth PP, pseudo wave-functions are defined as

RPPl (r) =

RAEnl (r) for r > rl

rl exp[p(r)] for r < rl

p(r) = c0 + c2r2 + c4r

4 + c6r6 + c8r

8 + c10r10 + c12r

12

29

• coefficients ci are imposed by

– norm conservation

– continuity of pseudo wavefunctions and their first four derivatives atr = rl

– zero curvature of the screened PP at the origin

Kleinman-Bylander separable form of PP Kleinman, Bylander (1982) [6]

• a semi-local pseudopotential can be rewritten in a separable form

wl(r, r′) = vl(r)vl(r

′)

• the real-space expression becomes more complicated (truly non-local) butthe Fourier space representation is simpler

• separation of short and long-range part (Coulomb), long-range is local

∆wl = wl(r)− wlocal(r)

• wlocal(r) is chosen so that the rest becomes as small as possible

• typically the most repulsive PP component is chosen as wlocal(r)

w(r, r′) = wlocal(r) +∑l

∆wl(r)l∑

m=−l

Y ∗lm(r′)Ylm(r)δ(r − r′)

wKB(r, r′) = wlocal(r) +∑l

l∑m=−l

φlm(r)∆wl(r)∆wl(r′)φlm(r′)∫

dr ∆wl(r)|φlm(r)|2

Kleinman-Bylander separable form of PP

• the action of wKB(r, r′) on the valence pseudo wavefunctions φlm(r) isidentical to that of the original semi-local one∫

dr′ wKB(r, r′) φlm(r) = wl(r)φlm(r)

• nevertheless w(r, r′) and wKB(r, r′) are two different potentials

• the construction illustrates another dimension of non-uniqueness of PP

30

• when applied to molecular or solid state system, the two forms do not pro-duce identical results (both are equally valid)

• if the PP approximation is valid, the results must be very similar

• KB is the most used form in DFT codes

• “ghost” states problem can arise with a bad choice of the wlocal(r)

Quality of the PP approximation - transferability

• the logarithmic derivative ddr

lnφ (ratio slope-value) plays a role of theboundary condition for the wavefunction at the core cutoff radius rc

• by construction it is adjusted to have the same value in the all-electron andpseudo wavefunctions at the atomic eigenvalues

• in molecular or condensed matter system, the eigenvalues are shifted fromthe atomic ones by bonding or banding

• the norm conservation is related to the logarithmic derivative of the wavefunction

−1

2(rφε)

2 d

dε

(d

drlnφε

)=

∫ R

0

φ2ε r

2 dr

where φε is a solution of the Schrödinger equation for energy ε (not neces-sarily an eigenvalue)

• the first energy derivatives of the logarithmic derivative in the all-electronand pseudo wavefunctions also agree

• if the norm-conserving PP reproduces the scattering properties of the all-electron potential in the neighborhood of the atomic eigenvalues, it does soalso in the entire range of valence bands or molecular orbitals

Quality of the PP approximation - transferability

• norm-conserving PP provide good transferability

• quality and computational efficiency are conflicting requirements but can bereasonably well satisfied by modern norm-conserving PP

• quality of a PP depends on

31

– correct scattering properties

– choice of the core cutoff radius rc, smaller rc are good for accuracybut bad for efficiency (strong PP)

– adequate account of the non-linear XC interaction between core andvalence electrons

– quality of the KB form (avoiding the spurious states)

– validity of the frozen core approximation

• in any case testing in different environments has to be performed

• other common type of PP is the Vanderbilt ultrasoft PP (Vanderbilt 1990)[7] which partially relaxes the requirement of the norm conservation infavour of generating softer PP (useful for oxygen)

Plane-wave representation of the KS equations

• kinetic energy

T =N∑i=1

〈φi| −1

2∇2

r|φi〉 =1

2

∑k,n

∑G

f(εk,n) |ck,n(G)|2 |k + G|2

• potentials are represented in the Fourier space

• Hartree energy

EHartree =1

2

∫ρ(r)vHartree(r)dr =

Ω

2

∑G

vHartree(G)ρ(G)

vHartree(r) =

∫ρ(r′)

|r− r′|dr′

• Ω is the volume of the cell

• Poisson’s equation

vHartree(G) = 4πρ(G)

G2

• vHartree(G) diverges at G = 0 (positive contribution, interaction betweenlike charges)

32

Plane-wave representation of the KS equations

• we consider the external potential in a general non-local formw(r, r′) (pseu-dopotential)

Eei =

∫ρ(r)v(r)dr→

N∑i=1

∫dr

∫dr′φi(r)w(r, r′)φ∗i (r

′)

• local potential is a special case w(r, r′) = w(r)δ(r− r′)

Eei =∑k,n

∑G

∑G′

f(εk,n)c∗k,n(G)ck,n(G′)w(k + G,k + G′)

w(k + G,k + G′) =

∫dr

∫dr′ w(r, r′) exp[−i(k + G)r] exp[i(k + G′)r′]

w(r, r′) =∑j,α

wα(r− ρj −Rα, r′ − ρj −Rα)

w(k + G,k + G′) =∑α

exp[−i(G−G′)Rα]wα(k + G,k + G′)

where ρj are supercell lattice vectors

• wα(k + G,k + G′) is the Fourier transform of the ionic pseudopotential

Evaluation of wα(k + G,k + G′)

• calculation of this matrix element for a general non-local or semilocal PPwα(r, r′) scales as N2

PW which is prohibitively expensive

wα(k,k′) =

∫dr

∫dr′ wα(r, r′) exp[−ikr] exp[ik′r′]

• with PP in separable form (Kleinman-Bylander) it becomes much easier

wα(r, r′) = wlocalα (r)δ(r, r′) +∑l

l∑m=−l

γlm φlm(r)∆wl(r)∆wl(r′)φlm(r′)

wα(k,k′) = wlocalα (k− k′)+∑l

l∑m=−l

γlm

[∫dr exp[−ikr]φlm(r)∆wl(r)

] [∫dr′ exp[ik′r′]φlm(r′)∆wl(r

′)

]

• this procedure scales as NPW

33

Electrostatics

• the local part of pseudopotential contains the tail of the ionic Coulomb po-tential ∼ −1

r

• the electron-ion energy

Eei(G,G′) =∑k,n

f(εk,n)c∗k,n(G)ck,n(G′)w(k + G,k + G′)

diverges at G,G′ → 0 (negative contribution, interaction between unlikecharges)

• ion-ion Coulomb energy

Enn =1

2

∑αα′jj′

′ ZαZα′

|ρj + Rα − ρj′ −Rα′ |

also diverges (positive contribution, interaction between like charges)

• the sum is handled by Ewald summation and the divergencies ofEHartree, Eeiand Enn at G→ 0 cancel each other, leaving a finite contribution (the totalsystem electrons + ions is charge neutral)

Electrostatics

limG,G′→0

[∑k,n

f(εk,n)c∗k,n(G)ck,n(G′)w(k + G,k + G′) +Ω

2vHartree(G)ρ(G)

]+Enn = Erep + EEwald

Erep = Ztotal1

Ω

∑α

Λα

Λα =1

Ω

∫dr

[wα,local(r) +

Zαr

]EEwald =

1

2

∑α,α′

ZαΓα,α′Zα′

Γα,α′ =4π

Ω

∑G 6=0

cos[G · (Rα −Rα′)]

G2exp(−G

2

4η2) +

∑j

erfc(η|ρj + Rα −Rα′|)|ρj + Rα −Rα′ |

− π

η2Ω− 2η√

πδα,α′

34

Total energy

• the Ewald term does not depend on the density and is evaluated only onceat the beginning of each iteration cycle

• the total energy is

Etot = T + E ′Hartree + E ′ei + EXC + EEwald + Erep

• the terms with G,G′ = 0 are excluded from the Hartree and pseudopoten-tial contributions

• exchange-correlation energy (LDA)

EXC =

∫ρ(r)εxc(ρ(r))dr = Ω

∑G

εxc(ρ(G))ρ(G)

• the XC functional εxc(ρ(r)) is local in real space so it has to be evaluatedon a real space grid and εxc(ρ(G)) is obtained via Fourier transform

KS equations in the plane wave representation

• simple form∑G′

HG,G′(k)ck,n(G′) = εk,nck,n(G)

HG,G′(k) =1

2|k + G|2δG,G′ + w(k + G,k + G′)

+ vHartree(G−G′) + vXC(G−G′)

• Hamiltonian matrix for each k point is constructed

• in principle, it can be solved by direct diagonalization and iteration untilself-consistency

• the total energy should be converged both in cutoff and in k-points, but inpractice it is more important to converge the energy differences rather thanthe absolute energy

• diagonalization scales as P 3 where P is the dimension of the matrix

• typically we have ∼ 100 plane waves per atom

• severe limitation to the size of the system

• alternative techniques for minimizing the KS functional have been devel-oped

35

Iteration to self-consistency

• iteration to self-consistency is performed until a convergence criterion isreached

• convergence in energy |E(i) − E(i−1)| < ηE or in density∫dr |ρ(i) −

ρ(i−1)| < ηρ

• to start a the new iteration cycle, the new density is mixed with the previousdensity to avoid oscillations

• simplest version - linear mixing

ρ(i+1) = βρ′ + (1− β)ρi

ρ′(r) =occ∑i

|φi(r)|2

• β ≈ 0.3

• more sophisticated versions of mixing exist, using densities from severalprevious iterations (Broyden 1965) [8]

The band gap problem

• KS orbitals and energies do not have a direct physical meaning

• the plane waves KS method provides a band structure

• valence and conduction bands are typically reproduced reasonably well

• the difference between the bottom of the conduction band and the top ofthe valence band (KS gap) is in semiconductors typically about two timessmaller than the experimental band gap

The band gap problem

• e.g. in Si the experimental indirect band gap is 1.17 eV and the LDA one is0.48 eV (from Γ to 0.85X)

• this is not only due to inaccuracy of LDA, even the exact XC functional willnot guarantee that the band gap has the correct value (Sham and Schlüter1983) [9]

36

• DFT is a ground state theory

• experimentally, the electronic excitation spectrum of a solid is measured byphotoemission experiments

• the excitation spectrum and the band gap are related to quasiparticles

• when self-energy correction (many-body perturbation theory) is taken intoaccount, substantial improvement can be obtained (GW approximation, Hedin1965) [10]

Example 3. DFT study of solid silicon phases (Yin and Cohen 1982) [11]

• Si has ground state electronic configuration 1s22s22p63s23p2

• reference configuration 3s23p0.53d0.5, PP of the Hamann-Schlüter-Chiangtype

• for isolated atom the PP reproduces eigenvalues within few mRy and wave-functions outside the core radius within 1 %

• plane-wave basis is not biased towards any particular crystal structure

• Ecutoff = 11.5 Ry

Example 4. DFT study of solid silicon phases (Yin and Cohen 1982)

• total energy was calculated for 7 crystal structures: fcc, bcc, hcp, cubicdiamond (CD), hexagonal diamond (HD), β-tin, simple cubic (sc)

• for semiconducting CD (HD) phases, 10 (6) special k points in the irre-ducible BZ were sufficient to reach convergence

• other phases were found metallic, number of sampling k-points was 24, 35,70, 36, 60 in the β-tin, sc, bcc, hcp and fcc, respectively

• with increasing Ecutoff the total energy decreases (more variational free-dom) and cohesive energy Ec = Ecrystal/Natom − Eatom increases


• upon formation of a crystal Ekin grows because of localization of electronswhen covalent bonds are formed

• at the same time EXC and Epot decreases

• the CD crystal structure is stabilized due to both EXC and Epot

37

• zero-point (quantum) phonon energy has to be taken into account at T = 0


• equations of state calculated for various phases are fitted on the Murnaghan’sequation

Etot(V ) =B0V

B′0

[(V0/V )B

′0

B′0 − 1

]+ const

where B0 and B′0 are bulk modulus and its pressure derivative at the equi-librium volume V0

• CD phase is slightly more stable than the HD phase

• at p = 9.9 GPa a transition from CD to β-tin structure is predicted (expt. at12.5 GPa at room temperature), driven by the Ewald (ion-ion) energy


• because of norm-conserving properties of the PP, the pseudovalence chargedensity reproduces well the true electronic density

• in order to compare with experiment (X-ray diffraction data) the core struc-ture factors have to be added to the valence structure factors

• the agreement is very good

• phases in the order of decreasing covalent bond character : CD, HD, β-tin,sc, bcc, fcc, hcp

Small computational project - ab initio calculation of Si crystal

• for Si crystal in the diamond phase, calculate the equilibrium lattice con-stant a and bulk modulus B0

• use the CPMD code

• use the LDA

• cell of 2x2x2 unit cells (64 atoms)

• Γ point only for BZ sampling

• Troullier-Martins pseudopotential

• Ecutoff = 12 Ry (check for convergence at one volume)

• calculate few points on the T = 0 equation of state E(V )

• compare results with other calculations and with exp. data

38

5 Exchange-correlation functionals (beyond LDA)Exchange-correlation functionals and their accuracy

• EXC is a relatively small fraction of the total energy of the system

• EXC is about 100 % of the chemical bonding energy

• the exchange-correlation functional is the crucial ingredient of DFT scheme

• empirical and non-empirical functionals

• empirical ones include parameters fitted to some database of systems (molecules,solids, . . . ) and can be seen as interpolation schemes

• non-empirical ones try to incorporate as many exact constraints (knownfrom theory) as possible - can be seen as extrapolation schemes

The pair correlation function

• diagonal element of the second order density matrix

ρ2(r1, r2) = ρ2(r1r2, r1r2)

=1

2N(N − 1)

∑s1...sn

∫. . .

∫dr3 . . . drN |Ψ(r1s1, r2s2, r3s3, . . . , rNsN)|2

ρ(r) =2

N − 1

∫ρ2(r, r′)dr′

• electron-electron repulsion 〈Vee〉

〈Vee〉 =

∫dr

∫dr′

ρ2(r, r′)

|r− r′|

• we define the pair-correlation function h(r, r′) which includes all non-classical effects

ρ2(r, r′) =1

2ρ(r)ρ(r′)[1 + h(r, r′)]

39

The exchange-correlation hole

• for every r, the function h(r, r′) satisfies the sum rule∫dr′ ρ(r′)h(r, r′) = −1

• we define the exchange-correlation hole ρxc(r, r′) of an electron at r

ρxc(r, r′) = ρ(r′)h(r, r′)

which satisfies the sum rule∫dr′ ρxc(r, r

′) = −1

• exchange-correlation hole contains a unit charge with sign opposite to thatof the electron

• the electron-electron repulsion 〈Vee〉 becomes

〈Vee〉 = J [ρ] +1

2

∫dr

∫dr′

ρ(r)ρxc(r, r′)

|r− r′|

The XC functional via the exchange-correlation hole Gunnarsson and Lundquist(1976) [12]

• XC energy functional

Exc[ρ] = T [ρ]− Ts[ρ] + Vee[ρ]− J [ρ]

F [ρ] = minΨ→ρ〈Ψ|T + Vee|Ψ〉 = 〈Ψminρ |T + Vee|Ψmin

ρ 〉

• in the Kohn-Sham non-interacting system there is no Vee

Ts[ρ] = minΨ→ρ〈Ψ|T |Ψ〉 = 〈Φminρ |T |Φmin

ρ 〉

where |Φminρ 〉 is a Slater determinant

• we consider λ coupling between the non-interacting (λ = 0) and normal(interacting, λ = 1) system

Fλ[ρ] = minΨ→ρ〈Ψ|T + λVee|Ψ〉 = 〈Ψλρ |T + λVee|Ψλ

ρ〉

where |Ψλρ〉 minimizes 〈T + λVee〉 and provides density ρ(r)

• ρ(r) remains the same as λ changes

40

The XC functional via the exchange-correlation hole

F1[ρ] = F [ρ] = T [ρ] + Vee[ρ]

F0[ρ] = Ts[ρ]

• we assume a smooth adiabatic connection between λ = 0 and λ = 1

Exc[ρ] = F1[ρ]− F0[ρ]− J [ρ] =

∫ 1

0

dλ∂Fλ[ρ]

∂λ− J [ρ]

• we apply the Hellman - Feynman theorem and get

Exc[ρ] =

∫ 1

0

dλ〈Ψλρ |Vee|Ψλ

ρ〉 − J [ρ]

• only λ = 1 is physical, all other values are fictitious

Exc[ρ] =

∫ 1

0

dλ

∫dr

∫dr′

ρλ2(r, r′)

|r− r′|− J [ρ]

The XC functional via the exchange-correlation hole

Exc[ρ] =

∫dr

∫dr′

ρ2(r, r′)

|r− r′|− J [ρ] where

ρ2(r, r′) =

∫ 1

0

dλ ρλ2(r, r′) =1

2ρ(r)ρ(r′)[1 + h(r, r′)]

Exc[ρ] =1

2

∫dr

∫dr′

ρ(r)ρ(r′)h(r, r′)

|r− r′|=

1

2

∫dr

∫dr′

ρ(r)ρxc(r, r′)

|r− r′|

where ρxc(r, r′) = ρ(r′)h(r, r′) is the average exchange-correlation hole

• Exc[ρ] is Coulomb interaction between each electron and the average exchange-correlation hole around it

• three contributions:

1. self-interaction correction (electron does not interact with itself)2. the Pauli exclusion principle which keeps 2 electrons with parallel spin

apart in space3. the Coulomb repulsion which which keeps any 2 electrons apart in

space

• effects (1) and (2) yield exchange energy (present also at λ = 0) while effect(3) yields the correlation energy (arises only at λ 6= 0)

41

The XC functional via the exchange-correlation holeTwo important consequences:

• the sum rule ∫dr′ ρxc(r, r

′) =

∫dr′ ρ(r′)h(r, r′) = −1

is a local condition providing stringent test on DFT functionals where anapproximate ρxc(r, r′) is used

• Exc[ρ] depends only on the spherically averaged behaviour of ρxc(r, r′)

Exc[ρ] =1

2

∫drρ(r)

∫ ∞0

4πs ds ρSAxc (r, s)

where ρSAxc (r, s) =1

4π

∫Ω:|r−r′|=s

ρxc(r, r′)dr′

is spherically averaged exchange-correlation hole

• ρxc(r, r′) can be further decomposed into exchange and correlation contri-butions

Why LDA works

• Exc[ρ] depends only on the system, spherically and coupling-constant aver-aged XC hole

Exc[ρ] =N

2

∫ ∞0

ds 4πs2 〈ρSAxc (r, s)〉s

where 〈ρSAxc (r, s)〉 =1

N

∫drρ(r)

1

4π

∫Ω:|r−r′|=s

ρxc(r, r′)dr′

• the XC hole in LDA is that of a possible physical system (homogeneouselectron gas) and satisfies several important sum rules

• the LDA model for 〈ρSAxc (r, s)〉

〈ρSAxc (r, s)〉 =1

N

∫drρ(r)ρunifxc (ρ(r), s)

can be quite realistic

• the system average unweights the regions where LDA is less reliable

• cancellation of errors between the exchange and correlation energy

42

Properties of LDA

• LSDA is exact for the homogeneous electron gas and good approximationin the long-wavelength limit

• LSDA is reliable moderate-accuracy approximation (reproduces chemicaltrends)

• LSDA is sufficient for many purposes in solid-state

• cohesive energies in solids are systematically overestimated

• lattice constants are systematically underestimated (by few %)

• LDA fails to describe hydrogen bonds

Local spin-density approximation (LSDA)

• local density approximation

ELDAxc [ρ] =

∫dr ρ(r)εxc(ρ(r))

where εxc(ρ(r)) corresponds to homogeneous spin-compensated electrongas (ρ(r) = ρ↑(r) + ρ↓(r)))

• local spin-density approximation LSDA is a generalization

ELSDAxc [ρ↑, ρ↓] =

∫dr ρ(r)εxc[ρ↑(r), ρ↓(r)]

where εxc(ρ↑, ρ↓) corresponds to homogeneous spin-polarized electron gas

• necessary when external magnetic field is present

• also in absence of magnetic field it is better, e.g. in systems with unpairedelectrons (open-shell molecules)

• various parametrizations exist

43

Beyond LDA - generalized gradient approximation (GGA) functionals

• ideally, all good properties of LSDA should be preserved and bad onesshould be improved

EGGAxc [ρ↑, ρ↓] =

∫dr ρ(r)εGGAxc [ρ↑(r), ρ↓(r),∇ρ↑(r),∇ρ↓(r)]

• GGA includes a gradient correction to the XC hole near the electron

• while the construction of LSDA is in principle unique, it is not the case withGGA

• BLYP (empirical) - the combination of exchange functional (Becke 1988)[13] with the correlation functional (Lee, Yang, and Parr 1988) [14]

• PBE non-empirical functional (Perdew, Burke, and Ernzerhof 1996) [15]

• GGA corrects LDA and sometimes overcorrects

Further developments

• meta GGA functionals include additional ingredients

EMGGAxc [ρ↑, ρ↓] =

∫dr ρ(r)εMGGA

xc [ρ↑(r), ρ↓(r),∇ρ↑(r),∇ρ↓(r), τ↑, τ↓]

where τσ(r) =∑occ

i12|∇ψiσ(r)|2 is the kinetic energy density for the occu-

pied KS orbitals

• τσ(r) is non-local functional of the density ρσ(r)

• PKZB (Perdew, Kurth, Zupan, Blaha 1999) [16] - one empirical parameter

• TPSS - (Tao, Perdew, Staroverov, Scuseria 2003) [17] - nonempirical func-tional

• adding fraction of exact exchange - hybrid functionals

• exact exchange techniques - third generation of density functionals

• exact exchange is expensive to calculate

44

Problem with van der Waals interactions in DFT

• dispersion interactions are weak but important (physisorption, soft matter,protein folding, . . . )

• van der Waals interactions between distant closed-shell atoms originate fromcorrelated dipolar fluctuations V (R) = −C6R

−6

• in DFT the interaction should arise from a long-ranged correlation hole,very different from that in uniform electron gas

• conventional LDA and GGA miss the van der Waals interactions completelybecause they are local

• fully nonlocal functionals are required to get the long-range forces fromdistant (nonoverlapping) atoms, molecules, and solids

• various solutions (special functionals) have been proposed

• in practice, classical van der Waals interaction terms are often added to ab-initio calculations (Grimme 2006) [18]

Example 8. The influence of density functional models in ab initio MD simulationof liquid water (VandeVondele, Mohamed, Krack, Hutter, Sprik, Parrinello 2005)[19]

• strong effect of functionals on the structure and dynamics (diffusivity) ofliquid water

• BLYP, PBE, and TPSS are too sluggish and overstructured

• some functionals exhibit less structure compared to experiment

• qualitative differences observed

Part II

Ab-initio Molecular Dynamics(Car-Parrinello) MethodLiterature:Giulia Galli, Michele Parrinello, Ab-Initio Molecular Dynamics: Principles and

45

Practical Implementation, Computer Simulation in Materials Science, NATO ASISeries Volume 205, 1991, pp 283-304Dominik Marx, Jürg Hutter, Ab initio molecular dynamics: Basic Theory andAdvanced Methods, Cambridge University Press (2009)

6 Car-Parrinello Molecular DynamicsAb-initio Molecular Dynamics Motivation

• static total energy calculations are good but MD is better

• allows to study chemical reactions and provides information about time evo-lution of electronic properties

• we assume that the Born-Oppenheimer approximation is valid

• true for insulators and semiconductors

• in case of metals, it is more subtle but it is believed to be true as well

• in principle, total energy DFT calculations allow to find an interatomic po-tential V (RI) from first principles

• the Hellman - Feynman theorem allows to find the forces F(RI)

• we have to find the interaction potential for each ionic configuration RI

V (RI) = 〈Ψ0|H|Ψ0〉F(RI) = −∇RIV (RI)

where |Ψ0〉 is the instantaneous electronic ground state

• in order to do MD, we need to perform at least ∼ 104 evaluations of theinteraction potential and forces

Dynamical approach to energy functional minimization Car, Parrinello (1985)[20]

• different approach to DFT calculations without explicit use of the KS equa-tions

46

• minimization of the KS functional in the space of electronic wavefunctionsΨi is a non-linear optimization problem

E[Ψi, RI] =N∑i=1

〈Ψi| −1

2∇2

r|Ψi〉+1

2

∫ ∫ρ(r)ρ(r′)

|r− r′|dr dr′ + Exc[ρ]

+

∫ρ(r)v(r)dr +

1

2

∑I,J

ZIZJ|RI −RJ |

• minimization can be performed by means of a fictitious dynamics of the Ψi

• simplest possibility is steepest-descent dynamics in fictitious time variablet

Ψi(r, t) = −1

2

δE

δΨ∗i (r, t)

δE

δΨ∗i (r, t)= 2HKSΨi(r, t)

Dynamical approach to energy functional minimization

• orthonormality constraints have to be added∫Ψ∗i (r)Ψi(r) dr = δij

• for a given ionic configuration RI the functional E[Ψi, RI] has asingle minimum

• the steepest-descent procedure would lead to absolute minimum ofE[Ψi, RI]without a need for explicit diagonalization

• more powerful techniques than steepest-descent can be used (e.g. conjugategradients)

• to get equilibrium ionic structure, we want to minimize E[Ψi, RI] alsowith respect to the ionic coordinates RI

E0 = minRI

minΨiE[Ψi, RI]

• at each step the electronic problem is minimized to perfect self-consistency

47

Minimization in the coupled space of electrons and ions

E0 = minRI

minΨiE[Ψi, RI] = min[RI,Ψi]E[Ψi, RI]

• generalized scheme: electrons and ions can be optimized at the same time

µiΨi(r, t) = −1

2

δE

δΨ∗i (r, t)+ constraints

MiRI = − ∂E

∂RI(t)

• “masses” µi, Mi are introduced to bring both problems to the same timescale

• electrons and ions can be relaxed at the same time

• perfect electronic self-consistency is not needed at every step of the ionicminimization and is only reached at the end of the procedure when the min-imum is found

• the scheme allows to optimize large systems

Car-Parrinello Lagrangian Car, Parrinello (1985) [20]

• instead of local optimization one prefers to perform global optimization(there can be more minima as function of RI)

• second-order Newton dynamics for all degrees of freedom generated by anextended Lagrangian

L = 2occ∑i

∫dr µi|Ψi(r)|2 +

1

2

∑I

MIR2I − E[Ψi, RI]

+ 2∑ij

Λij

(∫dr Ψ∗i (r)Ψj(r)− δij

)MI are physical ionic masses and µi are arbitrary parameters

• two classical kinetic energy terms

Ke = 2occ∑i

∫dr µi|Ψi(r)|2 electrons

KI =1

2

∑I

MIR2I ions

48

Car-Parrinello Lagrangian

• equations of motion

µΨi(r, t) = −1

2

δE

δΨ∗i (r, t)+∑j

ΛijΨj(r, t)

MIRI = − ∂E

∂RI(t)

• sampling of the parameter space can be performed with MD

• analogy to simulated annealing (Kirkpatrick, Gelatt and Vecchi 1983)

• a global minimum can be found (unlike steepest-descent dynamics)

Car-Parrinello Molecular Dynamics

• the Newtonian dynamics can be used to simulate quasi Born-Oppenheimertrajectories for the ions

• the correct MD would be obtained from the equations of motion

MIRI = −∂V (RI)∂RI

• the ionic trajectories generated by the CP Lagrangian would coincide withthe true BO ones only when E[Ψi, RI] is all the time in the instanta-neous minimum

• if the fictitious electronic masses µi are chosen in such way that the timescale of electronic motion is much smaller than that of the ionic motion,there will be very little energy transfer between the electrons and ions

• if the electrons are at the beginning in the minimum, they will remain closeto the minimum for a long time, close to the instantaneous BO surface

• the electrons must be “cold” regardless of the ionic temperature

• the system must remain in a metastable two-temperatures regime

• this regime of CPMD is efficient in keeping E[Ψi, RI] close to theminimum without explicit minimization

49

Constants of motion

• the Car-Parrinello equations of motion can be integrated to perform ab initioMD

• the Lagrangian is time-independent and has a conserved quantity (with nodirect physical significance)

H = Ke +KI + E[Ψi, RI]

• as long as “fictitious” kinetic energy Ke remains negligible, the physicalhamiltonian HI is approximately conserved

HI = KI + E[Ψi, RI] ≈ const

• the CP simulation can be seen as simulation of the ionic system in a micro-canonical ensemble

• usual statistical mechanics averages for the ions can be calculated along thetrajectory

• Ke measures the deviation from the Born-Oppenheimer surface

• Ke should not have an appreciable drift during the MD run

• Ψi(r, t = 0) = 0 is chosen as initial condition for the wavefunctions

Example 9. Toy system of 8 Si stoms in the diamond structure (Pastore, Smar-giassi, Buda 1991) [21]

• system is an insulator with direct gap of 2.24 eV at Γ

• H is constant

• HI is practically constant

• Ke oscillates within ∼ 3 × 10−5 Hartree, the equipartition value would be9.5× 10−3 Hartree

• total of 20000 time steps

• no systematic drift of Ke

• at the end of the simulation, the departure from the Born-Oppenheimer sur-face was < 10−5 Hartree

50

Example 10. Toy system of 8 Si stoms in the diamond structure (Pastore, Smar-giassi, Buda 1991)

• two different time scales are clearly visible on the evolution of the Ke

• the slower one corresponds to the ionic motion

• the faster (and smaller) component is due to the intrinsic dynamics of theKS orbitals

• the fast component allows the electrons to follow adiabatically the slowlychanging ground state

Dynamics of electronsThe dynamics of the electrons is determined by the KS eigenvalues

• for small deviations from the ground state the dynamics corresponds to setof harmonic oscilators with frequencies

ωij = [2(εj − εi)/µ]12

where εj (εi) is the KS eigenvalue of an empty (occupied) state

• the lowest frequency that can appear in the system is

ωmin =

(2Egµ

) 12

where Eg is the energy gap

Control of adiabaticity

• the high-frequency motion of the electrons serves two purposes:

1. allows the electrons to follow the slow motion of the ions

2. makes the irreversible exchange of energy between the fast and slowvariables very small

• ωelmin ωionmax is the condition of the adiabatic separation

• ωionmax and gap Eg are physical properties of the system

• we can only control the adiabaticity by the fictitious electron mass µ

51

• decreasing µ shifts the electronic spectrum upwards, but also stretches thewhole spectrum

• the maximum electronic frequency scales as

ωelmax ∝ [Ecutoff/µ]12

• the maximum MD time step is proportional to the inverse highest frequencyin the system ∆tmax ∝ [µ/Ecutoff ]

12

• compromise between adiabaticity and time step has to be made

• typical values are µ ∼ 500 - 1500 a.u.

• typical time step is ∆t ∼ 5− 10 a.u. (0.12 - 0.24 fs)

The case of metalsCPMD works well for the systems with large gap

• in metals the gap is close to zero

• the violation of adiabaticity causes drift of the electronic energy whichcauses problems

• Nosé-Hoover thermostats applied separately to the electrons and ions (Blöchland Parrinello 1992) can keep the two temperatures different

• for metals it is certainly better to use a true Born-Oppenheimer MD

• another source of non-adiabaticity is level crossing of the electronic levels

CP forces and Born-Oppenheimer forces Pastore, Smargiassi, Buda (1991)[21]

• the deviations of the CP forces from the true Born-Oppenheimer ones aresmall and oscillatory

• average CPMD forces reproduce the BO ones with high accuracy

52

CP equations of motion in the plane wave representation

• CP Lagrangian

L = µocc∑i

∑G

|ci(G)|2 +1

2

∑I

MIR2I − E[G, RI]

+∑ij

Λij

(∑G

c∗i (G)cj(G)− δij

)

• the orthonormality constraint does not depend on the nuclear positions

• equations of motion

µci(G, t) = − ∂E

∂c∗i (G)+∑j

Λijcj(G)

MIRI = − ∂E

∂RI

• the 2 sets of equations are coupled through the KS functional

• EOM can be integrated by the velocity Verlet/RATTLE algorithm

Integration of the CP equations of motion

53

˙RI(t+ δt) = RI(t) +δt

2MI

FI(t)

RI(t+ δt) = RI(t) + δt ˙RI(t+ δt)

˙ci(t+ δt) = ci(t) +δt

2µfi(t)

ci(t+ δt) = ci(t) + δt ˙ci(t+ δt)

ci(t+ δt) = ci(t+ δt) +∑j

Xijcj(t) where Xij =δt2

2µΛPij

calculate FI(t+ δt)

calculate fi(t+ δt)

RI(t+ δt) = ˙RI(t+ δt) +δt

2MI

FI(t+ δt)

c′i(t+ δt) = ˙ci(t+ δt) +δt

2µfi(t+ δt)

ci(t+ δt) = c′i(t+ δt) +∑j

Yijcj(t+ δt) where Yij =δt

2µΛvij

Imposing the orthogonality constraints

• Lagrange multipliers for positions ΛPij and velocities Λv

ij are treated as inde-pendent variables

C†(t+ δt)C(t+ δt)− I = 0

XX† + XB + B†X† = I−A

Aij = c†i (t+ δt)cj(t+ δt)

Bij = c†i (t)cj(t+ δt)

• the equation can be solved iteratively, starting from X(0) = 12(I−A)

X(n+1) =1

2

[I−A + X(n)(I−B) + (I−B)X(n) − (X(n))2

]• orthogonality condition on the orbital velocities

c†i (t+ δt)cj(t+ δt) + c†i (t+ δt)cj(t+ δt) = 0

is applied to the trial states C′ + YC

54

• the rotation matrix Y is obtained without iteration

Y = −1

2(Q + Q†), where Qij = c†i (t+ δt)c′†j (t+ δt)

Scaling of the CPMD method

• N is number of occupied orbitals

• M is number of plane waves

• kinetic energy is diagonal so its action on Ψ is O(NM)

• action of local potential on Ψ is O(NM logM) (FFT)

• action of non-local terms (pseudopotential) on Ψ is O(N2M)

• orthogonalization procedure is also O(N2M)

• if N M , the big dimension M enters only via M or M logM

• significant improvement over O(M3) scaling of a direct diagonalization

Practical aspects of CPMD simulation

1. for the initial ionic configuration the wavefunction must be well optimized

2. equations of motion are integrated

3. the kinetic energy of the electrons has to be monitored

4. if the electrons become hot, the wavefunction has to be quenched to the BOsurface

Example 11. Importance of the choice of the fictitious mass µ in study of liquidwater (Grossman, Schwegler, Draeger, Gygi and Galli 2004) [22]

• water is a wide-gap insulator with εHOMO − εLUMO ∼ 4 eV

• vibrational spectra of the electronic degrees of freedom are calculated from

ν(ω) =

∫ ∞0

dt cos(ωt)〈∑i

Ψi(t)Ψi(0)〉

• the highest ionic mode is the O-H stretch mode

• values of µM

up to ∼ 15

are appropriate

• values of µM∼ 1

3are too large and result in a strong deviation in structural

properties

55

A different possibility - Ehrenfest MD

• there are various ways how to make the electrons follow adiabatically theground state as the ions move

• Ehrenfest MD - Newton dynamics for ions and time-dependent Schrödingerequation for electrons

MIRI(t) = −∇RI〈Ψ0(t)|He|Ψ0(t)〉

i∂

∂t|Ψ0(t)〉 = He(RI, ri)|Ψ0(t)〉

• if the Born-Oppenheimer approximation is valid, electrons follow adiabati-cally and remain in the ground state (no diagonalization needed)

• propagation of the wavefunction is unitary and so the norm and orthogonal-ity is preserved

• Trotter-Suzuki formula can be used to derive the integrator

exp(−iHδt) ' exp(−iT δt2

) exp(−iV δt) exp(−iT δt2

)

• practically, integration of the Schrödinger equation is difficult and requiresextremely small time step

• CPMD combines advantages of Born-Oppenheimer and Ehrenfest MD

Example 12. Simulation of thermally induced conversion of carbon from diamondto graphite (De Vita, Galli, Canning, Car 1996) [23]

• diamond is a high-pressure phase of carbon and is metastable at ambientconditions

• the most stable phase is hexagonal graphite

• the barrier for conversion to graphite is very high and diamond is stableeven on a geological time scale

• important for chemical vapour deposition growth of diamond films whichmay contain graphitic islands

• experiments suggested that the transition is surface-induced

• CPMD study on a diamond slab containing 12 (111) atomic layers, sepa-rated by a vaccum slab of 9 Å

56

• slab is initially terminated by (2x1) reconstructed surfaces

• 192 atom slab was heated to T ∼ 2500 K

• cutoff of 33 and 99 Ry used for orbitals and density, respectively

• only Γ point used in the calculations

• total simulation time about 15 ps

Example 13. Simulation of thermally induced conversion of carbon from diamondto graphite (De Vita, Galli, Canning, Car 1996)

(a) original reconstructed slab at 0 K

(b) initial stage of graphitization at 2500 K

• nucleation of a graphitic seed

• initial breaking of bonds

• flattening of double layer

• change of hybridization from sp3 to sp2

(c) fast penetration of the graphite phase into the diamond slab proceeding viacorrelated breaking of nearest neighbour bonds allong the z-direction

(d) end of the phase transition, defect-free graphitic planes

Example 14. Simulation of thermally induced conversion of carbon from diamondto graphite (De Vita, Galli, Canning, Car 1996)

• the seed of the transition from previous simulation was taken and simulatedat 1000 K

• a stable interface was formed

• to reduce finite-size effects, a larger 288 atom system was prepared fromthe interface and simulated in the range 0 - 1000 K, interface still stable

• networks of graphitic sheets are coherently connected to the buckled dia-mond (111) double layers

• the interface reconstruction maximizes the number of 6-membered rings atthe phase boundary

57

7 Extensions of the CPMD techniquePulay stress Froyen and Cohen 1986

[24]

• the use of an incomplete plane-wave basis set has some consequences

• NPW at constant cutoff Ec depends on the volume NPW ∼√

23π2 ΩE

32c

• if Etot is not completely converged, Etot = Etot(NPW ,Ω)

• Etot(Ω) curves at constant NPW and constant Ec are different

• in lack of full convergence it is better to keep Ec = const because thisguarantees constant spatial resolution

• stress tensor is typically calculated with NPW = const

• Pulay stress arises from the volume dependence of NPW

dEtot

dΩ=

∂Etot

∂Ω

∣∣∣∣NPW

+∂Etot

∂NPW

∂NPW

∂Ω=∂Etot

∂Ω

∣∣∣∣NPW

+∂Etot

∂Ec

∂Ec∂NPW

∂NPW

∂Ω

pEc = pNPW −2

3

EcΩ

∂Etot

∂Ec

∣∣∣∣Ω

• the correction term can be easily determined from few calculations withdifferentEc (should be taken into account in constant-pressure calculations)

Constant-pressure Car-Parrinello MD Focher, Chiarotti, Bernasconi, Tosatti,Parrinello (1994) [25]

• various extended Lagrangian techniques can be combined

• combination of Car-Parrinello and Parrinello-Rahman Lagrangians

• the original wavefunctions depend on the cell so they have to be expressedin the scaled variables s

L = µ

occ∑i

∫ds |Ψi(s)|2 +

1

2

∑i

MI(STI GSI)− E[Ψi, hSI]

+∑ij

Λij

(∫ds Ψ∗i (s)Ψj(s)− δij

)+

1

2W Tr hT h− PV

where h = (a,b, c), r = hs, V = det h,G = hTh

58

• as the cell fluctuates, the G vectors change and the number of plane waveswithin given Ecutoff fluctuates too

• allows study of structural transitions in solids without empirical potentials

• important to study matter at extreme conditions (temperatures, pressures),e.g. geoscience, planetary science

Example 15. Simulation of pressure-induced conversion of carbon from graphiteto diamond (Scandolo, Bernasconi, Chiarotti, Focher, Tosatti 1995) [26]

• at pressure of ∼ 15 GPa and T > 1000 K graphite converts to the high-pressure phase, diamond

• process of great technological and fundamental importance

• two kinds of diamond structure, cubic and hexagonal (Lonsdaleite)

• in shock wave experiments, formation of hexagonal diamond was observed

• three kinds of graphite - hexagonal, orthorhombic and rhombohedral

• constant-pressure CPMD study, cutoff of 35 Ry, only Γ point

• starting structure hexagonal diamond, 48 or 64 atoms supercell

Example 16. Simulation of pressure-induced conversion of carbon from graphiteto diamond (Scandolo, Bernasconi, Chiarotti, Focher, Tosatti 1995) Possible geo-

metric paths for graphite to diamond transformation

(a) rhombohedral graphite to cubic diamond

(b) orthorhombic graphite to cubic diamond

(c) orthorhombic graphite to hexagonal diamond

• stacking of graphite determines the orientation of the final diamond phase

• shock-wave experiment is consistent with the mechanism (b)

Example 17. Simulation of pressure-induced conversion of carbon from graphiteto diamond (Scandolo, Bernasconi, Chiarotti, Focher, Tosatti 1995) Dynamics of

the transformation from graphite to diamond in a 48 atom cell

• compression to 30 GPa at 200 K, sliding of the graphitic planes betweenhex and orth

59

• heating to 1000 K

• fast increase of pressure, sliding of the graphitic planes between orth andhex

• at 90 GPa (much higher than the exp. value of 15 GPa) orthorhombicgraphite converted to cubic diamond

• mechanism corresponds to the path (b), no formation of hex diamond ob-served

Example 18. Simulation of pressure-induced conversion of carbon from graphiteto diamond (Scandolo, Bernasconi, Chiarotti, Focher, Tosatti 1995) Transformation

from graphite to diamond in a 64 atom cell

• in a bigger cell more paths are possible

• simulation performed at 150 and 1500 K

• sliding of graphite to orth stacking observed again

• at 1500 K hex diamond was found

• at 150 K structure with mixed hexagonal and cubic diamond arrangementwas found

• role of the orthorhombic graphite confirmed

Combination of CPMD and path integrals Marx and Parrinello 1994 [27]

• when light atoms (hydrogen, . . . ) are present, quantum effects on the ionicmotion are important

• static quantum effects can be calculated within the path-integral formalism

• path-integrals can be sampled by either MC or MD

• extended Lagrangian combining CPMD and path-integrals

L =1

P

P∑s=1

occ∑i

µ〈ψ(s)i |ψ

(s)i 〉 − E[ψ(s)

i , R(s)I ]

+∑ij

Λ(s)ij

(〈ψ(s)

i |ψ(s)j 〉 − δij

)

+P∑s=1

∑I

1

2M ′

I(R(s)I )2 −

∑I

1

2MIω

2P (R

(s)I −R

(s+1)I )2

60

• P is the Trotter number, M ′I 6= MI is fictitious mass for the ions, ω2

P =P/β2

Combination of CPMD and path integrals

• path integral is sampled by MD, the dynamics is unphysical (just for sam-pling the configuration space)

• Problems:

– because of the harmonic springs, path integral representation behavesas collection of nearly harmonic oscillators with ωP ∝

√P

– MD sampling of the ionic system presents strong non-ergodicity, forlarge P potential energy 1

PU(R) is just a small perturbation

– because of the high ionic frequencies the adiabaticity condition ofCPMD may be violated, use of small µ would require too small timestep

• Solution:

– can be solved by diagonalization of the harmonic part of the Lagrangian- staging which brings all modes to the same scale (Tuckerman, Marx,Klein, Parrinello 1996) [28]

– application of Nosé-Hoover chain thermostat to the ions (massive ther-mostatting - one chain for each Cartesian component of ionic coordi-nates)

– application of Nosé-Hoover chain thermostat to the electrons

• method parallelizes trivially - each Trotter slice runs on separate node, littlecommunication

Example 19. Tunnelling and zero-point motion in high-pressure ice Benoit, Marx,Parrinello (1998) [29]

• normally the configuration O – H . . . O is highly asymmetric (one covalentand one hydrogen bond)

• upon high pressure the bonds get shorter

• the normal configuration can transform into one where the proton stays inthe middle between between two oxygens

• transition from ice VII to ice X - non-molecular phase of ice

61

• quantum tunnelling effects crucial - constant-pressure path-integral CPMD

Example 20. Tunnelling and zero-point motion in high-pressure ice Benoit, Marx,Parrinello (1998)

• the symmetrization transition occurs in quantum system at ∼ 72 GPa whilein the classical one at ∼ 102 GPa

• two forms of ice X: proton-disordered symmetric ice (c) and proton-orderedsymmetric ice (d)

Metadynamics and CPMD Iannuzzi, Laio, Parrinello (2003) [30]

• Lagrangian metadynamics - useful to study chemical reactions

• large barriers can be crossed in very short time (few ps) and a lot of energyis released to degrees of freedom associated with the collective variable

• set of collective variables sα(RI), α = 1, . . . , d

• Lagrangian includes the dynamical collective variables sα(RI) with fic-titious mass Mα

L = L0 +∑α

1

2Mα

˙s2α −

∑α

1

2kα[sα(RI)− sα]2 + V (t, s)

L0 is the usual CPMD Lagrangian

• the instantaneous values of the collective variables Sα(RI) fluctuate aroundthe corresponding Sα

• the scheme allows for explicit control of the dynamics and temperature ofthe collective variable sα via thermostat, avoiding possible instabilities

Metadynamics and CPMD

• if the mass Mα is large, the dynamics of the collective variables is slow andadiabatically separated from that of ions and electrons

• Mα is tuned to obtain a smooth evolution of sα

• the spring constants kα must be chosen in such a way that the typical valueof the difference |sα(RI) − sα| is smaller than the length on which thefree energy varies of ∼ kBT

62

• the history-dependent potential can be constructed in form of a d-dimensionalGaussian tube accumulating infinitesimal tube slices along the trajectory ofs

V (t, s) =

∫ t

0

dt′| ˙s(t′)|w exp

−(s− s(t′))2

2δs2

δ

(˙s(t′)

| ˙s(t′)|· (s− s(t′))

)

δs is the size in the orthogonal direction

Dehydrogenation process of a Si6H8 cluster, collective coordinates are Si-Si, Si-H,and H-H coordination numbers

Example 21. Study of Azulene-to-Naphthalene Rearrangement Stirling, Iannuzzi,Laio, Parrinello (2004) [31]

• complex potential energy surface

• various C-C and C-H coordination numbers used as collective variables toexplore various proposed mechanisms

• several possible mechanisms were found with different barriers

Hybrid Schemes Quantum Mechanics/Molecular Mechanics (QM/MM)Literature:P. Sherwood, in Modern Methods and Algorithms of Quantum Chemistry NIC

Ser. Vol. 1, John von Neumann Institute for Computing, Juelich 2000, pp. 257 -277

• despite all progress in ab initio methods, it is not possible to treat largeenough systems

• in chemistry (catalysis), biochemistry (study of enzymes), materials scienceetc., often many thousands of atoms participate in the processes

• partitioning of the system to the QM and MM regions - multiscale approach

• QM is used to model more important processes in the inner region (small)

• MM (force field) is used for the less important processes in the outer (large)region (environment)

• the boundary must be sufficiently far from the reaction center (requireschemical intuition and testing)

• ideally, the boundary should not cut molecules (not always possible)

63

• if the boundary cuts a covalent bond, the interaction between the two sys-tems is strong and the QM calculation in the QM region alone does notmake sense

Approaches to QM termination

• treatment of bonds that connect different sides of the boundary is the mostdifficult aspect

• some way of termination or treating the boundary is necessary

1. link atoms

• additional centers are added to the QM calculation which are invisibleto the MM calculation

• positions of the link atoms are either independent variables or functionof positions of atoms in the (I) and (O) regions

2. boundary region

• subset of centers of the systems which are present in both QM andMM calculations

Construction of the QM/MM energy expressionTwo ways to construct the total energy of the system (E = entire, I = inner, O

= outer, L = link)

Additive schemes

• the QM and MM energies are considered complementary

• the case of link atoms

EQM/MM = EOMM + EI,L

QM + EI,OQM−MM − E

I,Lcorr

• the term EI,OQM−MM includes all terms that couple the two regions (e.g. van

der Waals interactions)

• mainly used in biomolecular modeling

• the case of boundary-region based methods

EEQM/MM = EO,B

MM + EI,BQM + EI,O,B

QM/MM

• the boundary atoms are treated by both QM and MM so the classical energymust be modified to avoid multiple counting of the interactions

64

Subtractive schemes

• the entire system is treated by MM and the energy of the I region is alsoevaluated by MM

• for the link atoms schemes

EEQM/MM = EE

MM + EI,LQM − E

I,LMM

• no coupling term is required

• the difference between the QM and MM forces in the link region should besmall

• it is useful if the forcefield is designed to reproduce the forces of the partic-ular QM approximation used for the inner region

• not suitable when the electronic ctructure of the QM region is significantlyperturbed by the environment

• simpler to implement, mainly used in chemistry

Choice of QM and MM models

• QM - typically DFT (CPMD)

• sometimes even a higher level method is necessary (QM/QM/MM)

• choice of MM depends on whether additive or subtractive scheme is used

• in additive schemes, the type of MM model has large influence on the treat-ment of the boundary

• two kinds of force fields:

– valence force fields - bond stretching terms, angle bending terms, tor-sions, etc.

– ionic force fields based on Coulomb and short range forces (shell mod-els)

• in additive schemes based on link atoms, it is easier to use the valence forcefields - more easy to separate the different contributions and leave out whatis treated by QM

65

Treatment of QM/MM electrostatic terms

A mechanical embedding - QM calculation is performed in the gas phase

B electrostatic embedding - classical partition appears as external charge dis-tribution, polarization of the QM part due to the MM charges occurs as partof the QM calculation

C polarised embedding polarisation of the MM region due to charges in theQM region is also included

D extension of C where the QM and MM polarizations are brought to self-consistency, e.g. iteratively

Termination of the QM region

• to terminate a broken covalent bond the simplest choice is to add a hydrogenlink atom - “capping” of the dangling bond

• the position of the link atom is an extra degree of freedom

• optimised link atoms - the link atoms are allowed to move

• constrained link atoms - link atom coordinates are expressed as function ofreal atom coordinates

• the force acting on the link atom has to be distributed to the real atoms

dE

dRI

=∂E

∂RI

+∂E

∂RL

∂RL

∂RI

• the constrained approach is better for MD - no spurious vibrational frequen-cies are introduced

Example 22. QM/MM study of defect formation in silica Zipoli, Laino, Laio,Bernasconi, Parrinello (2006) [33]

• a defect in a solid implies a change in electronic configuration as well asrelaxation of the lattice

• if the fully QM technique is used, the size limitation (few hundred atoms)would imply extremely high concentration of defects and strong defect-defect interaction

• in QM/MM, the defect can be treated by QM and the bulk part by MM

• the QM subsystem is embedded in a classical cluster composed of 508 SiO2

units

66

8 Minimization techniques for the KS functionalLiterature:M. C. Payne, M. P. Teter, D. C. Allan, T. A. Arias and J. D. Joannopoulos, Rev.Mod. Phys. 64, 1045 - 1097 (1992)

Two different approaches to optimization of KS orbitals

• Fixed-point methods (iterative diagonalization)

– we do not need all eigenvectors and eigenvalues, only the occupiedones (N M )

– there are methods which allow to do this and do not require storage ofthe full matrix, just the calculation of H|Ψ〉

– Lanczos (1950), Davidson (1975), . . .

– used together with iterative improvement of the density (mixing)

– robust, well suited for systems with no or small gap (metals)

– for review, see Kresse and Furthmüller (1996) [34]

• Direct non-linear minimization methods

– steepest descent

– conjugate gradients

– convergence acceleration schemes

– orbital rotation schemes

– . . .

• in both cases preconditioning is useful

Steepest descent

• function F (x) to be minimized in multidimensional space x

• the simplest possibility is to follow the direction of negative gradient

• gradient operator G

g1 = − ∂F

∂x

∣∣∣∣x=x1

= −Gx1

67

• one moves from the point x1 in the direction g1 to the point x2 = x1 + b1g1

where F (x1 + b1g1) has minimum

g1 ·G(x1 + b1g1) = g1 · g2 = 0

• the new search direction is orthogonal to the previous one

• the process converges but often needs a large number of steps, especially incase of strongly anisotropic valley

• at each step the search direction is chosen using only information from thepresent point

Conjugate gradients

• a better technique would combine the information from all the samplingpoints and search in direction orthogonal to previous searches

• we start from point xi and perform line-minimization of the function F (xi+λidi) along the direction di

• the new minimum is xi+1

• the search directions di are obtained from

dm = gm + γmdm−1

γm =gm · gm

gm−1 · gm−1

, γ1 = 0

• the directions di are conjugate

• minimizations along conjugate directions are independent

• subsequent step does not spoil the previous one

Preconditioned conjugate gradients scheme

• for efficient optimization in many dimensions it is useful to bring all degreesof freedom onto the same length scale - preconditioning

• for large G vectors the KS matrix HG,G′ is dominated by kinetic energywhich is diagonal - simple preconditioner

KG,G′ = HG,G δG,G′ if|G| ≥ Gc

KG,G′ = HGc,Gc δG,G′ if|G| ≤ Gc

where Gc is a free parameter (one can take Gc = 0.5 Hartree for eachsystem)

68

Conjugate gradients for plane-wave wavefunction optimization

• update scheme

ci(G) ← ci(G) + αhi(G)

hni (G) =

gni (G) n = 0gni (G) + γn−1hn−1

i (G) n = 1, 2, 3, . . .

gni (G) = K−1G,Gψ

ni (G)

γn =

∑i〈g

n+1i (G)|gn+1

i (G)〉〈gni (G)|gni (G)〉

where ψni (G) is the wavefunction gradient

• it is efficient to perform the update band by band (Teter, Payne, Allan 1989)[35]

Direct inversion in the iterative subspace (DIIS) Pulay (1980) [36], Hutter,Lüthi, Parrinello (1994) [37]

• convergence acceleration for iterative sequences, using information frommprevious iterations

• we have a sequence of m vectors xi

• assume that for each vector we can guess the difference to the stationarypoint x0

ei = xi − x0

• ansatz: find the best linear combination

xm+1 =m∑i=1

cixi

ideally we would havem∑i=1

cixi = x0

m∑i=1

cix0 +m∑i=1

ciei = x0

69

Direct inversion in the iterative subspace (DIIS)

• this would be satisfied ifm∑i=1

ci = 1 andm∑i=1

ciei = eM+1 = 0

• the last condition is in practice satisfied only in the mean-square sense

• we introduce a Lagrange multiplier λ and matrix bij = ei · ej and minimize

d

dci(m∑

i,j=1

cicj ei · ej − λm∑i=1

ci) =d

dci(m∑

i,j=1

cibij − λm∑i=1

ci) = 0

• system of (m+ 1) linear equations is solvedb11 b12 . . . b1m 1b21 b22 . . . b2m 1...

... . . ....

...bm1 bm2 . . . bmm 11 1 . . . 1 0

c1

c2...cm−λ

=

00...01

Direct inversion in the iterative subspace (DIIS)

• the error vectors em are not known, but close to the minimum x0 the func-tion is approximately quadratic F (x) ≈ 1

2xHx−ax (H is Hessian, assumed

to be constant)

G(x) ≈ Hx− a = H(x− x0 + x0)− a = H(x− x0)

em = H−1G(xm)

• HessianH can be approximated by the diagonal part of the KS Hamiltonian

HKSG,G′ =

1

2G2 + V H

G,G′ + V XCG,G′ + V i

G,G′

or by the preconditioner matrix KG,G′

• the new estimate for x0 is

x0 ≈ xM+1 − eM+1

70

Born-Oppenheimer MD and CPMD

• due to progress in minimization and iterative diagonalization techniques itis possible to perform also Born-Oppenheimer MD

• for systems with slow ionic dynamics, large time step can be used

• BO might be even faster than CPMD but with a large time step the energyconservation decreases

• in systems with stiff covalent bonds, even in BO a shorter time step has tobe used

• BO is preferred for metals where CPMD has problems

Part III

Special topics9 Tight-binding methodsThe tight-binding method

Literature: C. M. Goringe, D. R. Bowler and E. Hernández, Rep. Prog. Phys.60, 1447 (1997)

• TB is a simplified QM scheme using a series of approximations

• TB is between fully ab initio schemes and empirical models

• TB uses a minimal set of localized basis functions

• the basis set is not explicitly constructed but has the same symmetry prop-erties as the atomic orbitals

• instead of exact many-body Hamiltonian operator a parametrized Hamilto-nian matrix is used

• number of basis states is much smaller compared to plane wave DFT

• e.g. modeling carbon, only 2s and 2p orbitals are considered (4 states)

• origin - linear combination of atomic orbitals (LCAO) Method for the Peri-odic Potential Problem - Slater and Koster (1954)

71

The tight-binding method

• Hamiltonian matrix elements are written as

Hiαjβ(k) =∑Rj ,J

exp [ik · (Rj −Ri)]hαβJ(|Rj −Ri|)GαβJ(l,m, n)

sum goes over periodic images Rj , α, β correspond to atomic orbitals,k, l,m are the direction cosines of the vector Rj −Ri

• J is the angular momentum of the bond

– if either α or β is s, J = 0 (σ)

– if both α and β are at least p, J = 0 or 1 (π) etc.

• GαβJ(l,m, n) is the angular dependence tabulated in the paper by Slater andKoster (1954)

• often the function hαβJ(r) is short-ranged and a cutoff is used (e.g. at sec-ond neighbour distance)

• the whole scheme is described by the functions hαβJ(r), parametrized ana-lytically or numerically (e.g. splines)

• the quality of a TB scheme is determined by the parametrization


• the Hamiltonian can be solved by direct diagonalization

H|Ψi〉 = εi|Ψi〉

Etot = Eband + Erep = 2N∑i=1

εi +∑i 6=j

Uij

• Eband is the band structure term

• pairwise additive Erep acounts for all terms which are not included in Eband

• the H matrix is sparse

• TB is non self-consistent scheme so no iteration is performed, energy isobtained in one step for each atomic configuration

72

• if charge transfer is a problem, local charge neutrality can be imposed as aconstraint

• Hubbard U term can also be used (cost for charging an atom)

• by differentiating Eband (Hellman - Feynman theorem) the forces on atomscan be found and TB MD can be performed


• parameters of TB schemes are either determined from ab-initio calculationsor fitted to experimental results

• compared to full ab initio methods TB is 2 to 3 orders of magnitude faster,but less transferable

• compared to classical potentials, TB is 2 to 3 orders of magnitude slower,but QM nature of bonding is retained

• in TB the angular nature of bonding is correctly described even in configu-rations far from equilibrium structure

• useful in situations where QM effects are important but fully ab initio treat-ment of the large system is impossible

• TB is often used for simulations of carbon structures - fullerenes, nanotubes,onions etc.

• large scale applications - defects (point and extended), dislocations, grainboundaries, study of brittle fracture, crack propagation etc.

Example 23. Spanning the continuum to quantum length scales in a dynamic sim-ulation of brittle fracture Abraham, Broughton, Bernstein, Kaxiras (1998) [39]

• rapid brittle fracture of a Si slab flawed by a microcrack at its center, underuniaxial tension

• multiscale simulation with 3 regions (2 dynamical boundaries):

– continuum (finite elements) - “far-field” region

– atomistic (classical MD) - region around the crack (large strain gradi-ents but no bond rupture)

– quantum - (TB) in the region of bond failure at the crack tip

73

10 Linear scaling methodsLiterature:

Stefan Goedecker, Rev. Mod. Phys. 71, 1085 - 1123 (1999)D R Bowler, T Miyazaki and M J Gillan, J. Phys.: Condens. Matter 14, 2781(2002)

Scaling of electronic structure calculations

• basic operation of all methods is HΨ

• due to use of FFT HΨ scales as NM lnM where N is the number of atomsandM is a number of plane waves (compared toO(NM2) with the straight-forward approach)

• bottleneck is the orthogonalization step which is O(N2M)

• M ∼ N , for large enough systems we get O(N3) scaling

• with larger system there are more basis functions, more wavefunctions andeach wavefunction has to be orthogonal to all the others

• in classical simulations we have typically O(N)

• it is possible to achieve O(N) scaling tCPU = c1N also in electronic struc-ture calculations but the prefactor c1 is very large

• crossover where theO(N) methods win overO(N3) methods is still at largeN

• with faster computers the importance of O(N) methods will grow becausewe will get beyond the crossover point

Locality in quantum mechanics

• QM is local - properties of a certain region are only weakly influenced byfactors which are far away from the region

• this is not taken into account in the theories based on wavefunctions whichextend over the whole system

74

• single-particle density matrix in terms of KS orbitals ψi(r)

F (r, r′) =occ∑i

fiψ∗i (r)ψi(r

′) ρ(r) = F (r, r)

T = −1

2

∫dr′ ∇2

rF (r, r′)∣∣r=r′

V =

∫dr F (r, r)V (r)

• as |r−r′| → ∞, F (r, r′)→ 0, exponentially in insulators and algebraicallyin metals

• in principle it should be possible to achieve O(N) scaling

• concept of localization region - quantities are evaluated only inside the re-gion and neglected outside

Strategies for O(N) scaling

• typically O(N) algorithms are based either directly on the density matrix orits representation in terms of localized (Wannier) functions

• finite temperature density matrix is the Fermi operator

F = f [β(H − µ)] where f(x) =1

1 + exp(x)

• e.g. the Fermi operator expansion method (Goedecker and Colombo 1994)approximates the function f(x) by a polynomial over the interval betweenthe smallest and largest eigenvalue of H

F =n∑p=0

cpHp

• problems with metals where the gap is zero

• other methods are density matrix minimization, orbital minimization, divide-and-conquer methods, penalty functionals, . . .

• O(N) methods are typically used in connection with TB schemes where theprefactor is smaller due to much smaller number of basis states comparedto DFT

75

Linear scaling method suitable also for metals Krajewski and Parinello 2006[40]

• generic expression for the total energy in theories that can be formulated inan effective single particle form (e.g. Kohn-Sham, tight binding)

E = 2N∑i=1

εi + Vr

• first term is the so-called band structure term, εi are eigenvalues of a single-particle Hamiltonian H

• Vr corrects for double counting and accounts for the direct ion-ion interac-tion, can be calculated in O(N) operations

• the band structure term has an apparent O (N3) complexity

• it is convenient to use grand-canonical ensemble

• grand canonical potential for independent fermions

Ω = − 2

βTr ln

(1 + eβ(µ−H)

)= − 2

βln det

(1 + eβ(µ−H)

)where the identity Tr ln A = ln det A was used

Linear scaling method suitable also for metals

• factorization

1 + e−β(µ−H) =

P/2∏l=1

(M†

lMl

)where P is an even integer and

Ml = 1 + eiπ(2l−1)/P eβP

(µ−H)

• M†lMl is a positive definite operator and the identity holds

det(M†

lMl

)− 12

=

∫D[Φl]e

−Φ†lM

†lMlΦl∫

D[Φl]e−Φ†lΦl

• field theoretical formulation

Ω =4

β

P/2∑l=1

ln

∫D[Φl] e

−Φ†lM

†lMlΦl + const.

• the auxiliary field Φl has the dimension of the full Hilbert space

76


• no sign problem

• observables can be calculated as derivatives of Ω, e.g. Ne = − ∂∂µ

Ω

• the low temperature limit provides the band structure term

Eband = 2N∑i=1

εi = limβ→∞

∂

∂β(βΩ) + µNe

• force on particle I at position RI coming from the band term

FbandI = −∇RI

Ω

• general expression for the derivatives is

∂Ω

∂λ=

4

β

P/2∑l=1

∫D[Φl]

[Φ†l

(∂M†

l

∂λMl + M†

l∂Ml

∂λ

)Φl

]e−Φ†

lM†lMlΦl∫

D[Φl] e−Φ†lM

†lMlΦl

.

• so far no approximation was made


• high-temperature expansion can be made on eβP

(µ−H) in Ml

Ml = 1 + eπ(2l−1)/P

[1 +

β

P(µ−H)

]+O

((β

P

)2)

• we assume the case of tight-binding where H is sparse

• the operator Ml has the same sparsity of H which leads to linear scaling

• no assumption was made on the character of the spectrum εi so the schemeis usable also for metals

• sampling of the integral can be done by drawing a sequence of normal dis-tributed random numbers Ψl and computing Φl

• solving the equations MlΦl = Ψl is O (N)

• no diagonalization necessary

• the quantities calculated are noisy - statistical error

• in case of metals Ml≈P2

can have eigenvalues close to zero

77

Sampling the auxiliary fields by Langevin dynamics

• the distribution e−Φ†lM

†lMlΦl can be sampled by Langevin dynamics

mlΦl = −M†lMlΦl − γeΦl + ξl

ml is a fictitious mass, γe is friction term

• the components α of the white random noise vector ξl obey the relation

〈ξαl (0)ξαl (t)〉 = 2mlγeδ(t)

• friction and random noise simulate collisions with the heat bath

• trajectory of the stochastic differential equation samples the canonical en-semble

• using the Liouville formalism the integration algorithms can be found

• no matrix inversion is necessary, the expensive operation is the matrix-vector product MlΦl which is O (N) because Ml is sparse

• since the inter-atomic forces are noisy, energy-conserving MD cannot beperformed

Using the scheme for MD simulation

• the same Langevin dynamics can be used to sample also the ions

MRI = FI − γIRI + ΞI

• the random force obeys the relations

〈ΞI(0)ΞI(t)〉 = 6kBTMγIδ(t)

〈FI(0)ΞI(t)〉 = 0

• from the electrons Langevin dynamics an approximation FLI = FI + ΞL

I tothe exact forces FI is found

• assumption: the statistical error ΞLI is a white noise⟨

ΞLI (0)ΞL

I (t)⟩ ∼= 6kBTMγLI δ(t) and⟨

FI(0)ΞLI (t)

⟩ ∼= 0

• the noises ΞLI and ΞI add together

MRI = FI −(γI + γLI

)RI + ΞI + ΞL

I

78

Using the scheme for MD simulation

• the value of γLI is unknown but can be tuned until the equipartition theorem⟨12MR2

I

⟩= 3

2kBT is satisfied

• once the value of γLI is determined it has to be kept constant during thesimulation

• the assumptions about ΞLI were explicitly verified for a small system and

found to be satisfied

• even with noisy forces a correct Boltzmann sampling can be performed

• after each ionic displacement the fields Φl evolve under the action of thenew M†

lMl until the distribution is equilibrated

• the fields Φl are used to calculate the ionic forces for the next integrationstep

• the chemical potential µ is continuously adjusted during the simulation tokeep the number of electrons constant

• the algorithm can be parallelized easily

Example 24. Comparison of results for liquid and crystalline Si obtained by thelinear scaling method and standard diagonalization

• crossover point for Si is at ∼ 500 atoms

• 64 Si atoms, P = 200 was sufficient to make the approximation valid

• the crystalline Si is at 300 K (semiconductor)

• the liquid Si is at 3000 K (metallic)

11 Density functional perturbation theoryLiterature: S. Baroni, S. de Gironcoli, A. Dal Corso, P. Giannozzi, Rev. Mod.Phys. 73, 515 (2001)

79

Density functional perturbation theory (DFPT) Baroni, Giannozzi, Testa (1987)[41]

• we want to calculate also derivatives of the total energy with respect tostructural parameters or external perturbations

• linear response in periodic systems

• structural parameters can be particle positions or unit cell

• response to microscopic particle displacements is related to phonons

• response to macroscopic strain is related to elastic constants

• external perturbation can be e.g. static or time-dependent electric field (po-larization, dielectric constant)

• in principle, one can calculate the response via finite differences of energy

• for long-wavelength perturbations a large supercell would be needed

• the case of external electric field is particular

• we consider only static perturbations

General strategy

• the external potential V (λi is a differentiable function of a set of param-eters λi

• Hellman - Feynman theorem

∂E

∂λi=

∫∂Vλ(r)

∂λinλ(r) dr

∂2E

∂λi∂λj=

∫∂2Vλ(r)

∂λi∂λjnλ(r) dr +

∫∂Vλ(r)

∂λi

∂nλ(r)

∂λjdr

• we need the electron-density response ∂nλ(r)∂λi

∆n(r) = 4 Reocc∑i=1

ψ∗i (r)∆ψi(r) = 4occ∑i=1

ψ∗i (r)∆ψi(r)

because for a real potential ψi(r) and ψ∗i (r) are degenerate

• finite-difference operator

∆λF =∑i

∂Fλ∂λi

∆λi

80

Straightforward approach

• first-order perturbation theory correction to a given eigenfunction of theSchrödinger equation is

∆ψn(r) =∑m 6=n

ψm(r)〈ψm|∆V KS|ψn〉

εn − εm

• the charge-density response

∆n(r) = 4occ∑i=1

∑m6=n

ψ∗n(r)ψm(r)〈ψm|∆V KS|ψn〉

εn − εm

• contributions from occupied states cancel each other

• n(r) does not respond to perturbations which only mix the occupied stateswithin the occupied-state manifold

• a knowledge of the full spectrum of HKS is required (no iterative diagonal-ization techniques)

• extensive sumations over empty conduction bands have to be performed

Density functional perturbation theory approach

• variation of the KS orbitals and energies in the first-order perturbation the-ory

(HKS − εn)|∆ψn〉 = −(∆V KS −∆εn)|ψn〉∆εn = 〈ψn|∆V KS|ψn〉

(Sternheimer equation)

• first-order correction to the self-consistent potential

∆V KS(r) = ∆V (r) +

∫∆n(r′)

|r− r′|dr′ +

dvxc(n)

dn

∣∣∣∣n=n(r)

∆n(r)

• equations form a set of self-consistent equations for the perturbed system(∆V KS(r) and ∆n(r) must be selfconsistent)

• all equations are coupled because each ∆ψn depends through ∆n(r) on allother ∆ψm,m 6= n

• generalized linear problem, can be solved iteratively, similar to standard KSproblem

81

Density functional perturbation theory approach

• the operator (HKS − εn) is singular because it has a null eigenvalue

• only the projection on the empty-state manifold is important

• we introduce projectors on occupied (v) and empty (c) KS states

Pv =∑vk

|ψkv 〉〈ψk

v |

Pc = 1−∑vk

|ψkv 〉〈ψk

v |

• we make the operator nonsingular by adding αPv

(HKS + αPv − εn)|∆ψn〉 = −Pc∆V KS|ψn〉

• we only discuss the case of finite gap (insulators, semiconductors)

• the case of metals requires modification

Monochromatic perturbation

• the perturbing potential can be decomposed into Fourier components

∆V KS(r) =∑

q

∆vqKS(r)eiq·r

• one can show that responses to perturbations of different wavelengths aredecoupled

• for each wavevector q a set of self-consistent equations can be solved interms of lattice-periodic functions uk(r) only

• the workload for each calculation is of the same order as that for the unper-turbed system

• phonons can be calculated without the use of supercell

82

Response to particle displacement - Phonons

• we assume a crystalline solid with atoms at regular positions

• Rµ is unit cell, τs is equilibrium position of the atom s within the unit celland uµs is displacement

Rµs = Rµ + τs + uµs

• harmonic force-constants

Φss1αβ (Rµ,Rµ1) =

∂2Etot

∂uµsα∂uµ1s1β= Φss1

αβ (Rµ −Rµ1)

• Φss1αβ (Rµ − Rµ1) depend only on the distance Rµ − Rµ1 (translational in-

variance)

• Fourier transform of Φss1αβ (Rµ −Rµ1)

Φss1αβ (q) =

∑R

e−iq·RΦss1αβ (R) =

1

Nc

∂2E

∂u∗sα(q)∂us1β(q)

• distortion pattern

Rµs [us(q)] = Rµ + τs + us(q)eiq·Rµ


• Φss1αβ (q) has an electronic and ionic contribution

Φss1αβ (q) = elΦss1

αβ (q) +ion Φss1αβ (q) where

elΦss1αβ (q) =

1

Nc

[∫(∂n(r)

∂u∗sα(q))∗∂V ion(r)

∂us1β(q)dr +

∫n(r)

∂2V ion(r)

∂u∗sα(q)∂us1β(q)dr

]V ion(r) =

∑µs

vs[r−Rµ − τs − uµs ]

• vs[r] is the ionic (pseudo-) potential

• the term ionΦss1αβ (q) is independent on the electronic structure (calculated

from Ewald summation)

∂V ion(r)

∂us1β(q)= −

∑µ

∂vs[r−Rµ − τs]∂r

eiq·Rµ

• phonon frequencies are solutions of the secular equation

det

∣∣∣∣ 1√MsMt

Φss1αβ (q)− ω2(q)

∣∣∣∣ = 0

83


• the case of long-wavelength vibrations in polar materials has to be treatedin a special way (LO-TO splitting)

• the accuracy of ab initio lattice dynamics calculations can compete with thatof absorption or diffraction spectroscopies

• DFPT can provide predictions for systems that have not yet been synthe-sized

Response to external electric field

• dielectric constant is defined as

εαβ∞ = 1 + 4π∂Pα∂Eβ

P =−eV

∫dr rn(r)

εαβ∞ = 1− 4πe

V

∫dr rα

∂n(r)

∂Eβ

• we need to calculate

∆En(r) = 4occ∑n=1

ψ∗n(r)∆Eψn(r)

• QM is formulated in terms of potentials, not fields

• potential of the homogeneous electric field VE(r) = eE · r is unboundedfrom below and incompatible with the periodic boundary conditions

∆V KS = eE · r + ∆VH + ∆VXC

Response to external electric field

• position operator r and its matrix elements between states satisfying PBCare ill-defined (boundary dependent)

84

• to get the linear charge-density response we need only off-diagonal matrixelements

∆ψn(r) =∑m 6=n

ψm(r)〈ψm|∆V KS|ψn〉

εn − εm

which can be calculated via

〈ψm|r|ψn〉 =〈ψm|[HKS, r]|ψn〉

εm − εn, ∀m 6= n

• if V KS is local, [HKS, r] is proportional to the momentum operator

[HKS, r] = −~2

m

∂

∂r(in normal units)

• can be generalized also to non-local pseudopotential

12 Berry phase theory of macroscopic polarizationin crystalline dielectrics

Literature: R. Resta, Rev. Mod. Phys. 66, 899 (1994)

Berry phase Berry (1984) [42]Geometric phase in quantum systems

• quantum Hamiltonian with a parametric dependence

H(ξ)|Ψ(ξ)〉 = E(ξ)|Ψ(ξ)〉

• parameter ξ can be quite general

• |Ψ(ξ)〉 is non-degenerate ground state for any ξ

• phase difference between two states does not have any physical meaning

e−i∆φ12 =〈Ψ(ξ1)|Ψ(ξ2)〉|〈Ψ(ξ1)|Ψ(ξ2)〉|

∆φ12 = −Im ln〈Ψ(ξ1)|Ψ(ξ2)〉

because the phase factors of the states are arbitrary

85

Berry phase

• total phase difference along a closed path is gauge-invariant

γ = ∆φ12 + ∆φ23 + ∆φ34 + ∆φ41

= −Im ln〈Ψ(ξ1)|Ψ(ξ2)〉〈Ψ(ξ2)|Ψ(ξ3)〉〈Ψ(ξ3)|Ψ(ξ4)〉〈Ψ(ξ4)|Ψ(ξ1)〉

because all arbitrary phases cancel in pairs

• assume a continuous closed circuit C in the parameter space

e−i∆φ12 =〈Ψ(ξ)|Ψ(ξ + ∆ξ)〉|〈Ψ(ξ)|Ψ(ξ + ∆ξ)〉|

−i∆φ ≈ 〈Ψ(ξ)|∇ξΨ(ξ)〉 ·∆ξ

• in the continuum limit we get a circuit integral of the Berry connection

γ =N∑s=1

∆φs,s+1 →∮C

〈Ψ(ξ)|∇ξΨ(ξ)〉 · dξ

• a gauge-invariant quantity is potentially a physical observable

• the phase γ cannot be expressed in terms of eigenvalues of any operator

Berry phase theory of macroscopic polarizationKing-Smith, Vanderbilt (1993) [43], Resta (1993) [44]

• polarization of a periodic charge distribution is ill-defined (Martin 1974)

• changes of polarization ∆P can be unambiguously defined (related to cur-rent)

∆P =

∫ 1

0

dλdP

dλ

Berry phase theory of macroscopic polarization

• we decompose P into ionic and electronic contributions

∆P = ∆Pel + ∆Pion

∆Pel =1

V

∫dr r∆ρ(r)

86

• Wannier transformation to localized wavefunctions

a(λ)n (r) =

√V

(2π)3

∫dq ψ(λ)

n (q, r)

• periodic charge difference ∆ρ = ρ(1) − ρ(0)

∆ρ(r) = 2eocc∑n=1

∑l

[|a(1)n (r−Rl)|2 − |a(0)

n (r−Rl)|2]

Rl are lattice sites

• ∆ρ is decomposed into sum of localized and neutral charge distributions

• dipole moment per cell

∆Pel =2e

V

occ∑n=1

∫dr r

[|a(1)n (r)|2 − |a(0)

n (r)|2]


• the phases of the Bloch functions are arbitrary so it has to be demonstratedthat ∆Pel is unique

• periodic part u of the Bloch orbitals

u(λ)n (q, r) = e−iq·rψ(λ)

n (q, r) =√V e−iq·r

∑l

eiq·Rla(λ)n (r−Rl)

• basic identity

i

(2π)3

∫dq 〈u(q)|∇qu(q)〉 =

i

(2π)3

∫dq

∫dr u∗(q, r)∇qu(q, r)

= iV

(2π)3

∫dq

∫dr∑l

e−iq·(Rl−r)a∗(r−Rl)∇q

∑m

eiq·(Rm−r)a(r−Rm)

= iV

(2π)3

∫dr∑lm

a∗(r−Rl)a(r−Rm)i(Rm − r)

∫dq e−iq·(Rl−r)eiq·(Rm−r)

= −∑m

∫dr (Rm − r)|a(r−Rm)|2 =

∫dr r|a(r)|2

87


• u(λ)n (q) are eigenstates of the Hamiltonian

H(λ)(q) =1

2m(p + ~q)2 + V (λ)(r)

• u(λ)n (q) satisfy the relation u(λ)

n (q + G, r) = exp(−iG · r)u(λ)n (q, r) where

Gi are reciprocal lattice vectors

• we define q = ξ1G1 + ξ2G2 + ξ3G3 and ξ4 = λ

Gj

∫dr r|a(ξ4)

n (r)|2 = i

∫dξ1dξ2dξ3〈un(ξ)| ∂

∂ξjun(ξ)〉 j = 1, 2, 3

where |un(ξ)〉 = |u(λ)n (q)〉 and the integral is performed over the unit cube

• we consider the parametric Hamiltonian H(ξ) in the 4D space of ξ

• we define the Berry connection

χ(ξ) = iocc∑n=1

〈un(ξ)|∇ξun(ξ)〉


• we want to express the change of polarization ∆Pel as a Berry phase

γ(C) = −∮C

dφ = −∮C

χ(ξ) · dξ

• we choose C to be a contour parallel to ξ3 and ξ4 axes at given values of ξ1

and ξ2

• the two sides parallel to ξ4 give zero contribution since the segments differby G3

• the two remaining sides give contribution

γ(C) = −∮C

dφ =

∮C0

dφ−∮C1

dφ

where C0 is ξ = (ξ1, ξ2, x, 0) and C1 is ξ = (ξ1, ξ2, x, 1), 0 ≤ x ≤ 1

• the component of ∆Pel along G3 is

G3 ·∆Pel =2e

V

∫dξ1dξ2γ(C)

where the double integral is over a unit square

88


• macroscopic polarization is a property of the phase of the wavefunctionsand not of the periodic charge densities (where phase is lost)

• the Berry phase formalism is useful for simulations because it requires onlythe wavefunctions of the initial ξ4 = 0 and final ξ4 = 1 state

• applies only to the periodic case (E = 0), cannot be used to calculate thestatic dielectric constant

• calculation of infrared spectra (Bernasconi, Silvestrelli, Parrinello 1998)[45]

ε2(ω) =2πω

3V kBT

∫ ∞∞

dt e−iωt〈M(t)M(0)〉

where M is the total dipole moment of the sample and the electronic con-tribution is calculated via the Berry phase scheme

• more modern version (Resta 1998) [46] QM Position Operator in ExtendedSystems

〈X〉 =L

2πIm ln〈Ψ0|ei

2πLX |Ψ0〉 where X =

N∑i=1

xi

Part IV

Path Integral Quantum MonteCarlo methods13 Quantum vs. classical simulationWhy do we do classical simulations ?

Because we don’t know how to perform the integration on high-dimensionalphase space. In a classical simulation the probability distribution to be sampledon the phase space is known analytically.

p = exp(−βE(R,P)) (1)

Simulation (e.g. Monte Carlo or molecular dynamics or other methods) just per-forms the sampling of the distribution in the multi-dimensional space.

89

Why do we do quantum simulations ?Because in the quantum case, we don’t even know analytically the distribution,

just the underlying hamiltonian.

ρ = exp(−βH(R, P)) (2)

The density matrix in the above expression is an operator which is not diagonalin R and thus cannot be sampled directly. Simulation has to account for both thefunction to be sampled and the sampling itself.

QMC is more difficult.

Finite-temperature methods

• Path Integral QMC

calculate a trace over the thermal density matrix exp(−βH)

Zero-temperature methods

• Variational QMC

• Projector Methods

calculate properties of a single wavefunction Ψthese techniques can be applied to both continuum and lattice models

14 Imaginary time density matrixConsider a non-relativistic system of N particles with only translational degreesof freedom interacting via a pair potential (for simplicity, not an essential condi-tion) described by the Hamiltonian

H = − ~2

2m

N∑i=1

∇2i +

∑i<j

v(rij) = K + V , (3)

We want to study properties of this system in equilibrium at a finite temperatureT . This means calculating averages of operators corresponding to physical quan-tities. Let Φi be the complete system of eigenfunctions of the hamiltonian withcorresponding eigenvalues Ei, HΦi = EiΦi. For the beginning we ignore thestatistics of the particles and assume they are distinguishable.

90

Prescription of statistical mechanics

For inverse temperature β = 1/kBT we have density matrix ρ and partitionfunction Z and we can calculate the thermal average of an operator O

ρ = exp(−βH)

Z = Trρ =∑i

exp(−βEi)

〈O〉 =1

ZTr[ρO] =

1

Z

∑i

〈Φi|O|Φi〉 exp(−βEi)

However, this form is not very useful - we don’t know neither Φi nor Ei andthere’s no easy way to calculate them for non-trivial many body systems.

Note: Path integrals we will deal with from now on are sometimes called alsoimaginary time path integrals. This originates from the relation between the timeevolution operator in quantum mechanics U = exp(− i

~Ht) and the equilibriumdensity matrix in statistical mechanics ρ = exp(−βH). Substituting for the realtime t in the expression for U the imaginary time t → −iβ~ (Wick rotation) weobtain ρ.

Let’s try a different route.

〈O〉 =1

Z

∫dR

∫dR′〈R| exp(−βH)|R′〉〈R′|O|R〉

=1

Z

∫dR

∫dR′ρ(R,R′; β)〈R′|O|R〉 ,

where we introduced the position representation of the density matrix ρ(R,R′; β) =〈R| exp(−βH)|R′〉. Product of two density matrices is again a density matrix.

exp[−(β1 + β2)H] = exp(−β1H) exp(−β2H) (4)

This can be translated into a convolution property for matrix elements of the den-sity matrix

ρ(R1,R3; β) =

∫dR2ρ(R1,R2; β1)ρ(R2,R3; β2) (5)

In order to obtain a path-integral representation, we now write β = Pτ , whereτ = β/P is the time step. The integer number P is the Trotter number. Theoperator identity

exp(−βH) = (exp(−τH))P

= exp(−τH) exp(−τH) . . . exp(−τH)

91

now can be translated into the following exact convolution property for the densitymatrix

ρ(R0,RP ; β) =

∫. . .

∫dR1dR2 . . . dRP−1

× ρ(R0,R1; τ)ρ(R1,R2; τ) . . . ρ(RP−1,RP ; τ)

For P finite we have a discrete time-path R0,R1, . . . ,RP, taking the limit P →∞ we get a continuous path.

15 Path Integral Monte Carlo (PIMC)What did we gain ?

Quite a lot. The problem is how to evaluate an exponential of an operatorconsisting of two parts which do not commute.

H = K + V, [K,V ] 6= 0⇒exp(−βH) = exp(−β(K + V )) 6= exp(−βK) exp(−βV )

But the density matrices ρ(Ri−1,Ri; τ) in the representation (6) are at inversetemperature τ = β/P , and therefore correspond to temperature PT which is Ptimes higher than the true temperature T .

At high temperatures quantum mechanics becomes equivalent to the classicalmechanics !

For large P , τ = β/P is small enough and we can apply the short time ap-proximation for ρ(Ri−1,Ri; τ) which consists of writing

exp(−τ(K + V )) ≈ exp(−τK) exp(−τV ) . (6)

The convergence of this approximation for large P is guaranteed by the Trotterformula

exp(−β(K + V )) = limP→∞

[exp(−τK) exp(−τV )]P (7)

This is the simplest approximation possible and is called the primitive approxima-tion. Inserting the decomposition of the identity operator 1 =

∫dR′1 |dR′1〉〈dR′1|

in the position representation we get for the short time density matrix

ρ(R1,R2; τ) = 〈R1| exp(−τH)|R2〉 ≈∫dR′1〈R1| exp(−τK)|R′1〉〈R′1| exp(−τV )|R2〉

The potential energy operator is diagonal in the position representation

〈R′1| exp(−τV )|R2〉 = exp(−τV (R2))δ(R′1 −R2) , (8)

92

therefore

ρ(R1,R2; τ) ≈ 〈R1| exp(−τK)|R2〉 exp(−τV (R2)) . (9)

The matrix element of the exponential of the kinetic energy operator is just the freeparticle density matrix. For a single particle we can calculate it e.g. by insertingtwice the resolution of identity operator in the momentum representation (anotherpossibility is to solve the Bloch equation)

〈r1| exp(−τK)|r2〉

=

∫dp

∫dp′〈r1|p〉〈p| exp(−τK)|p′〉〈p′|r2〉

=

∫dp

∫dp′〈r1|p〉〈p|p′〉〈p′|r2〉 exp(−τ p2

2m)

=1

(2π~)3

∫dp exp

(i

~p · (r1 − r2)− τ p2

2m

)=

(Pm

2πβ~2

) 32

exp

[− Pm

2β~2(r1 − r2)2

]Using these expressions we obtain the discrete path-integral representation of thedensity matrix in the primitive approximation

ρP (R0,RP ; β) =

(Pm

2πβ~2

) 3NP2∫dR1 . . .

∫dRP−1

× exp

(−

P∑i=1

[Pm

2β~2(Ri−1 −Ri)

2 +β

PV (Ri)

])

Using this form, we get for the approximate partition function

ZP =

∫dR0ρP (R0,R0; β) =

(Pm

2πβ~2

) 3NP2∫dR1 . . .∫

dRP exp

(−

P∑i=1

[Pm

2β~2(Ri−1 −Ri)

2 +β

PV (Ri)

])

=

(Pm

2πβ~2

) 3NP2∫dR1 . . .

∫dRP exp(−βHP

eff ) ,

where we set R0 = RP (periodic boundary condition along the Trotter direction).It has a form of a classical configurational integral.

93

The last expression represents a mapping from a quantum to a classical systemwith a temperature dependent effective hamiltonian (or action) given by

HPeff =

P∑i=1

[Pm

2β2~2(Ri−1 −Ri)

2 +1

PV (Ri)

](10)

Therefore we can use in principle all simulation techniques developed for classicalsystems to sample the path integral representation of the quantum system.

Trotter approximation is the only approximation used in the derivation of thepath integral representation. For P → ∞ it becomes again exact and equal tothe Feynman path integral over the space of continuous paths. For finite P it isunder control, since for any inverse temperature β we can make the time stepτ arbitrarily small by increasing P and thus improve the accuracy of the path-integral representation to any extent (in principle).

The price we have to pay for the mapping quantum→ classical are the addi-tional integrations. We have to deal with P “replicas” of the original system andaltogether with 3NP degrees of freedom instead of original 3N . The configura-tion Ri = r1

i , r2i , . . . , r

Ni will be referred to as the i-th Trotter slice.

Few remarks on the Trotter number P

• P depends on the properties of the system and temperature

• the lower is the temperature T , the higher is the value of P necessary for agiven accuracy. It is therefore impossible to reach zero temperature. Pathintegral is not a ground state method !

• the stronger are the quantum effects, the higher is the required value of P .

• for P = 1, we obtain the classical system.

• in practical applications, P is typically in the range 10 - 100.

• we come back to this point later in more detail.

Properties and physical meaning of the mapping[47]Each particle ri of the system is now represented by P particles

ri → rikPk=1 = ri1, ri2, . . . , riP . (11)

This can be interpreted as a chain of “beads”, or a polymer ring. The effectiveclassical hamiltonian HP

eff consists of two parts.

94

• interactions

P∑i=1

[Pm

2β2~2(Ri−1 −Ri)

2

]=

Pm

2β2~2

P∑i=1

N∑k=1

(rki−1 − rki )2 (12)

can be regarded as harmonic springs acting between “beads” representingthe same particle in successive Trotter slices (nearest neighbours along theTrotter direction). The higher P , the stronger become the springs.

• interactions∑P

i=11PV (Ri) act only between “beads” in the same Trotter

slice and are equal to the potential energy of the quantum system divided byP . The higher P , the weaker become these interactions.

The chain of beads connected with springs thus can be regarded as polymer.Because of the periodic boundary condition imposed along the Trotter direction,R0 = RP , the chain must come back to its initial point after P steps, so it is a ringpolymer. The polymers interact with each other in a particular way - only withinthe same Trotter slice.

The path integral defines an isomorphism between the quantum system andthe classical system of interacting ring polymers. This provides us with a methodto understand many properties of quantum systems entirely in terms of classicalstatistical mechanics.

How to calculate the averages of observables ?The path-integral mapping allows us to easily calculate the average of any

operator which is diagonal in the position representation.

〈O〉P =1

ZP

∫dRρP (R,R; β)〈R|O|R〉

=1

ZP

(Pm

2πβ~2

) 3NP2∫dR1 . . .

∫dRP 〈RP |O|RP 〉

× exp(−P∑i=1

[Pm

2β~2(Ri−1 −Ri)

2 + τV (Ri)])

All Trotter slices are equal, therefore we can take the average of the operatorO inthe last expression in all slices and divide by P .

〈O〉P =1

ZP

(Pm

2πβ~2

) 3NP2∫dR1 . . .

∫dRP

1

P

P∑j=1

〈Rj|O|Rj〉 exp(−P∑i=1

[Pm

2β~2(Ri−1 −Ri)

2 + τV (Ri)])

95

In this way we can calculate any correlation functions, e.g. structure-related quan-tities. We simply calculate the quantity on each Trotter slice like in a classicalsimulation, and then take the average over all Trotter slices.

How to get an estimator for internal energy ?Internal energy contains apart from potential also the kinetic energy contri-

bution, which is not diagonal in the position representation. The easiest way toderive the estimator is via thermodynamics.

EP = − 1

ZP

∂ZP∂β

(13)

The effective hamiltonian itself is temperature dependent. Taking the derivativewe get

EP =3NP

2β+

1

ZP

(Pm

2πβ~2

) 3NP2∫dR1 . . .

∫dRP

×

[P∑i=1

(− Pm

2β2~2(Ri−1 −Ri)

2 +1

PV (Ri)

)]exp(−βHP

eff )

The expression in the square bracket can be seen as estimator, or quantity whichhas to be averaged over the probability distribution exp(−βHP

eff ) in order to getthe internal energy of the system. The last equation can thus be written as

EP =3NP

2β+

⟨1

P

P∑i=1

[−m

2

N∑k=1

(rki−1 − rki

~τ

)2

+ V (Ri)

]⟩

Clearly, the expression

K =3NP

2β−

⟨1

P

P∑i=1

[m

2

N∑k=1

(rki−1 − rki

~τ

)2]⟩

corresponds to kinetic energy (primitive estimator). At first look it might lookstrange, because of the negative sign in front of the energy of the springs. The“faster” the particle moves along the path between the slice i−1 and i, the lower isthe kinetic energy. The paradox is resolved when one realizes that we are dealingwith imaginary time path integral, so the sign is reverted. In fact, it’s consistentwith the uncertainty principle: the more delocalized the particle is, the lower thekinetic energy. When the particle is confined to stay within a small region ofspace, the kinetic energy is large.

While the last expression has a clear physical interpretation, it is not veryuseful for practical computation. Kinetic energy arises as a difference between

96

two quantities which both diverge as P . A statistical error in the average will thusbe amplified by this factor and for large P serious problems show up.

How to get a good estimator for internal energy ?A solution to the above problem was provided in Ref.[48]. The idea is to

extract the kinetic energy directly from the potential energy, without explicit ref-erence to the energy of the harmonic springs. In classical statistical mechanics,the virial theorem connects the two quantities. Applying the theorem to path inte-grals, on finds a new estimator for kinetic energy

K =

⟨1

2P

P∑i=1

Ri ·∂V (Ri)

∂Ri

⟩(14)

which is exact and has much smaller variance than the primitive estimator. It isoften used in real simulations.

Specific heat can be either calculated from internal energy via finite differ-ences, or one can derive a fluctuation formula directly from the path integral rep-resentation.

What happens when P is large ?Computationally, the workload of PIMC scales linearly with P . The higher P ,

the smaller is the error introduced by the Trotter approximation. Thus one mightthink it’s easy to improve the accuracy by using high values of P .

However, this is not true,because of the particular properties of the path-integral action. The harmonic

term grows linearly with P , while the potential energy term decreases as 1P

. Forlarge P therefore the harmonic term becomes completely dominant while the po-tential energy term becomes just a small perturbation. In a perfectly harmonicsystem, there is no exchange of energy between the normal modes. If the per-turbation on the harmonic term is small, which is just the case here for large P ,the energy exchange between the normal modes will be small. The system willenter the so called KAM (Kolmogorov - Arnold - Moser) regime, characterizedby loss of ergodicity [49]. If one used molecular dynamics to sample the action, itwould fail completely to sample the phase space, since the time average will notagree with the phase space average. With Monte Carlo, the problem is to someextent less severe, since there’s no deterministic dynamics. However, even herethe stiff springs cause a trouble since they allow only small displacements of thesingle particle moves resulting in a large auto-correlation times and non-efficientsampling.

Convergence of the path integral representation in PFor any finite P , the path integral representation is approximate. How does it

converge to exact quantum results when P →∞ ?

97

In case of the primitive approximation the answer for the partition function Zand average of a Hermitian operator O is provided by the formulas (Fye 1986)[50]

ZP = Z +O(τ 2) = Z +O

[(β

P

)2]

= Z +O(P−2)

〈O〉P = 〈O〉+O(τ 2) = 〈O〉+O

[(β

P

)2]

= 〈O〉+O(P−2)

The last result is extremely useful, since it explicitly provides the asymptoticscaling form as a function of P of the approximate averages. It is therefore notnecessary to reach the regime where the correction is negligible. Instead it is suf-ficient to reach the asymptotic O(P−2) regime. In practice, one usually performsseveral simulations with different values of P . If the results for a given quantityare equal for different values of P within the statistical error, the quantity can beconsidered converged and its value equal to the exact quantum value. If the valuesfor different P differ considerably, one can try to plot them against O(P−2) and ifthey collapse on a straight line, extrapolate to P →∞.

More specifically, it can be proved [51] that the expansion of averages is aneven function of 1/P , and one can write

〈O〉P = 〈O〉+a

P 2+

b

P 4+

c

P 6+ . . . (15)

so even when the asymptotic O(P−2) regime is not reached, one can try to fit andextrapolate the simulation data provided they are available for several values ofP .

16 Strategies to solve the “large P problem”There are several possibilities aimed at different aspects of the problem.

1. Improving on the action

2. Improving on the sampling

3. Improving on the extrapolation to P →∞

Improving on the action (how to improve the accuracy of the path integralrepresentation while keeping P low)

98

The decomposition of the hamiltonian in K and V we used (primitive approx-imation) is the simplest possible, and usually does not provide the best accuracyfo a given P . It is useful to include in the action as many of its exact properties aspossible.

In the case of liquid He, the primitive approximation would require a Trotternumber as large as P = 1000 in order to reach the temperature of the λ transition.With improved action, the actual simulation was performed with P = 20 [52].

Some methods to improve on the action

• renormalization of the pair density matrix [52]

• matrix squaring method [52]

• higher order approximations [51, 53]

Improving on the sampling of the path integralIn principle, we have to with ordinary multidimensional integration. It should

therefore be possible to apply any standard sampling method like Monte Carloor molecular dynamics. However, due to the already mentioned problem withergodicity, sampling might not always be efficient.

How to remedy the ergodicity problem ?The origin of the stiffness of the springs is the very form of the free particle

propagator.

〈r1| exp(−τK)|r2〉

=

(Pm

2πβ~2

) 32

exp

[− Pm

2β~2(r1 − r2)2

]=

(1

2πλ2

) 32

exp

[− 1

2λ2(r1 − r2)2

]where λ2 = β~2

Pm= τ~2

mis the thermal wavelength corresponding to the inverse

temperature τ . For |r1 − r2| & λ, the value of the propagator drops to zero veryfast. Since λ → 0 as P grows, the single particle moves will be constrained tosmaller and smaller displacements.

In the classical limit ~→ 0, λ→ 0 and the springs will become infinitely stiff,constraining the beads representing a given atom to have the same position in allTrotter slices (chain will collapse). There will be no quantum fluctuations, but onewould still like to sample the classical phase space. The only way to do so is tomove all the beads representing a given atom in all Trotter slices simultaneously.Such move is, however, a multislice or collective move. We recover back the usualMonte Carlo sampling; we have P beads each feeling the potential U/P , whichare constrained to move simultaneously.

99

Even when we are not in the classical limit, it is extremely useful to employsuch multislice moves. Moving all the beads representing a given atom in allTrotter slices simultaneously, one does not deform the springs; (ri+∆)− (ri+1 +∆) = ri − ri+1, so the internal structure of the path is not changed. Thereforethe displacements of these multislice moves are not directly constrained by thethermal wavelength λ, but rather by the structure of the potential U(R and usuallycan be much larger. Such translational moves of whole chains can be interpretedas sampling the classical degrees of freedom of the center of mass of the chains.On the other hand, the local moves where each bead is displaced independently,alter the structure of the path and can be interpreted as sampling the internal, orquantum degrees of freedom of the chain. Both kinds of moves can be mixedeither at random, or on a regular basis; e.g. after performing a local move oneach bead of a chain, one performs a collective move on the whole chain. Thepresence of collective moves considerably improves the sampling and decreasesthe autocorrelation time of the random walk.

Fourier space and normal-mode sampling.It’s possible to further generalize the above idea. The moves of a whole chain

actually represent moves of the k = 0 Fourier mode of the chain. One can de-compose the total kinetic action (springs) in normal modes and instead of actingon the beads act directly on all the normal modes. A different approach is to ex-press the exact path integral representation (in the limit P →∞) as integrals overthe Fourier components. In this way one obtains an infinite number of modes; anapproximation is introduced by truncating the number of modes in order to makethe problem tractable. Instead of discretizing the path in M time steps, one trun-cates the number of modes, so the problem is how to approximate the effect of theneglected modes. Partial averaging [54].

Quite generally, in MC we have the freedom to choose the moves, or even in-vent special ones, provided we satisfy the detailed balance. Several sophisticatedmethods have been designed, in particular in connection with liquid He simula-tion.

• multilevel Metropolis method [52]

• bisection method [52]

• staging [55]

• PIMD with Nose Hoover thermostat. We will come back to this again inconnection with ab-initio path integrals.

Improved extrapolation to P →∞For systems with stiff degrees of freedom, like covalent bonds in molecules or

solids, at low temperatures it might be difficult to reach the P−2 scaling regime

100

unless a very large value of P is used. Remarkably, the Trotter approximation isnot exact even for a harmonic oscillator, so if we wanted to simulate a stiff oneat a low temperature, we would have to use a large P in order to recover a trivialquantum mechanical result. However, for a harmonic oscillator the exact resultsfor a finite P can be calculated analytically.

This fact is a basis of the improved extrapolation method of Ref.[56]. Oftenthe dominant part of the pronounced systematic P dependence of the path integralresults comes from hard, high frequency modes of the system, which are almostharmonic. Provided a harmonic approximation of the system is known, it is possi-ble to perform its finite P evaluation and correct for the systematic P dependenceof the raw PIMC data. The P dependence of the corrected data is then consid-erably less pronounced and the extrapolation to P → ∞ can be performed withmuch lower values of P . For strongly anharmonic systems, it is possible to usethe self-consistent harmonic approximation.

17 Techniques for treating rotational degrees of free-dom

Techniques presented so far discretized the full kinetic energy in cartesian coordi-nates. This is suitable for systems where all degrees of freedom are translational,which is in principle always applicable in condensed matter systems consisting ofatoms. Sometimes, however, it is useful to treat some subsystems of the system asrigid objects and express the hamiltonian of the system also in terms of rotationaldegrees of freedom. Typical case are molecular crystals or molecular adsorbateson surfaces. Usually the intramolecular degrees of freedom are very stiff, withhigh vibration frequencies corresponding to temperatures of the order of 103 K.Even at room temperature these are completely frozen and thus decoupled fromtranslations and rotations of the molecule. Under such conditions, it is well jus-tified to treat the molecules as rigid rotors characterized apart from mass also bymoment of inertia which can be fully represented by the coordinates of the centreof mass and a set of angles specifying the orientation.

Apart from physical motivation, there’s a lot of technical reasons for usingrigid molecule models. Neglecting the stiff vibrational degrees of freedom, theTrotter number can be considerably reduced with respect to the values necessaryto treat fully flexible models in Cartesian coordinates. This brings a possibilityof much longer runs within a given amount of CPU time and results in muchbetter statistics. Moreover, these systems often exhibit phase transitions involvingthe orientational degrees of freedom. These are typically induced by relativelyweek dipolar or quadrupolar interactions and therefore the ordering can only

101

take place at a low temperature, where the quantum effects on the rotations alsobecome important. Such orientational phase transitions involve long correlationlengths, and therefore exhibit pronounced finite-size effects. In order to extractinformation about the nature of the transition, it is often important to be ableto simulate large systems and perform finite-size scaling. This has only beenpossible due to use of the rigid rotor representation of the molecules.

We will assume that the system is described by the hamiltonian

H = T tr + T rot + V (16)

where

T rot =drot∑i=1

L2i

2Θii

(17)

and Li are the components of the angular momentum operator and Θii are mo-ment(s) of inertia of the rigid molecule. drot is the number of angles necessary tospecify the orientation of the molecule.

Now the decomposition of the identity operator can be written as 1 =∫dRdω |Rω〉〈Rω|,

where ω represents the angles specifying the orientation. We can write the short-time approximation

exp(−τ(T tr + T rot + V )) =

exp(−τ2V ) exp

[−τ(T tr + T rot)

]exp(−τ

2V ) .

where we treated the potential energy operator in a more symmetric way. In therigid rotor approximation, [T tr, T rot] = 0 and so we can write

ρ(R1ω1,R2ω2; τ) = 〈R1ω1| exp(−τH)|R2ω2〉 ≈〈R1ω1| exp[−τ

2V (R1)] exp(−τT tr) exp(−τT rot)

exp[−τ2V (R2)]|R2ω2〉 = exp[−τ

2(V (R1) + V (R2))]

〈R1| exp(−τT tr)|R2〉〈ω1| exp(−τT rot)|ω2〉

The term 〈ω1| exp(−τT rot)|ω2〉 is new and represents a propagator for rotationalmotion, or free rotor density matrix (sometimes also called kernel). Generally,this is more difficult to evaluate than the free particle density matrix (propagatorfor translational motion). For simplicity of the notation, we will now concentrateon a single free rotor without loss of generality. It is useful to treat separately thethree cases depending on the dimensionality of the rotation.

One-dimensional rotation (linear molecule confined to rotate in a plane)drot = 1

102

There’s just one angle 0 ≤ φ < 2π specifying the orientation of the molecule.The hamiltonian reads

T rot = −B ∂2

∂φ2, (18)

where B = ~22Θ

is the rotational constant and Θ is the moment of inertia of therotor.

Note. Formally the same hamiltonian appears in the context of Josephsonjunctions, where the particle number operator conjugate to the superconductingphase operator φ is n = −i ∂

∂φ. The electrostatic charging energy then reads E =

Q2

2C= (en)2

2C= −Ec ∂

2

∂φ2, where Ec = e2

2C.

The hamiltonian (18) has an orthonormal set of eigenfunctions 〈φ|m〉 = 1√2πeimφ

with eigenvalues Em = Bm2. We can evaluate the free rotor density matrix asfollows

〈φ1| exp(−τT rot)|φ2〉 =∑m1,m2

〈φ1|m1〉〈m1| exp(−τT rot)|m2〉〈m2|φ2〉 =

1

2π

∞∑m1=−∞

exp[−Bτm2

1 − i(φ1 − φ2)m1

]Now we can use the Poisson summation formula

n=∞∑n=−∞

f(n) =M=∞∑M=−∞

∫ ∞−∞

f(x)e2πiMxdx (19)

and transform the last equation into

1

2π

∞∑m=−∞

exp[−Bτm2 − im(φ1 − φ2)

]=

(1

4πBτ

) 12

∞∑M=−∞

exp[− 1

4Bτ(φ1 − φ2 + 2πM)2]

With the last form we get the path integral representation of the partition functionof a free rotor

ZP =

(P

4πBβ

)P2

P∏i=1

[∞∑

Mi=−∞

∫ 2π

0

dφi

](20)

exp

[−β

P∑j=1

P

4Bβ2(φj − φj+1 + 2πMj)

2

](21)

103

Apart from the continuous angular variables φj , we have also the discrete integervariables Mj . In order to fully specify the path in imaginary time, we thus needthe full set φ1, . . . , φP ,M1, . . . ,MP. Both sets of variables therefore have tobe sampled in the Monte Carlo simulation. In principle, this could be done, butit is more convenient to perform a further transformation to get a more tractableexpression.

The winding number representationOne can define the following set of new variables

φ1 = φ1

φ2 = φ2 − 2πM1

φ3 = φ3 − 2πM1 − 2πM2

...

φP = φP − 2πP−1∑j=1

Mj

in which the 2πMj terms successively cancel for j = 1, . . . , P − 1, since φj −φj+1 = φj − φj+1 + 2πMj . For j = P , instead, all the terms sum up and we getφP − φP+1 + 2πMP = φP − φ1 + 2π

∑Pj=1Mj = φP − φ1 + 2πn.

The sum n =∑P

j=1Mj is called the winding number of the path. The path isnow specified by the set φ1, . . . , φP , n, and except for φ1 ∈ [0, 2π), the otherangles φj ∈ (∞,∞) for j = 2, . . . , P .

The winding number counts how many times a given path winds around thecircle [0, 2π) and thus represents a topological property of the path which is in-variant with respect to the small (local) deformations. A purely local Monte Carloalgorithm is thus uncapable of sampling the winding number, and global movesupdating at the same time all P angles φ1, . . . , φP are necessary. One has to cutthe path, change the winding number and reconnect it again.

The decoupled winding number representationIn the previous scheme the angular mismatch due to the non-zero winding

number was concentrated in one of the Trotter slices. A further simplification isobtained by applying the transformation

φj = φj − 2πnj − 1

P, j = 1, . . . , P (22)

which distributes the mismatch uniformly through all the slices. One can show

104

this leads to the representation

ZP =

(P

4πBβ

)P2

∞∑n=−∞

∫ 2π

0

dφ1

P∏i=2

[∫ ∞−∞

dφi

]

exp

[−β

P∑j=1

P

4Bβ2(φj − φj+1)2 +

(2πn)2

4β2B

]where the winding number is decoupled from the angular degrees of freedom. Thesum over n, however, cannot be performed explicitly, since for non-free rotorsthe transformation (22) has to be applied also in the potential energy propagator(not included in the above expression). Nevertheless, the winding number n canbe sampled directly as a random variable from a Gaussian distribution, which isconvenient for the simulation.

Moreover, this representation provides some physical insight into the distri-bution of winding numbers. It’s gaussian and thus only paths with (2πn)2

2λ2. 1,

where λ2 = 2βB = ~2βΘ

can be interpreted as angular thermal wavelength, cancontribute significantly. The lower the temperature, the more probable becomepaths with larger winding numbers.

Two-dimensional rotation (linear molecule rotating in 3D space)drot = 2The orientation of the molecule is defined by two angles, ω = (ϑ, φ). The

kinetic energy operator

T rot = −B[

1

sinϑ

∂

∂ϑ

(sinϑ

∂

∂ϑ

)+

1

sin2 ϑ

∂2

∂ϑ2

](23)

is proportional to the square of the angular momentum operator. The orthonor-mal eigenstates of T rot are the spherical harmonics 〈ϑ, φ|lm〉 = Ylm(ϑ, φ) witheigenvalues El = Bl(l + 1). We again calculate the free rotor density matrixby inserting the decomposition of the identity operator in the angular momentumrepresentation

〈ϑ1φ1| exp(−τT rot)|ϑ2φ2〉 =∑l

l∑m=−l

〈ϑ1φ1|lm〉〈lm|ϑ2φ2〉 exp(−τBl(l + 1))

Using the addition theorem for spherical harmonics we can explicitly sum overm and get

〈ϑ1φ1| exp(−τT rot)|ϑ2φ2〉 =∑l

2l + 1

4πPl(cos γ12) exp(−τBl(l + 1)) ,

105

where Pl is the Legendre polynomial of degree l and cos γ12 = cosϑ1 cosϑ2 +sinϑ1 sinϑ2 cos(φ1−φ2) is the relative angle between the two orientations (ϑ1, φ1)and (ϑ2, φ2).

The sum over l in the last expression cannot be performed analytically. In prin-ciple, one could also sample the discrete variable l by Monte Carlo, but becauseof the oscillating sign of the Legendre polynomials, this would lead to a severesign problem. It is much more convenient to perform the sum over l numericallyand store the resulting kernel function of cos γ12 in a table for simulation.

Three-dimensional rotation (arbitrary molecule rotating in 3D space)drot = 3In this case we need 3 Euler angles to specify the orientation of the molecule.

The approach is the same as above just the formulas are more complicated.Coupling to nuclear spinsSo far we ignored the symmetry requirements imposed on the wavefunction

of the rotor by the statistics of the nuclei. The spins of the nuclei implicitly con-strain the symmetry of the rotational state and have to be taken into account inthe construction of the kernel (e.g. not all values of the angular momentum maybe allowed). Several subtle points arise related to different kinds of averaging(quenched or annealed) over the molecular species with different symmetry (awell-known case is ortho vs. para hydrogen). In some cases, the sign problem ispresent. For discussion, see Ref.[57] and references therein.

18 Lattice systems with discrete degrees of freedomSo far we discussed only systems with continuous degrees of freedom. The PIMCmethodology can, however, be applied also to lattice systems with discrete degreesof freedom. We will illustrate the procedure here on the well-known case of Isingmodel in transverse field [51]. The hamiltonian of the system reads

H = −Γ∑i

σxi − J∑〈ij〉

σzi σzj = K + U

K = −Γ∑i

σxi , U = −J∑〈ij〉

σzi σzj ,

where we assume that the spins occupy sites of a d-dimensional cubic latticeand σxi , σ

yi , σ

zi are Pauli matrices corresponding to spin on lattice site i. Clearly,

[K,U ] 6= 0. We proceed analogously to the case of continuous degrees of free-

106

with J ′ = −12

ln tanh a, C2 = 12

sinh a. We get for our matrix elements

〈sk|e−βK/P |sk+1〉 =N∏i=1

〈ski | exp(βΓ

Pσxi )|sk+1

i 〉

=N∏i=1

CeJ′ski s

k+1i = CN

N∏i=1

eJ′ski s

k+1i

J ′ = −1

2ln tanh

Γβ

P,C =

(1

2sinh

2Γβ

P

) 12

which can be interpreted as an Ising-like coupling between the Trotter replicas ofthe same spin which are nearest-neighbours along the Trotter dimension.

The expression 〈sk|e−βK/P e−βU/P |sk+1〉 now can be written as follows

〈sk|e−βK/P e−βU/P |sk+1〉 = CN

N∏i=1

eJ′ski s

k+1i

∏〈ij〉

eβJPski s

kj (25)

For the full partition function we thus get

ZP = CNP∑s1

. . .∑sP

P∏k=1

N∏i=1

eJ′ski s

k+1i

∏〈ij〉

eβJPski s

kj

which represents a partition function of a (d + 1)-dimensional anisotropic Isingmodel. It has couplings βJ

Pin the d-dimensional hyperplanes and J ′ along the

extra dimension. The β-dependent term CNP just adds a trivial contribution tothe free energy. The Trotter decomposition has thus introduced an additionaldimension into the problem. However, even when we take the thermodynamiclimit N →∞, the size of the system along the extra dimension remains finite andequal to the Trotter number P .

The mapping of a d-dimensional quantum system onto a (d+ 1)-dimensionalclassical system provides a practical tool for simulation. Almost all the properties(including difficulties for large P ) mentioned for continuous degrees of freedomremain. Some new peculiar properties appear:

• sampling - in discrete systems we cannot adjust the acceptance ratio byvarying the size of the moves, we have to accept what comes.

• it’s not always clear how to decompose the hamiltonian in order to apply theTrotter formula, there may be many different possibilities. Checkerboarddecomposition.

108

• sometimes it’s necessary to invent sophisticated multiparticle moves in or-der to satisfy the conservation laws imposed by a particular decompositionof the hamiltonian

• one has to carefully derive estimators for response functions (e.g. suscep-tibilities) related to operators which do not commute with the hamiltonian,as they are not a simple translation of classical expressions.

• for many examples and details, see Ref.[51]

What happens as T → 0 ?If we fix τ = β/P and let P → ∞, we effectively perform the limit β → ∞

which corresponds to zero temperature. The size of the system along the extra di-mension becomes also infinite and thus we have a truly (d+1) dimensional system.It can be shown in this way that the ground state of the model in d dimensions isequivalent to a (d + 1)-dimensional model where Γ plays the role of tempera-ture. This has an intrinsic theoretical importance beyond simulations; e.g. in theIsing model in transverse field, the quantum phase transition driven at T = 0 bychange of Γ has the universality class and critical exponents corresponding to thetemperature driven transition in (d+ 1)-dimensional Ising model.

References[1] F. Nogueira, A. Castro, M. A. L. Marques, A Tutorial on the Density Func-

tional Theory. Lecture Notes in Physics 620, 218 - 256, (2003) SpringerVerlag

[2] D. R. Hamann, M. Schlüter, and C. Chiang, Phys. Rev. Lett. 43, 1494 - 1497(1979)

[3] G. B. Bachelet, D. R. Hamann, and M. Schlüter, Phys. Rev. B 26, 4199 -4228 (1982)

[4] Steven G. Louie, Sverre Froyen and Marvin L. Cohen, Phys. Rev. B 26, 1738- 1742 (1982)

[5] N. Troullier and José Luriaas Martins, Phys. Rev. B 43, 1993 - 2006 (1991)

[6] Leonard Kleinman and D. M. Bylander, Phys. Rev. Lett. 48, 1425 - 1428(1982)

[7] David Vanderbilt, Phys. Rev. B 41, 7892 (1990)

109

[8] Broyden, C. G., Mathematics of Computation (American Mathematical So-ciety) 19, 577 - 593 (1965)

[9] L. J. Sham and M. Schlüter, Phys. Rev. Lett. 51, 1888 - 1891 (1983)

[10] Lars Hedin, Phys. Rev. 139, A796 - A823 (1965)

[11] M. T. Yin and Marvin L. Cohen, Phys. Rev. B 26, 5668 - 5687 (1982)

[12] O. Gunnarsson, B. I. Lundqvist, Phys. Rev. B 13, 4274 - 4298 (1976)

[13] A. D. Becke, Phys. Rev. A 38, 3098 - 3100 (1988)

[14] Chengteh Lee, Weitao Yang, and Robert G. Parr, Phys. Rev. B 37, 785 - 789(1988)

[15] John P. Perdew, Kieron Burke, and Matthias Ernzerhof, Phys. Rev. Lett. 77,3865 - 3868 (1996)

[16] John P. Perdew, Stefan Kurth, Aleš Zupan, Peter Blaha, Phys. Rev. Lett. 82,2544 - 2547 (1999)

[17] J. Tao, J. P. Perdew, V. N. Staroverov, and G. E. Scuseria, Phys. Rev. Lett.91, 146401 (2003).

[18] S. Grimme, J. Comput. Chem. 27, 1787 (2006).

[19] Joost VandeVondele, Fawzi Mohamed, Matthias Krack, Jürg Hutter, MichielSprik, and Michele Parrinello, J. Chem. Phys. 122, 014515 (2005)

[20] R. Car, M. Parrinello, Phys. Rev. Lett. 55, 2471 - 2474 (1985)

[21] G. Pastore, E. Smargiassi, F. Buda, Phys. Rev. A 44, 6334 - 6347 (1991)

[22] Jeffrey C. Grossman, Eric Schwegler, Erik W. Draeger, François Gygi, andGiulia Galli, J. Chem. Phys. 120, 300 (2004)

[23] A. DeVita, G. Galli, A. Canning and R. Car, Nature, 379, 523 (1996)

[24] Sverre Froyen and Marvin L Cohen, J. Phys. C: Solid State Phys. 19 (1986)2623 - 2632

[25] P. Focher, G. L. Chiarotti, M. Bernasconi, E. Tosatti, and M. Parrinello, Eu-rophys. Lett. 26, 345 (1994).

[26] S. Scandolo, M. Bernasconi, G. L. Chiarotti, P. Focher, and E. Tosatti, Phys.Rev. Lett. 74, 4015 - 4018 (1995)

110

[27] D. Marx and M. Parrinello, Z. Phys. B: Condens. Matter 95, 143 (1994)

[28] Mark E. Tuckerman, Dominik Marx, Michael L. Klein, and Michele Par-rinello, J. Chem. Phys. 104, 5579 (1996)

[29] Magali Benoit, Dominik Marx and Michele Parrinello, Nature 392, 258 -261 (1998)

[30] Marcella Iannuzzi, Alessandro Laio, and Michele Parrinello, Phys. Rev. Lett.90, 238302 (2003)

[31] A. Stirling, M. Iannuzzi, A. Laio, and M. Parrinello, ChemPhysChem, 5,1558, (2004)

[32] P. Sherwood, in "Modern Methods and Algorithms of Quantum Chemistry"NIC Ser. Vol. 1, John von Neumann Institute of Computing, Juelich 2000,pp. 257 - 277

[33] Federico Zipoli, Teodoro Laino, Alessandro Laio, Marco Bernasconi, andMichele Parrinello, J. Chem. Phys. 124, 154707 (2006)

[34] G. Kresse, J. Furthmüller, Phys. Rev. B 54, 11169 - 11186 (1996); Comp.Mat. Sci. 6, 15 - 50 (1996)

[35] Michael P. Teter, Michael C. Payne, Douglas C. Allan, Phys. Rev. B 40,12255 - 12263 (1989)

[36] P. Pulay, Chem. Phys. Lett. 73, 393 (1980)

[37] Jürg Hutter, Hans Peter Lüthi and Michele Parrinello, Comp. Mat. Sci. 2,244 (1994)

[38] C M Goringe, D R Bowler and E HernÃ¡ndez, Rep. Prog. Phys. 60, 1447(1997)

[39] F. F. Abraham, J. Q. Broughton, N. Bernstein, and E. Kaxiras, Comput. Phys.12, 538 (1998)

[40] Florian R. Krajewski and Michele Parrinello, Phys. Rev. B 71, 233105(2005)

[41] Stefano Baroni, Paolo Giannozzi, Andrea Testa, Phys. Rev. Lett. 58, 1861 -1864 (1987)

[42] M. V. Berry, Proc. Roy. Soc. Lond. A 392, 45 (1984)

111

[43] R. D. King-Smith and David Vanderbilt, Phys. Rev. B 47, 1651 - 1654 (1993)

[44] R. Resta, Europhys. Lett. 22, 133 (1993)

[45] M. Bernasconi, P. L. Silvestrelli, and M. Parrinello, Phys. Rev. Lett. 81, 1235- 1238 (1998)

[46] R. Resta, Phys. Rev. Lett. 80, 1800 - 1803 (1998)

[47] D. Chandler, P. G. Wolynes, J. Chem. Phys. 74, 4078 (1981).

[48] M.F. Herman, E.J. Bruskin and B.J. Berne, J. Chem. Phys. 76, 5150 (1982).

[49] R.W. Hall, B.J. Berne, J. Chem. Phys. 81, 3641 (1984).

[50] R.M. Fye, Phys. Rev. B, 33, 6271 (1986).

[51] Progs. Theor. Phys. 56, 1454 (1976), Quantum Monte Carlo Methodsin Equilibrium and Nonequilibrium Systems, Proceedings of the NinthTaniguchi International Symposium, Susono, Japan, 1986, ed. by M. Suzuki(Springer-Verlag Berlin Heidelberg 1987)

[52] D. M. Ceperley, Rev. Mod. Phys., 67, 279 (1995).

[53] M. Takahashi, M. Imada, J. Phys. Soc. Jpn, 53, 3765 (1984).

[54] J.D. Doll, R.D. Coalson, D.L. Freeman, Phys. Rev. Lett., 55, 1 (1985).

[55] M. Sprik, M.L. Klein, D. Chandler, Phys. Rev. B, 31, 4234 (1985).

[56] A. Cuccoli, A. Macchi, G. Pedrolli, V. Tognetti, R. Vaia, Phys. Rev. B, 51,12369 (1995).

[57] Marx D., Muser M.H., Path integral simulations of rotors: theory and appli-cations J PHYS-CONDENS MAT 11: (11) R117-R155 (1999).

112

quantum simulation techniques for electrons and ions · 2015. 10. 7. · j. p. perdew and s. kurth,...

Documents