casinoqmc.net€¦ · introduction vmc statistics dmc statistics normally-distributed numbers...

IntroductionVMC statisticsDMC statistics

Normally-distributed numbersSummary

Statistics in Quantum Monte Carlo

Pablo Lopez Rıos

The Towler Institute

3 August 2010

Pablo Lopez Rıos Statistics in Quantum Monte Carlo

The need for statistical analysis

A QMC calculation produces millions of data

We want a single number (with its error bar) as a result:

E±σE

Serial correlation needs to be removed

How to manipulate quantities with error bars

E±σE

DefinitionsSampling and serial correlationStatistical efficiency

Basic statistics

The configurations {Ri}i=Mi=1 distributed according to |Ψ(R)|2

The local energy Ei = EL(Ri) = Ψ−1(Ri)HΨ(Ri)

EL(R) forms a distribution with:

EV = 〈Ψ|H|Ψ〉〈Ψ|Ψ〉 ≈ E = ∑

Mi=1 EiM

Variance

= 〈Ψ|H2|Ψ〉〈Ψ|Ψ〉 −

[〈Ψ|H|Ψ〉〈Ψ|Ψ〉

]2≈ σ2

∑Mi=1(Ei−E)

Basic statistics

EV = 〈Ψ|H|Ψ〉〈Ψ|Ψ〉 ≈ E = ∑

Mi=1 EiM

Variance

= 〈Ψ|H2|Ψ〉〈Ψ|Ψ〉 −

[〈Ψ|H|Ψ〉〈Ψ|Ψ〉

]2≈ σ2

∑Mi=1(Ei−E)

Basic statistics

EV = 〈Ψ|H|Ψ〉〈Ψ|Ψ〉 ≈ E = ∑

Mi=1 EiM

Variance

= 〈Ψ|H2|Ψ〉〈Ψ|Ψ〉 −

[〈Ψ|H|Ψ〉〈Ψ|Ψ〉

]2≈ σ2

∑Mi=1(Ei−E)

Basic statistics

EV = 〈Ψ|H|Ψ〉〈Ψ|Ψ〉 ≈ E = ∑

Mi=1 EiM

Variance

= 〈Ψ|H2|Ψ〉〈Ψ|Ψ〉 −

[〈Ψ|H|Ψ〉〈Ψ|Ψ〉

]2≈ σ2

∑Mi=1(Ei−E)

Basic statistics

EV = 〈Ψ|H|Ψ〉〈Ψ|Ψ〉 ≈ E = ∑

Mi=1 EiM

Variance

= 〈Ψ|H2|Ψ〉〈Ψ|Ψ〉 −

[〈Ψ|H|Ψ〉〈Ψ|Ψ〉

]2≈ σ2

∑Mi=1(Ei−E)

Basic statistics

E can be determined to a given degree of certainty→ it has an error bar σE→ it is a distribution with:

E = ∑Mi=1 EiM

Variance

σ2E ≈ σ2

E =∑

Mi=1(Ei−E)

M(M−1)

Basic statistics

E = ∑Mi=1 EiM

Variance

σ2E ≈ σ2

E =∑

Mi=1(Ei−E)

M(M−1)

Basic statistics

E = ∑Mi=1 EiM

Variance

σ2E ≈ σ2

E =∑

Mi=1(Ei−E)

M(M−1)

Basic statistics

E = ∑Mi=1 EiM

Variance

σ2E ≈ σ2

E =∑

Mi=1(Ei−E)

M(M−1)

Basic statistics

E = ∑Mi=1 EiM

Variance

σ2E ≈ σ2

E =∑

Mi=1(Ei−E)

M(M−1)

Local energy and mean energy

0.5 0.55E (a.u.)

EL distribution

E distribution

The local energy distribution is what we sample.The mean energy distribution is what we obtain.

0.5 0.55E (a.u.)

EL distribution

E distribution

0.5 0.55E (a.u.)

EL distribution

E distribution

Sampling of configuration space

{Ri}i=Mi=1 must be distributed according to |Ψ(R)|2.

Sampling algorithm at i-th step

Start at config Ri

Propose a new config R′i

Compute the wave function ratio qi =∣∣∣Ψ(R′i)

Ψ(Ri)

∣∣∣2Generate an uniformly-distributed random numberξ ∈ [0,1)

Accept/reject step:

if ξ < qi → set Ri+1 = R′i (accept new config)if ξ > qi → set Ri+1 = Ri (reject new config)

Start at config Ri

Ψ(Ri)

Accept/reject step:

Start at config Ri

Ψ(Ri)

Accept/reject step:

Start at config Ri

Ψ(Ri)

Accept/reject step:

Start at config Ri

Ψ(Ri)

Accept/reject step:

Start at config Ri

Ψ(Ri)

Accept/reject step:

Start at config Ri

Ψ(Ri)

Accept/reject step:

Start at config Ri

Ψ(Ri)

Accept/reject step:

Proposing Ri→ R′i

If R′i proposed at random:→ chances of landing in a reasonable region of configurationspaces are slim→ qi will be small→ most moves are rejected→ poor sampling

If R′i is Ri plus a small displacement:→ R′i similar to Ri

→ EL(R′i) similar to EL(Ri)→ Serial correlation

Effect of serial correlation

Consider an uncorrelated set of energies {E1,E2,E3, . . . ,EM}Generate a new set with artificial serial correlation:

{E1, . . . ,E1︸︷︷︸τ

,E2, . . . ,E2︸︷︷︸τ

,E3, . . . ,E3︸︷︷︸τ

, . . . ,EM, . . . ,EM︸︷︷︸τ

No new information → mean and error bar should beunchanged

Computed mean of new set is E′ = E

Computed error bar of new set is σ ′E = σE/√

→ error bar underestimated!

{E1, . . . ,E1︸︷︷︸τ

,E2, . . . ,E2︸︷︷︸τ

,E3, . . . ,E3︸︷︷︸τ

, . . . ,EM, . . . ,EM︸︷︷︸τ

{E1, . . . ,E1︸︷︷︸τ

,E2, . . . ,E2︸︷︷︸τ

,E3, . . . ,E3︸︷︷︸τ

, . . . ,EM, . . . ,EM︸︷︷︸τ

{E1, . . . ,E1︸︷︷︸τ

,E2, . . . ,E2︸︷︷︸τ

,E3, . . . ,E3︸︷︷︸τ

, . . . ,EM, . . . ,EM︸︷︷︸τ

{E1, . . . ,E1︸︷︷︸τ

,E2, . . . ,E2︸︷︷︸τ

,E3, . . . ,E3︸︷︷︸τ

, . . . ,EM, . . . ,EM︸︷︷︸τ

{E1, . . . ,E1︸︷︷︸τ

,E2, . . . ,E2︸︷︷︸τ

,E3, . . . ,E3︸︷︷︸τ

, . . . ,EM, . . . ,EM︸︷︷︸τ

Removing serial correlation

In this example we can remove the serial correlation byignoring τ−1 of every τ consecutive energies

For real data the correlation time τ varies during the run→ would need to ignore τmax−1 of each τmax data to be safe→ lots of relevant data discarded→ inefficiency

However we may assume that

σ ′E = σE/√

still holds, where τ is the average correlation time.

This is an alternative approach to the reblocking algorithm

σ ′E = σE/√

The reblocking algorithm

Consider the following operation on data, where the itemunder each brace is the average of the two numbers above:

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

If we succesively apply transformations until τmax of theoriginal data are averaged together→ resulting (smaller) data set is not serially correlated

Cannot compute τmax directly. Must find out how manyreblocking transformations to apply in some other way

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

E(0)1 E(0)

2︸︷︷︸ E(0)3 E(0)

4︸︷︷︸ E(0)5 E(0)

6︸︷︷︸ E(0)7 E(0)

8︸︷︷︸E(1)

1 E(1)2︸︷︷︸ E(1)

3 E(1)4︸︷︷︸

. . . . . .

Error estimator after reblocking

At the k-th iteration in this procedure:

σ(k+1)2E ≈ σ

(k)2E +

2∑M(k)/2i=1

2i−1− E)(

E(k)2i − E

)M(k)(M(k)−2)

If there is no serial correlation, the last term tends to zero

If there is serial correlation, the last term is positive

Hence σ(k)

E will increase until it reaches the true error bar atk ≈ log2(τmax)

Plateau in σ(k)

E signals convergence of reblocking algorithm

σ(k+1)2E ≈ σ

(k)2E +

2∑M(k)/2i=1

2i−1− E)(

E(k)2i − E

)M(k)(M(k)−2)

Hence σ(k)

Plateau in σ(k)

σ(k+1)2E ≈ σ

(k)2E +

2∑M(k)/2i=1

2i−1− E)(

E(k)2i − E

)M(k)(M(k)−2)

Hence σ(k)

Plateau in σ(k)

σ(k+1)2E ≈ σ

(k)2E +

2∑M(k)/2i=1

2i−1− E)(

E(k)2i − E

)M(k)(M(k)−2)

Hence σ(k)

Plateau in σ(k)

σ(k+1)2E ≈ σ

(k)2E +

2∑M(k)/2i=1

2i−1− E)(

E(k)2i − E

)M(k)(M(k)−2)

Hence σ(k)

Plateau in σ(k)

Reblock plot

0 5 10 15 20 25k

0.00000

0.00001

0.00002

0.00003

0.00004

~ σ E (k)

log 2(τ

How to run efficient VMC calculations

Reducing serial correlation, by

Choosing an appropriate timestepUsing electron-by-electron samplingSkipping the right number of steps between every twocalculations of expectation values

Reducing the intrinsic variance/expense, by

Using appropriate trial wave functions

The VMC timestep

The “timestep” T is the variance of the distribution used togenerate the random displacements when proposing moves

It is actually a squared length, but can be regarded a time ifconsidering a diffusion process

T does not enter the VMC formalism→ can be chosen so that statistics are improved

T small → R′i very similar to Ri→ serial correlation increasedT large → R′i very dissimilar from Ri→ most moves are rejected→ serial correlation increased

The VMC timestep

The 50% rule

Choose T such that the acceptance ratio a = 50%

0.001 0.01 0.1 1T (a.u.)

The 50% rule

There are better strategies for some systems, e.g. the electrongas

0.01 0.1 1 10T (a.u.)

Electron-by-electron sampling

QMC sampling usually described using configuration moves,→ Configuration-by-configuration sampling (CBCS)Usual implementation: move each electron and accept/rejecttheir moves individually→ Electron-by-electron sampling (EBES)Set T to the same value in CBCS and EBES→ aC would equal aN

E→ EBES winsSet T so that aC = aE = a→ the chances of Ri+1 = Ri in the CBCS is a,while in the EBES it is aN

→ EBES wins

The EBES is more efficient

→ EBES wins

Choosing the right wave function

The better the wave function, the lower the variance→ fewer data required to achieve a given errorbar→ lower cost

Generally, better wave functions are more expensive toevaluate→ higher cost

Important!

Regarding accuracy, the best trial wave function for aproblem need not be the most complicated

There is a risk of over-parametrization!

Important!

Wave functions and the local energy distribution

0.5 0.55E

L (a.u.)

Slater-JastrowSlater-Jastrow-backflow

Local energy distributions for two different wave functionsPablo Lopez Rıos Statistics in Quantum Monte Carlo

Definitions

The projection method

Time-dependent Schrodinger equation

H(R)Φ(R,x) = i ∂Φ(R,x)∂x

Imaginary time (ix = t) and energy shift(H(R)−ET

)Φ(R, t) = ∂Φ(R,t)

Eigenstate expansion

Φ(R, t) = ∑∞n=0 cnΦn(R)e−(En−ET )t

If we adjust ET ∼ E0, excited eigenstates decay exponentiallyas t→ ∞ and only ground state remains

Definitions

)Φ(R, t) = ∂Φ(R,t)

Definitions

)Φ(R, t) = ∂Φ(R,t)

Definitions

)Φ(R, t) = ∂Φ(R,t)

Definitions

Projection using discrete walkers

Consider the equation

A(R)f (R, t) =− ∂ f (R,t)∂ t

Given f (R, t) at time t and Green’s function G(R′← R,T):

f (R, t + T) =∫

G(R′← R,T)f (R, t)dRIf f (R, t) probability distribution → can represent discretely:

f (R, t)≈ ∑Pp=1 wp(t)δ [R−Rp(t)]

Therefore:

f (R, t + T) = ∑Pp=1 wp(t)G[R′← Rp(t),T]≈

≈ ∑Pp=1 wp(t + T)δ [R−Rp(t + T)]

Definitions

A(R)f (R, t) =− ∂ f (R,t)∂ t

f (R, t + T) =∫

Therefore:

≈ ∑Pp=1 wp(t + T)δ [R−Rp(t + T)]

Definitions

A(R)f (R, t) =− ∂ f (R,t)∂ t

f (R, t + T) =∫

Therefore:

≈ ∑Pp=1 wp(t + T)δ [R−Rp(t + T)]

Definitions

A(R)f (R, t) =− ∂ f (R,t)∂ t

f (R, t + T) =∫

Therefore:

≈ ∑Pp=1 wp(t + T)δ [R−Rp(t + T)]

Definitions

Green’s function in DMC

The Green’s function can be used as a transition probability todetermine the evolution of walkers in a stochastic approach

Definitions

Diffusion Monte Carlo

DMC consists in choosing f (R, t) = Φ(R, t)Ψ(R), where Ψ isthe trial wave function and Φ is called the DMC wavefunction

For small T, Green’s function can be split into drift, diffusionand branching

f (R, t) is not a probability distribution unless Φ and Ψ havethe same sign everywhere in space

The fixed-node approximation

Φ is constrained to have the same nodes as ΨThe resulting Φ is the lowest energy wave function among

those with the same nodal structure as ΨTherefore DMC is always better than VMC

Definitions

The DMC algorithm

Start from P walkers {R0,α}Pα=1 distributed according to

|Ψ(R)|2 (from VMC)DMC evolution of the walkers:

Drift-diffusion: move Ri,α → R′i,αBranching: define weight wi,α→ configurations breed/die according to branching factorw′i,α/wi,α→ variable number of walkers Pi

Equilibrate the walkers until we reach infinite-time limit→ look at Ei = ∑

Piα=1 wα,iEα,i/∑

Piα=1 wα,i

Accumulate data after equilibration to improve statistics ofresult

DMC mixed estimator

〈A〉DMC = limt→∞〈Ψ|A|Φ(t)〉/〈Ψ|Φ(t)〉Pablo Lopez Rıos Statistics in Quantum Monte Carlo

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

The DMC algorithm

Piα=1 wα,i

DMC mixed estimator

Definitions

Calculation of the energy in DMC

0 500 1000 1500

Number of moves

Local energy (Ha)Reference energyBest estimate

0 500 1000 15001000

POPULATION

ED ≈ E =∑

Mi=1 WiEi

∑Mi=1 Wi

; σ2E ≈ σ

∑Mi=1 Wi

(Ei− E

∑Mi=1 Wi− ∑

Mi=1 W2

∑Mi=1 Wi

)Pablo Lopez Rıos Statistics in Quantum Monte Carlo

Definitions

Sources of error in DMC

Timestep: we have assumed that T is small→ must extrapolate to zero timestep to obtain a reliable result→ cannot use timestep to improve statistics

Population: Φ is represented by set of configurations→ must use sufficient configurations to represent it accurately→ possible to extrapolate to infinite population

Fixed-node error: only limitation of DMC→ ED is still variational (very important!)→ can be reduced by using Ψ with better nodes

Locality approximation: from pseudopotentials→ ED non-variational→ goes away with good Ψ

Definitions

The normal distribution

The normal distribution is D(E; E,σ) = 1√2πσ

exp[− (E−E)2

]The probability of the E being in an interval (A,B) is

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

One-sigma interval (E−σ , E + σ) → 68.3% → unreliable

Two-sigma interval (E−2σ , E + 2σ) → 95.4% → reliable

Three-sigma interval (E−3σ , E + 3σ) → 99.7% → veryreliable

exp[− (E−E)2

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

exp[− (E−E)2

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

exp[− (E−E)2

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

exp[− (E−E)2

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

exp[− (E−E)2

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

exp[− (E−E)2

P(A < E < B) = f(

B−Eσ

)− f(

A−Eσ

)f (x) = 1√

∫ x−∞

exp(−y2/2

Comparison of a Gaussian and the local energy distribution

0.5 0.55E (a.u.)

GaussianVMC histogram

68.3%95.4%99.7%

-σ +σ-2σ

-3σ+2σ

From Central Limit Theorem:

The mean energy is exactly normal

How to compare quantities with errorbars

Want to find distribution of difference, denoted(E−±σ−

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

Ψ1 gives E1 =−14.66728(2) a.u.Ψ2 gives E2 =−14.66733(7) a.u.Comparison: E− = 0.00005(7) a.u. → 76% chance of E2 < E1→ unreliable!If E2 =−14.66733(2) a.u. instead → E− = 0.00005(3) a.u.→ 95% chance of E2 < E1 → reliable

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

)=(E1±σ1

)−(E2±σ2

)Results in

E− = E1− E2σ2− = σ2

1 + σ22

Example:

Summary

Reblocking algorithm applied using the reblock utility

Average correlation time τ given in VMC runs andreblock utility

VMC timestep automatically optimized to give a = 50% (donot apply on HEG)

EBEA is the default in both VMC and DMC

DMC statistics monitored using graphit utility

Timestep extrapolation carried out using theextrapolate tau utility

Summary

Reblocking algorithm applied using the reblock utility

Average correlation time τ given in VMC runs andreblock utility

VMC timestep automatically optimized to give a = 50% (donot apply on HEG)

EBEA is the default in both VMC and DMC

DMC statistics monitored using graphit utility

Timestep extrapolation carried out using theextrapolate tau utility

casinoqmc.net€¦ · introduction vmc statistics dmc statistics normally-distributed numbers...

Documents