parallel adaptive wang landau - gdr november 2011
DESCRIPTION
http://arxiv.org/abs/1109.3829 http://cran.r-project.org/web/packages/PAWL/index.html http://statisfaction.wordpress.com/2011/09/21/density-exploration-and-wang-landau-algorithms-with-r-package/TRANSCRIPT
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Parallel Adaptive Wang–Landau Algorithm
Pierre E. Jacob
CEREMADE - Universite Paris Dauphine & CREST, funded by AXA Research
15 novembre 2011
joint work with Luke Bornn (UBC), Arnaud Doucet (Oxford), Pierre Del Moral(INRIA & Universite de Bordeaux)
Pierre E. Jacob PAWL 1/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Outline
1 Wang–Landau algorithm
2 ImprovementsAutomatic BinningParallel Interacting ChainsAdaptive proposals
3 2D Ising model
4 Conclusion
Pierre E. Jacob PAWL 2/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Context
unnormalized target density π
on a state space X
A kind of adaptive MCMC algorithm
It iteratively generates a sequence Xt .
The stationary distribution is not π itself.
At each iteration a different stationary distribution is targeted.
Pierre E. Jacob PAWL 3/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Partition the space
The state space X is cut into d bins:
X =d⋃
i=1
Xi and ∀i 6= j Xi ∩ Xj = ∅
Goal
The generated sequence spends the same time in each bin Xi ,
within each bin Xi the sequence is asymptotically distributedaccording to the restriction of π to Xi .
Pierre E. Jacob PAWL 4/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Stationary distribution
Define the mass of π over Xi by:
ψi =
∫Xi
π(x)dx
The stationary distribution of the WL algorithm is:
πψ(x) ∝ π(x)× 1
ψJ(x)
where J(x) is the index such that x ∈ XJ(x)
Pierre E. Jacob PAWL 5/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Example with a bimodal, univariate target density: π and two πψcorresponding to different partitions.
X
Log
Den
sity
−12
−10
−8
−6
−4
−2
0
Original Density, with partition lines
−5 0 5 10 15
Biased by X
−5 0 5 10 15
Biased by Log Density
−5 0 5 10 15
Pierre E. Jacob PAWL 6/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Plugging estimates
In practice we cannot compute ψi analytically. Instead we plug inestimates θt(i) of ψi at iteration t, and define the distribution πθtby:
πθt (x) ∝ π(x)× 1
θt(J(x))
Metropolis–Hastings
The algorithm does a Metropolis–Hastings step, aiming πθt atiteration t, generating a new point Xt .
Pierre E. Jacob PAWL 7/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Estimate of the bias
The update of the estimated bias θt(i) is done according to:
θt(i)← θt−1(i)[1 + γt(IXt∈Xi− d−1)]
with γt a decreasing sequence or “step size”. E.g. γt = 1/t.
Pierre E. Jacob PAWL 8/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Wang–Landau
Result
In the end we get:
a sequence Xt asymptotically following πψ,
as well as estimates θt(i) of ψi .
Pierre E. Jacob PAWL 9/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Automatic BinningParallel Interacting ChainsAdaptive proposals
Automate Binning
Easily move from one bin to another
Maintain some kind of uniformity within bins. If non-uniform, splitthe bin.
Log density
Fre
quen
cy
(a) Before the split
Log density
Fre
quen
cy
(b) After the split
Pierre E. Jacob PAWL 10/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Automatic BinningParallel Interacting ChainsAdaptive proposals
Parallel Interacting Chains
N chains (X(1)t , . . . ,X
(N)t ) instead of one.
targeting the same biased distribution πθt at iteration t,
sharing the same estimated bias θt at iteration t.
The update of the estimated bias becomes:
θt(i)← θt−1(i)[1 + γt(1
N
N∑j=1
IX
(j)t ∈Xi
− d−1)]
Pierre E. Jacob PAWL 11/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Automatic BinningParallel Interacting ChainsAdaptive proposals
Adaptive proposals
For continuous state spaces
We can use the adaptive Random Walk proposal where thevariance σt is learned along the iterations to target an acceptancerate.
Robbins-Monro stochastic approximation update
σt+1 = σt + ρt (2I(A > 0.234)− 1)
Or alternatively
Σt = δ × Cov (X1, . . . ,Xt)
Pierre E. Jacob PAWL 12/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
2D Ising model
Higdon (1998), JASA 93(442)
Target density
Consider a 2D Ising model, with posterior density
π(x |y) ∝ exp
α∑i
I[yi = xi ] + β∑i∼jI[xi = xj ]
with α = 1, β = 0.7.
The first term (likelihood) encourages states x which aresimilar to the original image y .
The second term (prior) favors states x for whichneighbouring pixels are equal, like a Potts model.
Pierre E. Jacob PAWL 13/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
2D Ising models
(a) Original Image (b) Focused Region of Image
Pierre E. Jacob PAWL 14/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
2D Ising models
X1
X2
10
20
30
40
10
20
30
40
Iteration 300,000
10 20 30 40
Iteration 350,000
10 20 30 40
Iteration 400,000
10 20 30 40
Iteration 450,000
10 20 30 40
Iteration 500,000
10 20 30 40
Metropolis−
Hastings
Wang−
Landau
Pixel
On
Off
Figure: Spatial model example: states explored over 200,000 iterationsfor Metropolis-Hastings (top) and proposed algorithm (bottom).
Pierre E. Jacob PAWL 15/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
2D Ising models
X1
X2
10
20
30
40
Metropolis−Hastings
10 20 30 40
Wang−Landau
10 20 30 40
Pixel
0.4
0.6
0.8
1.0
Figure: Spatial model example: average state explored withMetropolis-Hastings (left) and Wang-Landau after importance sampling(right).
Pierre E. Jacob PAWL 16/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Conclusion
Automatic binning
We still have to define a range.
Parallel Chains
In practice it is more efficient to use N chains for T iterationsinstead of 1 chain for N × T iterations.
Adaptive Proposals
Convergence results with fixed proposals are already challenging,and making the proposal adaptive might add a layer of complexity.
Pierre E. Jacob PAWL 17/ 18
Wang–Landau algorithmImprovements
2D Ising modelConclusion
Bibliography
Article: An Adaptive Interacting Wang-Landau Algorithm forAutomatic Density Exploration, L. Bornn, P.E. Jacob, P. Del
Moral, A. Doucet, available on arXiv.
Software: PAWL, an R package, available on CRAN:
install.packages("PAWL")
References:
F. Wang, D. Landau, Physical Review E, 64(5):56101
Y. Atchade, J. Liu, Statistica Sinica, 20:209-233
Pierre E. Jacob PAWL 18/ 18