Download - Introduction to advanced Monte Carlo methods
![Page 1: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/1.jpg)
An introduction to advanced (?) MCMC methods
An introduction to advanced (?) MCMC methods
Christian P. Robert
Universite Paris-Dauphine and CREST-INSEEhttp://www.ceremade.dauphine.fr/~xian
Royal Statistical Society, October 13, 2010
![Page 2: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/2.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Motivating example
1 Motivating example
2 The Metropolis-Hastings Algorithm
![Page 3: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/3.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Latent structures make life harder!
Even simple models may lead to computational complications,as in latent variable models
f(x|θ) =
∫
f⋆(x, x⋆|θ) dx⋆
![Page 4: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/4.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Latent structures make life harder!
Even simple models may lead to computational complications,as in latent variable models
f(x|θ) =
∫
f⋆(x, x⋆|θ) dx⋆
If (x, x⋆) observed, fine!
![Page 5: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/5.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Latent structures make life harder!
Even simple models may lead to computational complications,as in latent variable models
f(x|θ) =
∫
f⋆(x, x⋆|θ) dx⋆
If (x, x⋆) observed, fine!
If only x observed, trouble!
![Page 6: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/6.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Example (Mixture models)
Models of mixtures of distributions:
X ∼ fj with probability pj ,
for j = 1, 2, . . . , k, with overall density
X ∼ p1f1(x) + · · · + pkfk(x) .
![Page 7: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/7.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Example (Mixture models)
Models of mixtures of distributions:
X ∼ fj with probability pj ,
for j = 1, 2, . . . , k, with overall density
X ∼ p1f1(x) + · · · + pkfk(x) .
For a sample of independent random variables (X1, · · · , Xn),sample density
n∏
i=1
{p1f1(xi) + · · · + pkfk(xi)} .
![Page 8: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/8.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
Example (Mixture models)
Models of mixtures of distributions:
X ∼ fj with probability pj ,
for j = 1, 2, . . . , k, with overall density
X ∼ p1f1(x) + · · · + pkfk(x) .
For a sample of independent random variables (X1, · · · , Xn),sample density
n∏
i=1
{p1f1(xi) + · · · + pkfk(xi)} .
Expanding this product involves kn elementary terms: prohibitiveto compute in large samples.
![Page 9: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/9.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
0.3N (µ1, 1) + 0.7N (µ2, 1) loglikelihood
−1 0 1 2 3
−1
01
23
µ1
µ 2
![Page 10: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/10.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
A typology of Bayes computational problems
(i) use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
![Page 11: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/11.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
A typology of Bayes computational problems
(i) use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii) use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
![Page 12: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/12.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
A typology of Bayes computational problems
(i) use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii) use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
(iii) use of a huge dataset;
![Page 13: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/13.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
A typology of Bayes computational problems
(i) use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii) use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
(iii) use of a huge dataset;(iv) use of a complex prior distribution (which may be the
posterior distribution associated with an earlier sample);
![Page 14: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/14.jpg)
An introduction to advanced (?) MCMC methods
Motivating example
A typology of Bayes computational problems
(i) use of a complex parameter space, as for instance inconstrained parameter sets like those resulting from imposingstationarity constraints in dynamic models;
(ii) use of a complex sampling model with an intractablelikelihood, as for instance in missing data and graphicalmodels;
(iii) use of a huge dataset;(iv) use of a complex prior distribution (which may be the
posterior distribution associated with an earlier sample);(v) use of a complex inferential procedure as for instance, Bayes
factors
Bπ01(x) = P (θ ∈ Θ0 |x)/P (θ ∈ Θ1 |x)
/
π(θ ∈ Θ0)
π(θ ∈ Θ1).
![Page 15: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/15.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis-Hastings Algorithm
1 Motivating example
2 The Metropolis-Hastings AlgorithmMonte Carlo Methods based on Markov ChainsThe Metropolis–Hastings algorithmA collection of Metropolis-Hastings algorithmsExtensionsConvergence assessment
![Page 16: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/16.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Monte Carlo Methods based on Markov Chains
Running Monte Carlo via Markov Chains
Fact: It is not necessary to use a sample from the distribution f toapproximate the integral
I =
∫
h(x)f(x)dx ,
![Page 17: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/17.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Monte Carlo Methods based on Markov Chains
Running Monte Carlo via Markov Chains
Fact: It is not necessary to use a sample from the distribution f toapproximate the integral
I =
∫
h(x)f(x)dx ,
We can obtain X1, . . . , Xn ∼ f (approx) without directlysimulating from f , using an ergodic Markov chain withstationary distribution f
![Page 18: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/18.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Monte Carlo Methods based on Markov Chains
Running Monte Carlo via Markov Chains (2)
Idea
For an arbitrary starting value x(0), an ergodic chain (X(t)) isgenerated using a transition kernel with stationary distribution f
![Page 19: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/19.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Monte Carlo Methods based on Markov Chains
Running Monte Carlo via Markov Chains (2)
Idea
For an arbitrary starting value x(0), an ergodic chain (X(t)) isgenerated using a transition kernel with stationary distribution f
Ensures the convergence in distribution of (X(t)) to a randomvariable from f .
For a “large enough” T0, X(T0) can be considered as
distributed from f
Produces a dependent sample X(T0), X(T0+1), . . ., which isgenerated from f , sufficient for most approximation purposes.
![Page 20: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/20.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
The Metropolis–Hastings algorithm
Problem:How can one build a Markov chain with a given stationarydistribution?
![Page 21: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/21.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
The Metropolis–Hastings algorithm
Problem:How can one build a Markov chain with a given stationarydistribution?
MH basicsAlgorithm that converges to the objective (target) density
f
using an arbitrary transition kernel density
q(x, y)
called instrumental (or proposal) distribution
![Page 22: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/22.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
The MH algorithm
Algorithm (Metropolis–Hastings)
Given x(t),
1 Generate Yt ∼ q(x(t), y).
2 Take
X(t+1) =
{
Yt with prob. ρ(x(t), Yt),
x(t) with prob. 1 − ρ(x(t), Yt),
where
ρ(x, y) = min
{
f(y)
f(x)
q(y, x)
q(x, y), 1
}
.
![Page 23: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/23.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Features
Independent of normalizing constants for both f and q(x, ·)(ie, those constants independent of x)
Never move to values with f(y) = 0
The chain (x(t))t may take the same value several times in arow, even though f is a density wrt Lebesgue measure
The sequence (yt)t is usually not a Markov chain
![Page 24: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/24.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Features
Independent of normalizing constants for both f and q(x, ·)(ie, those constants independent of x)
Never move to values with f(y) = 0
The chain (x(t))t may take the same value several times in arow, even though f is a density wrt Lebesgue measure
The sequence (yt)t is usually not a Markov chain
Satisfies the detailed balance condition
f(x)K(x, y) = f(y)K(y, x)
’θθ->P( )
P( )θ ’ θ->
θ θ’
[Green, 1995]
![Page 25: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/25.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Convergence properties
1 The M-H Markov chain is reversible, with invariant/stationarydensity f .
![Page 26: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/26.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Convergence properties
1 The M-H Markov chain is reversible, with invariant/stationarydensity f .
2 As f is a probability measure, the chain is positive recurrent
![Page 27: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/27.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Convergence properties
1 The M-H Markov chain is reversible, with invariant/stationarydensity f .
2 As f is a probability measure, the chain is positive recurrent
3 If
Pr
[
f(Yt) q(Yt, X(t))
f(X(t)) q(X(t), Yt)≥ 1
]
< 1. (1)
i.e., if the event {X(t+1) = X(t)} occurs with positiveprobability, then the chain is aperiodic
![Page 28: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/28.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Convergence properties (2)
4 Ifq(x, y) > 0 for every (x, y), (2)
the chain is irreducible
![Page 29: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/29.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Convergence properties (2)
4 Ifq(x, y) > 0 for every (x, y), (2)
the chain is irreducible5 For M-H, f -irreducibility implies Harris recurrence
![Page 30: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/30.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
The Metropolis–Hastings algorithm
Convergence properties (2)
4 Ifq(x, y) > 0 for every (x, y), (2)
the chain is irreducible5 For M-H, f -irreducibility implies Harris recurrence6 Thus, under conditions (1) and (2)
(i) For h, with Ef |h(X)| <∞,
limT→∞
1
T
T∑
t=1
h(X(t)) =
∫
h(x)df(x) a.e. f.
(ii) and
limn→∞
∥
∥
∥
∥
∫
Kn(x, ·)µ(dx) − f
∥
∥
∥
∥
TV
= 0
for every initial distribution µ, where Kn(x, ·) denotes thekernel for n transitions.
![Page 31: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/31.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
The Independent Case
The instrumental distribution q(x, ·) is independent of x and isdenoted g
![Page 32: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/32.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
The Independent Case
The instrumental distribution q(x, ·) is independent of x and isdenoted g
Algorithm (Independent Metropolis-Hastings)
Given x(t),
1 Generate Yt ∼ g(y)
2 Take
X(t+1) =
Yt with prob. min
{
f(Yt) g(x(t))
f(x(t)) g(Yt), 1
}
,
x(t) otherwise.
![Page 33: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/33.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Properties
The resulting sample is not iid
![Page 34: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/34.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Properties
The resulting sample is not iid but there exist strong convergenceproperties:
Theorem (Ergodicity)
The algorithm produces a uniformly ergodic chain if there exists aconstant M such that
f(x) ≤Mg(x) , x ∈ supp f.
In this case,
‖Kn(x, ·) − f‖TV ≤(
1 − 1
M
)n
.
[Mengersen & Tweedie, 1996]
![Page 35: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/35.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Noisy AR(1))
Hidden Markov chain from a regular AR(1) model,
xt+1 = ϕxt + ǫt+1 ǫt ∼ N (0, τ2)
and observablesyt|xt ∼ N (x2
t , σ2)
![Page 36: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/36.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Noisy AR(1))
Hidden Markov chain from a regular AR(1) model,
xt+1 = ϕxt + ǫt+1 ǫt ∼ N (0, τ2)
and observablesyt|xt ∼ N (x2
t , σ2)
The distribution of xt given xt−1, xt+1 and yt is
exp−1
2τ2
{
(xt − ϕxt−1)2 + (xt+1 − ϕxt)
2 +τ2
σ2(yt − x2
t )2
}
.
![Page 37: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/37.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Noisy AR(1) too)
Use for proposal the N (µt, ω2t ) distribution, with
µt = ϕxt−1 + xt+1
1 + ϕ2and ω2
t =τ2
1 + ϕ2.
![Page 38: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/38.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Noisy AR(1) too)
Use for proposal the N (µt, ω2t ) distribution, with
µt = ϕxt−1 + xt+1
1 + ϕ2and ω2
t =τ2
1 + ϕ2.
Ratioπ(x)/qind(x) = exp−(yt − x2
t )2/2σ2
is bounded
![Page 39: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/39.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
(top) Last 500 realisations of the chain {Xk}k out of 10, 000iterations; (bottom) histogram of the chain, compared withthe target distribution.
![Page 40: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/40.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk Metropolis–Hastings
Instead, use a local perturbation as proposal
Yt = X(t) + εt,
where εt ∼ g, independent of X(t).The instrumental density is now of the form g(y − x) and theMarkov chain is a random walk if g is symmetric
g(x) = g(−x)
![Page 41: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/41.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Algorithm (Random walk Metropolis)
Given x(t)
1 Generate Yt ∼ g(y − x(t))
2 Take
X(t+1) =
Yt with prob. min
{
1,f(Yt)
f(x(t))
}
,
x(t) otherwise.
![Page 42: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/42.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Probit illustration
Likelihood and posterior given by
π(β|y, X) ∝ ℓ(β|y, X) ∝n∏
i=1
Φ(xiTβ)yi(1 − Φ(xiTβ))ni−yi .
under the flat prior
![Page 43: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/43.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Probit illustration
Likelihood and posterior given by
π(β|y, X) ∝ ℓ(β|y, X) ∝n∏
i=1
Φ(xiTβ)yi(1 − Φ(xiTβ))ni−yi .
under the flat priorA random walk proposal works well for a small number ofpredictors. Use the maximum likelihood estimate β as startingvalue and asymptotic (Fisher) covariance matrix of the MLE, Σ, asscale
![Page 44: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/44.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
MCMC algorithm
Probit random-walk Metropolis-Hastings
Initialization: Set β(0) = β and compute Σ
Iteration t:1 Generate β ∼ Nk+1(β
(t−1), τ Σ)2 Compute
ρ(β(t−1), β) = min
(
1,π(β|y)
π(β(t−1)|y)
)
3 With probability ρ(β(t−1), β) set β(t) = β;otherwise set β(t) = β(t−1).
![Page 45: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/45.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
R bank benchmark
Probit modelling withno intercept over thefour measurements.Three different scalesτ = 1, 0.1, 10: bestmixing behavior isassociated with τ = 1.Average of theparameters overMCMC 9, 000iterations gives plug-inestimate
0 4000 8000
−2.
0−
1.0
−2.0 −1.5 −1.0 −0.5
0.0
1.0
0 200 600 1000
0.0
0.4
0.8
0 4000 8000
−1
12
3
−1 0 1 2 3
0.0
0.4
0 200 600 1000
0.0
0.4
0.8
0 4000 8000−
0.5
1.0
2.5
−0.5 0.5 1.5 2.5
0.0
0.4
0.8
0 200 600 1000
0.0
0.4
0.8
0 4000 8000
0.6
1.2
1.8
0.6 1.0 1.4 1.8
0.0
1.0
2.0
0 200 600 1000
0.0
0.4
0.8
pi = Φ(−1.2193xi1 + 0.9540xi2 + 0.9795xi3 + 1.1481xi4) .
![Page 46: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/46.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Mixture models)
π(θ|x) ∝n∏
j=1
(
k∑
ℓ=1
pℓf(xj |µℓ, σℓ)
)
π(θ)
![Page 47: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/47.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Mixture models)
π(θ|x) ∝n∏
j=1
(
k∑
ℓ=1
pℓf(xj |µℓ, σℓ)
)
π(θ)
Metropolis-Hastings proposal:
θ(t+1) =
{
θ(t) + ωε(t) if u(t) < ρ(t)
θ(t) otherwise
where
ρ(t) =π(θ(t) + ωε(t)|x)
π(θ(t)|x) ∧ 1
and ω scaled for good acceptance rate
![Page 48: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/48.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale 1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 1
![Page 49: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/49.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale 1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 10
![Page 50: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/50.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale 1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 100
![Page 51: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/51.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale 1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 500
![Page 52: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/52.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale 1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 1000
![Page 53: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/53.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale√.1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 10
![Page 54: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/54.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale√.1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 100
![Page 55: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/55.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale√.1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 500
![Page 56: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/56.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale√.1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 1000
![Page 57: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/57.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale√.1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 10,000
![Page 58: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/58.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Random walk MCMC output for.7N (µ1, 1) + .3N (µ2, 1)
and scale√.1
−1 0 1 2 3 4
−1
01
23
4
µ1
µ 2
Iteration 5000
![Page 59: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/59.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Convergence properties
Uniform ergodicity prohibited by random walk structure
![Page 60: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/60.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Convergence properties
Uniform ergodicity prohibited by random walk structureAt best, geometric ergodicity:
Theorem (Sufficient ergodicity)
For a symmetric density f , log-concave in the tails, and a positiveand symmetric density g, the chain (X(t)) is geometrically ergodic.
[Mengersen & Tweedie, 1996]
no tail effect
![Page 61: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/61.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
A collection of Metropolis-Hastings algorithms
Example (Comparison of taileffects)
Random-walkMetropolis–Hastings algorithmsbased on a N (0, 1) instrumentalfor the generation of (a) aN (0, 1) distribution and (b) adistribution with densityψ(x) ∝ (1 + |x|)−3
(a)
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
(a)
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
(b)
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
0 50 100 150 200
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
90% confidence envelopes ofthe means, derived from 500parallel independent chains
![Page 62: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/62.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Extensions
There are many other families of HM algorithms
Adaptive Rejection Metropolis Sampling
Reversible Jump
Langevin algorithms
to name just a few...
![Page 63: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/63.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Langevin Algorithms
Proposal based on the Langevin diffusion Lt is defined by thestochastic differential equation
dLt = dBt +1
2∇ log f(Lt)dt,
where Bt is the standard Brownian motion
Theorem
The Langevin diffusion is the only non-explosive diffusion which isreversible with respect to f .
![Page 64: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/64.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Discretization
Because continuous time cannot be simulated, consider thediscretised sequence
x(t+1) = x(t) +σ2
2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)
where σ2 corresponds to the discretisation step
Example off(x) = exp(−x4)
Den
sity
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
σ2 = .1
![Page 65: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/65.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Discretization
Because continuous time cannot be simulated, consider thediscretised sequence
x(t+1) = x(t) +σ2
2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)
where σ2 corresponds to the discretisation step
Example off(x) = exp(−x4)
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
σ2 = .01
![Page 66: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/66.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Discretization
Because continuous time cannot be simulated, consider thediscretised sequence
x(t+1) = x(t) +σ2
2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)
where σ2 corresponds to the discretisation step
Example off(x) = exp(−x4)
Den
sity
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
σ2 = .001
![Page 67: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/67.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Discretization
Because continuous time cannot be simulated, consider thediscretised sequence
x(t+1) = x(t) +σ2
2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)
where σ2 corresponds to the discretisation step
Example off(x) = exp(−x4)
Den
sity
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.0
0.2
0.4
0.6
0.8
σ2 = .0001
![Page 68: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/68.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Discretization
Because continuous time cannot be simulated, consider thediscretised sequence
x(t+1) = x(t) +σ2
2∇ log f(x(t)) + σεt, εt ∼ Np(0, Ip)
where σ2 corresponds to the discretisation step
Example off(x) = exp(−x4)
Den
sity
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
σ2 = .0001∗
![Page 69: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/69.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Discretization
Unfortunately, the discretized chain may be transient, for instancewhen
limx→±∞
∣
∣σ2∇ log f(x)|x|−1∣
∣ > 1
Example of f(x) = exp(−x4) when σ2 = .2
![Page 70: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/70.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
MH correction
Accept the new value Yt with probability
f(Yt)
f(x(t))·exp
{
−∥
∥
∥Yt − x(t) − σ2
2 ∇ log f(x(t))∥
∥
∥
2/
2σ2
}
exp
{
−∥
∥
∥x(t) − Yt − σ2
2 ∇ log f(Yt)∥
∥
∥
2/
2σ2
} ∧ 1 .
Choice of the scaling factor σ
Should lead to an acceptance rate of 0.574 to achieve optimalconvergence rates (when the components of x are uncorrelated)
[Roberts & Rosenthal, 1998]
![Page 71: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/71.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Optimizing the Acceptance Rate
Problem of choice of the transition kernel from a practical point ofviewMost common alternatives:
1 a fully automated algorithm like ARMS;
2 an instrumental density g which approximates f , such thatf/g is bounded for uniform ergodicity to apply;
3 a random walk
In both cases (b) and (c), the choice of g is critical,
![Page 72: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/72.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Case of the random walk
Different approach to acceptance ratesA high acceptance rate does not indicate that the algorithm ismoving correctly since it indicates that the random walk is movingtoo slowly on the surface of f .
![Page 73: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/73.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Case of the random walk
Different approach to acceptance ratesA high acceptance rate does not indicate that the algorithm ismoving correctly since it indicates that the random walk is movingtoo slowly on the surface of f .If x(t) and yt are close, i.e. f(x(t)) ≃ f(yt) y is accepted withprobability
min
(
f(yt)
f(x(t)), 1
)
≃ 1 .
For multimodal densities with well separated modes, the negativeeffect of limited moves on the surface of f clearly shows.
![Page 74: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/74.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Case of the random walk (2)
If the average acceptance rate is low, the successive values of f(yt)tend to be small compared with f(x(t)), which means that therandom walk moves quickly on the surface of f since it oftenreaches the “borders” of the support of f
![Page 75: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/75.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Rule of thumb
In small dimensions, aim at an average acceptance rate of 50%. Inlarge dimensions, at an average acceptance rate of 25%.
[Gelman,Gilks and Roberts, 1995]
![Page 76: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/76.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Rule of thumb
In small dimensions, aim at an average acceptance rate of 50%. Inlarge dimensions, at an average acceptance rate of 25%.
[Gelman,Gilks and Roberts, 1995]
This rule is to be taken with a pinch of salt!
![Page 77: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/77.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Example (Noisy AR(1) continued)
For a Gaussian random walk with scale ω small enough, therandom walk never jumps to the other mode. But if the scale ω issufficiently large, the Markov chain explores both modes and give asatisfactory approximation of the target distribution.
![Page 78: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/78.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Markov chain based on a random walk with scale ω = .1
![Page 79: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/79.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Markov chain based on a random walk with scale ω = .5
![Page 80: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/80.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Where do we stand?
MCMC in a nutshell:
![Page 81: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/81.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Where do we stand?
MCMC in a nutshell:
Running a sequence Xt+1 = Ψ(Xt, Yy) provides approximationto target density f when detailed balance condition holds
f(x)K(x, y) = f(y)K(y, x)
![Page 82: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/82.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Where do we stand?
MCMC in a nutshell:
Running a sequence Xt+1 = Ψ(Xt, Yy) provides approximationto target density f when detailed balance condition holds
f(x)K(x, y) = f(y)K(y, x)
Easiest implementation of the principle is random walkMetropolis-Hastings
Yt = X(t) + εt
![Page 83: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/83.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Extensions
Where do we stand?
MCMC in a nutshell:
Running a sequence Xt+1 = Ψ(Xt, Yy) provides approximationto target density f when detailed balance condition holds
f(x)K(x, y) = f(y)K(y, x)
Easiest implementation of the principle is random walkMetropolis-Hastings
Yt = X(t) + εt
Practical convergence requires sufficient energy from theproposal that is calibrated by trial and error.
![Page 84: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/84.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Convergence diagnostics
How many iterations?
![Page 85: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/85.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Convergence diagnostics
How many iterations?
Rule # 1 There is no absolute number of simulations, i.e.1, 000 is neither large, nor small.
Rule # 2 It takes [much] longer to check for convergencethan for the chain itself to converge.
Rule # 3 MCMC is a “what-you-get-is-what-you-see”algorithm: it fails to tell about unexplored parts of the space.
Rule # 4 When in doubt, run MCMC chains in parallel andcheck for consistency.
![Page 86: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/86.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Convergence diagnostics
How many iterations?
Rule # 1 There is no absolute number of simulations, i.e.1, 000 is neither large, nor small.
Rule # 2 It takes [much] longer to check for convergencethan for the chain itself to converge.
Rule # 3 MCMC is a “what-you-get-is-what-you-see”algorithm: it fails to tell about unexplored parts of the space.
Rule # 4 When in doubt, run MCMC chains in parallel andcheck for consistency.
Many “quick-&-dirty” solutions in the literature, but notnecessarily 100% trustworthy.
![Page 87: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/87.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Example (Bimodal target)
Density
f(x) =exp−x2/2√
2π
4(x− .3)2 + .01
4(1 + (.3)2) + .01.
−4 −2 0 2 4
0.0
0.1
0.2
0.3
0.4
and use of random walk Metropolis–Hastings algorithm withvariance .04Evaluation of the missing mass by
T−1∑
t=1
[θ(t+1) − θ(t)] f(θ(t))
![Page 88: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/88.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
0 500 1000 1500 2000
0.00.2
0.40.6
0.81.0
Index
mass
Index
Sequence [in blue] and mass evaluation [in brown]
[Philippe & Robert, 2001]
![Page 89: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/89.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Effective sample size
How many iid simulations from π are equivalent to N simulationsfrom the MCMC algorithm?
![Page 90: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/90.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Effective sample size
How many iid simulations from π are equivalent to N simulationsfrom the MCMC algorithm?
Based on estimated k-th order auto-correlation,
ρk = cov(
x(t), x(t+k))
,
effective sample size
N ess = n
(
1 + 2
T0∑
k=1
ρk
)−1/2
,
Only partial indicator that fails to signal chains stuck in onemode of the target
![Page 91: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/91.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Tempering
Facilitate exploration of π by flattening the target: simulate fromπα(x) ∝ π(x)α for α > 0 small enough
![Page 92: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/92.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Tempering
Facilitate exploration of π by flattening the target: simulate fromπα(x) ∝ π(x)α for α > 0 small enough
Determine where the modal regions of π are (possibly withparallel versions using different α’s)
Recycle simulations from π(x)α into simulations from π byimportance sampling
Simple modification of the Metropolis–Hastings algorithm,with new acceptance
{(
π(θ′|x)π(θ|x)
)α q(θ|θ′)q(θ′|θ)
}
∧ 1
![Page 93: Introduction to advanced Monte Carlo methods](https://reader034.vdocuments.us/reader034/viewer/2022042714/5549cdfdb4c9057c6d8b4b34/html5/thumbnails/93.jpg)
An introduction to advanced (?) MCMC methods
The Metropolis-Hastings Algorithm
Convergence assessment
Tempering with the mean mixture
−1 0 1 2 3 4
−10
12
34 1
−1 0 1 2 3 4
−10
12
34 0.5
−1 0 1 2 3 4
−10
12
34 0.2