discussion of matti vihola's talk

10
On stability and convergence of adaptive MC[MC] Discussion by Christian P. Robert Universit´ e Paris-Dauphine, IuF, and CREST http://xianblog.wordpress.com Adap’skiii, Park City, Utah, Jan. 03, 2011 C.P. Robert On stability and convergence of adaptive MC[MC]

Upload: christian-robert

Post on 04-Jul-2015

1.017 views

Category:

Education


2 download

DESCRIPTION

Presented at Adap'skiii, Park City, Utah, Jan. 3-4, 2010

TRANSCRIPT

Page 1: Discussion of Matti Vihola's talk

On stability and convergenceof adaptive MC[MC]

Discussion by Christian P. Robert

Universite Paris-Dauphine, IuF, and CRESThttp://xianblog.wordpress.com

Adap’skiii, Park City, Utah, Jan. 03, 2011

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 2: Discussion of Matti Vihola's talk

The adaptive algorithm

Focussing on the scale matrix of an adaptive random-walkMetropolis Hastings scheme is theoretically fascinating, especiallywhen considering

removal of containment for LLN [intuitive but hard to prove]

potential link with renewal theory [in the fixed componentversion]

no lower bound on Σm

verifiable assumptions

[Roberts & Rosenthal, 2007]

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 3: Discussion of Matti Vihola's talk

The adaptive algorithm (2)

Limited applicability of the adaptive scheme when using arandom-walk Metropolis Hastings scheme because of

strong tail [or support] conditions

more generaly, dependence on parameterisation

curse of dimensionality [ellip. symm. less & less appropriate]

unavailability of the performance target [e.g., acceptance of0.234]

Congratulations and thanks for adding a software to the theoreticaldevelopment!

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 4: Discussion of Matti Vihola's talk

The adaptive algorithm (2)

Limited applicability of the adaptive scheme when using arandom-walk Metropolis Hastings scheme because of

strong tail [or support] conditions

more generaly, dependence on parameterisation

curse of dimensionality [ellip. symm. less & less appropriate]

unavailability of the performance target [e.g., acceptance of0.234]

Congratulations and thanks for adding a software to the theoreticaldevelopment!

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 5: Discussion of Matti Vihola's talk

Initialising importance sampling

Alternative: Population Monte Carlo (PMC) offers a solution to thedifficulty of picking the importance function q through adaptivity:Given a target π, PMC produces a sequence qt of importancefunctions (t = 1, . . . , T ) aimed at improving the approximation ofπ

[Cappe, Douc, Guillin, Marin & X., 2007]

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 6: Discussion of Matti Vihola's talk

Adaptive importance sampling

Use of mixture densities

qt(x) = q(x;αt, θt) =D∑

d=1

αtdϕ(x; θt

d)

[West, 1993]

where

αt = (αt1, . . . , α

tD) is a vector of adaptable weights for the D

mixture components

θt = (θt1, . . . , θ

tD) is a vector of parameters which specify the

components

ϕ is a parameterised density (usually taken to be multivariateGaussian or Student-t)

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 7: Discussion of Matti Vihola's talk

PMC updates

Maximization of Lt(α, θ) leads to closed form solutions inexponential families (and for the t distributions)For instance for Np(µd,Σd):

αt+1d =

∫ρd(x;αt, µt,Σt)π(x)dx,

µt+1d =

∫xρd(x;αt, µt,Σt)π(x)dx

αt+1d

,

Σt+1d =

∫(x− µt+1

d )(x− µt+1d )Tρd(x;αt, µt,Σt)π(x)dxαt+1

d

.

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 8: Discussion of Matti Vihola's talk

Banana benchmark

Twisted Np(0,Σ) target with Σ = diag(σ21 , 1, . . . , 1), changing the

second co-ordinate x2 to x2 + b(x21 − σ2

1)

x1

x 2

−40 −20 0 20 40

−40

−30

−20

−10

010

20

p = 10, σ21 = 100, b = 0.03

[Haario et al. 1999]

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 9: Discussion of Matti Vihola's talk

Simulation

C.P. Robert On stability and convergence of adaptive MC[MC]

Page 10: Discussion of Matti Vihola's talk

Comparison to MCMC

Adaptive MCMC: Proposal is a multivariate Gaussian with Σupdated/based on previous values in the chain. Scale and updatetimes chosen for optimal results.

[Kilbinger, Wraith et al., 2010]

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)!

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

!"# $"# %"# &"# '"#

!$

!(

!!

"!

($

)(

fa fa

fbfb

PMC MCMC

Evolution of π(fa) (top panels) and π(fb) (bottom panels) from 10k points to 100k points for both PMC (leftpanels) and MCMC (right panels).

C.P. Robert On stability and convergence of adaptive MC[MC]