computation of the marginal likelihood

40
Computation of the marginal likelihood: brief summary and method of power posteriors JLF/BigMC 1 06/01/2011 Jean-Louis Foulley [email protected]

Upload: bigmc

Post on 10-May-2015

1.816 views

Category:

Documents


4 download

DESCRIPTION

First talk at BigMC seminar on 06/01/2010 (Institut Henri Poincaré, Paris), by Jean-Louis Foulley, INRA, on "Computation of the marginal likelihood".

TRANSCRIPT

Page 1: Computation of the marginal likelihood

Computation of the marginal likelihood:

brief summary and method of power posteriors

JLF/BigMC 106/01/2011

Jean-Louis Foulley

[email protected]

Page 2: Computation of the marginal likelihood

Outline

Objectives

Brief summary of current methods Monte Carlo direct

Harmonic mean

Generalized harmonic mean

Chib Chib

Bridge sampling

Nested sampling

Power Posteriors

Relationship with fractional BF

Algorithm

Examples

Conclusion

06/01/2011 2JLF/BigMC

Page 3: Computation of the marginal likelihood

Objectives

( ) ( ) ( )

( )

( )( )( )

( ) ( ) ( )*

*

*

Marginal likelihood ("Prior Predictive", "Evidence")

-Normalization c

|

| where | |

onstant of |

m f d

fm

π

ππ π π

π

Θ=

= =

∫y y θ θ θ

θθ y θ y y θ θ

y

θ y

JLF/BigMC 3

( )

( ) ( )( ) ( )

( )( )

1 2 1

12

1 2 2

,12

M | / M

-Component of the Bayes facto

|

M / M

2 n

r

lm

m

mBF

m

D

π π

π π= =

∆ = −

y

y y y

y

( )12 ,1 ,2

,

10

2ln : Marginal deviance

Calibration: Jeffreys & Turing (Deciban: 10log BF)

m m

m j j

BF D D

D m

= −

= − y

06/01/2011 3

Page 4: Computation of the marginal likelihood

Methods/Monte Carlo, Harmonic Mean

( ) ( )( )( ) ( ) ( )

( )

1

1

Converges (a. s) to but very ineffi

Many samples outside reg

1ˆ |

,..., : draws from

ions ofhigh likelih

1)

ood

cien

Direct Monte Carlo

2)Harmonic mean (Newton

t

& Raftery, 1994)

G g

MC g

g

m

m

fG

π

== ∑y y θ

y

θ θ θ

1−

JLF/BigMC 4

( )1

ˆNRm =y

( )( )( ) ( ) ( )

( )( ) ( )( ) ( )( )( )( ) ( ) ( ) ( ) ( ) ( )

1

1

1 1

1 ,..., : draws from |

|

A special case of WIS: | /

wher

Conve

e / for

rges (a.s) but very instable (infinite variance): to be absolutely a d

oi e

|

v

G g

g g

J Jj j j

j j

j

G f

f w w

w g g f

π

π π

=

= =

∝ ∝

∑ ∑

θ θ θ yy θ

y θ θ θ

θ θ θ θ y θ θ

"Worst Monte Carlo Method Ever" Radford Neal (2010)

Harmonic mean not really affected by change in prior while true marginal

highly sensitive to p

d

rior

06/01/2011 4

Page 5: Computation of the marginal likelihood

Methods/Gelfand&Dey & Chib

( )( )( )

( )( ) ( )( )( ) ( ) ( )

1

1

1

|

,...

3) Generalized harmonic mean

(Gelfand & Dey, 1994; Chen & Shao, 1997)

, : draws from |

g

G

GD g g g

g

gm

G f π

π

=

=

∑θ

yy θ θ

θ θ θ y

JLF/BigMC 5

( ) ( ) ( )

( )g . as an approx of the posterior: pbs in large

,...

dimensi

4)Chib's meth

on

, : draws from

od

|

s

πθ θ θ y

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )( ) ( )

* * *

*

ˆ ˆln ln | ln ln |

ˆ | to be estimated

(1995)

ln ln |

& ML, M

Simple & often effec

AP, | sele

ln ln |

tiv

ct d

e

,

e

SCm f

E

m f π π

π π

π

= + −

= + ∀

=

−y y θ θ θ y

y y θ θ θ y

θ y θ θ y

θ

06/01/2011 5

Page 6: Computation of the marginal likelihood

Chib(Cont.)

( ) ( ) ( ) ( )* * *

4)Chib (1995)

a) Gibbs & RaoBlackwellization (Chib,1995)

ˆ ˆln ln | ln ln |SCm f π π= + −y y θ θ θ y

JLF/BigMC 6

a) Gibbs & RaoBlackwellization (Chib,1995)

b) Metropolis-Hastings (Chib & Jeliazkov, 2001)

c) Kernel estimator (Chen, 1994)

06/01/2011 6

Page 7: Computation of the marginal likelihood

Chib via Gibbs

( )

( ) ( ) ( )

( ) ( ) ( )

1 2

1 2 1 2 2

known estimated

2 2 1 1 1

If ,

, | | , |

| | , | d

π π π

π π π

=

=

= ∫

θ θ θ

θ θ y θ y θ θ y

θ y θ y θ θ y θ

JLF/BigMC 7

( ) ( ) ( )

( ) ( )( )( ) ( )

2 2 1 1 1

known MCMC dr

* *

2 2

aws

1 1

11

| | , |

"Estimation by Rao-Blackwellization"

: draws from

| ,

|

1ˆ |

G g

g

g

G

d

π

π π π

π

π

=

=

=

θ y θ y θ θ y θ

θ θ

θ y θ y θ

y

06/01/2011 7

Page 8: Computation of the marginal likelihood

Bridge sampling

( ) ( )

5) (Meng & Wong, 19

|

96)

f π

Bridge samp

y θ θ

il ng

JLF/BigMC 88

( ) ( )( ) ( )

( )( ) ( ) ( )

1

|

|

fg d

g d

m

πα

α π=

y

y θ θθ θ θ

θ θ θ y θ

806/01/2011

Page 9: Computation of the marginal likelihood

Bridge sampling/cont.

( )( ) ( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( ) ( )( )( ) ( ) ( )( )

( ) ( )

( ) ( )

( ) ( )( ) ( )( ) ( )( )1

1

|

5) (Meng & Wong, 1996)

"bridge function" density to be calibated

For 1/

| |

ˆ | / ( )

|

L l l l

gf f

gg

m L f g IS

g d Em

Ed

g

g

π

α

α

α π α π

αα

π

π

−−

= =

=

=

= ∑

∫∫

θ

θ y

Bridge samplin

θ

θ θ

θ

θ y θ θ θ y θ θ

θ θθ θ

y y θ θ

θ

g

θ

θ

θ y

JLF/BigMC 99

( ) ( )( ) ( )( ) ( )( )( ) ( ) ( )

1

1 1

For 1/ |

ˆ | / ( )

ˆ

L l l l

l

BS

BSm L f g IS

mf

π

α π

=

=

= ∑y y θ θ

θ y θ θ

θ

( )

( ) ( ) ( ) ( ) ( )

( )( )( ) ( )( ) ( )( )

( )( ) ( )( ) ( )( )( ) ( ) ( ) ( )

3

1/ 21

1

3

1/

1/ 21

2

2

1

ˆ Lopes-West (2004)

ˆ

For 1/ |

: draws from g ; : draws from

Gelfand-Dey (1994)

| /

|

|

/

BS

L l

l

B

l

l

SM m m m

m

m

l

m

L g

g

f

M

g

mf

f

π

π

α π

π

=

=

=

=

=

=

y

y θ

y

θ

y

θ y θ θ θ

θ θ

θ

θ

θ

θ y

θ

θ

y

906/01/2011

Page 10: Computation of the marginal likelihood

Bridge sampling (cont.)

( )( ) ( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( ) ( )( )( ) ( ) ( )( )

( ) ( ) ( ) ( )

( )( )( )

( )( ) ( )( )( )

1

1

4 1/ 2

1

|

1

| |

1/(Lopes & West, 20

5) (Meng

ˆ 04; Ando, 2010)

& Wong, 1996)

For 1/ |

: dr

aws from g

1/ |

|

l

L l

l

SM m

m

g

Bm

d Em

Ed

f g

f f

gg

gm

M

g

L

f

π

π

π

α π α π

π

α

α

α

=

=

=

= =

=

θ

θ y

θ y θ θ θ y

θ

y

θ

θ

θ

Bridge sampling

y

θ θ

θ θθ

θ θ

y θ θ θ

θ

y

θ

θ

( ) ( ) ( ) (cf numer; : draws from | Od ator)dm πθ θ θ y

JLF/BigMC 10

( ) : draws from g

lθ ( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

( )( )( )( ) ( )( )

-1

1

1

1

5 5

(Meng & Wong, 1996; Lopes

(cf numer

& West, 20

; : draws from | Od ator)d

For | + ,

04; Fruhwirth-Schn

esti

ˆ

ˆˆ

m. wr

atter,200

ˆ

t E(

4)

|

RMS )

|

E

t

l l

M t Lt t

BS B

M L

l

S

m

l

s s

Ls s g

m m

g

π

α π

π

π

=

+

+=

θ y

θ y

θ θ θ y

θ θ y θ optim

θ

y

um

( )( )( )( ) ( )( )

( ) ( ) ( ) ( ) ( )5 5 1 2

1

1

0ˆ ˆ ˆ ˆ ˆwhere | | / and ou

1 /( )

ˆ |

t

t BS BS BS B

L

m

M

m m

M L

L

t

m

S

M

f m m

gM

s s

m m

s s M L

g

M

π π

π

=

= =

= − = +

+

θ y

θ y

y θ θ

θ

θ

1006/01/2011 10

Page 11: Computation of the marginal likelihood

Nested sampling

( )

( )( )

( ) ( )

( ) ( ) ( )1

6)

(Skilling, 2006; Murray et al, 2006; Chopin & Robert, 2010)

|

Let Pr be the survival function of rv

Z L

m f d E L

x l L l L

ππ

ϕ−

= =

= = >

∫θ

y y θ θ

Nested sampli

θ θ

ng

θ θ

JLF/BigMC 11

( ) ( ) ( )Let Pr be the survival function of rv

where ( ) (upper tail) quantile fun

x l L l L

l x

ϕ

ϕ

= = >

=

θ θ

( )

( ) ( )

( )

1

10

1 1 1

ction of so that (0,1)

ˆThen area under curve = and

with or ½ if trapezoidal integration

i

i i

m

x ii

x i i x i i

L x U

Z x dx l x Z l

x x x x

ϕ ϕ=

− − +

= = ∆

∆ = − ∆ = −

∑∫

θ ~

1106/01/2011 11

Page 12: Computation of the marginal likelihood

Nested sampling/Cont.

( ) ( )

( )

( )

1,i 1 1,.., 1,i 1 1

2, 1,i 1

1

2 1,.., 2,i

1) Draw points from prior, Argmin set

2) Obtain points by repeating θ except replaced by a draw

from prior constrained by ,

record Argmin and

i N

i

i N

N L l L

N

L l

L

θ θ θ θ

θ θ

θ

θ θ

=

=

= =

>

= ( )

( )

2 2

1

set

3) Repeat 1 & 2 until a stopping rule (change in max of )

l L

L

θ

ε

ϕ −

=

=

JLF/BigMC 12

( )

( )( )( )

1

i

1

1 0

Since is unknown

Set a) deterministic exp( / ) so that ln ln

or b) random

Ma

with 1, ,1

in di

i

i i i

i i i i

x l

x i N x E l

x t x x t Be N

ϕ

ϕ

+

=

= − =

= = ~

( )

( ) ( ) ( ) ( ) ( )1

See Chopin & Robert (2010) Extended Imp

fficulty in sampling from the prior co

ortance Sampling scheme

with

nstrained by ?

i

m

x i iiZ w L

L l

L w

θ

ϕ π θ θ π θ θ θ=

= ∆ =

>

θ

1206/01/2011 12

Page 13: Computation of the marginal likelihood

Power Posteriors/basic principle

( )( ) ( )

( )

( ) ( ) ( )

Method due to Friel & Petit (2008)

Lartillot & Philippe (2006) "Annealing-Melting"

Power Posterior defined as

where

|| ,

|

t

t

t

ft

z

z f d

ππ

π

=

= ∫

y θ θθ y

y

y y θ θ θ

JLF/BigMC 13

( ) ( ) ( )

] [ 1

where

and 0,1 with equivalent to "physical temperatu e

|

r

t

tz f d

t t

π

−∈

= ∫y y θ θ θ

( ) ( ) ( )

( ) ( ) ( ) ( )0

1

"

0 to 1: cooling down or "annealing"; 1 to 0 "melting"

Notice the path sampling scheme (Gelman & Meng, 1998)

| ,0 with 1

| ,1 | with

t t

z

z m

π π

π π

= =

= =

= =

θ y θ y

θ y θ y y y

06/01/2011 13

Page 14: Computation of the marginal likelihood

PP/key result

( ) ( )

( )( ) ( )

1

| ,0

where | , has density:

|

log log |

t

t

t

f

m E f dt

π

= ∫ θ yy y θ

θ y

y θ θ

JLF/BigMC 14

( )( ) ( )

( )Thermodynamic integration (end of the 70's)

Ripley (1988),Ogata (1989), Neal (1993)

"Path sampling" (Gelman & Meng, 199

|| ,

8)

t

ft

z

ππ =

y θ θθ y

y

06/01/2011 14

Page 15: Computation of the marginal likelihood

PP formula/proof as a special case of path sampling

( ) ( ) ( ) ( ) ( )

( ) ( )

( )( )

1

If / où

Let label , ln as the potential

1z

p t q t z t z t q t d

dU t q t

dt

θ θ θ θ

θ θ

= =

=

∫| | |

|

JLF/BigMC 15

( )( )

( )

( ) ( ) ( ) ( ) ( )

( ) ( )

1

0One has

1ln ,

Here | , ;

0

|

Then , ln |

t

t

z

p t t q t f

E U t

f

z

U

t

t

θ π θ π

θ

θ=

= =

=

θ y y θ θ

y θ

|

| |

06/01/2011 15

Page 16: Computation of the marginal likelihood

PP/Example

( )

( )

( )

( )

2

2

22

2 2

| ~ ,1 , 1,..,

~ ,

Alors | , ~ ,

1;

log

2 |

i iid

t t

t t

y N i N

N

t N

Nty

Nt Nt

E f

θ θ

θ µ τ

θ µ τ

µτµ τ

τ τ

θ

− −

=

+= =

+ +

− =

y

y

JLF/BigMC 16

( )( )

( )

( )

( )

( ) ( )

2

2

2 22

21 2 1

1 1

| ,

2

0

log2

1log 2 log

1

;

|

tD

N N

i ii

t

i

N

yN s

Ntt

y N y s N y y

D N Cte y

E f

θ

θ

τ

µπ

τµτ

θ

θ

µ

− −

= =

− =

− + + + ++

= = −

= + − +

∑ ∑

y y

( )0

2 2

2

High sensitivity t ( , )o Dτ τ θ→ ∞ → ∞

06/01/2011 16

Page 17: Computation of the marginal likelihood

PP/Example/cont.

JLF/BigMC 1706/01/2011 1717

Page 18: Computation of the marginal likelihood

KL distance Prior-Posterior

( ) ( )( )( )

( )( )

( ) ( )( ) ( )

( )

|| , ln |

|ln |

KL d

fKL d

m

ππ π π

π

ππ

π

=

=

θ yθ y θ θ y θ

θ

y θ θθ

y θy θ

JLF/BigMC 18

( ) ( )

( ) ( )

( )

| ln | ln

2 (by-product of PP 2

where model complexity

) m

D

m

D

m

KL E f

D D KL

DIC D p p

m

KL D D

D D

π

= −

− = = +

= +

− ⇒

= −

θ y

y θ

θ

y θ y

06/01/2011 18

Page 19: Computation of the marginal likelihood

PP/partial BF

( ) ( )1)if improper marginal also improper

resulting in problems for defining BF

2) High sensitivity of BF to priors (does not vanish with

increasing sample size)

f xπ θ ⇒

JLF/BigMC 19

increasing samp

partial

le

B

size)

Idea behind (Lempers,1971) F P=y y( ),

-Learning or pilot sample to tune the prior

-Testing sample for data analysis

(Berger & Perrichi, 1996)Intrinsinc BF

Fracti (O'Hagan, 19onal )B 95F

T

P

T

y

y

y

06/01/2011 19

Page 20: Computation of the marginal likelihood

Fractional BF

( ) ( )

A fraction of the likelihood is used to tune the prior

| | / 1 (O'Hagan, 1995)

resulting in:

b

P

b

f f b m N= <≈y θ y θ

20

( ) ( ) ( )

resulting i

,

n

|

:

bfbπ π∝θ y θ θ

06/01/2011 20

Page 21: Computation of the marginal likelihood

PP & fractional BF

( ) ( ) ( )

( ) ( ) ( )

( )( ) ( )

( ) ( )

( )( )

1

, |

, | ,

| ,1,

,

b

bF

F

b

b f

m b f b d

f d mm b

m b

π π

π

π

=

=

=

∫∫

θ y θ θ

y y θ θ θ

y θ θ θ yy

y

JLF/BigMC 21

( )( ) ( ) ( )

( ) ( )

( ) ( )1

| ,

,,|

PP directly provide

| ,

log , log

s

|

,

tb

b

F

m bm bf d

b via t b

m b E f dt

π

π π =

− =

=

=

∫ θ y

θ y

y y θ

yyy θ θ θ

θ

06/01/2011 21

Page 22: Computation of the marginal likelihood

PP/algorithm[ [

( ) ( )

( ) ( )( )

0 1 1

| , 1

MCMC with discretization of on 0,1

0 ... ... 1

( / ) with 1,.., ; 20 100; 2 5

1)Make draws of MCMC from | ,

2)Compute 1ˆ log | log | i

i

i

i

i n

G g

t t

n

c

i

g

g

i

E

t

t t t t t

t i n i n n

G

t

p p

c

π

= =

= < < < < < < =

= = = − =

= ∑θ y y θ θ

θ θ y

y

JLF/BigMC 22

( )Often cond

iG

( ) ( )

( )

( ) ( )( )1 10

1itional independence, log | log |

eg if if the closest stochastic parent of (as for DIC)

3)Approximate the integral (eg trapezoidal rule)

l

Error due to this

og ½

nume

n

i i

N

i

i

i

i i

i

m y t

p p y

t E E

y

+ +=

==

=

= − +∑

∑y θ θ

θ y

rical approx. (Calderhead & Girolami,2009)

Formula for MC sampling error: see Friel & Pettitt

06/01/2011 22

Page 23: Computation of the marginal likelihood

PP/Little toy example

( ) ( )( ) ( )

( ) ( )( )

( )

( ) ( )

id

1

exp0) | ~ |

!

exp1) ~ ,

0 1) ~ , where /

iy

i i i i

i i i i i i

i

i i

i id i

i id i i i

x xy x f y

y

y p p x

α α

λ λλ λ λ

β λ βλλ α β π λ

α

α β β

−⇔ =

−⇔ =

Γ

+ = +

P

G

BN

JLF/BigMC 23

( ) ( )

( )( )( )

( )

( ) ( ) ( ) ( )

( )

1 1

1 1

0 1) ~ , where /

Direct approach: 1!

ln ln ln !

ln ln 1

i

i id i i i

yi

i i i

i

n n

i ii i

n n

i i ii i

y p p x

yf y p p

y

f n y y

p y p

α

α β β

α

α

α α

α

= =

= =

+ = +

Γ += −

Γ

= − Γ + Γ + −

+ + −

∑ ∑

y

BN

( ) ( ) ( )1

Indirect approach: |n

i i i iif f y dλ π λ λ

==

∏ ∫y

06/01/2011 2306/01/2011 23

Page 24: Computation of the marginal likelihood

PP/Little toy example/cont.

( )

3#failures of pumps in (10 )

5,1,5,14,3,19,1,1, 4, 22 ; 10; 1

/ Pump data: Ex#2 in Winbugs, Carlin-Louis (p126)

y x hrs

y n

Ex

α β

=

= = = =

JLF/BigMC 24

( )

( )

5,1,5,14,3,19,1,1, 4, 22 ; 10; 1

(94.3,15.7,62.9,126,5.24,31.4,1.05,1.05, 2.1,10.5)

ˆ2 ln 66.03 66.2 08 .FP

y n

x

D f D

α β= = = =

=

= − = = ±y 03 (20pts)

06/01/2011 2406/01/2011 24

Page 25: Computation of the marginal likelihood

PP/Toy example in Openbugs

JLF/BigMC 2506/01/2011 2506/01/2011 25

Page 26: Computation of the marginal likelihood

PP/Toy example in Openbugs/Cont.

JLF/BigMC 2606/01/2011 2606/01/2011 26

Page 27: Computation of the marginal likelihood

PP/Toy example in Openbugs/Cont.

JLF/BigMC 2706/01/2011 2706/01/2011 27

Page 28: Computation of the marginal likelihood

PP/Toy example in Openbugs/Cont.

JLF/BigMC 2806/01/2011 2806/01/2011 28

Page 29: Computation of the marginal likelihood

Sampling both θθθθ & t

( ) ( ) ( )

( )( )

( )( )

( )( )

1

0

0

1

, |

log log | | ,

log |log | , ( )

( )

log |

t

f

m f t dt

fm t p t dt

p tπ

π

π

=

=

=

∫θ y

y y θ θ y

y θy θ y

y θ

JLF/BigMC06/01/2011 29

( )( )

( ) ( ) ( )

( ) ( ) ( )

( )

, |

| , |

if we assume ( ) | , |

Sampling , in such conditions gives poor

lo

es

g |log

tim

( )

ati

t

t

t

t

fm

t f

p t z

Ep t

y t f

t

π π

π

=

∝ ⇒ ∝

θ y y θ θ

θ y y θ

y

θ

θy

on

(too few draws of t close to 0)

2906/01/2011 29

Page 30: Computation of the marginal likelihood

Example 1/ Pothoff&Roy’s data

Growth measurements in 11 girls and 16 boys: Pothoff and Roy,1964; Little and Rubin, 1987

Age (years) Age (years)

Girl 8 10 12 14 Boy 8 10 12 14

1 210 200 215 230 1 260 250 290 310

2 210 215 240 255 2 215 230 265

3 205 245 260 3 230 225 240 275

4 235 245 250 265 4 255 275 265 270

5 215 230 225 235 5 200 225 260

6 200 210 225 6 245 255 270 285

JLF/BigMC 3006/01/2011

6 200 210 225 6 245 255 270 285

7 215 225 230 250 7 220 220 245 265

8 230 230 235 240 8 240 215 245 255

9 200 220 215 9 230 205 310 260

10 165 190 195 10 275 280 310 315

11 245 250 280 280 11 230 230 235 250

12 215 240 280

13 170 260 295

14 225 255 255 260

15 230 245 260 300

16 220 235 250

distance from the centre of the pituary to the pteryomaxillary fissure (unit 10-4m)

Page 31: Computation of the marginal likelihood

Model comparison on Pothoff’s data

( ) ( )( )0 0

int

: subscript for individual 1,.., 25 (11girls+16boys)

: subscript for measurement at age (8,10,12,14 )

1)Purely Fixed Model

8

2)Random intercept mod

j

ij i i j ij

ercept pente

i i I

j t yrs

y x x t eα α β β

= =

= + + + − +

( ) ( )( )0 0

el

8ij i i j iji

y x x t eaα α β β= + + + + − +

JLF/BigMC 3106/01/2011

( ) ( )( )

( ) ( )

0 0

2

1 2 id

1 0

2 0

3)Random intercept & slope model assuming independent effects

8

or

8 , ~ ,

with ~ ,

ij i i j ij

ij i i j ij ij ij

i

e

i

i

i

i

i

i

y x a x t e

y t e

x

x

b

y

α α β β

φ φ η σ

φ α αφ

φ β β

= + + + + + − +

= + − +

+ =

+

N

N

2

2

21 0

22 0

0

0

4)Random intercept & slope model assuming correlated effects

~ ,

a

b

i i a ab

i

i i ab b

x

x

σ

σ

φ α α σ σφ

φ β β σ σ

+ =

+ N

Page 32: Computation of the marginal likelihood

Model presentation:Hierarchical Bayes

( ) ( )

( ) ( )

st 2

id 1 2

nd

21 0

22 0

2 2

e

rd

1 level: ~ , with 8

2 level :

2 ) ~ ,

2 ) ~ U 0, ~ InvG 1,

3 level:

ij ij e ij i i j

i i a ab

i

i i ab b

e e e

y t

xa

x

b or

η σ η φ φ

φ α α σ σφ

φ β β σ σ

σ σ σ

Σ

= + −

+

= +

N

N

JLF/BigMC 3206/01/2011

rd

0 0

3 level:

Fixed effects: , , , ~ U(inf,suα α β β

( ) ( )

( ) ( )( ) ( ) ( )

( )( )

2 2 2 2

*1 1

p)

Var (Covar) components:

0, then i) ~ U 0, , same for ~ U 0,

or ii) ~ InvG 1, ,same for ~ InvG 1,

0, then i) ~ U 0, , ~ U 0, , ~ U -1,1

or ii) ~ , for

with dim( ) 1 and

ab a a b b

a a b b

ab a a b b

If

If

W

σ σ σ

σ σ σ σ

σ σ σ ρ

ν ν

ν

− −

− = ∆ ∆

− ≠ ∆ ∆

Ω Σ Ω = Σ

= Ω + Σ

( )( ) known location parameter

*Take care as Winbugs uses another notation ie ,W ν νΣ

Page 33: Computation of the marginal likelihood

Results

JLF/BigMC 3306/01/2011

Page 34: Computation of the marginal likelihood

Results/fractional priors (b=0 vs 0.125)

JLF/BigMC 3406/01/2011

Page 35: Computation of the marginal likelihood

Example 2:Models of genetic differentiation

( )

2 level hierarchical model

=locus; =(sub)population

=Nbre of genes carrying a given allele at locus in pop.

Frequency of that allele at loc

| ~ B ,

us in pop.

0)

ij

ij

ij ij id ij ijy n

i j

a i j

p i j

α α

=

JLF/BigMC 3506/01/2011

( )| ~ B ,0)

1) | , ~

ij ij id ij ij

ij i ij id

y n

x

α α

α λ ( )( ) ( )

( ) ( )

1 where Dif. index

Frequency of that allele at locus i in

Beta , 1

~ Beta

the gene pool

2)

Migration-Drift at equilibrium (Baldi

, , ~ Beta ,

ng)

j i j i

i id j id

j

j j

j

i

c ca b

c

c a b

cc

π π

τ π τ π τ

π

π

−=−

=

Page 36: Computation of the marginal likelihood

Ex2: Nicholson’s model

( )( )

( )

| , ~ , 1

Truncated normal with masses in 0 and 1

Nicholson et al (2002) same as previously but

1) ij i ij id i j i ix N cα λ π π π−

JLF/BigMC 3606/01/2011

( )

( ) ( )

*

*

so that | ~ B ,

where max(0, min(1, ))

2) ~ Beta , , ~ Beta ,

ij ij id ij ij

ij ij

i id j id c c

y n

a b c a bπ π

α α

α α

π

=

Pure drift model

Page 37: Computation of the marginal likelihood

Results

JLF/BigMC 3706/01/2011

Page 38: Computation of the marginal likelihood

Conclusion

Derived from thermodynamical integration

Link with « path sampling »

Easy to understand and quite general

Well suited to complex hierarchical models

« Theta’s » can be defined as the closest stochastic parents « Theta’s » can be defined as the closest stochastic parents

of data making the latter conditionally independent

Draws only from posterior distributions

Gives as a by product fractional BF

Easy to implement (including in Openbugs) but time

consuming

Caution needed in discretization of t (close to 0)

06/01/2011 38JLF/BigMC

Page 39: Computation of the marginal likelihood

Some references Chen M, Shao Q, Ibrahim J (2000) Monte Carlo methods in Bayesian

computation. Springer

Chib S (1995) Marginal likelihood from the Gibbs output. JASA 90,1313-1321

Chopin N, Robert CP (2010) Properties of nested sampling. Biometrika, 97, 741-755

Friel N, Pettitt AN (2008) Marginal likelihood estimation via power posteriors, JRSS, B, 70, 589-607

Frühwirth-Schnatter (2004) Estimating marginal likelihoods from mixtures & Markov switching models using bridge sampling techniques. EconometricsMarkov switching models using bridge sampling techniques. EconometricsJournal, 7,143-167

Gelman A, Meng X-L (1998) Simulating normalizing constants: fromimportance sampling to bridge sampling and path sampling, Statistical Science, 13, 163-185

Lartillot N, Philippe H (2006) Computing Bayes factors using thermodynamicintegration. Systematic Biology, 55, 195-207

Marin JM, Robert CP (2009) Importance sampling methods for Bayesiandiscrimination between embedded models. arXiv:0910.2325v1

Meng X-L, Wong WH (1996) Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica,6,831-860

O Hagan A (1995) Fractional Bayes factors for model comparison. JRSS, B, 57, 99-138

06/01/2011 39JLF/BigMC

Page 40: Computation of the marginal likelihood

Acknowledgements

Nial Friel (U College, Dublin) for his interest in these

applications and his unvaluable explanations &

suggestions

Tony O’Hagan for further insight into FBF

Gilles Celeux, Mathieu Gautier as coadvisors of the Gilles Celeux, Mathieu Gautier as coadvisors of the

Master dissertation of Yoan Soussan (Paris VI)

Christian Robert for his blog and his relevant

comments, standpoints and bibliographical references

The Applibugs & Babayes groups for stimulating

discussions on DIC, BF,CPO & other information

criteria (AIC,BIC)

06/01/2011 40JLF/BigMC