bayesian methods with monte carlo markov chains iii

54
1 Bayesian Methods with Monte Carlo Markov Chains III Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University [email protected] http:// tigpbp.iis.sinica.edu.tw/courses.htm

Upload: caesar-conley

Post on 31-Dec-2015

21 views

Category:

Documents


0 download

DESCRIPTION

Bayesian Methods with Monte Carlo Markov Chains III. Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University [email protected] http://tigpbp.iis.sinica.edu.tw/courses.htm. Part 8 More Examples of Gibbs Sampling. An Example with Three Random Variables (1). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bayesian Methods with Monte Carlo Markov Chains III

1

Bayesian Methods with Monte Carlo Markov

Chains III

Henry Horng-Shing LuInstitute of Statistics

National Chiao Tung [email protected]

http://tigpbp.iis.sinica.edu.tw/courses.htm

Page 2: Bayesian Methods with Monte Carlo Markov Chains III

2

Part 8 More Examples of Gibbs

Sampling

Page 3: Bayesian Methods with Monte Carlo Markov Chains III

3

An Example with Three Random Variables (1)

To sample (X,Y,N) as follows:

1 1

( , , ) ~ ( , , )

(1 ) ,!

where 0,1,2,..., , 0 1, 0,1,2,...,

, are known, and is a constant.

nx n x

X Y N f x y n

nc y y ex n

x n y n

c

Page 4: Bayesian Methods with Monte Carlo Markov Chains III

4

An Example with Three Random Variables (2)

One can see that

( | , ) (1 ) ~ ( , ),x n xnf x y n y y Binomial n y

x

1 1( | , ) (1 ) ~ ( , ),x n xf y x n y y Beta x n x (1 ) ((1 ) )

( | , ) ~ ((1 ) ).( )!

y n xe yf n x x y Poisson y

n x

Page 5: Bayesian Methods with Monte Carlo Markov Chains III

5

An Example with Three Random Variables (3)

Gibbs sampling Algorithm:1. Initial Setting: t=0,

2. Sample a value (xt+1,yt+1) from

3. t=t+1, repeat step 2 until convergence.

0

0

0 0 0

~ [0,1] or a arbitrary value [0,1]

~ [1, ) or a arbitrary integer value [1, )

~ ( , )

y Unif

n Discrete Unif

x Bin n y

1

1 1

1 1 1

~ ( , )

~ ((1 ) )

~ ( , )

t t t t

t t t

t t t

y Beta x n x

n x Possion y

x Bin n y

Page 6: Bayesian Methods with Monte Carlo Markov Chains III

6

An Example with Three Random Variables by R

10000 samples with α=2, β=7 and λ=16

Page 7: Bayesian Methods with Monte Carlo Markov Chains III

7

An Example with Three Random Variables by C (1)

10000 samples with α=2, β=7 and λ=16

Page 8: Bayesian Methods with Monte Carlo Markov Chains III

8

An Example with Three Random Variables by C (2)

Page 9: Bayesian Methods with Monte Carlo Markov Chains III

9

An Example with Three Random Variables by C (3)

Page 10: Bayesian Methods with Monte Carlo Markov Chains III

1010

Example 1 in Genetics (1) Two linked loci with alleles A and a, and B and

b A, B: dominant a, b: recessive

A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab

F (Female) 1- r’ r’ (female recombination fraction)

M (Male) 1-r r (male recombination fraction)

A

B b

a B

A

b

a a

B

b

A

A

B b

a

10

Page 11: Bayesian Methods with Monte Carlo Markov Chains III

1111

Example 1 in Genetics (2) r and r’ are the recombination rates for male a

nd female Suppose the parental origin of these heterozyg

ote is from the mating of . The problem is to estimate r and r’ from the offspring of selfed heterozygotes.

Fisher, R. A. and Balmukand, B. (1928). The estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92.

http://en.wikipedia.org/wiki/Genetics http://www2.isye.gatech.edu/~brani/isyebayes/bank/handout12.pdf

AABB aabb

11

Page 12: Bayesian Methods with Monte Carlo Markov Chains III

1212

Example 1 in Genetics (3)MALE

AB (1-r)/2

ab(1-r)/2

aBr/2

Abr/2

FEMALE

AB (1-r’)/2

AABB (1-r) (1-r’)/4

aABb(1-r) (1-r’)/4

aABBr (1-r’)/4

AABbr (1-r’)/4

ab(1-r’)/2

AaBb(1-r) (1-r’)/4

aabb(1-r) (1-r’)/4

aaBbr (1-r’)/4

Aabbr (1-r’)/4

aB r’/2

AaBB(1-r) r’/4

aabB(1-r) r’/4

aaBBr r’/4

AabBr r’/4

Ab r’/2

AABb(1-r) r’/4

aAbb(1-r) r’/4

aABbr r’/4

AAbb r r’/4

12

Page 13: Bayesian Methods with Monte Carlo Markov Chains III

1313

Example 1 in Genetics (4) Four distinct phenotypes: A*B*, A*b*, a*B* and a*b*. A*: the dominant phenotype from (Aa, AA, aA). a*: the recessive phenotype from aa. B*: the dominant phenotype from (Bb, BB, bB). b* : the recessive phenotype from bb. A*B*: 9 gametic combinations. A*b*: 3 gametic combinations. a*B*: 3 gametic combinations. a*b*: 1 gametic combination. Total: 16 combinations.

13

Page 14: Bayesian Methods with Monte Carlo Markov Chains III

1414

Example 1 in Genetics (5)

Let (1 )(1 '), then

2( * *) ,

41

( * *) ( * *) ,4

( * *) .4

r r

P A B

P A b P a B

P a b

14

Page 15: Bayesian Methods with Monte Carlo Markov Chains III

15

Hence, the random sample of n from the offspring of selfed heterozygotes will follow a multinomial distribution:

15

Example 1 in Genetics (6)

2 1 1; , , , .

4 4 4 4Multinomial n

We know that (1 )(1 '), 0 1/ 2, and 0 ' 1/ 2.

So, 1 1/ 4.

r r r r

15

Page 16: Bayesian Methods with Monte Carlo Markov Chains III

16

Example 1 in Genetics (7) Suppose that we observe the data of y = (y1, y

2, y3, y4) = (125, 18, 20, 24), which is a random sample from

Then the probability mass function is

2 1 1; , , , .

4 4 4 4Multinomial n

2 31 4

1 2 3 4

! 2 1( | ) ( ) ( ) ( ) .

! ! ! ! 4 4 4y yy yn

f yy y y y

16

Page 17: Bayesian Methods with Monte Carlo Markov Chains III

17

Example 1 in Genetics (8) How to estimate

MME (shown in the last week):http://en.wikipedia.org/wiki/Method_of_moments_%28statistics%29

MLE (shown in the last week):http://en.wikipedia.org/wiki/Maximum_likelihood

Bayesian Method:http://en.wikipedia.org/wiki/Bayesian_method

?

Page 18: Bayesian Methods with Monte Carlo Markov Chains III

18

Example 1 in Genetics (9) As the value of is between ¼ and 1, we

can assume that the prior distribution of is Uniform (¼,1).

The posterior distribution is

The integration in the above denominator,

does not have a close form.

( | ) ( )( | ) .

( | ) ( )

f y ff y

f y f d

( | ) ( ) ,f y f d

Page 19: Bayesian Methods with Monte Carlo Markov Chains III

19

Example 1 in Genetics (10) We will consider the mean of posterior

distribution (the posterior mean),

The Monte Carlo Markov Chains method is a good method to estimate

even if and the posterior mean do not have close forms.

( | ) ( | )E y f y d

( | ) ( | ) .E y f y d

( | ) ( )f y f d

Page 20: Bayesian Methods with Monte Carlo Markov Chains III

20

Example 1 by R Direct numerical integration when

We can assume other prior distributions to compare the results of posterior means: Beta(1,1), Beta(2,2), Beta(2,3), Beta(3,2), Beta(0.5,0.5), Beta(10-5,10-5)

1~ ( ,1) :

4U

Page 21: Bayesian Methods with Monte Carlo Markov Chains III

21

Example 1 by C/C++

Replace other prior distribution, such as Beta(1,1),…,Beta(1e-5,1e-5)

Page 22: Bayesian Methods with Monte Carlo Markov Chains III

22

Beta Prior

Page 23: Bayesian Methods with Monte Carlo Markov Chains III

23

Comparison for Example 1 (1)

Estimate Method

Estimate Method

MME 0.683616 BayesianBeta(2,3)

0.564731

MLE 0.663165 BayesianBeta(3,2)

0.577575

BayesianU(¼,1)

0.573931 BayesianBeta(½,½)

0.574928

BayesianBeta(1,1)

0.573918 BayesianBeta(10-5,10-5)

0.588925

BayesianBeta(2,2)

0.572103 BayesianBeta(10-7,10-7)

show below

Page 24: Bayesian Methods with Monte Carlo Markov Chains III

24

Comparison for Example 1 (2)

Estimate Method

Estimate Method

BayesianBeta(10,10)

0.559905 BayesianBeta(10-7,10-7)

0.193891

BayesianBeta(102,102)

0.520366 BayesianBeta(10-7,10-7)

0.400567

BayesianBeta(104,104)

0.500273 BayesianBeta(10-7,10-7)

0.737646

BayesianBeta(105,105)

0.500027 BayesianBeta(10-7,10-7)

0.641388

BayesianBeta(10n,10n)

BayesianBeta(10-7,10-7)

Not stationary

0.5n

Page 25: Bayesian Methods with Monte Carlo Markov Chains III

25

Part 9 Gibbs Sampling

Strategy

Page 26: Bayesian Methods with Monte Carlo Markov Chains III

26

Sampling Strategy (1) Strategy I:

Run one chain for a long time. After some “Burn-in” period, sample points

every some fixed number of steps.

The code example of Gibbs sampling in the previous lecture use sampling strategy I.

http://www.cs.technion.ac.il/~cs236372/tirgul09.ps

Burn-in N samples from one chain

Page 27: Bayesian Methods with Monte Carlo Markov Chains III

27

Sampling Strategy (2) Strategy II:

Run the chain N times, each run for M steps. Each run starts from a different state points. Return the last state in each run.

N samples from the last sample of each chain

Burn-in

Page 28: Bayesian Methods with Monte Carlo Markov Chains III

28

Sampling Strategy (3) Strategy II by R:

Page 29: Bayesian Methods with Monte Carlo Markov Chains III

29

Sampling Strategy (4) Strategy II by C/C++:

Page 30: Bayesian Methods with Monte Carlo Markov Chains III

30

Strategy Comparison Strategy I:

Perform “burn-in” only once and save time. Samples might be correlated (--although only

weakly). Strategy II:

Better chance of “covering” the space points especially if the chain is slow to reach stationary.

This must perform “burn-in” steps for each chain and spend more time.

Page 31: Bayesian Methods with Monte Carlo Markov Chains III

31

Hybrid Strategies (1) Run several chains and sample few

samples from each. Combines benefits of both strategies.

N samples from each chain

Burn-in

Page 32: Bayesian Methods with Monte Carlo Markov Chains III

32

Hybrid Strategies (2) Hybrid Strategy by R:

Page 33: Bayesian Methods with Monte Carlo Markov Chains III

33

Hybrid Strategies (3) Hybrid Strategy by C/C++:

Page 34: Bayesian Methods with Monte Carlo Markov Chains III

34

Part 10 Metropolis-Hastings

Algorithm

Page 35: Bayesian Methods with Monte Carlo Markov Chains III

35

Metropolis-Hastings Algorithm (1) Another kind of the MCMC methods. The Metropolis-Hastings algorithm can draw sa

mples from any probability distribution π(x), requiring only that a function proportional to the density can be calculated at x.

Process in three steps: Set up a Markov chain; Run the chain until stationary; Estimate with Monte Carlo methods.

http://en.wikipedia.org/wiki/Metropolis-Hastings_algorithm

Page 36: Bayesian Methods with Monte Carlo Markov Chains III

36

Metropolis-Hastings Algorithm (2) Let be a probability density (or mass)

function (pdf or pmf). f(‧) is any function and we want to estimate

Construct P={Pij} the transition matrix of an irreducible Markov chain with states 1,2,…,n, whereand π is its unique stationary distribution.

1( ,..., )n

.

1

( ) ( )n

i

i

I E f f i

1Pr{ | }, (1,2..., )ij t t tP X j X i X n

Page 37: Bayesian Methods with Monte Carlo Markov Chains III

37

Metropolis-Hastings Algorithm (3) Run this Markov chain for times t=1,…,N

and calculate the Monte Carlo sum

then

Sheldon M. Ross(1997). Proposition 4.3. Introduction to Probability Model. 7th ed.

http://nlp.stanford.edu/local/talks/mcmc_2004_07_01.ppt

1

1ˆ { },N

t

t

I f XN

ˆ as .I I N

Page 38: Bayesian Methods with Monte Carlo Markov Chains III

38

Metropolis-Hastings Algorithm (4) In order to perform this method for a given distri

bution π , we must construct a Markov chain transition matrix P with π as its stationary distribution, i.e. πP= π.

Consider the matrix P was made to satisfy the reversibility condition that for all i and j, πiPij= πjPij.

The property ensures that

and hence π is a stationary distribution for P. P for all ,i ij j

i

j

Page 39: Bayesian Methods with Monte Carlo Markov Chains III

39

Metropolis-Hastings Algorithm (5) Let a proposal Q={Qij} be irreducible where Qij

=Pr(Xt+1=j|xt=i), and range of Q is equal to range of π.

But π is not have to a stationary distribution of Q.

Process: Tweak Qij to yield π.

States from Qij

not π Tweak States from Pij π

Page 40: Bayesian Methods with Monte Carlo Markov Chains III

40

Metropolis-Hastings Algorithm (6) We assume that Pij has the form

where is called accepted probability, i.e. given Xt=i,

( , ) ( ),

1 ,

ij ij

ii ij

i j

P Q i j i j

P P

( , )i j

1

1

with probability ( , ) take

with probability 1- ( , )

t

t

X j i j

X i i j

Page 41: Bayesian Methods with Monte Carlo Markov Chains III

41

Metropolis-Hastings Algorithm (7)

WLOG for some (i,j), . In order to achieve equality (*), one can introd

uce a probability on the left-hand side and set on the right-hand side.

For ,

( , ) ( , ) (*)

i ij j ji

i jij ji

i j P P

Q i j Q j i

i jij jiQ Q

( , ) 1i j ( , ) 1j i

Page 42: Bayesian Methods with Monte Carlo Markov Chains III

42

Metropolis-Hastings Algorithm (8) Then

These arguments imply that the accepted probability must be

( , ) ( , )

( , ) .

i j jij ji ji

j ji

i ij

Q i j Q j i Q

Qi j

Q

( , ) min 1 , .j ji

i ij

Qi j

Q

( , )i j

Page 43: Bayesian Methods with Monte Carlo Markov Chains III

43

Metropolis-Hastings Algorithm (9) M-H Algorithm:

Step 1: Choose an irreducible Markov chain transition matrix Q with transition probability Qij.

Step 2: Let t=0 and initialize X0 from states in Q.

Step 3 (Proposal Step): Given Xt=i, sample Y=j form QiY.

Page 44: Bayesian Methods with Monte Carlo Markov Chains III

44

Metropolis-Hastings Algorithm (10) M-H Algorithm (cont.):

Step 4 (Acceptance Step):Generate a random number U from Uniform(0,1).

Step 5: t=t+1, repeat Step 3~5 until convergence.

1

1

If ( , ) set ,

else .

t

t t

U i j X Y j

X X i

Page 45: Bayesian Methods with Monte Carlo Markov Chains III

45

Metropolis-Hastings Algorithm (11) An Example of Step

3~5:Qij

X1= Y1

X2= Y1

X3= Y3

‧ ‧

‧ ‧

‧ ‧

XN

PijTweak

Y1

Y2

Y3

YN

X(t) Y

0

1 0

0 1

1

2 1

1 2

2

1 ,

1 ,0 1

0 ,

1 1

2 ,

2 ,1 2

1 ,

2 1

3 ,

1. from and

( ) ( , ) min 1 ,

( )

accepted

2. from Q and

( ) ( , ) min 1 ,

( )

not accepted

3. from Q

X Y

Y X

X Y

X Y

Y X

X Y

X Y

Y Q

Y QX Y

X Q

X Y

Y

Y QX Y

X Q

X Y

Y

2 1

1 2

3 ,2 3

2 ,

3 3

and

( ) ( , ) min 1 ,

( )

accepted

and so on.

Y X

X Y

Y Qa X Y

X Q

X Y

Page 46: Bayesian Methods with Monte Carlo Markov Chains III

46

Metropolis-Hastings Algorithm (12) We may define a “rejection rate” as the prop

ortion of times t for which Xt+1=Xt. Clearly, in choosing Q, high rejection rates are to be avoided.

Example:

Xt

π

Y

1

will be small ( )

and it is likely that

More Step3~5 are needed .

t

t t

Y

X

X X

Page 47: Bayesian Methods with Monte Carlo Markov Chains III

47

Example (1) Simulate a bivariate normal distribution:

1 1 11 122

2 2 21 22

1

1/2

~ ( , ),

1exp( ( ) ( ))

2i.e. ( ) .2 | |

T

XX N

X

X XX

Page 48: Bayesian Methods with Monte Carlo Markov Chains III

48

Example (2) Metropolis-Hastings Algorithm:

1. 2.

3.

4.

5.

0 , i=0.X

1 2

11 2

2

Generate ~ ( 1,1) and ~ ( 1,1)

that and are independent, then U .i

U U U U

UU U

U

.i i iY X U

1

1

( ) w.p. ( , )=min{1, },

( )

w.p. 1- ( , ).

ii i i i

i

i i i i

YX Y X Y

X

X X X Y

1, repeat step 2~4 until convergence.i i

Page 49: Bayesian Methods with Monte Carlo Markov Chains III

49

Example of M-H Algorithm by R

Page 50: Bayesian Methods with Monte Carlo Markov Chains III

50

Example of M-H Algorithm by C (1)

Page 51: Bayesian Methods with Monte Carlo Markov Chains III

51

Example of M-H Algorithm by C (2)

Page 52: Bayesian Methods with Monte Carlo Markov Chains III

52

Example of M-H Algorithm by C (3)

Page 53: Bayesian Methods with Monte Carlo Markov Chains III

53

An Figure to Check Simulation Results

0 1 2 3 4 5 6

45

67

89

10

X1

X2

Black points are simulated samples; color points are probability density.

Page 54: Bayesian Methods with Monte Carlo Markov Chains III

54

Exercises Write your own programs similar to those

examples presented in this talk, including Example 1 in Genetics and other examples.

Write programs for those examples mentioned at the reference web pages.

Write programs for the other examples that you know.

54