bayesian inference presenting: assaf tzabari. 2 agenda basic concepts conjugate priors generalized...

40
Bayesian Inference Presenting: Assaf Tzabari

Post on 21-Dec-2015

244 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

Bayesian Inference

Presenting:Assaf Tzabari

Page 2: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

2

Agenda

Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic efficiency

Page 3: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

3

Basic concepts

x - random vector with density |f x - unknown parameter with prior density ( )

Joint density of x and : ( , ) | ( )h x f x

marginal density of x : ( ) ( , )m x h x d

Posterior density of :

( , )( | )

( )

h xx

m x

Page 4: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

4

Basic concepts (cont.)

- loss function defined for all ,a A

- decision rule( ) :x X A

Risk function : ( , ) ,xR E L x

Elements of a decision problem:

,L a- the set of all possible decisionsA

Bayes risk function : ( , ) ,r E R

Page 5: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

5

Basic concepts (cont.)

or, equivalently, which minimizes:

, |L a x d

A Bayes rule is a decision rule which minimizes ( , )r

A Bayes rule can be found by choosing, for each x, an action which minimizes the posterior expected loss:

, |L a f x d

Page 6: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

6

Basic concepts (cont.)

2

2

2 |

|

,

The posterior expacted loss is |

0 | 2 2

x

x

L a a

a x d

da x d E a

da

x E

Example: Bayesian estimation under MSE

Page 7: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

7

Conjugate priors

Definition:

A class

F | | f x of prior distributions is a conjugate family for F

ifP

for all | Px P , Ff

Example: the class of normal priors is a conjugate family for the class of normal sample densities,

2 2 1( ) ( | )

2 22 2

2 2 2 2

, , , ( ),

where ( ) and

xx N N N x

x x

Page 8: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

8

Using conjugate priors

Step 1: Find a conjugate prior

Choose a class with the same form as the likelihood functions

Step 2: Calculate the posterior

Gather the factors involving in

|xl f x

( , )h x

Page 9: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

9

Using conjugate priors (cont.)

11

1

1 /2

( 1) ( 1/ )

( | )

| ( )! !

Assuming ~ ( , ) we get:

( ) ( , )

( , ) | ( ) , , ( ) |

1 The posterior is a ~ ( ,

ix nx nnnx n

xni i ii

a b

nx a n b

x

e ef c e l

x x

G a b

c a b e

h x f c a b x e m x

G nx a

x x

x x

)1/n b

Example: Finding conjugate prior for the Poisson distribution

x=(x1,…, xn ) where xi~P are iid,

Factors fit to gamma distribution of

Page 10: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

10

Using conjugate priors (cont.)

( | )

( | )

1~ ( , ) where and

1/

[ ]1/

x

x

G a b a nx a bn b

nx aE a b

n b

Example (cont.): Finding conjugate prior for the Poisson distribution

The Bayes estimator under MSE is then,

The ML estimator is ML x 0 1 2 3 4 5 6 7 8 9 10

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

a=1, b=2a=2, b=2a=3, b=2a=10, b=0.5

Gamma

Page 11: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

11

Using conjugate priors (cont.)

More conjugate priors for common statistical distributions:

( ) ( | )

( | )

, ,

where [ ]

x

x

B a b B a x b n x

a a xE

a b a b n

Binomial x~b(p,n) and Beta prior

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

3

3.5

a=2,b=2a=0.5,b=0.5a=2,b=5a=5,b=2

Beta

Page 12: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

12

Using conjugate priors (cont.)

Uniform iid x=(x1,…, xn ) , xi ~U(0) and Pareto prior

( ) 0 ( | ) 0 1

0 1( | ) 0

, max , ,..., ,

( ) max , ,...,where [ ]

1 1

x n

nx

Pa a Pa x x a n

a n x xaE

a a n

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

1

1.5

2

2.5

3a=1a=2a=3

Pareto

Page 13: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

13

Conjugate priors (cont.)

| x Advantages

Easy to calculate Intuitive Useful for sequential estimation

Can a conjugate prior be a reasonable approximation to the true prior?

Not always!

Page 14: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

14

Conjugate priors (cont.)

Step 1: subjectively determine -fractiles a point z is defined -fractile if

Step 2: look for matching prior and find Bayes estimator

Example: estimating under MSE based on x~N1)

( )F z

1 2

1 22

1 1 3If median =0 and quartiles =(-1) , =1

2 4 4

There are two matching priors (0,2.19) or (0,1)

2 ( ) while ( )

3.19 2

z z z

N C

x xx x x x

x

Only is conjugate prior, but which is a better estimator ?

Page 15: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

15

Improper priors

d

Improper prior – a prior with infinite mass Bayes risk has no meaning The posterior function usually exists

Useful in the following cases: Prior information is not available (noninformative priors

are usually improper) The parameter space is restricted

Page 16: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

16

Generalized Bayes rules

, ( | )L a f x d

Definition:If is an improper prior, a generalized Bayes rule, for given x, is an action which minimizes

or, if , which minimizes the posterior expected loss. 0 m x

2 2

2 2

2 2

( ) /2(0, )

(0, ) ( ) /2

1/2 /2|

Noninformative (improper) prior - |

2( ) where (0,1)

/

x

x

xx

e II x

e d

ex E x Z N

P Z x

2,x N Example: estimating >0 under MSE based on

Page 17: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

17

Generalized Bayes rules (cont.)

Bayes estimator for

ML estimator

Bayes estimator for =2

-10 -5 0 50

1

2

3

4

5

6

Bayes estimator for

x

Page 18: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

18

Generalized Bayes rules (cont.)

Generalized Bayes rules are useful in solving problems which don’t include prior information

fx| is a location density with location parameter if fx| =f(x-)

Using we get,

Example: Location parameter estimation under L(a-)

( | ) ( )

( | ) ( )

( ) ( ) ( ) ( )

where

x f y

x f x

E L a L y a x f y dy E L y K

a x K

Page 19: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

19

Generalized Bayes rules (cont.)

The generalized Bayes rule has the form,

This is a group of invariant rules, and the best invariant rule is the generalized Bayes rule with the prior

Example (cont.): Location parameter estimation under L(a-)

( )( ) where minimizes ( )f yx x K K E L y K

Page 20: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

20

Generalized Bayes rules (cont.)

Under MSE (x) is the posterior mean,

Example (cont.): Location parameter estimation under L(a-)

( )( ) ( ) f yx f x d x E y

for x=(x1,…,xn) , Pitman’s estimator is derived:

1

1

( ,..., )

( )

( ,..., )

n

n

f x x d

x

f x x d

Page 21: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

21

Empirical Bayes

m x

Development of Bayes rules using auxiliary empirical (past or current) data

Methods: Using past data in constructing the prior Using past data in estimating the marginal distribution Dealing simultaneously with several decision problems

Xn+1 - sample information with density 1 1|n nf x

x1 ,…,xn - past observations with densities |i if x

Page 22: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

22

Determination of the prior from past data

- conditional mean and variance of xi 2 , f f

Assumption: ,…,n ,n+1 - parameters from a common prior

22 2

m f

m f f m

E

E E

- marginal mean and variance of xi 2 , m m

Lemma 1:

Result 1:

2 2 22 2If then mf

m ff f

Page 23: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

23

Determination of the prior from past data (cont.)

22

1 1

1 1ˆ ˆ ˆ= and

1

n n

m i m i mi i

x xn n

Step 1: Assume a certain functional form for Conjugate family of priors is convenient

Step 2: Estimate , 2 based on x1,…, xn

Xn+1 can be included too

If f and f2 is constant then:

Step 2a: Estimate m , m2 from the data

E.g.

Step 2b: Use result 1. to calculate , 2

Page 24: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

24

Determination of the prior from past data (cont.)

1 122 2

1 1

2 22 2 2

21

|

2

1 1ˆ ˆ and

1

1 1 0ˆ ˆ and 1 where 1

0

ˆ ˆThe prior is ( , ) and the Bayes estimate of under MSE,

ˆ1ˆ( )

ˆ1

n n

m i m ii i

n

x

x x x x sn n

s if sx s s

else

N

x E

2

1 1 12 2

1min 1,

ˆ1 n n nx x x xs

,1i ix N Example: and is assumed to be normal (conjugate prior).

Estimation of and 2 is needed for determining the prior.

Page 25: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

25

Estimation of marginal distribution from past data

[the number of equal to ]ˆ

1ix j

m jn

Assumption: The Bayes rule can be represented in terms of m(x)

Advantage: No need to estimate the prior

Step 1: Estimate m(x) x1,…, xn , xn+1 are a sample from the distribution with density m(x)

E.g. in the discrete case,

Step 2: Estimate the Bayes rule, using

Advantage: No need to estimate the prior

m̂ j

Page 26: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

26

Estimation of marginal distribution from past data (cont.)

1

1|1 1

1

1

1 111

1 1

1 11

1

|( ) |

1 ( 1)!( )

ˆ1 ( 1)ˆand ( )ˆ

n

nxn n

n

x

n nnn

n n

n nn

n

f x dx E x d

m x

ed

x m xxx

m x m x

x m xx

m x

Example: The Bayes estimation of n+1 when ,under MSE. i ix P

Page 27: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

27

Compound decision problems

Independent x1,…, xn are observed, where i are from a common prior

Goal: simultaneously make decisions involving ,…,n

The loss is L(,…,n,a)

Solution: Determine the prior from x1,…, xn using empirical Bayes methods

Page 28: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

28

Admissibility of Bayes rules

If a Bayes rule, is unique then it is admissible E.g. Under MSE the Bayes rule is unique Proof: Any rule R-better than must be a Bayes rule itself

For discrete , assuming that is positive, is admissible

For continuous , if R( is continuous in for every then is admissible

Bayes rules with finite Bayes risk are typically admissible:

Page 29: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

29

Admissibility of Bayes rules (cont.)

Generalized Bayes rules can be inadmissible and the verification of their admissibility can be difficult.

Example: generalized Bayes estimator of based on

versus the James-Stein estimator ,p pNx θ I

21 1

1

2 2

1 1

,..., , , ,..., ( ) 1 , L( , ) ( )

, ,

is identical to the ML estimator

, ( ) ( )

pt t

p p p p i ii

p p

p p

i i i ii i

x x N a

N E

R E x E x p

θ|xθ|x

x

x θ I θ θ a

θ x I x θ x

θ x

Page 30: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

30

Admissibility of Bayes rules (cont.)

Example (cont.): generalized Bayes estimator of versus the James-Stein estimator

2

2 ( | )2

2The James-Stein estimator is 1

1And its risk is , 2

For 2 , is not admissible

JS

js f

p

R p p E

p

x θ

x xx

θx

Page 31: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

31

Admissibility of Bayes rules (cont.)

1

| ( ) exp ( ) ( ) ( )p

i ii

f h x T x B

x

Theorem: If x is continues with p-dimensional exponential density and is closed, then any admissible estimator is a generalized Bayes rule

fx| is a p-dimensional exponential density if,

E.g. The normal distribution

Page 32: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

32

Asymptotic efficiency of Bayes estimators

x1 ,…,xn are iid samples with density f(xi|)

Definitions: Estimator n(x1,…,xn) of is defined asymptotically

unbiased if,

Asymptotically unbiased estimator is defined asymptotically efficient if,

and 0d

nn H E H

1

( )v

nI

v– asymptotic variance

I() – Fisher information in a single sample

Page 33: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

33

Asymptotic efficiency of Bayes estimators (cont.)

| d x

Assumptions for the next theorems

The posterior is a proper continues and positive density, and

The prior can be improper!

The likelihood function lf(x satisfies regularity conditions

Page 34: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

34

Asymptotic efficiency of Bayes estimators (cont.)

Theorem: For large values of n, the posterior distribution is approximately –

Conclusion: Bayes estimators such as the posterior mean are asymptotic unbiased

The effect of the prior declines as n increases

1 ,

( ) ( )

lN

nI nI

Page 35: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

35

Asymptotic efficiency of Bayes estimators (cont.)

Theorem: If n is the Bayes estimator under MSE then,

Conclusion: The Bayes estimator n under MSE is asymptotically efficient

1 0 ,

( )

d

nn NI

Page 36: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

36

Asymptotic efficiency of Bayes estimators (cont.)

Example: estimator of pbased on binomial sample x~b(p,n) under MSE

( ) |

Beta distribution is a conjugate family for the binomial distribution,

~ ( , ) ~ ( , )

The Bayes estimator under MSE is

The ML estimator is arg max ln |

p p x

p

MLp

p B a b p B a x b n x

a xx E p

a b nx

x f x pn

|x

( ) pML

a b nx E p x

a b n a b n

Page 37: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

37

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1a=b=2000 a=b=2

(x) Bayes estimatorML estimator

x x

If the prior is concentrated it determines the estimator, “Don’t confuse me with the facts!”

Asymptotic efficiency of Bayes estimators (cont.)

Page 38: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

38(x) Bayes estimator when a=b=2ML estimator

9 100 1 2 3 4 5 6 7 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

300 400 500 600 700 800 900 10000 100 2000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

xx

n=10 n=1000

Asymptotic efficiency of Bayes estimators (cont.)

For large sample, the Bayes estimator tends to become independent of the prior

Page 39: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

39

Asymptotic efficiency of Bayes estimators (cont.)

Location distributions: if the likelihood function lf(x satisfies the regularity conditions, then the Pitman estimator after one observation is asymptotically efficient

Exponential distributions: if

then it satisfies the regularity conditions, and the asymptotic efficiency depends on the prior

More examples of asymptotic efficient Bayes estimators

( ) ( )( | ) T x Af x e

Page 40: Bayesian Inference Presenting: Assaf Tzabari. 2 Agenda Basic concepts Conjugate priors Generalized Bayes rules Empirical Bayes Admissibility Asymptotic

40

Conclusions

Bayes rules are designed for problems with prior information, but useful in other cases as well

Determining the prior is a crucial step, which affects the admissibility and the computational complexity

Bayes estimators, under MSE, performs well on large sample