Download - Bayesian Hierarchical Model Ying Nian Wu UCLA Department of Statistics IPAM Summer School
Bayesian Hierarchical Model
Ying Nian WuUCLA Department of Statistics
IPAM Summer SchoolJuly 12, 2007
Plan
•Bayesian inference•Learning the prior•Examples•Josh’s example
independently
2
unknown parameter
given constant
one’s height
Inference of normal mean
Example:
),(~]|,...,,[ 221 NnYYY
nYYY ,...,, 21repeated measurements
2 known precision
Prior distribution
),(~ 2 N
),( 2 known hyper-parameters
The larger 2 , the more uncertain about 2 , prior becomes non-informative
Bayesian inference
independently),(~]|,...,,[ 221 NnYYY
Prior: ),(~ 2 N
Data:
Posterior: )1
1,
1
1
(~],...,|[
2222
22
1
nn
nY
YY n N
n
jjYn
Y1
1
Compromise between prior and data
Bayesian inference
Prior:
Data:
)(~ p
)|(~]|[ ypY
Posterior: )|()()|(~]|[ yppypyY
likelihoodpriorposterior
Prior:
Data:
)(~ p
)|(~]|[ ypY
y
)|()(),(~],[ yppypY
]|[ yY
]|[ Y
Illustration
independently),(~]|,...,,[ 221 NnYYY
Prior: ),(~ 2 N
Data:
Inference of normal mean
Sufficient statistic: ),(~1
]|[2
1 nY
nY
n
jj
N
Y
]|[ Y
]|[ Y
Combining prior and data
Y
]|[ Y
2 large n/2 small
Combining prior and data
]|[ YY
2 largen/2small
Y
]|[ Y
)1
1,
1
1
(~],...,|[
2222
22
1
nn
nY
YY n N
Prior knowledge is useful for inferring
independently),(~]|,...,,[ 221 NnYYY
Prior: ),(~ 2 N
Data:
Learning the prior
Prior distribution cannot be learned from single realization of
Prior:
Data:
Learning the prior
mii ,...,1),,(~],|[ 22 N
iiiij njY ,...,1),,(~],,|[ 22 N
Prior distribution can be learned from multiple experiences
Prior:
Data:
mii ,...,1),,(~],|[ 22 N
iiiij njY ,...,1),,(~],,|[ 22 N
Hierarchical model
),( 2
1 2 i m…… ……
1,12,11,1 ,...,, nYYYiniii YYY ,2,1, ,...,,
mnmmm YYY ,2,1, ,...,,
1 2 i m…… ……
1Y iY mY2Y
Hierarchical model
Collapsing
iiiii dppp )|()|()|( YY
yprojecting
Prior:
Data:
mii ,...,1),,(~],|[ 22 N
iiiij njY ,...,1),,(~],,|[ 22 N
Sufficient statistics
in
jij
ii Yn
Y1
1
),(~]|[2
iiii n
Y N
),(~],|[2
22
ii nY
N
),(~]|[2
iiii n
Y N
Integrating out i
Collapsing
),(~],|[2
22
nYi
N
Estimating hyper-parameter
m
iiYm 1
1
nY
m
m
ii
2
1
22 )ˆ(1
ˆ
Empirical Bayes
Borrowing strength from other observations
22
22
ˆ1
ˆ1
ˆˆ
n
nYi
i
iY
i
Hyper prior: )(~),( 2 p e.g., constant
),( 2
1 2 i m…… ……
1,12,11,1 ,...,, nYYYiniii YYY ,2,1, ,...,,
mnmmm YYY ,2,1, ,...,,
Full Bayesian
Full Bayesian
)]|()|([)(~],...,;,...,;[1
11 ii
m
iimm ppp YYY
)]|()|([)(),...,|,...,;(1
11 ii
m
iimm pppp YYY
m
iim ppp
11 )|()(),...,|( YYY
m
iiimmm dppp
1111 ),|(),...,|(),...,|,...,( YYYYY
Bayesian hierarchical model
Stein’s estimator
miY iii ,...,1),,(~]|[ 2 N
Example: measure each person’s height
ii Y 22
1
)ˆ( mm
iii
E
im
ii
i YY
m)
)2(1(
~
1
2
2
2
1
2)~
( mm
iii
E3m
Stein’s estimator
Stein’s estimator
222 ][ iiYE222 ][ mY
ii
ii E
miY iii ,...,1),,(~]|[ 2 N
Y
miY iii ,...,1),,(~]|[ 2 N
Stein’s estimator
),0(~ 2 Ni
Empirical Bayes interpretation
Beta-Binomial example
),(~]|[ nY Binomial
e.g., flip a coin, is probability of head
Y is number of heads out of n flips
yny
y
nyYp
)1()|(
Data:
Pre-election poll
),(~ 01 aaBeta
11
01
01 01 )1()()(
)()(
aa
aa
aap
01
1][aa
a
E
Conjugate prior
),(~]|[ nY Binomial
yny
y
nyYp
)1()|(
Data:
11
01
01 01 )1()()(
)()(
aa
aa
aap
),(~ 01 aaBetaPrior:
Posterior: ),(~]|[ 01 aynayyY Beta
01
1~]|[aan
ayyY
E
Hierarchical model
Examples: a number of coins probs of head a number of MLB players probs of hit pre-election poll in different states
),(~ 01 aai Beta
Dirichlet-Multinomial
Roll a die: ),...( 61
),(~]|,...,[ 61 nYY lmultinomia
6161
6161 ...
,...,)|,...,( yy
yy
nyyp
Conjugate prior
6161
6161 ...
,...,)|,...,( yy
yy
nyyp
16
11
61
61 61 ...)()...(
)...()(
aa
aa
aap
),...,(~ 61 aaDirichlet
),(~]|,...,[ 61 nYY lmultinomia
),...,(~ 61 aaDirichlet
),...,(~]|[ 6611 ayayy Dirichlet
Data
Prior
Posterior
61 ...~]|[
aan
ayy kk
k
E
Hierarchical model
),...,(~ 61 aai Dirichlet