st5219: bayesian hierarchical modelling lecture 2.1

PRIORS, NORMAL MODELS, COMPUTING POSTERIORS

st5219: Bayesian hierarchical modellinglecture 2.1

Plan for lecture

Priors: how to choose them, different types

The normal distribution in Bayesianism Tutorial 1: over to you Computing posteriors:

Monte Carlo Importance Sampling Markov chain Monte Carlo

What is random?

FREQUENTISM BAYESIANISM

Something with a long run frequency distribution

E.g. coin tosses Patients in a clinical

trial “Measurement”

errors?

Everything What you don’t know

is random Unobserved data,

parameters, unknown states, hypotheses

Observed data still arise from probability modelKnock on effects on how to estimate things

and assess hypotheses

This week: practical issues

CHOOSING A PRIOR DOING COMPUTATIONS

Very misunderstood “How did you choose

your priors?” Please never answer

“Oh, I just made them up”

For data analysis, you need strong rationale for choice of prior

(later)

An example to illustrate priors

Following infection: body creates antibodies

These target pathogen and remain in the blood

Antibodies can provide data on historic disease exposure

H1N1 in Singapore

Cook, Chen, Lim (2010) Emerg Inf Dis DOI:10.3201/EID.1610.100840

Serology of H1N1

Singapore study longitudinal

Chen et al (2010) J Am Med Assoc 303:1383--91

Measurements

Observation in (xij,2xij) for individual i, observation j

Define “seroconversion” to be a “four-fold” rise in antibody levels, i.e.

yi = 1 if xi2 ≥ 4xi1 and 0 otherwise Out of 727 participants with follow up, we

have 98 seroconversions

Q: what proportion were infected?

AIDS ≠ influenza A H1N1

Seroconversion “test” not perfect: something about 80%

Infection rate should be higher than seroconversion rate

Board

work

Bayesian approach

Need some priors Last time: “U(0,1) good way to represent

lack of knowledge of a probability” Before we collected the JAMA data, we

didn’t know what p would be, and a prior p~U(0,1) makes sense

But there are data out there on σ!

Other data

Zambon et al (2001) Arch Intern Med 161:2116--22

Other data

m = 791 y = 629

This can give you

a prior!!!

σ~Be(630,163)

Board

work

Kinds of priors

NON-INFORMATIVE INFORMATIVE

p~U(0,1) σ²~U(0,∞) μ~U(- ∞, ∞) β~N(0,1000²) Should give you no

information about that parameter except what is in the data

σ ~Be(630,163) μ ~N(15.2,6.8²) Lets you supplement

natural information content of the data when not enough information on that aspect

Can give information on other parameters indirectly

How to choose?Scenario 1. You are trying to reach an optimal decision in the presence of uncertainty: use whatever information you can, even if subjective, via informative priors

Scenario 2. You are trying to estimate parameters for a scientific data analysis (you cannot or don’t want to use external data): use non-informative prior

Scenario 3. You are trying to estimate parameters for a scientific data analysis (you have good external data): use non-informative priors for those bits you have no data for or in which you want your own data to speak for themselves; use informative priors elsewhere

Whence came that Be(630,163)?

Step 1: uniform prior for σ

Step 2: fit model to Zambon data

Step 3: posterior for that becomes prior for main analysis Boar

d work

Conjugacy

The beta distribution is conjugate to a binomial model, in that if you start with a beta prior and use it in a binomial model for p and x, you end with a beta posterior of known form

I.e. if p~Be(a,b) and x~Bin(n,p), p|x~Be(a+x,b+n-x)

Other conjugate priors exist forsimple models, e.g. ...

Board

work

Why is it ok to take posteriors and turn them into priors?

It’s the incremental nature of accumulated knowledge

Eg Zambon study:

Stage Prior Data (y,m)=

Posterior

0 Be(1,1) (0,0) Be(1,1)

1 Be(1,1) (1,1) Be(2,1)

2 Be(2,1) (1,2) Be(2,2)

3 Be(2,2) (1,3) Be(2,3)

4 Be(2,3) (2,4) Be(3,4)

Effective sample sizes

You can think of the parameters of the beta(a,b) as representing a best guess of the proportion, a/(a+b) a “sample size” that the prior is equivalent to

(a+b) This is an easy way to transform published

results into beta priors: take the point estimate (MLE, say) and the sample size and transform to get a and b.

(So a uniform prior is like adding one positive and one negative value to your data set: is this fair???)

Other converting methods Take a point estimate and CI and convert

to 2 parameters to represent your prior. Eg the infectious period is a popular

parameter in infectious disease epidemiology: the average time from infection to recovery

For no good reason, often assumed to be exponential with mean λ, say

Fraser et al (2009) Science 324:1557--61 suggest estimate ofgeneration period of 1.91 with95%CI (1.3,2.71)

Board

work

Two final thoughts on priors

I mentioned U(-∞, ∞) as a non-informative prior.What’s the density function for U(- ∞, ∞)?

Board

work


A prior such as U(-∞, ∞) is called an improper prior as it does not have a proper density function.

Improper priors sometimes give proper posteriors: depending on the integral of the likelihood.

Not an improper prior is a proper one


Just because a prior is flat in one representation does not mean it is flat in another

Eg for an exponential model (for survival analysis say)

Board

work

st5219: bayesian hierarchical modelling lecture 2.1

Documents