bayesian model selection in factorial designs seminal work is by box and meyer seminal work is by...

21
Bayesian Model Selection in Factorial Designs Seminal work is by Box and Meyer Intuitive formulation and analytical approach, but the devil is in the details! Look at simplifying assumptions as we step through Box and Meyer’s approach One of the hottest areas in statistics for several years

Upload: kristian-pearson

Post on 26-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Bayesian Model Selection in Factorial Designs

Seminal work is by Box and Meyer Intuitive formulation and analytical

approach, but the devil is in the details!

Look at simplifying assumptions as we step through Box and Meyer’s approach

One of the hottest areas in statistics for several years

Bayesian Model Selection in Factorial Designs

There are 2k-p-1 possible (fractional) factorial models, denoted as a set {Ml}.

To simplify later calculations, we usually assume that the only active effects are main effects, two-way effects or three-way effects– This assumption is already in place for

low-resolution fractional factorials

Bayesian Model Selection in Factorial Designs

Each Ml denotes a set of active effects (both main effects and interactions) in a hierarchical model.

We will use Xik=1 for the high level of effect k and Xik=-1 for the low level of effect k.

Bayesian Model Selection in Factorial Designs

We will assume that the response variables have a linear model with normal errors given model M

Xi and b are model-specific, but we will use a saturated model in what follows

Yi ~ N(Xi′β, σ 2I)

Bayesian Model Selection in Factorial Designs

The likelihood for the data given the parameters has the following form

L(β,σ ,Y ) =1

2πσ 2exp

i=1

2k− p

∏ −1

2σ 2 Yi − ′ X iβ( )2 ⎛

⎝ ⎜

⎠ ⎟

=1

2πσ 2( )

m2

exp −1

2σ 2 Y − Xβ( )′ Y − Xβ( ) ⎛

⎝ ⎜

⎠ ⎟

Bayesian Paradigm

Unlike in classical inference, we assume the parameters, Q, are random variables that have a prior distribution, fQ(q), rather than being fixed unknown constants.

In classical inference, we estimate qby maximizing the likelihood L(q|y)

Bayesian Paradigm

Estimation using the Bayesian approach relies on updating our prior distribution for Q after collecting our data y. The posterior density, by an application of Bayes rule, is proportional to the familiar data density and the prior density:

fΘ|Y θ | y( ) ∝ fX |Θ(x |θ ) fΘ(θ )

Bayesian Paradigm

The Bayes estimate of Q minimizes Bayes risk—the expected value (with respect to the prior) of loss function L(q).

Under squared error loss, the Bayes estimate is the mean of the posterior distribution:

ˆ θ y( ) = EΘ|YΘ

Bayesian Model Selection in Factorial Designs

The Bayesian prior for models is quite straightforward. If r effects are in the model, then they are active with prior probability p

L(π ) = C1πr 1− π( )

n −r= C1 1− π( )

n π

1− π

⎝ ⎜

⎠ ⎟r

Bayesian Model Selection in Factorial Designs

Since we’re using a Bayesian approach, we need priors for b and s as well

β0 ~ N 0, σ 2 ε( ), ε =10−6

β j ~ N 0, γ 2σ 2( )

σ ~ g σ( ), g σ( ) ∝ σ −a

Bayesian Model Selection in Factorial Designs

For non-orthogonal designs, it’s common to use Zellner’s g-prior for b:

Note that we did not assign priors to g or p

β j ~ N 0, γ 2σ 2 X ' X( )−1

( )

Bayesian Model Selection in Factorial Designs

We can combine f(b,s,M) and f(Y|b,s,M) to obtain the full likelihood L(b,s,M,Y)

L(β,σ , M,Y ) = Cπ

1− π

⎝ ⎜

⎠ ⎟r

1

γ

⎝ ⎜

⎠ ⎟

n −11

2πσ 2

⎝ ⎜

⎠ ⎟

n +m +a

2

×

exp −1

2σ 2 Q(β) ⎛

⎝ ⎜

⎠ ⎟

Bayesian Model Selection in Factorial Designs

Q(β) = Y − Xβ( )′ Y − Xβ( ) + ′ β Γβ

Γ =ε 0

01

γ 2 In −1

⎜ ⎜

⎟ ⎟

Bayesian Model Selection in Factorial Designs

Our goal is to derive the posterior distribution of M given Y, which first requires integrating out b and s.

L(M |Y ) ∝ L(M,Y ) = L(β,σ ,M,Y )dβdσRn

∫0

∫ =

1− π

⎝ ⎜

⎠ ⎟r

1

γ

⎝ ⎜

⎠ ⎟

n −1 ′ X X + Γ( )−1 2

Q ′ X X + Γ( )−1

′ X Y( )(n −1+a ) 2

Bayesian Model Selection in Factorial Designs

The first term is a penalty for model complexity (smaller is better)

The second term is a measure of model fit (smaller is better)

L(M |Y ) ∝ L(M,Y ) =

1− π

⎝ ⎜

⎠ ⎟r

1

γ

⎝ ⎜

⎠ ⎟

n −1 ′ X X + Γ( )−1 2

Q ′ X X + Γ( )−1

′ X Y( )(n −1+a ) 2

Bayesian Model Selection in Factorial Designs

p and g are still present. We will fix p; the method is robust to the choice of p

g is selected to minimize the probability of no active factors

L(M |Y ) ∝ L(M,Y ) =

1− π

⎝ ⎜

⎠ ⎟r

1

γ

⎝ ⎜

⎠ ⎟

n −1 ′ X X + Γ( )−1 2

Q ′ X X + Γ( )−1

′ X Y( )(n −1+a ) 2

Bayesian Model Selection in Factorial Designs

With L(M|Y) in hand, we can actually evaluate the P(Mi|Y) for all Mi for any prior choice of p, provided the number of Mi is not burdensome

This is in part why we assume eligible Mi only include lower order effects.

Bayesian Model Selection in Factorial Designs

Greedy search or MCMC algorithms are used to select models when they cannot be itemized

Selection criteria include Bayes Factor, Schwarz criterion, Bayesian Information Criterion

Refer to R package BMA and bic.glm for fitting more general models.

Bayesian Model Selection in Factorial Designs

For each effect, we sum the probabilities for all Mi that contain that effect and obtain a marginal posterior probability for that effect.

These marginal probabilities are relatively robust to the choice of p.

Case Study

Violin data* (24 factorial design with n=11 replications)

Response: Decibels Factors

– A: Pressure (Low/High)– B: Placement (Near/Far)– C: Angle (Low/High)– D: Speed (Low/High)

*Carla Padgett, STAT 706 taught by Don Edwards

Case Study

Fractional Factorial Design:

• A, B, and D significant

• AB marginal

*Carla Padgett, STAT 706 taught by Don Edwards

Bayesian Model Selection:• A, B, D, AB, AD, BD significant• All others negligible