designs with randomization restrictions

Designs with Randomization Restrictions

RCBD with a complete factorial in each block– A: Cooling Method– B: Temperature

Conduct ab experiments in each block


All factors are crossed

nk

bj

aiY

ijkjkik

kijjiijk

,,1

,,1

,,1


By convention, we assume there is not block by treatment interaction (the usual RCBD assumption) so that:

Note that this is different from “pooling”

ijkijkjkik


A similar example uses a Latin Square design

The treatment is in fact a factorial experiment

n=ab or n-1=(a-1)+(b-1)+(a-1)(b-1)

Split Plot Design

Two factor experiment in which a CRD within block is not feasible

Example (observational study)– Blocks: Lake–Whole plot: Stream; Whole plot factor:

lampricide– Split plot factor: Fish species– Response: Lamprey scars

Split Plot Design

Agricultural Example– Block: Field–Whole Plot Factor: Tilling method– Split Plot Factor: Seed variety

Whole plot and whole plot factor are confounded

This is true at split plot level as well, though confounding is thought to be less serious

Split Plot Design

One version of the model (See ex. 24.1):

ijkjkijj

ikkiijkY

EMS Table--Whole Plot

1)-1)(n-(a

1-n

1-a

df

)1(

EMS

BlockA x

Block

(A)Plot Whole

Source

22

22

222

b

ab

abnbi

i

EMS Table--Split Plot

Source EMS

)1)(1)(1(

)1)(1(

)1)(1(

1

Block x AB

)1)(1(AB

Block x B

1(B)Plot Split

22

2

22

22

2

22

nba

ba

nb

b

ba

n

a

b

an

a

i jij

jj

Split Plot Design

Note that there are no degrees of freedom for error

Block and Block x Treatment interactions cannot be tested

Split Plot Design

In an alternative formulation, SP x Block and SP x WP x Block are combined to form the Split Plot Error. Note the unusual subscript—a contrivance that yields the correct df.

)(ijkijj

ikkiijkY

Split Plot Design

Yandell presents an alternate model

Useful when whole plots are replicated and no blocks are present

)(

)(

ijkijj

ikiijkY

EMS Table--Whole Plot

Source EMS

22

2

22

ErrorPlot Whole1

A

ba

bnb i

i

EMS Table--Split Plot

Source EMS

22

2

22

2

22

ErrorPlot Split )1)(1(

AB

1(B)Plot Split

ba

nb

an

j kij

jj

Split Plot Design

Yandell considers the cases where the whole plot and split plot factors, alternately, do not appear– Split plot factor missing—whole plot looks like RCBD (me) or CRD (Yandell); subplots are subsampled.– Whole plot factor—whole plots look like one-way random effects; subplots look like either RCBD or CRD again.

Yandell has nice notes on LSMeans in Ex. 23.4

Split Split Plot Design

We can also construct a split split plot design (in the obvious way)

Montgomery example– Block: Day– Whole Plot: Technician receives batch– Split Plot: Three dosage strengths formulated

from batch– Split split plot: Four wall thicknesses tested

from each dosage strength formulation


Surgical Glove Example– Block: Load of latex pellets–Whole Plot: Latex preparation method– Split Plot: Coagulant dip– Split Split Plot: Heat treatment


A model version that facilitates testing:

)(

)(

)(

ijklijkjkikk

ijlijj

iliijklY

Split Plot Design with Covariates

This discussion is most appropriate for the nested whole plots example

Often, researchers would like to include covariates confounded with factors


Example (Observational study)–Whole Plot: School–Whole Plot Factor: School District– Split Plot Factor: Math Course– Split Plot : Class– Split Plot covariate: Teacher Rating–Whole Plot covariate: School Rating–Whole Plot Factor covariate: School

District Rating– Response: % Math Proficient (HSAP)


Whole Plot Covariate– Xijk=Xik

– Xijk=Xi occurs frequently in practice Split Plot Covariate

kikiijkijk XXXX ..

SP Covariate

WP Covariate


Model

)(.

)(.

ijkijkiijkj

ikkiiijk

XX

XY


A Whole Plot covariate’s Type I MS would be tested against Whole Plot Error (with 1 fewer df because of confounding)

Split Plot Covariate is not confounded with any model terms (though it is confounded with the error term), so no adjustments are necessary

Repeated Measures Design

Read Yandell 25.1-25.3 Chapter 26 generally covers multivariate approaches to repeated measures—skip it

We will study the traditional approach first, and then consider more sophisticated repeated measures correlation patterns


Looks like Yandell’s split plot design– The whole plot structure looks like a nested design– The split plot structure looks much the same

tk

nm

airY

ikmikk

imiikm

,,1

,,1

,,1

)(

)(


Fuel Cell Example– Response: Current– Group: Control/Added H20– Subject(Group): Daily Experimental Run or Fuel Cell– Repeated Measures Factor: Voltage


Source dfGroup a-1Subject(Group) a(n-1)Repeated Measures Factor t-1Group x Repeated Measures (a-1)(t-1)Error a(t-1)(n-1)Total atn-1


A great deal of work has been conducted on repeated measures design over the last 15 years– Non-normal data– More complex covariance structure

222'

2)()('

),(

),(),(

PPmikikm

Pimimmikikm

YYCorr

rrCovYYCov


Fuel Cell Example– Repeated Measures Factor: Voltage– Response: Current– Group: Control/Added H20– Subject: Fuel Cell


8,,1;2,1)2(;3,2,1)1(;2,1

,)(

kmmj

jkmjkkjmjjkm TVVRTY


Yi=Yjm=(Yj1m,…,Yj8m)’

b=(m,T1,V1,…,V7,TV11,…,TV17)’

Mixed Models

The general mixed model is

),0(~

),0(~

GN

RNZXY i

iiii


For our example, we have no random effects (no Zi or g) separate from the repeated measures effects captured in R. X1 =X1(1) has the form (assume V8=TV18=0)

1|1|1 0 0 0 0 0 0|1 0 0 0 0 0 0

1|1|0 1 0 0 0 0 0|0 1 0 0 0 0 0

…

1|1|0 0 0 0 0 0 1|0 0 0 0 0 0 1

1|1|0 0 0 0 0 0 0|0 0 0 0 0 0 0

Mixed Models

For many models we encounter, R is s2I

In repeated measures models, R can have a lot more structure. E.g., for t timepoints, an AR(1) covariance structure would be:

22221

22221

21212

tt

t

t

R

Repeated Measures Structures

rkk

rkkkkkk

kk

kkkk

kkkk

',0

','2

'

2'

'2

'

'2

'

• Toeplitz

• Unstructured

• Compound

Symmetric

• Banded Toeplitz

Mixed Models

G almost always has a diagonal structure

Regardless of the form for R and G, we can write Yi~N(Xib,ZiGZi’+R)

ab

b

a

I

I

I

G2

2

2

00

00

00

:Ex

Mixed Models

For the entire sample we have

RIR

ZZZ

XXX

YYY

RZGZXNY

n

n

n

n

*

)',,'('

)',,'('

)',,'('

where*),',(~

1

1

1

Restricted MLE

If V=ZGZ’+R* were known, the MLE for b would be (X’V-1X)-1X’V-1Y

We would estimate the residuals as e=Y-Yhat=Y=XB=Y-HY=(I-H)Y=PY where H=X(X’V-1X)-1X’V-1.

PY has a multivariate normal distribution that is a function of (G,R) only

Restricted MLE

The profile likelihood for the parameters of G and R would be based on the distribution of the residuals.

P has rank n-q, where q is the number of random effects; this improves the performance of the estimates of the variance components.

Restricted MLE

The Profile RMLE of the parameters of G and R would maximize :

YVXXVX

eVePVPqnRGl

111

1

ˆ')ˆ'(ˆ

computeThen

'2

1'log

2

12log)(

2

1),(

Case Study

To choose between non-hierachical models, we select the best model based on the Akaike Information Criterion (smaller is better for the second form; q=# of random effects estimated)

qLAIC

or

qLAIC

2ˆln2

ˆln

Case Study

Autocorrelation was strong A Toeplitz model worked best Voltage effect, as expected, was strong Treatment effect was marginal Voltage x Treatment effect was strong to

moderate

A note on fitting

REML essentially uses OLS to estimate the mean parameters, then uses these estimated mean parameters to estimate mean-zero residuals, then uses maximum likelihood to estimate variance components from the residuals. The variance component estimates are then used in GLS (more general than WLS) to re-estimate the mean parameters. This results in unbiased estimates of variance components.

A note on fitting

I’ve seen versions that plug in WLS estimates of the mean parameters assuming the variance components are known to create a likelihood for the variance components from the residuals. The mean parameters are then estimated using GLS, as above.

A note on fitting

Maximum likelihood simply estimates both the mean parameters and variance components at the same time using maximum likelihood. In either case the g are not part of the likelihood. These can be estimated after the population parameters are estimated using Bayes rule; they are called “BLUPs” for best linear unbiased predictor.

A note on fitting

PROC MIXED models repeated measures effects with a REPEATED statement, while GLIMMIX uses RANDOM _RESIDUAL_ . The default fitting method for PROC MIXED for the normal linear mixed model is METHOD=REML. PROC GLIMMIX uses METHOD=RSPL; these are equivalent. METHOD=MSPL in PROC GLIMMIX is equivalent to METHOD=ML in PROC MIXED. PROC GLIMMIX has more optimization methods available, and uses a different default optimization method from PROC MIXED.

A note on fitting

PROC GLM uses Method-of-Moments to estimate variance components. It constructs appropriate F-tests, but doesn’t build randomness of effects into tests or estimation of main effects, lsmeans, contrasts, etc. PROC MIXED has some more complex covariance structures (Kronecker-type for spatiotemporal models) that GLIMMIX lacks. Some of PROC GLIMMIX’s most useful features are not available in PROC MIXED: COVTEST, LSMESTIMATE, OUTPUT.

designs with randomization restrictions

Documents