introduction to randomized block designs

47
1 Reduced slides Introduction to Randomized block designs Accounting for predicted but random variance A B C D B C A D C B D A B D C A Block 1 Block 2 Block 3 Block 4

Upload: others

Post on 20-Mar-2022

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Randomized block designs

1

Reduced slides

Introduction to Randomized

block designs

Accounting for predicted but random

variance

A

B C

D B

C A

D C

B D

A B

D C

A

Block 1 Block 2 Block 3 Block 4

Page 2: Introduction to Randomized block designs

2

Blocking

• Aim:

– Reduce unexplained variation, without increasing size of experiment.

Approach:

– Group experimental units (“replicates”) into blocks.

– Blocks usually spatial units, one experimental unit from each treatment in each block.

Walter & O’Dowd (1992)

• Effects of domatia (cavities

on leaves) on number of

mites

• Two treatments (Factor A):

– shaving domatia removes

domatia from leaves

– normal domatia as control

• Required 14 leaves for

each treatment

Page 3: Introduction to Randomized block designs

3

Control leaves Shaved domatia leaves

Completely randomized design: - 28 leaves randomly allocated to each of 2

treatments

Completely randomized ANOVA • Factor A with p groups (p = 2 for domatia)

• n replicates within each group (n = 14 pairs of

leaves)

Source general df example df

Factor A p-1 1

Residual p(n-1) 26

Total pn-1 27

Page 4: Introduction to Randomized block designs

4

Walter & O’Dowd (1992)

• Required 14 leaves for each treatment

• Set up as blocked design

– paired leaves (14 pairs) chosen

– 1 leaf in each pair shaved, 1 leaf in each

pair control

1 block

Control leaves Shaved domatia leaves

Page 5: Introduction to Randomized block designs

5

Rationale for blocking

• Micro-temperature, humidity, leaf age,

etc. more similar within block than

between blocks

• Variation in response variable (mite

number) between leaves within block

(leaf pair) < variation between leaves

between blocks

Rationale for blocking

• Some of unexplained (residual)

variation in response variable from

completely randomized design now

explained by differences between

blocks

• More precise estimate of treatment

effects than if leaves were chosen

completely randomly

Page 6: Introduction to Randomized block designs

6

Null hypotheses

• No main effect of Factor A

– H0: m1 = m2 = … = mi = ... = m

– H0: a1 = a2 = … = ai = ... = 0 (ai = mi - m)

– no effect of shaving domatia, pooling

blocks

• Factor A usually fixed

Null hypotheses

• No effect of factor B (blocks):

– no difference between blocks (leaf pairs),

pooling treatments

• Blocks usually random factor:

– sample of blocks from population of blocks

– H0: 2 = 0

Page 7: Introduction to Randomized block designs

7

• Factor A with p groups (p = 2 treatments for

domatia)

• Factor B with q blocks (q = 14 pairs of leaves)

Source general example

Factor A p-1 1

Factor B (blocks) q-1 13

Residual (p-1)(q-1) 13

Total pq-1 27

Randomized blocks ANOVA

Notice

that this

is not

pq(n-1)

Randomized block ANOVA

• Randomized block ANOVA is 2 factor

factorial design

– BUT no replicates (n) within each cell

(treatment-block combination), i.e.

unreplicated 2 factor design

– No measure of within-cell variation

– No test for treatment by block interaction

Page 8: Introduction to Randomized block designs

8

If factor A fixed and factor B (Blocks)

random:

MSwatering 2 + a2 + n (ai)

2/p-1

MSBlocks 2 + n2

MSResidual 2 + a2

Expected mean squares

General Randomized Block

Mean Square calcualtion

Page 9: Introduction to Randomized block designs

9

Testing null hypotheses

• Factor A fixed and blocks random

• If H0 no effects of factor A is true:

– then F-ratio MSA / MSResidual 1

• If H0 no variance among blocks is true:

– no F-ratio for test unless no interaction

assumed

– if blocks fixed, then F-ratio MSB / MSResidual

1

Walter & O’Dowd (1992)

• Factor A (treatment - shaved and

unshaved domatia) - fixed

• Blocks (14 pairs of leaves) - random

Source df MS F P

Treatment 1 31.34 11.32 0.005

Block 13 1.77 0.64 0.784 ??

Residual 13 2.77

Should this be reported??

Page 10: Introduction to Randomized block designs

10

Randomized Block vs

Completely Randomized designs

• Total number of experimental units

same in both designs

– 28 leaves in total for domatia experiment

• Test of factor A (treatments) has fewer

df in block design:

– reduced power of test

RCB vs CR designs

• MSResidual smaller in block design if blocks explain some of variation in Y:

– increased power of test

• If decrease in MSResidual (unexplained variation) outweighs loss of df, then block design is better:

– when blocks explain much of variation in Y

Page 11: Introduction to Randomized block designs

11

Assumptions

• Normality of response variable

– boxplots etc.

• No interaction between blocks and

factor A, otherwise

– MSResidual increase proportionally more than

MSA with reduced power of F-ratio test for

A (treatments)

– interpretation of main effects may be

difficult, just like replicated factorial ANOVA

Interaction plots

Y

Block

Y

No interaction

Interaction

Page 12: Introduction to Randomized block designs

12

Worked Example – seastar colors

• Comparison of numbers of purple vs orange

seastars along the CA coast

• Data number of purple and orange seastars

collected at 7 random locations

• Compare models (block vs completely

random vs paired t test)

sea star colors all sites two sample

Number of Seastars as a function

of color and site

Any obvious problem with the data??

Page 13: Introduction to Randomized block designs

13

Diagnostics: Log transform helps normality and

homogeneity of variance assumptions

Model 1: One factor ANOVA

Why the difference?

SE=0.184

SE=0.181

Page 14: Introduction to Randomized block designs

14

Model 2: Paired t test

• Accounts for site specific (block) differences

• But no way to assess site (block) differences

LORANGE LPURPLE

Index of Case

1.0

1.5

2.0

2.5

3.0

3.5

Valu

e

Model 3: Randomized Block Design -

using least squares

• Accounts for and assesses (with a

caveat) site specific effects

1) Compare to paired t (same p value for Color) but no Site effect

2) Compare to single factor ANOVA (look at p-value for Color). Here

tradeoff between df and partitioning of variance makes for a more

powerful test

Be careful

Page 15: Introduction to Randomized block designs

15

Any hint of Interaction (site*color)? If not then how does this change our interpretation of results?

Govpt

BoatSta

ir

Hazard

s

Shell Beach

Cayuco

sPSN

SITE

1.0

1.5

2.0

2.5

3.0

3.5

LN

UM

BE

R

PurpleOrange

COLOR

If factor A fixed and factor B (Blocks) random:

MSA 2 + a2 + n (ai)

2/p-1

MSBlocks 2 + n2

MSResidual 2 + a2

Model 3: Randomized Block Design - using

(restricted) Maximum Likelihood Estimation

• Accounts for site specific effects

1) Variance component used to calculate percent of variance

associated with the random effect

2) P-value for Color is identical to that from the Least Squares

Estimation (this will always be true for balanced designs)

Identical to least squares

solution

Page 16: Introduction to Randomized block designs

16

Model 3: Mixed Model Solution

• Also accounts for site specific

effects

Identical to least squares

solution and REML

Examples of randomized block

designs • Effect of feeding time (pre, post) on metabolic rate in otters. Each

otter is measured twice (pre post). Hence otter ID is the random effect

unless????

• Effect of Health Care reform on percentage of insured people in

counties of CA. Each county is measured twice (pre post). Hence

county is the random effect unless??

• Effect of watering regime (0,1,2,4,6 times weekly in replicate plots).

Each treatment (ttt) is in each of 10 plots. Plots are random effect.

• Effect of gender on grades in replicated classrooms. Grades for males

and females are measured in each of 20 classrooms. Classrooms

(teachers) are a random effect unless??

Page 17: Introduction to Randomized block designs

17

Full set of slides

Page 18: Introduction to Randomized block designs

18

Introduction to Randomized

block designs

Accounting for predicted but random

variance

A

B C

D B

C A

D C

B D

A B

D C

A

Block 1 Block 2 Block 3 Block 4

Blocking

• Aim:

– Reduce unexplained variation, without increasing size of experiment.

Approach:

– Group experimental units (“replicates”) into blocks.

– Blocks usually spatial units, one experimental unit from each treatment in each block.

Page 19: Introduction to Randomized block designs

19

Walter & O’Dowd (1992)

• Effects of domatia (cavities

on leaves) on number of

mites

• Two treatments (Factor A):

– shaving domatia removes

domatia from leaves

– normal domatia as control

• Required 14 leaves for

each treatment

Control leaves Shaved domatia leaves

Completely randomized design: - 28 leaves randomly allocated to each of 2

treatments

Page 20: Introduction to Randomized block designs

20

Completely randomized ANOVA • Factor A with p groups (p = 2 for domatia)

• n replicates within each group (n = 14 pairs of

leaves)

Source general df example df

Factor A p-1 1

Residual p(n-1) 26

Total pn-1 27

Walter & O’Dowd (1992)

• Required 14 leaves for each treatment

• Set up as blocked design

– paired leaves (14 pairs) chosen

– 1 leaf in each pair shaved, 1 leaf in each

pair control

Page 21: Introduction to Randomized block designs

21

1 block

Control leaves Shaved domatia leaves

Rationale for blocking

• Micro-temperature, humidity, leaf age,

etc. more similar within block than

between blocks

• Variation in response variable (mite

number) between leaves within block

(leaf pair) < variation between leaves

between blocks

Page 22: Introduction to Randomized block designs

22

Rationale for blocking

• Some of unexplained (residual)

variation in response variable from

completely randomized design now

explained by differences between

blocks

• More precise estimate of treatment

effects than if leaves were chosen

completely randomly

Null hypotheses

• No main effect of Factor A

– H0: m1 = m2 = … = mi = ... = m

– H0: a1 = a2 = … = ai = ... = 0 (ai = mi - m)

– no effect of shaving domatia, pooling

blocks

• Factor A usually fixed

Page 23: Introduction to Randomized block designs

23

Null hypotheses

• No effect of factor B (blocks):

– no difference between blocks (leaf pairs),

pooling treatments

• Blocks usually random factor:

– sample of blocks from population of blocks

– H0: 2 = 0

• Factor A with p groups (p = 2 treatments for

domatia)

• Factor B with q blocks (q = 14 pairs of leaves)

Source general example

Factor A p-1 1

Factor B (blocks) q-1 13

Residual (p-1)(q-1) 13

Total pq-1 27

Randomized blocks ANOVA

Notice

that this

is not

pq(n-1)

Page 24: Introduction to Randomized block designs

24

Randomized block ANOVA

• Randomized block ANOVA is 2 factor

factorial design

– BUT no replicates (n) within each cell

(treatment-block combination), i.e.

unreplicated 2 factor design

– No measure of within-cell variation

– No test for treatment by block interaction

Example – effect of watering on

plant growth • Factor 1: Watering, no watering

• Factor 2: Blocks (1-4). One replicate of each treatment

(watering, no watering) in each of 4 plots

• Replication: 1 plant for each watering/black combination

(8 total)

No water

Water

Page 25: Introduction to Randomized block designs

25

Treatment

No Water Water

G r o

w t h

0

5

10

15

Block Treatment Growth

1 No Water 6

1 Water 10

2 No Water 4

2 Water 6

3 No Water 11

3 Water 15

4 No Water 5

4 Water 8

Results

Block

1.0 2.0 3.0 4.0

G r o

w t h

0.0

5.0

10.0

15.0

Block Treatment Growth

1 No Water 6

1 Water 10

2 No Water 4

2 Water 6

3 No Water 11

3 Water 15

4 No Water 5

4 Water 8

Results

Page 26: Introduction to Randomized block designs

26

Treatment

No Water

Water

Block

1.0 2.0 3.0 4.0

G r o

w t h

0

5

10

15

Block Treatment Growth

1 No Water 6

1 Water 10

2 No Water 4

2 Water 6

3 No Water 11

3 Water 15

4 No Water 5

4 Water 8

Results

If factor A fixed and factor B (Blocks)

random:

MSwatering

MSBlocks

MSResidual

Expected mean squares

Page 27: Introduction to Randomized block designs

27

Treatment

No Water

Water

Block

1.0 2.0 3.0 4.0

G r o

w t h

0

5

10

15

Block Treatment Growth

1 No Water 6

1 Water 10

2 No Water 4

2 Water 6

3 No Water 11

3 Water 15

4 No Water 5

4 Water 8

EMSResidual = 2 + a2 : Why is this not simply 2

Residual

• Cannot separately estimate 2 and a2:

– no replicates within each block-treatment

combination

• MSResidual estimates 2 + a2

Page 28: Introduction to Randomized block designs

28

If factor A fixed and factor B (Blocks)

random:

MSwatering

MSBlocks

MSResidual 2 + a2

Expected mean squares

Block

1.0 2.0 3.0 4.0

G r o

w t h

0.0

5.0

10.0

15.0

Block Treatment Growth

1 No Water 6

1 Water 10

2 No Water 4

2 Water 6

3 No Water 11

3 Water 15

4 No Water 5

4 Water 8

EMSBlocks =2 + n2

Page 29: Introduction to Randomized block designs

29

If factor A fixed and factor B (Blocks)

random:

MSwatering

MSBlocks 2 + n2

MSResidual 2 + a2

Expected mean squares

Treatment

No Water Water

G r o

w t h

0

5

10

15

Block Treatment Growth

1 No Water 6

1 Water 10

2 No Water 4

2 Water 6

3 No Water 11

3 Water 15

4 No Water 5

4 Water 8

EMSwatering= 2 + a2 + n (ai)

2/p-1

Why does the EMS for watering include a2 , which is the effect of the interaction?

Block is a random effect, hence there are unsampled combinations of block and

watering that could affect the estimates of EMSwatering

Page 30: Introduction to Randomized block designs

30

General Randomized Block

Mean Square calcualtion

If factor A fixed and factor B (Blocks)

random:

MSwatering 2 + a2 + n (ai)

2/p-1

MSBlocks 2 + n2

MSResidual 2 + a2

Expected mean squares

Page 31: Introduction to Randomized block designs

31

Testing null hypotheses

• Factor A fixed and blocks random

• If H0 no effects of factor A is true:

– then F-ratio MSA / MSResidual 1

• If H0 no variance among blocks is true:

– no F-ratio for test unless no interaction

assumed

– if blocks fixed, then F-ratio MSB / MSResidual

1

Walter & O’Dowd (1992)

• Factor A (treatment - shaved and

unshaved domatia) - fixed

• Blocks (14 pairs of leaves) - random

Source df MS F P

Treatment 1 31.34 11.32 0.005

Block 13 1.77 0.64 0.784 ??

Residual 13 2.77

Should this be reported??

Page 32: Introduction to Randomized block designs

32

Explanation Blocks

Treatment 1 2 3 4 5 6 7 8 9 10 11 12 13

Shaved 1 1 1 1 1 1 1 1 1 1 1 1 1

Control 1 1 1 1 1 1 1 1 1 1 1 1 1

Cells represent the possible effect of the Block by Treatment

interaction but:

1) There is only one replicate per cell, therefore

2) No way to estimate variance term for each cell, therefore

3) No way to estimate the variance associated with the interaction,

therefore

4) The residual term estimates 2 + a2

Randomized Block vs

Completely Randomized designs

• Total number of experimental units

same in both designs

– 28 leaves in total for domatia experiment

• Test of factor A (treatments) has fewer

df in block design:

– reduced power of test

Page 33: Introduction to Randomized block designs

33

RCB vs CR designs

• MSResidual smaller in block design if blocks explain some of variation in Y:

– increased power of test

• If decrease in MSResidual (unexplained variation) outweighs loss of df, then block design is better:

– when blocks explain much of variation in Y

Assumptions

• Normality of response variable

– boxplots etc.

• No interaction between blocks and

factor A, otherwise

– MSResidual increase proportionally more than

MSA with reduced power of F-ratio test for

A (treatments)

– interpretation of main effects may be

difficult, just like replicated factorial ANOVA

Page 34: Introduction to Randomized block designs

34

Checks for interaction

• No real test because no within-cell

variation measured

• Tukey’s test for non-additivity:

– detect some forms of interaction

• Plot treatment values against block

(“interaction plot”)

Interaction plots

Y

Block

Y

No interaction

Interaction

Page 35: Introduction to Randomized block designs

35

Growth of Plantago

Growth of Plantago

• Growth of five genotypes (3 fast, 2 slow) of Plantago major (ribwort)

• Poorter et al. (1990)

• One replicate seedling of each genotype placed in each of 7 plastic containers in growth chamber – Genotypes (1, 2, 3, 4, 5) are factor A

– Containers (1 to 7) are blocks

– Response variable is total plant weight (g) after 12 days

Page 36: Introduction to Randomized block designs

36

Poorter et al. (1990)

1

2 3 4

5

1 2

3 4 5

Container 1 Container 2

Similarly for containers 3, 4, 5, 6 and 7

Source df MS F P

Genotype 4 0.125 3.81 0.016

Block 6 0.118

Residual 24 0.033

Total 34

Conclusions:

• Large variation between containers (= blocks) so

block design probably better than completely

randomized design

• Significant difference in growth between

genotypes

Page 37: Introduction to Randomized block designs

37

Mussel recruitment and seastars

Mussel recruitment and seastars

• Effect of increased mussel (Mytilus spp.) recruitment on seastar numbers

• Robles et al. (1995) – Two treatments: 30-40L of Mytilus (0.5-3.5cm

long) added, no Mytilus added

– Four matched pairs (blocks) of mussel beds chosen

– Treatments randomly assigned to mussel beds within pair

– Response variable % change in seastar numbers

Page 38: Introduction to Randomized block designs

38

mussel bed with added mussels

mussel bed without added mussels

+ -

- +

- +

+ -

1 block (pair of mussel beds)

+

-

Source df MS F P

Blocks 3 62.82

Treatment 1 5237.21 45.50 0.007

Residual 3 115.09

Conclusions:

• Relatively little variation between blocks so a

completely randomized design probably better

because treatments would have 1,6 df

• Significant treatment effect - more seastars

where mussels added

Page 39: Introduction to Randomized block designs

39

Worked Example – seastar colors

• Comparison of numbers of purple vs orange

seastars along the CA coast

• Data number of purple and orange seastars

collected at 7 random locations

• Compare models (block vs completely

random vs paired t test)

sea star colors all sites two sample

Number of Seastars as a function

of color and site

Any obvious problem with the data??

Page 40: Introduction to Randomized block designs

40

Diagnostics: Log transform helps normality and

homogeneity of variance assumptions

Model 1: One factor ANOVA

Why the difference?

SE=0.184

SE=0.181

Page 41: Introduction to Randomized block designs

41

Model 2: Paired t test

• Accounts for site specific (block) differences

• But no way to assess site (block) differences

LORANGE LPURPLE

Index of Case

1.0

1.5

2.0

2.5

3.0

3.5

Valu

e

Model 3: Randomized Block Design -

using least squares

• Accounts for and assesses (with a

caveat) site specific effects

1) Compare to paired t (same p value for Color) but no Site effect

2) Compare to single factor ANOVA (look at p-value for Color). Here

tradeoff between df and partitioning of variance makes for a more

powerful test

Be careful

Page 42: Introduction to Randomized block designs

42

Any hint of Interaction (site*color)? If not then how does this change our interpretation of results?

Govpt

BoatSta

ir

Hazard

s

Shell Beach

Cayuco

sPSN

SITE

1.0

1.5

2.0

2.5

3.0

3.5

LN

UM

BE

R

PurpleOrange

COLOR

If factor A fixed and factor B (Blocks) random:

MSA 2 + a2 + n (ai)

2/p-1

MSBlocks 2 + n2

MSResidual 2 + a2

Model 3: Randomized Block Design - using

(restricted) Maximum Likelihood Estimation

• Accounts for site specific effects

1) Variance component used to calculate percent of variance

associated with the random effect

2) P-value for Color is identical to that from the Least Squares

Estimation (this will always be true for balanced designs)

Identical to least squares

solution

Page 43: Introduction to Randomized block designs

43

Model 3: Mixed Model Solution

• Also accounts for site specific

effects

Identical to least squares

solution and REML

Examples of randomized block

designs • Effect of feeding time (pre, post) on metabolic rate in otters. Each

otter is measured twice (pre post). Hence otter ID is the random effect

unless????

• Effect of Health Care reform on percentage of insured people in

counties of CA. Each county is measured twice (pre post). Hence

county is the random effect unless??

• Effect of watering regime (0,1,2,4,6 times weekly in replicate plots).

Each treatment (ttt) is in each of 10 plots. Plots are random effect.

• Effect of gender on grades in replicated classrooms. Grades for males

and females are measured in each of 20 classrooms. Classrooms

(teachers) are a random effect unless??

Page 44: Introduction to Randomized block designs

44

Sphericity assumption

This is for reference – much more

important for repeated measures

Block Treat 1 Treat 2 Treat 3 etc.

1 y11 y21 y31

2 y12 y22 y32

3 y13 y23 y33

etc.

Page 45: Introduction to Randomized block designs

45

Block T1 - T2 T2 - T3 T1 - T3 etc.

1 y11-y21 y21-y31 y11-y31

2 y12-y22 y22-y32 y12-y32

3 y13-y23 y23-y33 y13-y33

etc.

Sphericity assumption

• Pattern of variances and covariances within

and between treatments:

– sphericity of variance-covariance matrix

• Equal variances of differences between all

pairs of treatments :

– variance of (T1 - T2)’s = variance of (T2 - T3)’s =

variance of (T1 - T3)’s etc.

• If assumption not met:

– F-ratio test produces too many Type I errors

Page 46: Introduction to Randomized block designs

46

Sphericity assumption

• Applies to randomized block

– also repeated measures designs

• Epsilon (e) statistic indicates degree to which

sphericity is not met

– further e is from 1, more variances of treatment

differences are different

• Two versions of e

– Greenhouse-Geisser e

– Huyhn-Feldt e

Dealing with non-sphericity

If e not close to 1 and sphericity not met,

there are 2 approaches:

– Adjusted ANOVA F-tests

• df for F-ratio tests from ANOVA adjusted

downwards (made more conservative)

depending on value e

– Multivariate ANOVA (MANOVA)

• treatments considered as multiple

response variables in MANOVA

Page 47: Introduction to Randomized block designs

47

Sphericity assumption

• Assumption of sphericity probably OK for randomized block designs:

– treatments randomly applied to experimental units within blocks

• Assumption of sphericity probably also OK for repeated measures designs:

– if order each “subject” receives each treatment is randomized (eg. rats and drugs)

Sphericity assumption

• Assumption of sphericity probably not OK for repeated measures designs involving time:

– because response variable for times closer together more correlated than for times further apart

– sphericity unlikely to be met

– use Greenhouse-Geisser adjusted tests or MANOVA