sjs sdi_51 design of statistical investigations stephen senn 5. orthogonal designs randomised blocks
TRANSCRIPT
SJS SDI_5 1
Design of Statistical Investigations
Stephen Senn
5. Orthogonal Designs
Randomised Blocks
SJS SDI_5 2
Blocks
• So far we have ignored differences in experimental units
• Some subsets of units may be similar to each other but different from other subsets
• Such similar subsets are called blocks
• The presence of blocks can be exploited– By design– And by analysis
SJS SDI_5 3
Randomised Block Design• We identify blocks of experimental material
• We allocate treatments to block at random in such a way that– each treatment appears in every block– if a treatment appears m times in one block it
appears m times in all blocks– but subject to no further restriction
• Referred to as a randomised block design
SJS SDI_5 4
# to create randomized blocksn.b<-6 # number of blocksn.t<-3 # number of treatmentsn.r<-2 # number of replicates#create vector of treatmentstreat<-c(rep(seq(1,n.t),n.r))#creat vector of blocksblock<-seq(1,n.b)#create one permuted blockunit<-sample(treat)#create other permuted blocks#and join themfor(i in 1:(n.b-1)){
unit<-rbind(unit,sample(treat))}design.frame<-data.frame(block,unit)design.frame #print design
Note use of sample function
SJS SDI_5 5
Randomised BlocksSPlus Output
> design.frame block unit.1 unit.2 unit.3 unit.4 unit.5 unit.6 1 1 1 3 3 2 1 22 2 1 2 3 3 2 13 3 2 3 3 2 1 14 4 3 2 1 3 2 15 5 2 3 2 1 1 36 6 1 3 3 2 1 2
SJS SDI_5 6
Exp_5
Graff-Lonnevig and Browaldh (1990), Senn and Auclair (1990)
Cross-over trial of single doses of 12 g formoterol compared with 200 g salbutamol in 13 asthmatic children. Main outcome measure peak expiratory flow (PEF) 8 hours after treatment.Two sequences used with wash-out in between.
Sequence Period 1 Wash-out Period 2
for/sal formoterol salbutamol
sal/for salbutamol formoterol
SJS SDI_5 7
Design Points
• Treatments are given in two periods• Washout is used to allow possible carry-over to
disappear• Two sequences were used
– Permits blinding
– A voids associating particular treatment with particular period
• We shall assume patients were randomised to the two sequences
SJS SDI_5 8
Exp_5The Data
SEQ Patient Formoterol Salbutamolforsal 1 310 270salfor 2 385 370salfor 3 400 310forsal 4 310 260salfor 5 410 380forsal 6 370 300forsal 7 410 390salfor 9 320 290forsal 10 250 210forsal 11 380 350salfor 12 340 260salfor 13 220 90forsal 14 330 365
We shall ignore the sequence information for the moment.
If we have assigned patients at random to the two possible sequences, this is a randomised blocks design
SJS SDI_5 9
Questions
• What do we note about the precision of measurement?– What possible explanation is there?
• What do we note about the patient numbers?– What possible explanation is there?
SJS SDI_5 10
Blocks in a Cross-over• In this design the units are episodes of
treatment
• As the graphs that follow will show, there is a correlation between results from the same patient
• Patients form the blocks of the experiment– Naturally– And by design
SJS SDI_5 11
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Patient
100
200
300
400
For
ced
Exp
irato
ry V
olum
e in
One
Sec
ond
(FE
V1)
ml
FEV1 by Patient for Formoterol and Salbutamol
SJS SDI_5 12
100 200 300 400
FEV1 Formoterol
100
200
300
400
FE
V1
Sal
buta
mol
Salbutamol Readings Plotted Against Formoterol ReadingsBy Sequence
SJS SDI_5 13
Points and Questions
• The graph plots the salbutamol reading against the formoterol reading
• Each point represents a patient– triangles salbutamol/formoterol sequence– squares formoterol/salbutamol sequence
• All the points except one are to the right of the line of equality– What does this suggest?
SJS SDI_5 14
Blocking• From field trials in agriculture
• A block was a set of plots of presumed similar fertility
• Design trick was to use each treatment within a given block– Compare like with like– Eliminate a source of variation
• Now used to describe any set of similar units use in design
SJS SDI_5 15
Blocks - Examples• Centres in a multi-centre trial
– Units are patients
• Cars in a fuel consumption experiment– Units are runs
• Patients in a cross-over trial– Units are episodes of treatment
• Fermentation tanks in a plant– Units are runs
SJS SDI_5 16
Model for Randomised Blocks
2
1 1
, 1 , 1 (5.1)
( ) (5.2)
ij i j ij
v r
ij i ji j
y i v j r
y
Assume for simplicity every treatment appears once in each block
1 1
1
1
ˆˆ ˆ2 ( ) 0 (5.3)
ˆˆ ˆ2 ( ) 0, 1 (5.4)
ˆˆ ˆ2 ( ) 0, 1 (5.5)
i
i
rv
ij i ji j
r
ij i jj
v
ij i ji
y
y i v
y j r
Quantity to be minimised
Normal equations obtain by differentiating sum of squares with respect to unknown parameters and setting equal to zero
Basic model
SJS SDI_5 17
Some Notation
1 1
1
1
v r
iji j
r
ij ij
v
ij ji
y G
vr N
y T
y B
Total of all observations
Number of observations
Total on treatment i
Total in block j
SJS SDI_5 18
Solutions
11
1 1
1
1
ˆˆˆˆ ˆ ˆ (5.6)
ˆˆ ˆ ˆ ˆ (5.7)
ˆ ˆˆ ˆ ˆ (5.8)
rv
jiv rji
i ji j
ri
i i j ij
vj
j i j ji
vrG
G N r vN N N
TT r r
r
BB v
v
SJS SDI_5 19
Identifiability
• (5.1) Is over-parameterised
• Not all effects are identifiable
• However contrasts of the form below are uniquely identifiable
. .
. .
ˆ ˆ
ˆ ˆ
h kh k h k
q pq p p q
T Ty y
r rB B
y yv v
These are of particular interest
These are not
SJS SDI_5 20
Indentifiability continued
• Predictions are also identifiable
. . ..
ˆˆ ˆ ˆ jiij i j
jii j
BTY G G G
r v
BTG y y y
r v
SJS SDI_5 21
Exp_5: MeansPatient Formoterol Salbutamol Mean
1 310 270 2902 385 370 377.53 400 310 3554 310 260 2855 410 380 3956 370 300 3357 410 390 4009 320 290 305
10 250 210 23011 380 350 36512 340 260 30013 220 90 15514 330 365 347.5
Mean 341.15 295.77 318.46
SJS SDI_5 22
Exp_5: MeansPatient Formoterol Salbutamol Mean
1 310 270 2902 385 370 377.53 400 310 3554 310 260 2855 410 380 3956 370 300 3357 410 390 4009 320 290 305
10 250 210 23011 380 350 36512 340 260 30013 220 90 15514 330 365 347.5
Mean 341.15 295.77 318.46
SJS SDI_5 23
Predicted Value and ResidualPatient 7, Formoterol
1,7
1,7 1,7 1,7
ˆ 341.15 400 318.46 422.69
ˆ 410 422.69 12.69
y
e y y
Note that the data are laid out in columns for treatments and rows for blocks ( patients) for convenience but that our notation suggested rows for treatments columns for blocks. Our subscripts reflect this latter convention. Note also that since patient 8 is missing we have a potential ambiguity regarding subscripts for patients 9 onwards.
SJS SDI_5 24
Exp_5Predicted Values
Formoterol Salbutamol Mean312.69 267.31 290400.19 354.81 377.5377.69 332.31 355307.69 262.31 285417.69 372.31 395357.69 312.31 335422.69 377.31 400327.69 282.31 305252.69 207.31 230387.69 342.31 365322.69 277.31 300177.69 132.31 155370.19 324.81 347.5341.15 295.77 318.46
SJS SDI_5 25
Exp_5Residuals
Formoterol Salbutamol Mean-2.69 2.69 0
-15.19 15.19 022.31 -22.31 02.31 -2.31 0
-7.69 7.69 012.31 -12.31 0
-12.69 12.69 0-7.69 7.69 0-2.69 2.69 0-7.69 7.69 017.31 -17.31 042.31 -42.31 0
-40.19 40.19 00.00 0.00 0.00
SJS SDI_5 26
Sums of Squares
2. . ..
min 1 1
2
.. . .. . ..1 1
( )v r
ij i ji j
v r
ij i ji j
y y y y
y y y y y y
Expanding we get….
SJS SDI_5 27
2 22
.. . .. . ..1 1 1 1
. .. .. . .. ..1 1 1 1
2
. .. .. . ..1 1 1
2
. .. .. . ..1 1 1
2 ( ) 2 ( ) 0
( )
( )
v r v r
ij i ji j i j
v r r v
i ij j iji j j i
v r v
i ij ii j i
r v r
i ij jj i j
y y r y y v y y
y y y y y y y y
However
y y y y r y y
and
y y y y v y y
Hence we get...
SJS SDI_5 28
ANOVA Identity
2 22
.. . .. . ..min 1 1 1 1
2 22
.. . .. . ..1 1 1 1 min
v r v r
ij i ji j i j
v r v r
ij i ji j i j
y y r y y v y y
y y r y y v y y
SJS SDI_5 29
ANOVA Table
2 2 2. .. 1
1
2 2 2. .. 1
1
2 2 2.
1 1
2 2 2.. 1
1 1
Sum of DistributionSource d.f Squares under null
Treatments -1 ( )
Blocks r-1 ( )
Residual 1 ( )
Total 1 ( )
v
i vi
r
j rj
v r
ij i N vi j
v r
ij Ni j
v r y y
v y y
N v r y y
N y y
SJS SDI_5 30
Computational Approaches
Reminder
In general
2
2 2. . .
1 1 1
2
2 1 1
1 1
2
212 2 .
1 1
2
2
n n n
i i ii i i
n n
i in ni i
i ii i
n
in ni
i ii i
X X X X X nX
X XX X n
n n
XT
X Xn n
SJS SDI_5 31
Computational Approaches (cont)
2 2
2..
1 1 1 1
2
0
v r v r
ij iji j i j
GY Y Y
N
GS S S
N
Thus to calculate the Total Sum of Squares we may proceed as follows
SJS SDI_5 32
Computational Approaches (cont)
2 22. .. . ..
1 1
2 2. ..
1
2 2. .2 2
1 1..
0
( )r r
j jj j
rj
j
r r
j jj j
B
v y y v y ry
T Tv r
v vr
T TT G
v vr v N
S S
Sum of squares between blocks
SJS SDI_5 33
Computational Approaches (cont)2 22
. .. . ..1 1
2 2
. ..
1
2 2. .2 2
1 .. 1
0
( )v v
i ii i
vi
i
v v
i ii i
T
r y y r y vy
T Tr v
r vr
T TT G
r vr r N
S S
Sum of squares between treatments
SJS SDI_5 34
Exp_5: Calculation 1Squares
Patient Formoterol Salbutamol Total1 96100 72900 1690002 148225 136900 2851253 160000 96100 2561004 96100 67600 1637005 168100 144400 3125006 136900 90000 2269007 168100 152100 3202009 102400 84100 186500
10 62500 44100 10660011 144400 122500 26690012 115600 67600 18320013 48400 8100 5650014 108900 133225 242125
Total 1555725 1219625 2775350
SJS SDI_5 35
Exp_5: Calculation 2Patient Formoterol Salbutamol Total Squares
1 310 270 580 3364002 385 370 755 5700253 400 310 710 5041004 310 260 570 3249005 410 380 790 6241006 370 300 670 4489007 410 390 800 6400009 320 290 610 372100
10 250 210 460 21160011 380 350 730 53290012 340 260 600 36000013 220 90 310 9610014 330 365 695 483025
Total 4435 3845 8280 5504150Squares 19669225 14784025 34453250
SJS SDI_5 36
Exp_5: Calculation 3
0
0
0
0
Re
2636861.538, 2775350
2752075, 2650250
=2775350-2636861.538 138488.4615
- =2752075-2636861.538 115213.4615
- =2650250-2636861.538 13388.46154
B T
Total
Blocks B
Treat T
sidual Total
S S
S S
SS S S
SS S S
SS S S
SS SS S
138488.4615-115213.4615-13388.46154
=9886.538462
Blocks TreatsS SS
SJS SDI_5 37
Exp_5 Analysis using Excel
ANOVASource of Variation SS df MS F P-value F critRows 115213.5 12 9601.122 11.65357 7.93E-05 2.686633Columns 13388.46 1 13388.46 16.25053 0.001666 4.747221Error 9886.538 12 823.8782
Total 138488.5 25
This uses the data analysis menu of Excel
SJS SDI_5 38
Exp_5ANOVA Analysis using SPlus
(Data input details omitted)
#ANOVA just fitting treatfit1<-aov(pef~treat)summary(fit1)#ANOVA fitting treat and patientfit2<-aov(pef~patient+treat)summary(fit2)
SJS SDI_5 39
Exp_5SPlus Results
> summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) treat 1 13388.5 13388.46 2.56853 0.1220902Residuals 24 125100.0 5212.50 > #ANOVA just fitting treat and patientfit2 <- aov(pef ~ patient + treat)> summary(fit2) Df Sum of Sq Mean Sq F Value Pr(F) patient 12 115213.5 9601.12 11.65357 0.000079348 treat 1 13388.5 13388.46 16.25053 0.001665618
Residuals 12 9886.5 823.88
SJS SDI_5 40
Questions
• Has the treatment sum of squares changed in fitting “patient”?
• Are the degrees of freedom for treatment different?
• What has changed?
• Why has it changed?
• What is the net effect?