sjs sdi_81 design of statistical investigations stephen senn 8 factorial designs
TRANSCRIPT
SJS SDI_8 1
Design of Statistical Investigations
Stephen Senn
8 Factorial Designs
SJS SDI_8 2
Introduction
• So far we have been looking at complications with blocking structure
• However, we now introduce complications in treatment structure
• We now look at factorial designs
• These are designs in which there are two or more “dimensions” to the treatments
SJS SDI_8 3
Exp_8
(From Clarke and Kempson)
The yield of a chemical reaction is presumed to depend on two things
A: The amount (low or high) in the mixture of a certain chemical
B: The presence or absence of a catalyst
An experiment is run to determine the importance of these in affecting yield.
SJS SDI_8 4
Treatments in Terms of Factors
Chemical A
Low High
Absent Treatment 1 Treatment 2
Catalyst B Present Treatment 3 Treatment 4
SJS SDI_8 5
Terminology
• A and B are factors
• “low” and “high” are levels of the factor A
• “absence” and “presence” are levels of the factor B
• An experiment studying combinations of factors is called a factorial experiment
• If all four combinations are studied, then this is a 2 x 2 or 22 factorial.
SJS SDI_8 6
Usual Notation for 22 Factorials
• A and B are the factors
• a and b are the higher levels
• ab = the combination of both factors at higher level
• a = A at higher level B at lower level
• b = A at lower level B at higher level
• (1) = both factors at lower level
SJS SDI_8 7
Main Effects and InteractionsThe Main Effect of a factor is the average response (averaged over all levels of the other factors) to a change in the level of that factor. Thus the main effect of A is the average of the difference between a and (1) and the difference between ab and b.
The interaction between two factors A and B is the difference between the effect of A at the higher level of B (ab - b) and the difference at the lower level of B (a- (1)).
Sometimes, by convention, this double difference is divided by 2.
SJS SDI_8 8
22 FactorialsDefinition of Effects
(1) a b ab
Main effectA
- + - +
Main effectB
- - + +
InteractionAB
+ - - +
SJS SDI_8 9
Exp_9(Clarke and Kempson)
• Factor S: source of supply of a particular material– Two sources
• s when first is used
• Factor M: the speed of running a machine– Two speeds
• m whenever higher is used
• Experiment run on five days• Response: Average number of defectives per batch
SJS SDI_8 10
Exp_9(Clarke and Kempson)
• The days determine the block structure of the experiment
• The treatment structure is that of a 2 2 factorial– S M– (1), s, m, sm
SJS SDI_8 11
Exp_9Data
BlockTreatment I II III IV V
(1) 5.3 5.7 5.1 5.3 5.6
m 11.8 13.0 12.6 12.1 11.5
s 20.0 19.0 20.3 19.5 20.2
ms 26.7 24.1 25.7 26.0 25.5
SJS SDI_8 12
Exp_9Analysis
SquaresI II III IV V Total Total
(1) 5.3 5.7 5.1 5.3 5.6 27 729m 11.8 13 12.6 12.1 11.5 61 3721s 20 19 20.3 19.5 20.2 99 9801
ms 26.7 24.1 25.7 26 25.5 128 16384Total 63.8 61.8 63.7 62.9 62.8 315 30635
Squares 4070.44 3819.24 4057.69 3956.41 3943.84 19847.62 6133.52Total
DATA
SJS SDI_8 13
Exp_9Analysis Continued
G 315total S 6133.52treatments ST 6127blocks SB 4961.905
Source df SS MS FBlocks 4 0.655 0.16375 0.335038Treatment 3 1165.75 388.5833 795.0554Residuals 12 5.865 0.48875Total 19 1172.27
SJS SDI_8 14
Treatment Structure
• The above analysis uses a one dimensional treatment structure– Single factor with four unordered levels
• We wish, however, to distinguish between constituent factors
• This can be done as follows
SJS SDI_8 15
Factorial Analysis
Treatment ProductsM S MS Totals M S MS-1 -1 1 27 -27 -27 271 -1 -1 61 61 -61 -61
-1 1 -1 99 -99 99 -991 1 1 128 128 128 128
Value 63 139 -5Divisor 20 20 20SS 198.45 966.05 1.25 1165.75
SJS SDI_8 16
ANOVA (Factorial)
ANOVA With Factorial StructureSource df SS MS FBlocks 4 0.655 0.16375 0.335038M 1 198.45 198.45 406.0358S 1 966.05 966.05 1976.573Interaction 1 1.25 1.25 2.557545TREAT 3 1165.75 388.5833 795.0554Residuals 12 5.865 0.48875Total 19 1172.27
SJS SDI_8 17
Exp_9SPlus
#Input dataBlock<-factor(rep(c(seq(1:5)),4))Supply<-factor(rep(c(1,2),each=10))Machine<-factor(rep(rep(c(1,2),each=5),2))#Create new factor treatment with 4 levelsTreat<-ifelse((Supply==1 & Machine==1),1,0)Treat<-ifelse((Supply==1 & Machine==2),2,Treat)Treat<-ifelse((Supply==2 & Machine==1),3,Treat)Treat<-ifelse((Supply==2 & Machine==2),4,Treat)Treat<-factor(Treat)Y<-c(5.3,5.7,5.1,5.3,5.6,11.8,13,12.6,12.1,11.5,20,19,20.3,19.5,20.2,26.7,24.1,25.7,26,25.5)
SJS SDI_8 18
Exp_9SPlus: Treatment as Factor with 4 Levels
fit1 <- aov(Y ~ Block + Treat)> summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.3350 0.8491633 Treat 3 1165.750 388.5833 795.0554 0.0000000Residuals 12 5.865 0.4888
SJS SDI_8 19
Exp_9SPlus: Two equivalent statements using two
factors with interactions
> fit2 <- aov(Y ~ Block + Supply * Machine)summary(fit2) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.335 0.8491633 Supply 1 966.050 966.0500 1976.573 0.0000000 Machine 1 198.450 198.4500 406.036 0.0000000Supply:Machine 1 1.250 1.2500 2.558 0.1357512 Residuals 12 5.865 0.4888 > fit3 <- aov(Y ~ Block + Supply + Machine + Supply:Machine)> summary(fit3) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.335 0.8491633 Supply 1 966.050 966.0500 1976.573 0.0000000 Machine 1 198.450 198.4500 406.036 0.0000000Supply:Machine 1 1.250 1.2500 2.558 0.1357512 Residuals 12 5.865 0.4888
SJS SDI_8 20
Exp_9SPlus: Two equivalent statements using two
factors without interactions
> fit4 <- aov(Y ~ Block + Supply + Machine)> summary(fit4) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.299 0.8732822 Supply 1 966.050 966.0500 1765.095 0.0000000 Machine 1 198.450 198.4500 362.593 0.0000000Residuals 13 7.115 0.5473 > fit5 <- aov(Y ~ Block + Supply * Machine - Supply:Machine)> summary(fit5) Df Sum of Sq Mean Sq F Value Pr(F) Block 4 0.655 0.1637 0.299 0.8732822 Supply 1 966.050 966.0500 1765.095 0.0000000 Machine 1 198.450 198.4500 362.593 0.0000000Residuals 13 7.115 0.5473
SJS SDI_8 21
Wilkinson and Roger NotationThis is a common notation
A = main effect of factor A, B = main effect of factor B
A:B = interaction of A and B, A:B:C = three factor interaction of A, B and C
+ sign used to add effects - used to subtract them
A*B = A + B+ A:B = main effects of A and B and their interactions
A*B*C = A + B + C +A:B + A:C + B:C + A:B:C
NB In their original paper Applied Statistics,1973,22,392-399, W&R used instead of : as used in SPlus
SJS SDI_8 22
Exp_9Design Notes
• The two factors and their interaction are orthogonal– consequence of treatment combinations chosen
• They are also orthogonal to the blocks– This is a consequence of how they were applied
• Each combination in each day of the week
– This increases efficiency• Effectively treatments are compared within blocks
SJS SDI_8 23
Exp_10(Senn Example 7.1)
• Cross-over comparing two formulations at two doses– Solution and Suspension– 12g and 24g per puff
• Four periods• Four sequences in a Latin Square used• 16 Patients allocated at random
– 4 to each sequence
SJS SDI_8 24
Treatment Combinations
Formulation Dose
A Suspension 12g
B Suspension 24g
C Solution 12g
D Solution 24g
Sequences Patients
ABDC 3,5,12,13
BCAD 4,6,10,16
CDBA 2,8,9,14
DACB 1,7,11,15
SJS SDI_8 25
Exp_10Design Notes
• Two dimensional block structure– 16 Patients x 4 periods
• Treatment structure factorial– Formulations x doses
• Treatments allocated in way that is orthogonal to block structure
• Latin square (“replicated” 4 times)– Actually the patient changes
SJS SDI_8 26
Exp_10Splus Data Entry
#Input datapatient<-factor(rep(c(3, 5, 12, 13, 4, 6, 10, 16, 2, 8, 9, 14, 1,7, 11, 15),4))treat<-factor(rep(c("sus12","sus24","sol12","sol24"),each=16))form<-factor(rep(c("sus","sol"),each=32))dose<-factor(rep(c(12,24,12,24),each=16))period<-factor(rep(c(1,3,4,2,2,1,3,4,4,2,1,3,3,4,2,1),each=4))fev1<-c(2.7, 2.5, 2.6, 2, 3.7, 0.9, 2.5, 2, 1.3, 2.2, 1.8, 1.9, 1.7, 2.2, 3.3, 2.2, 1.7, 2.4, 2.5, 2.2,3.6, 1.4, 2.6, 2.5, 1.3, 2.2, 1.9, 2.2, 1.7, 1.9, 3.7, 2.3, 2.2, 2.4, 2.4, 2.6, 3.7, 2.4, 2.6,2.2, 1.4, 2.3, 1, 2.2, 1.6, 1.8, 3.6, 2.4, 2.6, 2.4, 2.5, 2.6, 3.6, 1.1, 2.4, 2.7, 1.3, 2.3,2.7, 2.1, 2, 2.6, 3.3, 2.5)
SJS SDI_8 27
Exp_10SPlus Analysis
#fit treat as a factorfit1<-aov(fev1~patient+period+treat)summary(fit1)model.tables(fit1, type="effects", se=T, cterms="treat")
#use the factorial approach with dose and formfit2<-aov(fev1~patient+period+form*dose)summary(fit2)model.tables(fit2, type="effects", se=T, cterms=c("form","dose","form:dose"))
SJS SDI_8 28
SPlusResults 1
summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) patient 15 22.27234 1.484823 14.46822 0.0000000 period 3 0.08547 0.028490 0.27760 0.8412298 treat 3 0.36172 0.120573 1.17487 0.3307357Residuals 42 4.31031 0.102626
Tables of effects treat sol12 sol24 sus12 sus24 0.00156 0.12031 -0.07969 -0.04219
Standard errors of effects treat 0.080088replic. 16.000000
SJS SDI_8 29
SPlusResults 2
Df Sum of Sq Mean Sq F Value Pr(F) patient 15 22.27234 1.484823 14.46822 0.0000000 period 3 0.08547 0.028490 0.27760 0.8412298 form 1 0.23766 0.237656 2.31574 0.1355644 dose 1 0.09766 0.097656 0.95157 0.3349054form:dose 1 0.02641 0.026406 0.25730 0.6146314Residuals 42 4.31031 0.102626
SJS SDI_8 30
form sol sus 0.060938 -0.060938
dose 12 24 -0.039063 0.039063
form:dose Dim 1 : formDim 2 : dose 12 24 sol -0.020313 0.020313sus 0.020313 -0.020313
Standard errors of effects form dose form:dose 0.056631 0.056631 0.080088replic. 32.000000 32.000000 16.000000
SJS SDI_8 31
Questions
• To what extent do you think that the model for analysis is appropriate?
• What sort of distribution might number of defectives have?
• How else might one analyse the data– If one knew the batch sizes?
– If one did not?
• What further problems might there be?
According to C&K in Exp_9 the response is mean faulty items per batch based on ten batches