1
Chapter 6 The 2k Factorial Design
2
6.1 Introduction
• The special cases of the general factorial design (Chapter 5)
• k factors and each factor has only two levels• Levels:
– quantitative (temperature, pressure,…), or qualitative (machine, operator,…)
– High and low– Each replicate has 2 2 = 2k observations
3
• Assumptions: (1) the factor is fixed, (2) the design is completely randomized and (3) the usual normality assumptions are satisfied
• Wildly used in factor screening experiments
4
6.2 The 22 Factorial Design
• Two factors, A and B, and each factor has two levels, low and high.
• Example: the concentration of reactant v.s. the amount of the catalyst (Page 208)
5
• “-” And “+” denote the low and high levels of a factor, respectively
• Low and high are arbitrary terms
• Geometrically, the four runs form the corners of a square
• Factors can be quantitative or qualitative, although their treatment in the final model will be different
6
• Average effect of a factor = the change in response produced by a change in the level of that factor averaged over the levels if the other factors.
• (1), a, b and ab: the total of n replicates taken at the treatment combination.
• The main effects:
AAyy
n
b
n
aab
baabn
ababn
A
2
)1(
2
)]1([2
1)]}1([]{[
2
1
BByy
n
a
n
bab
ababn
baabn
B
2
)1(
2
)]1([2
1)]}1([]{[
2
1
7
• The interaction effect:
• In that example, A = 8.33, B = -5.00 and AB = 1.67
• Analysis of Variance• The total effects:
n
ab
n
ab
baabn
ababn
AB
22
)1(
])1([2
1)]}1([]{[
2
1
baabContrast
ababContrast
baabContrast
AB
B
A
)1(
)1(
)1(
8
• Sum of squares:
ABBATE
i j
n
kijkT
AB
B
A
SSSSSSSSSS
n
yySS
n
ababSS
n
ababSS
n
baabSS
4
4
])1([
4
)]1([
4
)]1([
22
1
2
1 1
2
2
2
2
9
Response:Conversion ANOVA for Selected Factorial ModelAnalysis of variance table [Partial sum of squares]
Sum of Mean FSource Squares DF Square Value Prob > FModel 291.67 3 97.22 24.82 0.0002A 208.33 1 208.33 53.19 < 0.0001B 75.00 1 75.00 19.15 0.0024AB 8.33 1 8.33 2.13 0.1828Pure Error 31.33 8 3.92Cor Total 323.00 11
Std. Dev. 1.98 R-Squared 0.9030Mean 27.50 Adj R-Squared 0.8666C.V. 7.20 Pred R-Squared 0.7817
PRESS 70.50 Adeq Precision 11.669
The F-test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important?
10
• Table of plus and minus signs:
I A B AB
(1) + – – +
a + + – –
b + – + –
ab + + + +
11
• The regression model:
– x1 and x2 are coded variables that represent the
two factors, i.e. x1 (or x2) only take values on –
1 and 1.
22110 xxy
2/)(
2/)(
2/)(
2/)(
2
1
highlow
highlow
highlow
highlow
CatalystCatalyst
CatalystCatalystCatalystx
ConcConc
ConcConcConcx
– Use least square method to get the estimations of the coefficients
– For that example,
– Model adequacy: residuals (Pages 213~214)
12
21 2
00.5
2
33.85.27ˆ xxy
13
• Response surface plot:
– Figure 6.3
CatalystConcy 00.58333.033.18ˆ
14
6.3 The 23 Design
• Three factors, A, B and C, and each factor has two levels. (Figure 6.4 (a))
• Design matrix (Figure 6.4 (b))• (1), a, b, ab, c, ac, bc, abc• 7 degree of freedom: main effect = 1, and
interaction = 1
15
16
• Estimate main effect:
• Estimate two-factor interaction: the difference between the average A effects at the two levels of B
])1([4n
1
4
)1(
4
abcacaba
])1([4
1
bccbabcacaba
n
bccb
n
yy
bcabccacbaban
A
AA
n
aacbbc
n
cababc
acacbabbcabcn
AB
44
)1(
)]1([4
1
17
• Three-factor interaction:
• Contrast: Table 6.3– Equal number of plus and minus– The inner product of any two columns = 0– I is an identity element– The product of any two columns yields another
column– Orthogonal design
• Sum of squares: SS = (Contrast)2/8n
)]1([4
1
)]}1([][][]{[4
1
ababcacbcabcn
ababcacbcabcn
ABC
18
19
Factorial Effect
TreatmentCombination
I A B AB C AC BC ABC
(1) + – – + – + + –
a + + – – – – + +
b + – + – – + – +
ab + + + + – – – –
c + – – + + – – +
ac + + – – + + – –
bc + – + – + – + –
abc + + + + + + + +Contrast 24 18 6 14 2 4 4
Effect 3.00 2.25 0.75 1.75 0.25 0.50 0.50
Table of – and + Signs for the 23 Factorial Design (pg. 218)
20
• Example 6.1
A = gap, B = Flow, C = Power, y = Etch Rate
21
22
• The regression model and response surface:– The regression model:
– Response surface and contour plot (Figure 6.7)
3131 2
625.153
2
125.306
2
625.1010625.776ˆ xxxxy
23
24
25
6.4 The General 2k Design
• k factors and each factor has two levels• Interactions• The standard order for a 24 design: (1), a, b, ab, c,
ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd
two-factor interactions2
three-factor interactions3
1 factor interaction
k
k
k
26
• The general approach for the statistical analysis:– Estimate factor effects– Form initial model (full model)– Perform analysis of variance (Table 6.9)– Refine the model– Analyze residual– Interpret results
•
2
...
)(2
12
2
)1()1)(1(
KABCkKABC
KABCk
KABC
Contrastn
SS
Contrastn
KABC
kbaContrast
27
28
6.5 A Single Replicate of the 2k Design• These are 2k factorial designs
with one observation at each corner of the “cube”
• An unreplicated 2k factorial design is also sometimes called a “single replicate” of the 2k
• If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data
29
• Lack of replication causes potential problems in statistical testing– Replication admits an estimate of “pure error”
(a better phrase is an internal estimate of error)
– With no replication, fitting the full model results in zero degrees of freedom for error
• Potential solutions to this problem– Pooling high-order interactions to estimate
error (sparsity of effects principle)– Normal probability plotting of effects
(Daniels, 1959)
30
• Example 6.2 (A single replicate of the 24 design)– A 24 factorial was used to investigate the effects
of four factors on the filtration rate of a resin– The factors are A = temperature, B = pressure,
C = concentration of formaldehyde, D= stirring rate
31
32
• Estimates of the effects
Term Effect SumSqr % ContributionModel InterceptError A 21.625 1870.56 32.6397Error B 3.125 39.0625 0.681608Error C 9.875 390.062 6.80626Error D 14.625 855.563 14.9288Error AB 0.125 0.0625 0.00109057Error AC -18.125 1314.06 22.9293Error AD 16.625 1105.56 19.2911Error BC 2.375 22.5625 0.393696Error BD -0.375 0.5625 0.00981515Error CD -1.125 5.0625 0.0883363Error ABC 1.875 14.0625 0.245379Error ABD 4.125 68.0625 1.18763Error ACD -1.625 10.5625 0.184307Error BCD -2.625 27.5625 0.480942Error ABCD 1.375 7.5625 0.131959
Lenth's ME 6.74778 Lenth's SME 13.699
33
• The normal probability plot of the effectsDESIGN-EXPERT PlotFiltration Rate
A: TemperatureB: PressureC: ConcentrationD: Stirring Rate
Normal plot
No
rma
l % p
rob
ab
ility
Effect
-18.12 -8.19 1.75 11.69 21.62
1
5
10
20
30
50
70
80
90
95
99
A
CD
AC
AD
34
35
DESIGN-EXPERT Plot
Filtration Rate
X = A: TemperatureY = C: Concentration
C- -1.000C+ 1.000
Actual FactorsB: Pressure = 0.00D: Stirring Rate = 0.00
C: ConcentrationInteraction Graph
Filt
ratio
n R
ate
A: Temperature
-1.00 -0.50 0.00 0.50 1.00
41.7702
57.3277
72.8851
88.4426
104
DESIGN-EXPERT Plot
Filtration Rate
X = A: TemperatureY = D: Stirring Rate
D- -1.000D+ 1.000
Actual FactorsB: Pressure = 0.00C: Concentration = 0.00
D: Stirring RateInteraction Graph
Filt
ratio
n R
ate
A: Temperature
-1.00 -0.50 0.00 0.50 1.00
43
58.25
73.5
88.75
104
36
• B is not significant and all interactions involving B are negligible
• Design projection: 24 design => 23 design in A,C and D
• ANOVA table (Table 6.13)
37
38
Response:Filtration Rate ANOVA for Selected Factorial ModelAnalysis of variance table [Partial sum of squares]
Sum of Mean FSource Squares DF Square Value Prob >FModel 5535.81 5 1107.16 56.74 < 0.0001A 1870.56 1 1870.56 95.86 < 0.0001C 390.06 1 390.06 19.99 0.0012D 855.56 1 855.56 43.85 < 0.0001AC 1314.06 1 1314.06 67.34 < 0.0001AD 1105.56 1 1105.56 56.66 < 0.0001Residual 195.12 10 19.51Cor Total 5730.94 15
Std. Dev. 4.42 R-Squared 0.9660Mean 70.06 Adj R-Squared 0.9489C.V. 6.30 Pred R-Squared 0.9128
PRESS 499.52 Adeq Precision 20.841
39
• The regression model:
• Residual Analysis (P. 235)• Response surface (P. 236)
Final Equation in Terms of Coded Factors:
Filtration Rate =+70.06250+10.81250 * Temperature+4.93750 * Concentration+7.31250 * Stirring Rate-9.06250 * Temperature * Concentration+8.31250 * Temperature * Stirring Rate
40
41
42
• Half-normal plot: the absolute value of the effect estimates against the cumulative normal probabilities.
DESIGN-EXPERT PlotFiltration Rate
A: TemperatureB: PressureC: ConcentrationD: Stirring Rate
Half Normal plot
Ha
lf N
orm
al %
pro
ba
bility
|Effect|
0.00 5.41 10.81 16.22 21.63
0
20
40
60
70
80
85
90
95
97
99
A
CD
AC
AD
43
• Example 6.3 (Data transformation in a Factorial Design)
A = drill load, B = flow, C = speed, D = type of mud, y = advance rate of the drill
44
• The normal probability plot of the effect estimates
DESIGN-EXPERT Plotadv._rate
A: loadB: flowC: speedD: mud
Half Normal plot
Ha
lf N
orm
al %
pro
ba
bili
ty
|Effect|
0.00 1.61 3.22 4.83 6.44
0
20
40
60
70
80
85
90
95
97
99
B
C
D
BCBD
45
• Residual analysisDESIGN-EXPERT Plotadv._rate
Residual
No
rma
l % p
rob
ab
ility
Normal plot of residuals
-1.96375 -0.82625 0.31125 1.44875 2.58625
1
5
10
20
30
50
70
80
90
95
99
DESIGN-EXPERT Plotadv._rate
Predicted
Re
sid
ua
ls
Residuals vs. Predicted
-1.96375
-0.82625
0.31125
1.44875
2.58625
1.69 4.70 7.70 10.71 13.71
46
• The residual plots indicate that there are problems with the equality of variance assumption
• The usual approach to this problem is to employ a transformation on the response
• In this example, yy ln*
47
DESIGN-EXPERT PlotLn(adv._rate)
A: loadB: flowC: speedD: mud
Half Normal plotH
alf
No
rma
l % p
rob
ab
ility
|Effect|
0.00 0.29 0.58 0.87 1.16
0
20
40
60
70
80
85
90
95
97
99
B
C
D
Three main effects are large
No indication of large interaction effects
What happened to the interactions?
48
Response: adv._rate Transform: Natural log Constant: 0.000
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]Sum of Mean F
Source Squares DF Square Value Prob > FModel 7.11 3 2.37 164.82 < 0.0001B 5.35 1 5.35 371.49 < 0.0001C 1.34 1 1.34 93.05 < 0.0001D 0.43 1 0.43 29.92 0.0001Residual 0.17 12 0.014Cor Total 7.29 15
Std. Dev. 0.12 R-Squared 0.9763Mean 1.60 Adj R-Squared 0.9704C.V. 7.51 Pred R-Squared 0.9579
PRESS 0.31 Adeq Precision 34.391
49
• Following Log transformation
Final Equation in Terms of Coded Factors:
Ln(adv._rate) =+1.60+0.58 * B+0.29 * C+0.16 * D
50
DESIGN-EXPERT PlotLn(adv._rate)
Residual
No
rma
l % p
rob
ab
ility
Normal plot of residuals
-0.166184 -0.0760939 0.0139965 0.104087 0.194177
1
5
10
20
30
50
70
80
90
95
99
DESIGN-EXPERT PlotLn(adv._rate)
PredictedR
es
idu
als
Residuals vs. Predicted
-0.166184
-0.0760939
0.0139965
0.104087
0.194177
0.57 1.08 1.60 2.11 2.63
51
• Example 6.4:– Two factors (A and C) affect the mean number
of defects– A third factor (B) affects variability– Residual plots were useful in identifying the
dispersion effect– The magnitude of the dispersion effects:
– When variance of positive and negative are equal, this statistic has an approximate normal distribution
)(
)(ln
2
2*
iS
iSFi
52
53
54
55
6.7 2k Designs are Optimal Designs
• Consider 22 design with one replication.• Fit the following model:
• Matrix form:
56
211222110 xxxxy
Xy
ab
b
a
12
2
1
0
1111
1111
1111
1111)1(
• The LS estimation:
• D-optimal criterion, |X’X|: the volumn of the joint confidence region that contains all coefficients is inversely proportional to the square root of |X’X|.
• G-optimal design: 57
4
)1(4
)1(4
)1(4
)1(
')'(ˆ 1
abba
abba
abba
abba
YXXX
)1(4
)ˆVar(max min 22
21
22
21
2
xxxxy
58
6.8 The Addition of Center Points to the 2k Design • Based on the idea of replicating some of the runs
in a factorial design• Runs at the center provide an estimate of error and
allow the experimenter to distinguish between two possible models:
01 1
20
1 1 1
First-order model (interaction)
Second-order model
k k k
i i ij i ji i j i
k k k k
i i ij i j ii ii i j i i
y x x x
y x x x x
59
60
no "curvature"F Cy y
The hypotheses are:
01
11
: 0
: 0
k
iii
k
iii
H
H
2
Pure Quad
( )F C F C
F C
n n y ySS
n n
This sum of squares has a single degree of freedom
To detect the possibility of the quadratic effects: add center points
61
62
• Example 6.6
Refer to the original experiment shown in Table 6.10. Suppose that four center points are added to this experiment, and at the points x1=x2 =x3=x4=0 the four observed filtration rates were 73, 75, 66, and 69. The average of these four center points is 70.75, and the average of the 16 factorial runs is 70.06. Since are very similar, we suspect that there is no strong curvature present.
4Cn
Usually between 3 and 6 center points will work well
Design-Expert provides the analysis, including the F-test for pure quadratic curvature
63
64
65
• If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model