-
Hae-Jin Choi School of Mechanical Engineering,
Chung-Ang University
2. Experiments with One Factor
(Ch.3. Analysis of Variance ANOVA)
1 DOE and Optimization
-
Experiments with One Factor
2 DOE and Optimization
Plasma etching process:
-
3
Plasma etching process:
An engineer is interested in investigating the relationship between the RF power setting and the etch rate for this tool. The objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power setting that will give a desired target etch rate.
The response variable is etch rate.
She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power.
The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W
The experiment is replicated 5 times runs made in random order
DOE and Optimization
-
Plasma etching process:
4
DOE and Optimization
-
5
Does changing the power change the
mean etch rate?
We would like to have a systematic way
to answer these questions
Analysis of Variance (ANOVA)
DOE and Optimization
-
The Analysis of Variance (ANOVA)
In general, there will be a levels of the factor, or a treatments, and n replicates of the experiment, run in random ordera completely randomized design (CRD)
N = an total runs
We consider the fixed effects casethe random effects case will be discussed later
Objective is to test hypotheses about the equality of the a treatment means
DOE and Optimization 6
-
Statistical Hypothesis
DOE and Optimization 7
A statistical hypothesis is a statement about the parameters of a
probability distribution. For example, we may think that the mean
values of distributions are equal. This may be stated formally as
(Null hypothesis)
(Alternative hypothesis)
Type I error the null hypothesis is rejected when it is true.
Type II error the null hypothesis is accepted when it is false
0 1 2
1
:
: At least one mean is different
aH
H
-
Basic Single-factor ANOVA model
DOE and Optimization 8
the total of the observations for the ith factor level
the average of the observations under the ith factor level
the grand total of all observations
the grand average of all observation
1
= 0a
i
i
. .
1
. = / 1,2, ,n
i ij i i
j
y y y y n i a
.. .. ..
1 1
= /a n
ij
i j
y y y y N
.iy
.iy
..y
..y
-
Basic Single-factor ANOVA model
DOE and Optimization 9
.
ij i ijy
i = 1 i = 2 i = 3 i = 4
-
10
Basic Single-factor ANOVA model
1,2,...,,
1, 2,...,
a parameter common to all factors called an overall mean,
the deviation at the th factor level from the overall mean,
called the th factor effe
ij i ij
i
i ay
j n
i
i
2
ct,
random error, (0, )ij NID
Typically, resulting from measurement error, the effects of variables not
included in the experiment, and so on. (1) normally and independently distributed random variables (2) zero mean value (3) Variance ( ) is assumed constant for all levels of the factor. 2
DOE and Optimization
-
11
The Analysis of Variance
Total variability is measured by the total sum of
squares:
The basic ANOVA partitioning is:
2
..
1 1
( )a n
T ij
i j
SS y y
2 2
.. . .. .
1 1 1 1
2 2
. .. .
1 1 1
( ) [( ) ( )]
( ) ( )
a n a n
ij i ij i
i j i j
a a n
i ij i
i i j
T Treatments E
y y y y y y
n y y y y
SS SS SS
2..2
1 1
= - a n
ijT
i j
ySS y
anor
DOE and Optimization
-
The Analysis of Variance
12
SSTreatments is the sum of squares due to the factor, that is the sum of squares of differences between factor-level averages and the grand average. It measurers the differences between factor levels
A large value of SSTreatments reflects large differences in treatment means
A small value of SSTreatments likely indicates no differences in treatment means
2
. ..
1
( )a
Treatments i
i
SS n y y2 2
. ..
1
= - a
i
Treatments
i
y ySS
n anor
DOE and Optimization
-
The Analysis of Variance
13
SSE is the sum of squares due to error, that is the sum of square of difference of observations within a specific factor level and the factor-level average. It measures the random error.
If SSTreatments is large, it is due to differences among the means at the
different factor levels.
Thus, by comparing the magnitude of SSTreatments to SSE we can see how
much variability is due to changing factor levels and how much is due to
error.
This comparison is facilitated if we first divide this sums by their number
of degrees of freedom.
or 2
.
1 1
( )a n
E ij i
i j
SS y y = - E T TreatmentsSS SS SS
DOE and Optimization
-
The Analysis of Variance
14
While sums of squares cannot be directly compared to test the hypothesis of equal means, mean squares can be compared.
A mean square is a sum of squares divided by its degrees of freedom:
If the treatment means are equal, the treatment and error mean squares will be (theoretically) equal.
If treatment means differ, the treatment mean square will be larger than the error mean square.
1 1 ( 1)
,1 ( 1)
Total Treatments Error
Treatments ETreatments E
df df df
an a a n
SS SSMS MS
a a n
DOE and Optimization
-
ANOVA Table
15
The ratio between MS Treatments and MSE follows a F, distribution,
which is completely determined by u, the numerator degrees of
freedom, and v, the denominator degrees of freedom .
In analysis of variance, u is equal to (a-1), the degrees of freedom
of MS Treatments and v is equal to a(n-1), the degrees of freedom of MSE.
The test statistic is
If Fo > F a-1,a(n-1) , we may conclude that the factor-level means are
different
= TreatmentsoE
MSF
MS
DOE and Optimization
-
16
ANOVA Table
The reference distribution for F0 is the Fa-1, a(n-1) distribution
Reject the null hypothesis (equal treatment means) if
P-value is the probability of Type I error i.e., the probability of rejecting the null hypothesis even if it is actually TRUE
0 , 1, ( 1)a a nF F
DOE and Optimization
-
17
Plasma etching process:
DOE and Optimization
-
18
Plasma etching process:
DOE and Optimization
-
19
Why Does the ANOVA Work?
2 2
1 0 ( 1)2 2
0
We are sampling from normal populations, so
if is true, and
Cochran's theorem gives the independence of
these two chi-square random variables
/ (So
Treatments Ea a n
Treatments
SS SSH
SSF
2
11, ( 1)2
( 1)
2
2 21
1) / ( 1)
/ [ ( 1)] / [ ( 1)]
Finally, ( ) and ( )1
Therefore an upper-tail test is appropriate.
aa a n
E a n
n
i
iTreatments E
a aF
SS a n a n
n
E MS E MSa
F
DOE and Optimization
-
Residual Analysis
DOE and Optimization 20
We define a residual as the difference between the actual
observation and the factor-level mean . Therefore, the residual is
The difference between an observation and the corresponding
factor-level mean.
The normality assumption can be checked by plotting the residuals
on normal probability paper.
Residual Plot
Normal Probability Plot
. = - ij ij ie y y
-
Residual Analysis
DOE and Optimization 21
Plasma etching process
-
Residual Plot
DOE and Optimization 22
Plasma etching process
-
Residual Plot
DOE and Optimization 23
To check the assumption of zero mean and equal variances at each
factor level, plot the residuals against the factor levels and compare
the spread in the residuals.
The independent assumption can be checked by plotting the
residuals against the run order in which the experiment was
performed.
A pattern in this plot, such as sequences of positive and negative
residuals, may indicate that the observations are not independent.
-
Normal Probability Plot
DOE and Optimization 24
-
Normal Probability Plot
DOE and Optimization 25
A normal probability paper is a graph of the cumulative normal
distribution of the residuals, that is, graph paper with the ordinate
scaled so that the cumulative normal distribution plots as a straight
line.
To construct a normal probability plot, arrange the residuals in
increasing order and plot the kth of these ordered residuals versus
the cumulative probability point,
on normal probability paper. If the underlying error distribution is
normal, this plot will resemble a straight line
= ( - 0.5) /kP k N
-
Practical Interpretation of Results
DOE and Optimization 26
Regression Model
Contrasts
-
Regression Model
DOE and Optimization 27
Plasma etching process
-
Contrasts
DOE and Optimization 28
We might suspect at the outset of the experiment that 200W and
220W produce the same etch rate, implying that we would like to
test the hypothesis:
This hypothesis could be tested by investigating an appropriate
linear combination of factor level totals say
3 4
1 3 4
: =
: =
oH
H
3. 4. - = 0y y
3 4 - = 0
Or
-
Contrasts
DOE and Optimization 29
If we have suspected that the average of 160 W and 180W did not
differ from the average of 200W and 220W, then the hypothesis,
would have been
which implies the linear combination
1 2 3 4
1 1 2 3 4
: + = +
: + = +
oH
H
1. 2. 3. 4. + - y - y = 0y y
1 2 3 4 + - - = 0
Or
-
Contrasts
DOE and Optimization 30
In general, the comparison of factor level means of interest will
imply a linear combination of factor level totals such as
with the restriction that Such linear combinations are
called contrasts. The sum of squares of any contract is
.
1
= ya
i i
i
C c
1
= 0.a
i
i
c
2
.
1
2
1
c y
= 1
a
i i
i
c a
i
i
SS
cn
-
Contrasts
DOE and Optimization 31
Plasma etching problem
Hypothesis Contrast
1 2
1 2 3 4
3 4
: =
: + +
: =
o
o
o
H
H
H
1 1. 2.
2 1. 2. 3. 4.
3 3. 4.
+
C y y
C y y y y
C y y
-
Contrasts
DOE and Optimization 32
Plasma etching problem
1
2
1
2
2
2
1(551.2) 1(587.4) 36.2
36.23276.10
1(2)
5
1(551.2) 1(587.4) 1(625.4) 1(707.0) 193.8
( 193.8)46,948.05
1(4)
5
C
C
C
SS
C
SS
3
3
2
1(625.4) 1(707.6) 81.6
( 81.6)16,646.40
1(2)
5
C
C
SS
-
Contrasts
DOE and Optimization 33
Plasma etching problem
A contrast is tested by comparing its sum of squares to the error mean square. The resulting statistic would be distributed as F with 1 and a(n-1) degrees of freedom.
Contrast is always single degree of freedom since two comparing data and one equation