Download - ANOVA - Pennsylvania State University
ANOVA
An Old Research Question
The impact of TV on high-school grade
Watch or not watch
Two groups
The impact of TV hours on high-school grade
Exactly how much TV watching would make
difference
Multiple groups
Not watch, watch a little, watch regularly
Then we could have
something like this
What Should We Do?
Should t-Test Be Used?
Multiple comparison
Increasing the chance of
Type I error
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50
Multiple Comparison Is
Common
In particular in factorial design
Single factor
Multiple levels: previous example
Multiple factors
Impact of TV watching and library visit
Terminology
Factor
The independent variable that designates the
groups being compared
TV watching and library visit
Levels
Individual conditions or values that make up
a factor
Factorial design
A study that combines two or more factors
The research study uses two factors
One factor uses two levels of therapy technique (I
versus II)
The second factor uses three levels of time
(before, after, and 6 months after).
Figure 12.2
Two-Factor Research Design
Figure 12.2
Two-Factor Research Design
Also notice that the therapy factor uses two
separate groups (independent measures)
and the time factor uses the same group for
all three levels (repeated measures).
We have 15 comparisons!
Figure 12.2
Two-Factor Research Design
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50
How to deal with this
problem?
Analysis of Variance
Analysis of variance
Also called ANOVA
Used to evaluate mean differences between
two or more treatments (advantage over t-
test)
Uses sample data as basis for drawing
general conclusions about populations
Analysis of Variance
Null hypothesis: the level or value on the
factor does not affect the dependent variable
In the population, this is equivalent to saying that
the means of the groups do not differ from each
other
Alternative hypothesis: There is at least one
mean difference among the populations
All means are different from every other mean
Some means are not different from some others,
but other means do differ from some means
3210 : H
ANOVA: Statistics
F test
F-ratio: based on variance instead of sample mean
difference
Numerator: Variance caused by differences among sample
means
Denominator: Variance be expected if there is no treatment
effect
chancebyexpecteddifference
meanssamplebetweendifferenceobtainedt
chance)by (error effect treatmentnowith expectede)(differencvariance
meanssamplebetweene)(differencvarianceF
Logic of ANOVA
A study with three
treatments
Sources of Variability
Between Treatments
Systematic differences caused
by treatments
Random, unsystematic differences
Individual differences
Experimental (measurement) error
What if the null
hypothesis is true?
F-Ratio
The ratio of the variance between treatments
to the variance within treatments
(treatment effects + chance) / (chance)
If no treatment effect, F should be 1
Otherwise, F should be larger than 1.
Experimental Design
Simple experiments
Single factor
Between-subjects design
Within-subjects design
Factorial experiments
More factors
2 x 2
These design all involve multiple treatments
ANOVA would be needed.
Numerator of F-ratio
Numerator of F-ratio
Denominator of F-ratio
Denominator of F-ratio
Logic of Repeated-Measures
ANOVA
Comparing variance
Between-treatments vs. within-treatments
Removing the difference between subjects
s)difference individual (chancebyexpectedvariance
s)difference individual ( treatmentsbetweenvarianceF
without
without
chancebyexpectedvariance
treatmentsbetweenvarianceF
ANOVA
Notation and Formulas
k: the number of treatment
n: the number of scores in each treatment
N: the number of total scores in the study
SX or T: the sum of the scores for each
treatment
G: the sum of all the scores in the study
G = S(SX) = ST
SX2, SS, s2, df,
Figure 12.4 ANOVA
Calculation Structure and
Sequence
Figure 12.5 Partitioning SS
for Independent-measures
ANOVA
ANOVA equations
N
GXSStotal
22
treatment each insidetreatmentswithin SSSS
N
G
n
TSS treatmentsbetween
22
Degrees of Freedom Analysis
Total degrees of freedom
dftotal= N – 1
Within-treatments degrees of freedom
dfwithin= N – k
Between-treatments degrees of freedom
dfbetween= k – 1
Figure 12.6 Partitioning
Degrees of Freedom
Mean Squares and F-ratio
within
withinwithinwithin
df
SSsMS 2
between
betweenbetweenbetween
df
SSsMS 2
within
between
within
between
MS
MS
s
sF
2
2
ANOVA Summary Table
Source SS df MS F
Between Treatments 40 2 20 10
Within Treatments 20 10 2
Total 60 12
• Concise method for presenting ANOVA results
• Helps organize and direct the analysis process
• Convenient for checking computations
• “Standard” statistical analysis program output
Distribution of F-ratios
If the null hypothesis is true, the value of F
will be around 1.00
Because F-ratios are computed from two
variances, they are always positive numbers
Table of F values is organized by two df
df numerator (between) shown in table columns
df denominator (within) shown in table rows
Figure 12.7
Distribution of F-ratios
ANOVA Test
Uses the same four steps that have been
used in earlier hypothesis tests.
Computation of the test statistic F is done
in stages
Compute SStotal, SSbetween, SSwithin
Compute MStotal, MSbetween, MSwithin
Compute F
Measuring Effect size for
ANOVA
Compute percentage of variance accounted
for by the treatment conditions
In published reports of ANOVA, effect size is
usually called η2 (“eta squared”)
r2 concept (proportion of variance explained)
total
treatments between
SS
SS2
In the Literature
Treatment means and standard deviations
are presented in text, table or graph
Results of ANOVA are summarized, including
F and df
p-value
η2
• E.g., F(3,20) = 6.45, p<.01, η2 = 0.492
Example
For each
experiment
N = 14
Experiment A
Source SS df MS F
Between Treatments
Within Treatments
Total
Experiment B
Source SS df MS F
Between Treatments
Within Treatments
Total
post hoc Tests
ANOVA compares all individual mean
differences simultaneously, in one test
A significant F-ratio indicates that at least one
difference in means is statistically significant
Does not indicate which means differ significantly
from each other!
post hoc tests are follow up tests done to
determine exactly which mean differences
are significant, and which are not
Tukey’s Honestly Significant
Difference
A single value that determines the minimum
difference between treatment means that is
necessary to claim statistical significance–a
difference large enough that p < αexperimentwise
Honestly Significant Difference (HSD)
n
MSqHSD within
A vs. B: MA – MB = 2.44 > HSD significant
B vs. C: MB – MC = 1.66 < HSD
A vs. C: MA – MC = 4.00 > HSD significant
The Scheffé Test
The Scheffé test is one of the safest of all
possible post hoc tests
Uses an F-ratio to evaluate significance of the
difference between two treatment conditions
groups twoof SS with calculatedB A versus
within
between
MS
MSF
Between A & B
A & B F(2,24) = 3.36
B & C F(2,24) = 1.36
A & C F(2,24) = 9.00
df = 2, 24 and α = .05
the critical value for F: 3.40
Only the difference between A&C is significant.
Relationship between ANOVA
and t tests
For two independent samples, either t or F
can be used
Always result in same decision
F = t2
For any value of α, (tcritical)2 = Fcritical
Figure 12.10
Distribution of t and F statistics
Independent Measures ANOVA
Assumptions
The observations within each sample must
be independent
The population from which the samples are
selected must be normal
The populations from which the samples are
selected must have equal variances
(homogeneity of variance)
Violating the assumption of homogeneity of
variance risks invalid test results
To Report ANOVA Result
The subjects averaged MA = 3, MB = 5.44, and MC = 7 in three treatments respectively. ANOVA indicated a significant difference, F(2, 24) = 9.15, p<.05, 2 = ….
Post hoc analysis (Tukey’s HSD) indicated significant difference between Treatments A and B, as well as between Treatments A and C (HSD = 2.36).
or
Post hoc analysis (Sheffé) indicated significant difference between Treatments A and C only, FA vs. C (2,24) = 9, p<.05.
Homework
12.22