chapter 17 comparing multiple population means: one-factor anova
TRANSCRIPT
Chapter 17
Comparing Multiple Population Means: One-factor ANOVA
What if we have more than 2 conditions/groups?
Interest - the effects of 3 drugs on depression - Prozac, Zoloft, and Elavil
Select 24 people with depression, randomly assign (blindly) to one of four conditions: 1) Prozac, 2) Zoloft, 3) Elavil, and 4) Placebo
After 1 month of drug therapy, we measure depression
Research Design and Data
Prozac Zoloft Elavil Placebo10 14 19 21 8 12 15 2715 18 14 2012 16 16 23 9 13 18 15 6 17 20 22
Multiple t-tests?Differences between drugs?
Prozac vs. Zoloft Prozac vs. Elavil
Prozac vs. Placebo Zoloft vs. Elavil
Zoloft vs Placebo Elavil vs. Placebo6 separate t-tests
Probability Theory (Revisited)The probability of making a correct
decision when the null is false is 1 - α (generally .95)
Each test is independentThe probability of making the correct
decision across all 6 tests is the product of those probabilities
or, (.95)(.95)(.95)(.95)(.95)(.95) = .735
Type 1 error & multiple t-testsThus, the probability of a type 1 error is
not α, but 1 - (1 - α)C, where C is the number of comparisons
Or, in the present case 1 - .735 = .265
t statistic as a ratio
obtained difference t = ———————————————— difference expected by chance (“error”)
Easy – Pool Variance
Hmmm…
Differences in the t test
M1 – M2 or MD
Can we subtract multiple means from one another?
M1 – M2 – M3 – M4 = ????
M4 – M1 – M2 – M3 = ????Is there another statistic that tells us how much things differ from one another?
What statistic describes how scores differ from one another?
Variance
How do a set a means differ from one another? Answer – variance between means/groups
t statistic as a ratio
obtained difference t = ———————————————— difference expected by chance (“error”)
variance between means/groups t = ———————————————— pooled variance
F statistic between-groups variance estimateF = —————————————— within-groups variance estimate
Mean-square Treatment (MST or MSB) s2B
F = ———————————————— = —
Mean-square Error (MSE or MSW) s2W
ANOVAAnalysis of Variance, or ANOVA,
allows us to compare multiple group means, without compromising α
And, even though an ANOVA uses variances and the F statistic, it helps test hypotheses about means
F statisticBetween-groups variance (MST or MSB) is
based on the variability between the groupsWithin-groups variance (MSE or MSW) is a
measure of the variability within the groups– if there is no difference between these 2 measure of
variability (due to no differences between groups), F will be close to 1
– if there is greater variability between-groups (due to differences between groups), F will be greater than 1
Between-groups variance (MST, MSB or s2
B) k groups
where Mi is the mean of the ith group, and MG is the grand mean (the mean of all scores)
Within-groups variance(MSE, MSW, or s2
W) k groups
SST (Sums of Squares Total)The sums of squares total can be used
either as a check, or to calculate SSW
An ANOVA TableThe results of an ANOVA are often
presented in a table:
Source SS df MS FBetweenWithinTotal
An ANOVA TableThe results of an ANOVA are often
presented in a table:
Source SS df MS FBetween 180 2 90.0 36.00Within 30 12 2.5Total 210 14
Procedure for Completing an ANOVA1. Arrange Data by Group2. Compute for each group (k
groups):Σx
Σx2
M
SS(x)
n
Procedure for Completing an ANOVA 3. Compute the grand mean ( MG), by
adding all the scores and dividing by NMG = Σx/N
4. Compute SSB = Σ ni( Mi - MG)2 5. Compute SSW
SSW = SS(x1) + SS(x2) + ···+ SS(xk) 6. Compute SST = Σx2 - (Σx)2/N
Procedure for Completing an ANOVA7. Compute df
dfB = k - 1
dfW = N - k
dfT = N -18. Fill in ANOVA table 9. Compute MS (SS/df)10. Compute F = MSB/MSW
1. ANOVA Calculations
Prozac Zoloft Elavil Placebo10 14 19 21 8 12 15 2715 18 14 2012 16 16 23 9 13 18 15 6 17 20 22
2. ANOVA Calculations
Prozac (Group 1)
10 ΣX1 = 50
8 ΣX12 = 650
15 M1 = 10
12 SS(X1) = 50
9 n1 = 6 6
2. ANOVA Calculations
Zoloft (Group 2)
14 ΣX2 = 90
12 ΣX22 = 1378
18 M2 = 15
16 SS(X2) = 28
13 n2 = 617
2. ANOVA Calculations
Elavil (Group 3)
19 ΣX3 = 102
15 ΣX32 = 1762
14 M3 = 17
16 SS(X3) = 28
18 n3 = 620
2. ANOVA Calculations
Placebo (Group 4)
21 ΣX4 = 128
27 ΣX42 = 2808
20 M4 = 21.33
23 SS(X4) = 77.33
15 n4 = 622
3. ANOVA Calculations
MG = Σx/N
= (ΣX1+ ΣX2 + ΣX3 + ΣX4)/ (n1+n2+n3+n4) = (60+90+102+128)/(6+6+6+6) = 380/24 = 15.83
4. ANOVA Calculations
SSB = Σ ni ( Mi - XG)2
= 6(10 - 15.83)2 + 6(15 - 15.83)2 + 6(17 - 15.83)2 + 6(21.33 - 15.83)2
= 6(34.03) + 6(.69) + 6(1.36) + 6(30.25) = 204.18 + 4.14 + 8.16 + 181.5 = 398.00
5. ANOVA Calculations
SSW = SS(X1) + SS(X2) + ···+ SS(Xk) = 50 + 28 + 28 + 77.33 = 183.33
6. ANOVA Calculations
SST = ΣX2 - (ΣX)2/N = (650 + 1378 + 1762 + 2808) - (60 + 90 + 102 + 128)2/24 = 6598 - 144400/24
= 6598 - 6016.67 = 581.33
Check
SST = SSB + SSW581.33 = 398 + 183.33 581.33 = 581.33
7. ANOVA Calculations
dfB = k -1 = 4 -1 = 3
dfW = N - k = 24 - 4 = 20
dfT = N - 1 = 23
8. ANOVA Calculations
Source SS df MS FBetween 398.00 3Within 183.33 20Total 581.33 23
8. ANOVA Calculations
Source SS df MS FBetween 398.00 3 132.67Within 183.33 20 9.17 Total 581.33 23
8. ANOVA Calculations
Source SS df MS FBetween 398.00 3 132.67 14.47Within 183.33 20 9.17 Total 581.33 23
8. ANOVA Calculations
Source SS df MS FBetween 398.00 3 132.67 14.47Within 183.33 20 9.17 Total 581.33 23
Hypothesis test of Anti-depressants1. State and Check Assumptions
– About the population Normally distributed? - don’t know Homogeneity of variance – we’ll check
– About the sample Independent Random sample? – yes Independent samples
– About the sample Interval level
Hypothesis test of Anti-depressants
2. HypothesesHO : μProzac = μZoloft = μElavil = μPlacebo
HA : the null is wrong
That’s an Odd HA
You might think that the alternative hypothesis should look like this:HA : μProzac ≠ μZoloft ≠ μElavil ≠ μPlacebo
Accepting this alternative indicates that all of the means are unequal, which is not what ANOVA determines
What does ANOVA determine?That at least one of the means is
different than at least one other meanSince, that is a difficult statement to
write, we say“the null is wrong”
Hypothesis test of Anti-depressants
3. Choose test statistic– 4 groups
independent samples
One-factor ANOVA
Hypothesis test of Anti-depressants 4. Set Significance Level
α = .05
Critical Value
Non-directional Hypothesis with
dfB = k – 1 and dfW = N – k
dfB = 3 and dfW = 21
From Table D
Fcrit = 3.07, so we reject HO if
F ≥ 3.07
Hypothesis test of Anti-depressants5. Compute Statistic
Source SS df MS FBetween 398.00 3 132.67 14.47Within 183.33 20 9.17 Total 581.33 23
Hypothesis test of Anti-depressants6. Draw Conclusions
– because our F falls within the rejection region, we reject the HO, and
– conclude that at least one medicine is better than at least one other medicine in treating depression
Violations of AssumptionsAs with t-tests, ANOVA is fairly
ROBUST to violations of normality and homogeneity of variance, but
IF there are severe violations of these assumptions,
Use a Kruskal-Wallis H test (a non-parametric alternative)
Procedure for completing a Kruskal-Wallis H 1. Arrange data in columns, 1 group per
column, skipping columns between groups 2. Rank all the scores, assigning the lowest
rank (1) to the lowest score (put ranks in the column next to the raw scores)
3. Sum the ranks in each column (ΣTj) 4. Square the sum of the ranks of each
column (ΣTj)2
Procedure for completing a Kruskal-Wallis H test
5. Compute SSB
6. Compute H
Procedure for completing a Kruskal-Wallis H test
6. Compute df = k - 17. H is distributed as a χ2
– Look up critical value in χ2 (chi-square) table with appropriate df
Dependent Samples(more than 2 conditions)Experiments are often conducted
comparing more than 2 conditions– ANOVA– Kruskal-Wallis H
Samples are often related - “dependent samples” (within-subjects, repeated measures, etc.)
Dependent Samples ANOVA
SS(T) = SS(B) + SS(Bl) + SS(E)Calculate SS(T), SS(B), and SS(Bl)
SS(E) = SS(T) - SS(B) - SS(Bl)
Why “Blocks”?A dependent samples ANOVA is
sometimes referred to as a “Randomized-Block” design
Each group of related measurements, either within-subject, or with matching, is a “Block” of measurements
SS(Bl)Sum of Squares Blocks - the sum of
the squared deviations of each block mean from the grand mean
SS(Bl) = Σk( Mi - MG )2, or
SS(Bl) = ΣBl2/k - N( MG2), where
Bl = sum of the scores in a block
Procedure for Completing A dependent samples ANOVA 1. Arrange data where columns are
conditions, rows are blocks (subjects or matched-subjects)
2. Compute for each column (conditions)n
ΣXΣX2
MSS(X)s2
Procedure for Completing A dependent samples ANOVA3. Total the scores in the rows in a
new column to the right (Block Totals)4. Square the block totals in the next
column5. Compute the grand mean ( MG),
by adding all the scores and dividing by N
MG = ΣX/N
6. Compute SS(B) = Σ ni ( Mi - MG)2
Procedure for Completing A dependent samples ANOVA
7. Compute SS(T) = ΣX2 - NMG2
8. Compute SS(Bl) = ΣBl2/k – NMG2
9. Compute SS(E) = SS(T) - SS(B) - SS(Bl)
10. Compute dfdfB = k - 1
dfBl = n - 1
dfE = (N - k) - (n - 1)
dfT = N -1
Procedure for Completing A dependent samples ANOVA
11. Fill in ANOVA table 12. Compute MS (SS/df)13. Compute F = MSB/MSE
Dependent Samples ANOVA tableSource SS df MS FBetween
Blocks Error
Total
Example A researcher is interested in the effects of
three new sleep-aids, Sleep E-Z, Zonked, and NockOut
He selects 5 subjects and they take each of the 3 new drugs in a random order
The number of hours slept per night on each of the new sleep-aids is recorded
Data
Subject Sleep E-Z ZonkedNockOut1 6 5 82 5 6 7
3 6 6 9
4 7 7 6
5 4 5 8
Hypothesis Test – Sleep aids 1. State and Check Assumptions
– Population Normally Distributed – not sure, assume for time being H of V – not sure, but we’ll check sample variances
– Sample Dependent samples Random assignment
– Data Interval/Ratio
Hypothesis Test – Sleep aids2. State Null and Alternative Hypotheses
HO : μ1 = μ2 = μ3 (the population means are equal)
HA : HO is wrong (at least one of the means differs, can’t say “μ1 ≠ μ2 ≠ μ3” because this means “all the means differ from one another”)
Hypothesis Test – Sleep aids3. Choose Test Statistic
– Parameter of interest – means– Number of Groups – 3– One factor (or IV being manipulated)– Dependent Samples
One-factor ANOVA for Dependent Samples (F)
Hypothesis Test – Sleep aids4. Set Significance Level
α = .05
F = MSB/MSE, dfB = k – 1, dfE = (N – k) – (n – 1), where N = total number of obs, k = number of groups/conditions, n = number of subs/blocks
dfB = 3 –1 = 2, dfE = (15 – 3) – ( 5 – 1)
Fcrit(2, 8) = 4.46
If our F ≥ 4.46, we Reject HO
Hypothesis Test – Sleep aids5. Compute test Statistic
Computations
Sub S E-Z Z NO 1 6 5 8 2 5 6 73 6 6 94 7 7 65 4 5 8
Sub S E-Z Z NO 1 6 5 8 2 5 6 73 6 6 94 7 7 65 4 5 8n 5 5 5ΣX 28 29 38ΣX2 162 171 294M 5.6 5.8 7.6SS(X) 5.2 2.8 5.2s2 1.3 0.7 1.3 – H of V Otay!
Sub S E-Z Z NO Bl 1 6 5 8 192 5 6 7 183 6 6 9 214 7 7 6 205 4 5 8 17n 5 5 5ΣX 28 29 38ΣX2 162 171 294M 5.6 5.8 7.6SS(X) 5.2 2.8 5.2s2 1.3 0.7 1.3
Sub S E-Z Z NO Bl Bl2
1 6 5 8 19361
2 5 6 7 18324
3 6 6 9 21441
4 7 7 6 20400
5 4 5 8 17289
N 5 5 5 ΣBl2 = 1815
ΣX 28 29 38ΣX2 162 171 294M 5.6 5.8 7.6SS(X) 5.2 2.8 5.2s2 1.3 0.7 1.3
Computations
MG = ΣX/N = (28 + 29 + 38) / (15)
= 6.333
Computations
SS(B) = Σ ni ( Mi - MG)2
= 5(5.6 - 6.33)2 + 5(5.8 - 6.33)2 +
5(7.6 - 6.33)2
= 12.133
Computations
SS(T) = Σ X2 - (Σ X)2/N= (162 + 171 + 294) - (28 + 29 + 38)2/15
= 627 - 601.66= 25.33
Computations
SS(Bl) = ΣBl2/k - N( MG2)
= (361 + 324 + 441 + 400 + 289)/3 -
15(6.33)2
= 1815/3 - 601.66= 3.33
Computations
SS(E) = SS(T) - SS(B) - SS(Bl)= 25.33 - 12.13 - 3.33= 9.87
Computations
dfB = k - 1 = 3 - 1 = 2
dfBl = n - 1 = 5 - 1 = 4
dfE = (N - k) - (n - 1) = (15 - 3) - (5 - 1) = 12 - 4 = 8
dfT = N -1 = 15 -1 = 14
Computations
Source SS df MS FBetween 12.13 2
Blocks 3.33 4Error 9.87 8
Total 25.33 14
Computations
Source SS df MS FBetween 12.13 2 6.07
Blocks 3.33 4 .83Error 9.87 8 1.23
Total 25.33 14
Computations
Source SS df MS FBetween 12.13 2 6.07 4.93
Blocks 3.33 4 .83Error 9.87 8 1.23
Total 25.33 14
Hypothesis Test6. Draw Conclusions
– Since our F > 4.46, we Reject HO, accept HA
– And conclude that the at least one of the medications resulted in more sleep than the others
Dependent samples ANOVAWhat if we violate one of the
assumptions?Friedman test
– means (or distribution) are of interest– more than 2 groups/conditions– dependent samples– concerns about normality, homogeneity of
variance, etc.
Friedman Fr
1. Arrange data in columns, 1 group/condition per column, (conditions = columns = k)2. Place correlated measures (matched, repeated, etc.) across conditions in the same rows (n rows)3. Rank the scores in each row from 1 to k, assigning the lowest rank (1) to the lowest score (put ranks in the column next to the raw scores)
Friedman (continued)
4. Sum the ranks of each column (ΣTk)5. Compute the mean of the Ts, T
6. Compute S
Friedman (continued)
7. Compute the Friedman test statistic Fr
8. Compute df = k-19. Look up critical value in Χ2 table or use
Excel to find p