analysis of variance (anova) - statistical coaching › ... › 05 ›...
TRANSCRIPT
ANOVA: Two Way Repeated Measures 1
Hands-on Data Analysis with R University of Neuchatel, 10 May 2016
Bernadetta Tarigan, Dr. sc. ETHZ
Tarigan Statistical Consulting & Coaching
statistical-coaching.ch
Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL
Analysis of Variance (ANOVA)
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 2
Introduction
Variabletype data
Variable of interest: guessed, nearGuessed and notGuessed Goal : TO EXPLAIN 1. What is the distribution of guessed variables in every approach & heuristic, and what is the
best way to compare them? 2. For each heuristic, how to compare the relation in between guessed package and guessed
library? 3. For each heuristic, how to compare guessed, near guessed and not guessed to get some
parameter of successfulness?
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 3
Introduction
Do the numbers in the groups come from the same population?
Res
po
nse
: o
ne
qu
anti
tati
ve v
aria
ble
Explanatory: one categorical variable with 4 levels
# groups = 4
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 4
Introduction
Do the numbers in the groups come from the same population?
Res
po
nse
: o
ne
qu
anti
tati
ve v
aria
ble
Explanatory: two categorical variables with 4 levels and 2 levels # groups = 8
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 5
Introduction
• Comparing groups = comparing populations = comparing distributions
• Distribution: parametric or nonparametric? • Assumptions of ANOVA
– Populations are from Normal family – Populations have equal variance – Observations across the groups are iid (independent
and identically distributed) – Different sizes of groups are allowed
• Normal distribution: symmetric around its mean • Comparing distributions = comparing means of
the groups
ANOVA : parametric approach
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 6
Introduction
ANOVA: Comparing 𝒌 groups, with 𝒌 ≥ 𝟑
ANOVA Response = always one
quantitiave variable
Explanatory = always categorical, one or more
variables
One way
- 1 categorical variable - with 𝑘 levels - #groups = 𝑘
Two ways
- 2 categorical variables - with 𝑘1& 𝑘2 levels - #groups = 𝑘1 ∙ 𝑘2
...
Different subjects / participants in
each group
Same subjects / participants
between subjects within subjects / repeated measures
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 7
Introduction
ANOVA and its siblings
ANCOVA
Covariance -> explanatory variables are not only categorical but also quantitative
MANOVA
Multi -> response variables are more than one
MANCOVA
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 8
Introduction
1-way = 1 categorical explanatory variable (levels = 𝑘)
– All population means are equal
– That is, no variation in means between groups
𝐻0: 𝜇1 = 𝜇𝑘 = ⋯ = 𝜇𝑘
• At least one population mean is different than the others • That is, there is variation between groups • Does not mean that all population means are different
(some pairs may be the same)
𝐻1: 𝜇𝑖 ≠ 𝜇𝑗 , for at least one 𝑖, 𝑗 pair
ANOVA hypotheses
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 9
Introduction
Illustration of the hypotheses
All populations means are the same: The null hypothesis is true
(no variation between groups)
At least one mean is different: The null hypothesis is NOT true
(variation is presence between groups)
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 10
Introduction
How to test the hypotheses of means equality?
Analyze the data variance: variability of all subjects in all groups
large variation between groups small variation within groups
large variation between groups large variation within groups
Group means may look different, but large variation within groups makes the evidence weak!
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 11
Introduction
The logic of ANOVA: decompose variability
SST total variability in data
SSB between group variability
SSW within group variability
grand mean: 𝒙 = 1
𝑛 𝑥𝑖𝑗
𝑛𝑗𝑖=1
𝑘𝑗=1 where n = 𝑛𝑗
group mean: 𝒙 𝒋 = 1
𝑛𝑗 𝑥𝑖𝑗𝑛𝑗𝑖=1
“signal” “noise” 𝑥𝑖𝑗 − 𝒙 𝒋
2
𝑛𝑗
𝑖=1
𝑘
𝑗=0
𝑥𝑖𝑗 − 𝒙 2
𝑛𝑗
𝑖=1
𝑘
𝑗=1
𝑛𝑗 𝒙 𝒋 − 𝒙 2
𝑘
𝑗=1
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 12
Introduction
Test: the ratio of SSB to SSW
𝑛𝑗 𝒙 𝒋 − 𝒙 2
𝑘
𝑗=1
= 𝑆𝑆𝐵 ~ 𝜒2𝑘−1
𝑥𝑖𝑗 − 𝒙 𝒋2𝑛𝑗
𝑖=1𝑘𝑗=0 = 𝑆𝑆𝑊 ~ 𝜒2𝑛−𝑘
Under the assumptions and 𝐻0: 𝜇1 = 𝜇𝑘 = ⋯ = 𝜇𝑘 , we get that:
𝑆𝑆𝐵/(𝑘 − 1)
𝑆𝑆𝑊/(𝑛 − 𝑘) ~ 𝐹(𝑘−1;𝑛−𝑘)
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 13
Introduction
ANOVA Table
Source of Variation
𝑺𝑺 (𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔) 𝒅𝒇 Mean SS
(aka. variance) 𝑭 ratio
Between groups
𝑆𝑆𝐵 𝑘 − 1 𝑀𝑆𝐵 =𝑆𝑆𝐵
𝑘 − 1 𝐹 =
𝑀𝑆𝐵
𝑀𝑆𝑊
signal to noise
ratio
Within groups
𝑆𝑆𝑊 𝑛 − 𝑘 𝑀𝑆𝑊 =𝑆𝑆𝑊
𝑛 − 𝑘
Total 𝑆𝑆𝑇 = 𝑆𝑆𝐵 + 𝑆𝑆𝑊 𝑛 − 1
Interpretation: • signal ≈ noise ⇒ 𝐹 ≈ 1 • signal ≫ noise ⇒ 𝐹 ≫ 1
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 14
Introduction
It is all simple in R…
Source of Variation
𝑺𝑺 (𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔) 𝒅𝒇 Mean SS
(aka. variance) 𝑭 ratio
Between groups
𝑆𝑆𝐵 𝑘 − 1 𝑀𝑆𝐵 =𝑆𝑆𝐵
𝑘 − 1 𝐹 =
𝑀𝑆𝐵
𝑀𝑆𝑊
signal to noise
ratio
Within groups
𝑆𝑆𝑊 𝑛 − 𝑘 𝑀𝑆𝑊 =𝑆𝑆𝑊
𝑛 − 𝑘
Total 𝑆𝑆𝑇 = 𝑆𝑆𝐵 + 𝑆𝑆𝑊 𝑛 − 1
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 15
Introduction
Repeated Measures: reducing the noise
𝐹 =𝑀𝑆𝐵𝑀𝑆𝑊
𝐹 =𝑀𝑆𝐵𝑀𝑆𝑒𝑟𝑟𝑜𝑟
The noise is discounted, that means the F value increases leading to an increase in the power of the test to detect significant differences between groups
“signal” “noise”
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 16
Introduction
Example
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 17
Introduction
Model Validation
The same as what we have seen for the simple linear regression model: 1. Homoscedasticity of the residuals in each group
2. Normal distribution of the residuals in each group
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 18
Introduction
2 way: 2 factors
total variability
between group variability of
Factor 1
“signal” “noise”
between group variability of
Factor 2
Residuals (of the subjects)
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 19
Introduction
Any interaction between the 2 factors?
Intro
1-way ANOVA & Repeated Measures
2-way ANOVA & Repeated Measures
ANOVA: Two Way Repeated Measures 20
Introduction
Repeated Measures: noise reduction