7. statistical power · 2018. 11. 21. · statistical power analysis for the behavioral sciences d...
TRANSCRIPT
![Page 1: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/1.jpg)
P-values and statistical tests7. Statistical power
Hand-outsavailableathttp://is.gd/statlec
MarekGierlińskiDivisionofComputationalBiology
![Page 2: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/2.jpg)
Statistical power: what is it about?
2
Two populations(alternative hypothesis)
Effectsize
Samplesize
Two samples Statistical significance
How does our ability to call a change “significant” depend on the effect size and the sample size?
![Page 3: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/3.jpg)
Effect size
![Page 4: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/4.jpg)
Effect size describes the alternative hypothesis
4
Effectsize
![Page 5: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/5.jpg)
Effect size for two sample means
5
! = #$ −#&'(
Cohen’s d
'( = )$ − 1 '($& + )& − 1 '(&&)$ + )& + 2
- = #$ −#&'.
! = - )$ + )&)$)&
! = 1.1# – mean'( – standard deviation
![Page 6: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/6.jpg)
Effect size for two sample means
6
Cohen, J. (1988). Statistical power analysis for the behavioral sciences
d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2
![Page 7: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/7.jpg)
Effect size depends on the standard deviation
7
Fold change = 2Difference between means = 20 g
! = 1.4 ! = 5.4 ! = 26
![Page 8: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/8.jpg)
Effect size does not depend on the sample size
8
Effect size = 0.8
! = 5$ = 1.3( = 0.2
! = 20$ = 2.5( = 0.02
! = 25$ = 4.0( = 0.0001
![Page 9: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/9.jpg)
Comparing two samples
9
Statistic Formula Description
Difference ∆" = "$ −"&Absolute difference between sample means
Ratio ' = "$"&
Often used as logarithm
Cohen’s d ( = "$ −"&)*
Effect size; takes spread in data into account
t-statistic + = "$ −"&),
Directly relates to statistical significance; takes spread of data and sample size into account
" – mean)* – standard deviation), – standard error
![Page 10: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/10.jpg)
10
Effect size describes the alternative hypothesis
Effect size is not related to statistical significance
![Page 11: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/11.jpg)
Effect size in ANOVA
11
For the purpose of this calculation we only consider groups of equal sizes, !
" = 1
Test statistic
% = &'(&')
H0: &'( = &')H1: &'( = &') + !&'+Added variance
", = &'+&')
Cohen’s f", = % − 1
!
![Page 12: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/12.jpg)
Effect size in ANOVA
12
! = 1 ! = 1
![Page 13: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/13.jpg)
Effect size
13
Data Statistical test Effect size Formula
Two sets, size !" and !# t-test Cohen’s $ $ = & !" + !#!"!#
( groups of ! points each
ANOVA Cohen’s ) ) = * − 1!
2×2 contingency table Fisher’s exact Odds ratio - = .//1/.2/12
Paired data 3", 3#, … , 36and 7", 7#, … , 76
Significance of correlation
Pearson’s 8 8 = 1! − 19:;"
6 3: − <=>?=
7: − <@>?@
![Page 14: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/14.jpg)
Statistical powert-test
![Page 15: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/15.jpg)
Statistical testing
15
Statistical model
Null hypothesisH0: no effect
All other assumptions
Significance level! = 0.05
p-value: probability that the observed effect is random
& < !Reject H0
(at your own risk)Effect is real
& ≥ !Accept H0 (!!!)
Statistical test against H0Data
![Page 16: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/16.jpg)
This table
16
H0 is true H0 is false
H0 rejected type I error !false positive
correct decisiontrue positive Positive
H0 accepted correct decisiontrue negative
type II error "false negative Negative
No effect Effect
![Page 17: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/17.jpg)
Gedankenexperiment
Draw 100,000 pairs of samples (", $) of size & = 5
Find ) = (*+ − *-)//0 for each pair
Build sampling distribution of )
17
H0: there is no effect" from 1+ = 20 g$ from 1- = 20 g
H1: there is an effect" from 1+ = 20 g$ from 1- = 30 g
![Page 18: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/18.jpg)
One alternative hypothesis
18
Null hypothesis
! = 0.05
H0 true H0 false
reject FP & TP
accept TN FN '
acceptanceregion1 − !
rejectionregion!/2
rejectionregion!/2
Power of the test
, = 1 − -
Probability that we correctly reject H0
TN FP
- 1 − -
Alternative hypothesis
- = 0.21
FN TP
![Page 19: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/19.jpg)
19
Statistical power
The probability of correctly rejecting the null hypothesis
The probability of detecting an effect which is really there
![Page 20: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/20.jpg)
Multiple alternative hypotheses
20
!" = 22 g& = 0.4* = 0.08
!" = 24 g& = 0.8* = 0.20
!" = 26 g& = 1.2* = 0.39
!" = 28 g& = 1.6* = 0.60
!" = 30 g& = 2.0* = 0.79
1 * = 1 − 1
![Page 21: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/21.jpg)
Power curve
21
! " = 1 − !
Power: probability of detecting an effect when there is an effect
n = 5
![Page 22: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/22.jpg)
Power curves
22
! = 0.8! = 0.95
n = 2
3
51030
! = 0.6
) = *+ −*-./
![Page 23: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/23.jpg)
How to do it in R?# Find sample size required to detect the effect size d = 1> power.t.test(delta=1, sig.level=0.05, power=0.8, type="two.sample", alternative="two.sided")
Two-sample t test power calculation
n = 16.71477
delta = 1
sd = 1
sig.level = 0.05
power = 0.8
alternative = two.sided
> power.t.test(delta=1, sig.level=0.05, power=0.95, type="two.sample", alternative="two.sided")
Two-sample t test power calculation
n = 26.98922
delta = 1
sd = 1
sig.level = 0.05
power = 0.95
alternative = two.sided23
![Page 24: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/24.jpg)
Statistical powerANOVA
![Page 25: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/25.jpg)
One alternative hypothesis
25
Null hypothesis
! = 0.05
Sampling dist. of F, &' = &( = &) = &* = 20 g
acceptanceregion1 − !
rejectionregion!
Sampling distribution of F &' = &( = 20 g&) = &* = 25 g
. 1 − .
Alternative hypothesis
. = 0.20
![Page 26: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/26.jpg)
Power curves
26
n = 2
3510
30
! = 0.8! = 0.95
! = 0.6
![Page 27: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/27.jpg)
How to do it in R?> library(pwr)
# Find sample size required to detect a “large” effect size f = 0.4> pwr.anova.test(k=4, f=0.4, sig.level=0.05, power=0.8)
Balanced one-way analysis of variance power calculation
k = 4n = 18.04262f = 0.4
sig.level = 0.05power = 0.8
NOTE: n is number in each group
27
![Page 28: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/28.jpg)
Statistical powerFisher’s test
![Page 29: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/29.jpg)
Power test for proportion# Find sample size required to detect observed proportions> power.prop.test(p1=0.85, p2=0.70, power=0.8)
Two-sample comparison of proportions power calculation
n = 120.4719
p1 = 0.85
p2 = 0.7
sig.level = 0.05
power = 0.8
alternative = two.sided
29
Dead Alive Total
Drug A 68 12 80
Drug B 70 30 100
Total 138 42 180
Dead Alive Total
Drug A 0.85 0.15 1Drug B 0.70 0.30 1
Proportions in rows
![Page 30: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/30.jpg)
Worked example
![Page 31: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/31.jpg)
Tumour growth in mice
31
Pilot experimentWT and 4 KOs miceObserve tumour growthMeasure volume after 10 days
Power analysis
How many replicates do we need to...
1) detect a 2-fold change between conditions? (power in t-test)
2) detect the observed effect in ANOVA? (power in ANOVA)
![Page 32: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/32.jpg)
How many replicates to detect a 2-fold change between WT and a KO?
![Page 33: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/33.jpg)
Estimate standard deviation
33
Standard error of SD
!"#$ =!&
2() − 1)
Optimistic case!& = 75
Pessimistic case!& = 160
![Page 34: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/34.jpg)
Two scenarios, SD1 = 75 and SD2 = 160
34
Cohen’s d:
!" =Δ%&'"
= 20075 = 2.7
!- =Δ%&'-
= 200160 = 1.25
n = 2
3456
10
Pessimistic case
Optimistic case
![Page 35: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/35.jpg)
R power calculations
35
# Optimistic case, SD = 75> power.t.test(delta=200, sd=75, power=0.8)
Two-sample t test power calculation
n = 3.484297delta = 200
sd = 75sig.level = 0.05
power = 0.8alternative = two.sided
# Pessimistic case, SD = 160> power.t.test(delta=200, sd=160, power=0.8)
Two-sample t test power calculation
n = 11.09423delta = 200
sd = 160sig.level = 0.05
power = 0.8alternative = two.sided
![Page 36: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/36.jpg)
How many replicates to detect the observed effect in ANOVA?
![Page 37: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/37.jpg)
ANOVA power curves
37
n = 2
345610 From ANOVA on our data we
have ! = 2.31
Then, we find the observed effect size:
' = ! − 1) = 0.47
![Page 38: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/38.jpg)
How many replicates do we need?> library(pwr)> tumour <- read.table("http://is.gd/mouse_tumour", header=TRUE)# Here n = 6 and k = 5
> tum.aov <- aov(Volume ~ Group, data=tumour) # perform ANOVA
> F <- summary(tum.aov)[[1]]$F[1] # Extract F value
> f <- sqrt((F - 1)/6) # Effect size: Cohen's f
# What is the power of this experiment?> pwr.anova.test(k=5, n=6, f=f)
k = 5
n = 6
f = 0.4670469
sig.level = 0.05
power = 0.4293041
# How many replicates to get power of 0.8?
> pwr.anova.test(k=5, f=f, power=0.8)
k = 5
n = 11.93119
f = 0.4670469
sig.level = 0.05
power = 0.8
38
![Page 39: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/39.jpg)
Conclusions from our examplen Request power of 0.8
n To detect 2-fold change between WT and a KO in a pessimistic case we need 11 mice in each group
n To detect a change across all groups (ANOVA) we need 12 mice in each group
n We recommend an experiment with at least 12 mice in each group
39
![Page 40: 7. Statistical power · 2018. 11. 21. · Statistical power analysis for the behavioral sciences d = 0.01 d = 0.2 d = 0.5 d = 0.8 d = 1.2 d = 2. Effect size depends on the standard](https://reader035.vdocuments.us/reader035/viewer/2022071100/5fd97ea38c20543a421ee376/html5/thumbnails/40.jpg)
Hand-outs available at http://is.gd/statlec