why you need power analysis
TRANSCRIPT
Why you need power analysis even if you think you don’t Paul Johnson1 Sarah Barry2 Heather Ferguson1 Pie Müller3
1: Boyd Orr Centre, IBAHCM, University of Glasgow
2: Robertson Centre for Biostatistics, University of Glasgow
3: Swiss Tropical and Public Health Institute
14/12/2015 @paulcdjo 1
Power
Narrow definition: The probability of a significant result given some effect size
• Applies only to null hypothesis significance testing (NHST)
Broad definition: The information we expect to gain from a study that uses statistical inference; “informativeness”
• Applies to all modes of statistical inference:
• Null hypothesis significance testing (NHST)
• Estimating an effect & confidence interval
• Information theoretic (AIC, BIC, etc)
• Bayesian
14/12/2015 @paulcdjo 2
Reproducibility
“Reproducibility is the ability of an entire experiment or study to be duplicated… Reproducibility is one of the main principles of the scientific method.”
Wikipedia, 08/12/15
“non-reproducible single occurrences are of no significance to science”
Karl Popper, The Logic of Scientific Discovery, 1935
14/12/2015 @paulcdjo 3
Low power means low reproducibility
• 1000 studies test a null hypothesis (H0) at 5% significance level and with 30% power
• In 100, H0 is false, 30% x 100 = 30 significant and true
• In 900, H0 is true, 5% x 900 = 45 significant but false
• P(true | significant) = 30/(30 + 45) = 40%
14/12/2015 @paulcdjo 4
A crisis of irreproducibility?
14/12/2015 @paulcdjo 5
A crisis of irreproducibility?
• Human genetics
• Psychology
• Neuroscience
14/12/2015 @paulcdjo 6
Irreproducibility in human genetics
14/12/2015 @paulcdjo 7
Nature Genetics 29, 306-309 (2001)
Typical GWAS sample sizes from current Nature Genetics issue:
• 23,000
• 27,000
• 116,000
Irreproducibility in psychology
14/12/2015 @paulcdjo 8
Original study effect size versus replication effect size
(correlation coefficients).
Open Science Collaboration
Science 2015;349:aac4716
Reproducibility Project: Psychology
• Aimed to replicate 100 studies
• Result: 39/100 replicated original result
Irreproducibility in neuroscience
14/12/2015 @paulcdjo 9
Nature Reviews Neuroscience 14, 365-376 (2013)
“Our results indicate that the median statistical power in neuroscience is 21%.” “The consequences of this [very low power] include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is inefficient and wasteful. Improving reproducibility in neuroscience is a key priority and requires attention to well-established but often ignored methodological principles.”
14/12/2015 @paulcdjo 10
A crisis of irreproducibility?
• Human genetics
• Psychology
• Neuroscience
14/12/2015 @paulcdjo 11
A crisis of irreproducibility?
• Human genetics
• Psychology
• Neuroscience
• Ecology?
14/12/2015 @paulcdjo 12
A crisis of irreproducibility in ecology?
• No study has directly assessed reproducibility across ecology
• Are known causes of irreproducibility prevalent in ecological research?
14/12/2015 @paulcdjo 13
Low power in ecological research?
• Jennions & Møller (2003) A survey of the statistical power of research in behavioral ecology and animal behavior. Behavioral Ecology • Average power 40-47% for medium effect sizes
• Taborsky (2010) Sample size in the study of behaviour. Ethology • Power analysis rarely used in behavioural research
• Smith et al. (2011) Power rangers: no improvement in the statistical power of analyses published in Animal Behaviour. Animal Behaviour • Average power 23-26% for medium effect sizes
What can we do about it?
14/12/2015 @paulcdjo 14
14/12/2015 @paulcdjo 15
Obstacle to using power analysis Solution
Power analysis doesn’t apply to my study
It does if you are using statistical inference – broaden your definition of power analysis
I don’t know what parameter values to assume (pilot data not available)
Assume a plausible worst-case scenario based on experience, or do a pilot study
I don’t know the true effect size You don’t need to – this is the job of the study. Think instead of • The smallest effect worth detecting (for NHST) • The desired precision of the estimate (for estimation)
My model is too complicated for power analysis
• Simplify to the bare essentials • Use simulations • Ask for help
Power analysis is hard & no one is forcing me to do it
Power analysis should be promoted by • Journal editors • Funders • Group leaders
14/12/2015 @paulcdjo 16
14/12/2015 @paulcdjo 17
14/12/2015 @paulcdjo 18
14/12/2015 @paulcdjo 19
Obstacle to using power analysis Solution
Power analysis doesn’t apply to my study
It does if you are using statistical inference – broaden your definition of power analysis
I don’t know what parameter values to assume (pilot data not available)
Assume a plausible worst-case scenario based on experience, or do a pilot study
I don’t know the true effect size You don’t need to – this is the job of the study. Think instead of • The smallest effect worth detecting (for NHST) • The desired precision of the estimate (for estimation)
My model is too complicated for power analysis
• Simplify to the bare essentials • Use simulations • Ask for help
Power analysis is hard & no one is forcing me to do it
Power analysis should be strongly encouraged by • Journal editors • Funders • Group leaders
Conclusions
• Underpowered studies are probably common in ecology, so irreproducibility is probably rife
• Time and money spend on underpowered research is wasted
• We should stop looking for excuses to avoid power analysis
• We should collaborate with statisticians who know how to do power analysis
• Journal editors, funders & group leaders should encourage power analysis
Funding acknowledgement: AvecNet - African Vector Control: New Tools
14/12/2015 @paulcdjo 20