core: may the “power” (statistical) - be with you!
TRANSCRIPT
May the “Power” (Statistical)Be with You!
Dr. Mickey Shachar C.O.R.E. Webinar Series
13 October, 2015
¡ RQ: Is there a statistical significant difference instudents’ academic performance in Math between the classes of Dr. Adam and Dr. Eve?
¡ Hnull: There is no statistical significant difference instudents’ academic performance in Math between the classes of Dr. Adam. and Dr. Eve.
You are the Dean and receive the following report:
3
¡ Report:An Independent Samples T Test was run to compare the means of a Math test between Dr. Eve: M = 90.96 (12.60) and Dr. Adam: M = 89.32 (15.38), yielding a statistical significant difference with t(1358) = 2.164 p = .031 . Hence we reject Hnull and conclude that the Dr. Eve’s students outperformed Dr. Adam’s students.
¡ What should the Dean do based on these accurate true results? § A: CritiqueDr. Adam on his students’ low performance and set a deadline and minimal score for him to meet.
§ B: Promote Dr. Eve and let Dr. Adam eat his heart out.§ C: Results are subject to chance due to small sample size, and we need to rerun study with a larger sample.
§ D: Attend Dr. Shachar’s C.O.R.E. PowerWebinar 4
5
q Problemswith Hypothesis SignificantTesting -‐ based on p values are:
q The p-‐valuedepends essentially on two things: the size of the effect and the size of the sample. One would get a ‘significant’ result either if the effect were very big (despite having only a small sample) or if the sample were very big (even if the actual effect size were tiny).
qWe are looking at “StatisticalSignificance” and not at “Practical Significance”.
¡ If only the null hypothesis is available and is rejected, at most the conclusion is that “the difference is not zero”
¡ When the President asks the Five-‐Star General to estimatethe war casualty, can he give “not zero” as a satisfactory answer?!
6
¡ We should be concerned with not only whether a null hypothesis is false or not, but also how false it is.
¡ In other words, if the difference is not zero, how large the difference one should expect?
¡ The larger the effect size (the difference between the Hnulland Halt Means) is, the greater the power of a test is.
7
A-‐Priori -‐ It allows you to decide, in the process of designingan experiment/study:¡ How large a sample is needed to enable statisticaljudgments that are accurate and reliable, and¡ How likely your statistical test will be able to detect effects of a
given size in a particular situation. ¡ Without these calculations, sample size may be too high or too low.
§ If sample size is too low, the experiment will lack the precision.§ If sample size is too large, time and resources will be wasted.
Post-‐Hoc -‐ It allows you to decide, after study was executed:¡ Whether the study attained an acceptable power, and¡ Whether the results have a practical significance.
APA -‐ Publication Requirements:¡ All study publications should report in addition to p values, the
effect sizes (ES) and their Confidence Interval (CI). 8
Power AnalysisTopics
Error Types
Type I = alpha Type II = betaPower = 1-‐ beta
Effect Size Power Analysis
9
¡ The null hypothesis is either true or false¡ The null hypothesis is either rejected or not rejected. ¡ Only 4 possible things can happen:
State of the WorldH0
State of the WorldH1
Our Decision H0
Correct Acceptance Type II Error (beta)
Our Decision H1
Type I Error (alpha) Correct Rejection
10
Common acceptance in the social sciences:¡ Type I error -‐ alpha,must be kept at or below .05
¡ Type II error -‐ beta,must be kept low as well.¡ "Statistical power,"which is equal to 1 -‐ beta, must be kept correspondingly high.
¡ Ideally, power should be at least .80 to detect a reasonable departure from the null hypothesis.
11
¡ Effect size (ES) is a name given to a family of indices that measure the magnitude of a treatmenteffect (Becker, 2000).
§ Unlike significance tests, these indices are independentof sample size.
¡ There is a wide array of formulas used to measure ES:§ as the standardized difference between two means ‘d’ or ‘g’
§ as the correlation between the independent variable (IV) classification and the individual scores on the dependent variable (DV) ‘r’.
§ Others: OR, HR, RR, etc. 13
The simplest form, effect size, as denoted by the symbol ‘d’ is the mean difference between groups in standard score form i.e., the ratio of the difference between the means to the standard deviation.
14
Conventions Standardized Difference of Means ‘d’
Correlation ‘r’
‘Small’ 0.2 0.1
‘Medium’ 0.5 0.3
‘Large’ 0.8 0.5
15
The factors influencing power in a statistical test:
¡ What kind of statistical test is being performed. § You will need to calculate a different effect size per test type!!!
¡ Sample size. In general, the larger the sample size, the larger the power.
¡ The size of experimental effects. If the null hypothesis is wrong by a substantial amount, power will be higher than if it is wrong by a small amount.
¡ The level of error in experimental measurements. anything that enhances the accuracy and consistency of measurement can increase statistical power.
17
¡ To ensure a statistical test will have adequate power, one usually must perform special analyses prior to running the experiment, to calculate how large an N is required.
¡ The question is, "How large an N is necessary to produce a power that is reasonably high" in this situation, while maintaining alpha at a reasonably low value .
18
To determine the sample size needed, we play with four factors (in red below):
1. Obtain “ES” -‐ where do we find it?1. Lit review2. Pilot3. An “educated conjecture”
2. Define alpha <=.053. Define power (1-‐beta) .804. Calculate sample size (by stat calculator)
see example19
To determine the sample size needed, we play with four factors (in red below):
1. Obtain “ES” -‐ where do we find it?1. Lit review2. Pilot3. An “educated conjecture”
2. Define alpha <=.053. Define power (1-‐beta) .804. Calculate sample size (can use Gpower)
see example20
Now that we are done with our study, we need to check how well did the actualresults we found do in terms of power:
Again, we play with four factors:1. Input “ES” – from our study2. Define alpha <=.053. Input sample size -‐ from our study4. Calculate power – can use G-‐Power
22
23
For our t test with: ES= .091, Alpha=.05, Sample size N=1360, We have obtained a dismal .388 power !!!
24
¡ Hypothesis Testing based on p value –provides only statistical significance.
¡ Power analysis is crucial for your study:¡ A-‐priori: to determine required sample size¡ Post-‐hoc: § To calculate and examine power from actualresearch study
§ To examine the practical significance of the research findings.
¡ If you fired Dr. Adam – Reinstatehim!!!
25
¡ “G Power” v. 3.1.9.2. (2015). Buchner, Erdfelder, Faul, & Lang.
§ To download software for free: http://www.psycho.uni-‐duesseldorf.de/abteilungen/aap/gpower3
¡ Using “G Power” for Statistical Power and Sample Size Analysis (2008). Eveland, J.D. § Download instructions to follow for PPT.
¡ Becker, L. A. (2000). Effect Sizes. Retrieved: http://www.uccs.edu/lbecker/effect-‐size.html
Attention Faculty, Students, Alumni and Guest Speakers in Business, Health Sciences, and Education:
¡ Have you wanted to present your ongoing scholarly and professional work to a general audience?
¡ COREGrand Rounds provides a platform for professional development and increased engagement to receive constructive feedback from peers and scholars-‐in-‐training.
¡ Email Dr. Bernice B. Rumala at [email protected] to sign up
¡ To receive more information about C.O.R.E. please visit the C.O.R.E. webpage at: www.trident.edu/webinars/core
¡ For further information about Trident’s doctoral programs in educational leadership, business and health sciences please visit : https://www.trident.edu/degrees/doctoral/
¡ Do you have any comments for C.O.R.E., you may email Dr. Bernice B. Rumala, C.O.R.E. Chair, at: [email protected]
31