introduction to sample size determination: “how powerful do i need to be, anyway?” dennis g....

22
Introduction to Introduction to Sample Size Sample Size Determination: “How Determination: “How powerful do I need powerful do I need to be, anyway?” to be, anyway?” Dennis G. Fisher, Dennis G. Fisher, Ph.D. Ph.D. Center for Behavioral Center for Behavioral Research and Services Research and Services California State California State University, Long Beach University, Long Beach

Upload: oliver-manning

Post on 29-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Introduction to Sample Size Introduction to Sample Size Determination: “How Determination: “How

powerful do I need to be, powerful do I need to be, anyway?”anyway?”

Dennis G. Fisher, Ph.D.Dennis G. Fisher, Ph.D.Center for Behavioral Center for Behavioral Research and ServicesResearch and Services

California State University, California State University, Long BeachLong Beach

Page 2: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Power and DesignPower and Design

The One-Group Pretest-Posttest The One-Group Pretest-Posttest Design.Design.

Must have 2 points in time (about 6 Must have 2 points in time (about 6 months apart for the administration of months apart for the administration of the instrument).the instrument).

Must have method of linking time 1 Must have method of linking time 1 responses to time 2 responses.responses to time 2 responses.

Three levels of measurement Three levels of measurement considerations.considerations.

Page 3: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Interval (Ratio) MeasurementInterval (Ratio) Measurement

Equal intervals (interval) with true Equal intervals (interval) with true zero (ratio).zero (ratio).

Dependent sample Dependent sample tt-test.-test. ““On how many occasions during the On how many occasions during the

last 30 days have you had alcoholic last 30 days have you had alcoholic beverages to drink (more than just a beverages to drink (more than just a few sips)?”few sips)?”

Ratio scale.Ratio scale.

Page 4: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Dependent-samples Dependent-samples tt-test-test

HHoo: µ: µdd = 0 = 0 HHaa: µ: µdd ≠ 0 ≠ 0 αα=.05=.05 d=difference scores (between time 1 d=difference scores (between time 1

and time 2)and time 2) = standard error of difference = standard error of difference

scores.scores.

d

dts

ds

Page 5: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Sample Size Determination for Sample Size Determination for Dependent-Samples Dependent-Samples tt-test-test

Formula for sample size determination.Formula for sample size determination.

δδ=hypothesized difference (how do you know =hypothesized difference (how do you know this?)this?)

σσ=hypothesized standard deviation of difference=hypothesized standard deviation of difference ZZαα Z Zββ are alpha and beta levels. are alpha and beta levels. If If pp=.05 then Z=.05 then Zαα = 1.96 for two-tailed. = 1.96 for two-tailed. If power = .8 then ZIf power = .8 then Zββ=1.28.=1.28.

2

z zn

Page 6: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Ordinal level of measurementOrdinal level of measurement

““How much do you think people risk How much do you think people risk harming themselves (physically or in harming themselves (physically or in other ways) if they take one or two other ways) if they take one or two drinks of alcohol nearly every day?”drinks of alcohol nearly every day?”

No risk, Slight risk, Moderate risk, No risk, Slight risk, Moderate risk, Great riskGreat risk

Ordering but not equal intervals.Ordering but not equal intervals. Wilcoxon paired-sample test (aka Wilcoxon paired-sample test (aka

signed-rank test)signed-rank test)

Page 7: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Wilcoxon Paired-Sample TestWilcoxon Paired-Sample Test

HHoo: Perceived risk at time 1 same as : Perceived risk at time 1 same as perceived risk at time 2.perceived risk at time 2.

HHaa: Perceived risk at time 1 is not the : Perceived risk at time 1 is not the same as perceived risk at time 2.same as perceived risk at time 2.

1 e

w

W WZ

11

2 2e

n nW

2 1 / 6w en W

Page 8: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Wilcoxon Signed-Rank TestWilcoxon Signed-Rank Test

WW1 1 = Smaller of rank sums.= Smaller of rank sums. WWe e = Expected sum of rank scores.= Expected sum of rank scores. σσw w = Standard deviation of rank = Standard deviation of rank

scores.scores. Ties are eliminated from analysis.Ties are eliminated from analysis.

Page 9: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Nominal Scale of Measurement Nominal Scale of Measurement McNemar’s Chi-Square TestMcNemar’s Chi-Square Test

““I plan to get drunk I plan to get drunk sometime in the sometime in the next year.” False next year.” False TrueTrue

f11f11 f12f12

f21f21 f22f22

Time 1

Time 2

False True

False

True 212 212

12 21

f f

f f

Page 10: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

McNemar’s Chi-SquareMcNemar’s Chi-Square

Power calculation (Miettinen, 1968)Power calculation (Miettinen, 1968)

2

2

1 / 1

2

3

4sz z

n

Page 11: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Computer ProgramsComputer Programs

nQuery AdvisornQuery Advisor Power and PrecisionPower and Precision Statistica Power AnalysisStatistica Power Analysis Power and Sample Size (PASS)Power and Sample Size (PASS) SAS – The SAS Power and Sample SAS – The SAS Power and Sample

Size ApplicationSize Application SPSS – SamplePower – Stand-alone SPSS – SamplePower – Stand-alone

productproduct

Page 12: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

What do you need to know before What do you need to know before you use the computer program?you use the computer program?

What is alpha? (What p value do you want? What is alpha? (What p value do you want? Usual value .05)Usual value .05)

What is beta? (Actually 1-beta or what power What is beta? (Actually 1-beta or what power do you want? Usual values .8, .85, .9)do you want? Usual values .8, .85, .9)

What is your estimate of effect? (e.g. What is your estimate of effect? (e.g. difference between means etc.) How do you difference between means etc.) How do you find this information?find this information?

What is your estimate of variance? (or SD etc.)What is your estimate of variance? (or SD etc.) Obtain approximately 150% of required sample Obtain approximately 150% of required sample

at time 1 to account for loss to follow-up.at time 1 to account for loss to follow-up.

Page 13: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

How to Increase Statistical PowerHow to Increase Statistical Power

1. Add Subjects1. Add Subjects Simple and direct, but also expensive.Simple and direct, but also expensive. 2. Add more subjects to group which is 2. Add more subjects to group which is

cheaper, easier.cheaper, easier. If you can only add to one group, then do If you can only add to one group, then do

it even though it will not be as efficient as it even though it will not be as efficient as keeping sample sizes equal between the keeping sample sizes equal between the two groups. Efficiency of this approach two groups. Efficiency of this approach drastically reduces after 2x in larger drastically reduces after 2x in larger group.group.

Page 14: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Choose Less Stringent Alpha LevelChoose Less Stringent Alpha Level

Using a one-tailed test is the Using a one-tailed test is the equivalent of changing alpha equivalent of changing alpha from .05 to .10 for a two-tailed test. from .05 to .10 for a two-tailed test. If you specify a priori for a one-tailed If you specify a priori for a one-tailed test (and your thesis chair agrees) test (and your thesis chair agrees) you can greatly increase power.you can greatly increase power.

Page 15: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Increase Effect SizeIncrease Effect Size

1. By strengthening the intervention 1. By strengthening the intervention - increase dose, increase number of - increase dose, increase number of sessions, use multiple modalities etc.sessions, use multiple modalities etc.

2. By weakening the comparison 2. By weakening the comparison group – use no-treatment control.group – use no-treatment control.

3. Use extreme groups.3. Use extreme groups.

Page 16: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Use as Few Groups as PossibleUse as Few Groups as Possible

The more groups, the more the total The more groups, the more the total sample will be split into smaller cell sample will be split into smaller cell sizes. The more groups, the smaller sizes. The more groups, the smaller the number of subjects for any the number of subjects for any specific comparison or contrast. specific comparison or contrast. Student-Newman-Keuls is more Student-Newman-Keuls is more powerful in these situations, than powerful in these situations, than Tukey HSD.Tukey HSD.

Page 17: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Use Covariates or Blocking Use Covariates or Blocking VariablesVariables

If the blocking variable is correlated If the blocking variable is correlated with the dependent variable, then with the dependent variable, then the power will increase with the size the power will increase with the size of the correlation.of the correlation.

Page 18: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Use Cross-Over, Repeated Use Cross-Over, Repeated Measures, Within-Subject DesignMeasures, Within-Subject Design

These designs can greatly increase These designs can greatly increase power if there is a high correlation power if there is a high correlation between the adjacent measures. For between the adjacent measures. For example, if the time 1 measure is example, if the time 1 measure is highly correlated with the time 2 highly correlated with the time 2 measure, then power is increased by measure, then power is increased by using this kind of a design.using this kind of a design.

Page 19: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

For n-way ANOVA, Hypothesize For n-way ANOVA, Hypothesize Main Effects Instead of InteractionsMain Effects Instead of Interactions The Main Effects tests have more The Main Effects tests have more

power than the Interaction tests. power than the Interaction tests.

Page 20: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Measurement IssuesMeasurement Issues

The Dependent Variable should be The Dependent Variable should be sensitive to change as a result of the sensitive to change as a result of the intervention. intervention.

The greater the reliability of the DV, The greater the reliability of the DV, the lower the model error, the the lower the model error, the greater the power. This means that greater the power. This means that assessing the reliability is important, assessing the reliability is important, as well as quality control procedures as well as quality control procedures to reduce administration variability. to reduce administration variability.

Page 21: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

Direct Measures Instead of IndirectDirect Measures Instead of Indirect

The use of proximal instead of distal The use of proximal instead of distal measures will increase power.measures will increase power.

For instance, if an intervention For instance, if an intervention increases knowledge that hopefully increases knowledge that hopefully will lead to behavior change, that will will lead to behavior change, that will lead to change in physiological lead to change in physiological measures, there will be more power measures, there will be more power to assess the intervention if the dv is to assess the intervention if the dv is a change in measure of knowledge. a change in measure of knowledge.

Page 22: Introduction to Sample Size Determination: “How powerful do I need to be, anyway?” Dennis G. Fisher, Ph.D. Center for Behavioral Research and Services

ReferencesReferences Kuzma, J. W., & Bohnenblust, S. E. (2001). Kuzma, J. W., & Bohnenblust, S. E. (2001).

Basic statistics for the health sciencesBasic statistics for the health sciences. . Mountain View, CA: Mayfield.Mountain View, CA: Mayfield.

Miettinen, O.S. (1968). On the matched-Miettinen, O.S. (1968). On the matched-pairs design in the case of all-or-none pairs design in the case of all-or-none responses. responses. Biometrics, 24Biometrics, 24, 339-352., 339-352.

Norman, G. R., & Streiner, D. L. (1998). Norman, G. R., & Streiner, D. L. (1998). Biostatistics: The bare essentialsBiostatistics: The bare essentials. . Hamilton, Ontario: B. C. Decker.Hamilton, Ontario: B. C. Decker.

Zar, J. H. (1984). Zar, J. H. (1984). Biostatistical analysis, Biostatistical analysis, second editionsecond edition. Englewood Cliffs, NJ: . Englewood Cliffs, NJ: Prentice-Hall.Prentice-Hall.