part 2 attrition: bias and loss of power. relevant papers graham, j.w., (2009). missing data...

Part 2Attrition: Bias and Loss of

Power

Relevant Papers Graham, J.W., (2009). Missing data analysis: making it

work in the real world. Annual Review of Psychology, 60, 549-576.

Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330_351.

Hedeker, D., & Gibbons, R.D. (1997). Application of random-effects pattern-mixture models for missing data in longitudinal studies, Psychological Methods, 2, 64-78.

Graham, J.W., & Collins, L.M. (2010, forthcoming). Using Modern Missing Data Methods with Auxiliary Variables to Mitigate the Effects of Attrition on Statistical Power. Chapter 10 in Graham (2010, forthcoming), Missing Data: Analysis and Design. New York: Springer.

Relevant Papers

Graham, J.W., Palen, L.A., et al. (2008). Attrition: MAR & MNAR missingness, and estimation bias. Annual Meetings of the Society for Prevention Research, San Francisco, CA. (available upon request)

also see: Graham, J.W., (2010, forthcoming). Simulations with Missing Data. Chapter 9 in Graham (2010, forthcoming), Missing Data: Analysis and Design. New York: Springer.

What if the cause of missingness is MNAR?

Problems with this statement

MAR & MNAR are widely misunderstood concepts

I argue that the cause of missingness is never purely MNAR

The cause of missingness is virtually never purely MAR either.

MAR vs MNAR

"Pure" MCAR, MAR, MNAR never occur in field research

Each requires untenable assumptions e.g., that all possible correlations

and partial correlations are r = 0

MAR vs MNAR

Better to think of MAR and MNAR asforming a continuum

MAR vs MNAR NOT even the dimension of interest

MAR vs MNAR: What IS the Dimension of Interest?

How much estimation bias? when cause of missingness cannot be

included in the model

Bottom Line ...

All missing data situations are partly MAR and partly MNAR

Sometimes it matters ... bias affects statistical conclusions

Often it does not matter bias has tolerably little effect on statistical

conclusions

(Collins, Schafer, & Kam, Psych Methods, 2001)

Methods:"Old" vs MAR vs MNAR

MAR methods (MI and ML) are ALWAYS at least as good as, usually better than "old" methods

(e.g., listwise deletion)

Methods designed to handle MNAR missingness are NOT always better than MAR methods

Yardstick for Measuring Bias

Standardized Bias =

(average parameter est) – (population value)-------------------------------------------------------- X 100

Standard Error (SE)

|bias| < 40 considered small enough to be tolerable t-value off by 0.4

A little background for Collins, Schafer, & Kam (2001; CSK)

Example model of interest: X Y X = Program (prog vs control)Y = Cigarette SmokingZ = Cause of missingness: say,

Rebelliousness (or smoking itself) Factors to be considered:

% Missing (e.g., % attrition) rYZ rZR

rYZ

Correlation between cause of missingness (Z)

e.g., rebelliousness (or smoking itself) and the variable of interest (Y)

e.g., Cigarette Smoking

rZR

Correlation between cause of missingness (Z)

e.g., rebelliousness (or smoking itself) and missingness on variable of interest

e.g., Missingness on the Smoking variable

Missingness on Smoking (often designated: R or RY) Dichotomous variable:

R = 1: Smoking variable not missingR = 0: Smoking variable missing

CSK Study Design (partial)

Simulations manipulated amount of missingness (25% vs 50%) rZY (r = .40, r = .90) rZR held constant

r = .45 with 50% missing (applies to "MNAR-Linear" missingness)

CSK Results (partial) (MNAR Missingness)

25% missing, rYZ = .40 ... no problem 25% missing, rYZ = .90 ... no problem 50% missing, rYZ = .40 ... no problem 50% missing, rYZ = .90 ... problem

* "no problem" = bias does not interfere with inference

These Results apply to the regression coefficient for X Y with "MNAR-Linear" missingness (see CSK, 2001, Table 2)

But Even CSK ResultsToo Conservative

Not considered by CSK: rZR In their simulation rZR = .45

Even with 50% missing and rYZ = .90 bias can be acceptably small

Graham et al. (2008): Bias acceptably small

(standardized bias < 40) as long as rZR < .24

rZR < .24 Very Plausible

Study rZR

_________ _____HealthWise

(Caldwell, Smith, et al., 2004) .106AAPT (Hansen & Graham, 1991) .093Botvin1 .044Botvin2 .078Botvin3 .104

All of these yield standardized bias < 10

(estimated)

CSK and Follow-up Simulations

Results very promising Suggest that even MNAR biases

are often tolerably small

But these simulations still too narrow

Beginnings of a Taxonomy of Attrition

Causes of Attrition on Y (main DV)

Case 1: not Program (P), not Y, not PY interaction

Case 2: P only Case 3: Y only . . . (CSK scenario) Case 4: P and Y only

Graham, J. W. (2009). Annual Review of Psychology.

Beginnings of a Taxonomy of Attrition

Causes of Attrition on Y (main DV)

Case 5: PY interaction only Case 6: P + PY interaction Case 7: Y + PY interaction Case 8: P, Y, and PY interaction

Taxonomy of Attrition

Cases 1-4 often little or no problem

Cases 5-8 Jury still out (more research needed) Very likely not as much of a problem

as previously though Use diagnostics to shed light

Use of Missing Data Diagnostics

Diagnostics based on pretest data not much help Hard to predict missing distal

outcomes from differences on pretest scores

Longitudinal Diagnostics can be much more helpful

Hedeker & Gibbons (1997)

Plot main DV over time for four groups: for Program and Control for those with and without last wave

of data

Much can be learned

Empirical Examples

Hedeker & Gibbons (1997) Drug treatment of psychiatric patients

Hansen & Graham (1991) Adolescent Alcohol Prevention Trial

(AAPT) Alcohol, smoking, other drug prevention

among normal adolescents (7th – 11th grade)

Empirical Example Used by Hedeker & Gibbons (1997) IV: Drug Treatment vs. Placebo Control DV: Inpatient Multidimensional Psychiatric

Scale (IMPS) 1 = normal 2 = borderline mentally ill 3 = mildly ill 4 = moderately ill 5 = markedly ill 6 = severely ill 7 = among the most extremely ill

From Hedeker & Gibbons (1997)

2.5

3

3.5

4

4.5

5

5.5

0 1 3 6

IMPSlow = better outcomes

Placebo Control

Drug Treatment

Weeks of Treatment

Longitudinal DiagnosticsHedeker & Gibbons Example Treatment

droppers do BETTER than stayers Control

droppers do WORSE than stayers Example of Program X DV interaction But in this case, pattern would lead to suppression bias Not as bad for internal validity in presence

of significant program effect

AAPT (Hansen & Graham, 1991)

IV: Normative Education Program vs Information Only Control

DV: Cigarette Smoking (3-item scale) Measured at one-year intervals 7th grade – 11th grade

AAPT

Cigarette Smoking

(high = more smoking; arbitrary scale)

th th th th th

Control

Control

Program

Program

Longitudinal DiagnosticsAAPT Example Treatment

droppers do WORSE than stayers little steeper increase

Control droppers do WORSE than stayers

little steeper increase

Little evidence for Prog X DV interaction Very likely MAR methods allow good

conclusions (CSK scenario holds)

Use of Auxiliary Variables

Reduces attrition bias Restores some power lost due to

attrition

What Is an Auxiliary Variable?

A variable correlated with the variables in your model but not part of the model not necessarily related to missingness used to "help" with missing data estimation

Best auxiliary variables: same variable as main DV, but measured at

waves not used in analysis model

Model of Interest

X Y res 11

Benefit of Auxiliary Variables

Example from Graham & Collins (2010)

X Y Z1 1 1 500 complete cases1 0 1 500 cases missing Y

X, Y variables in the model (Y sometimes missing)

Z is auxiliary variable


Effective sample size (N')

Analysis involving N cases, with auxiliary variable(s)

gives statistical power equivalent to N' complete cases without auxiliary variables


It matters how highly Y and Z (the auxiliary variable) are correlated

For example increase

rYZ = .40 N = 500 gives power of N' = 542 ( 8%) rYZ = .60 N = 500 gives power of N' = 608 (22%) rYZ = .80 N = 500 gives power of N' = 733 (47%) rYZ = .90 N = 500 gives power of N' = 839 (68%)

Effective Sample Size by rYZ

500

600

700

800

900

1000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

rYZ

Effective

Sample

Size

Conclusions Attrition CAN be bad for internal validity But often it's NOT nearly as bad as often feared

Don't rush to conclusions, even with rather substantial attrition

Examine evidence (especially longitudinal diagnostics) before drawing conclusions

Use MI and ML missing data procedures! Use good auxiliary variables to minimize impact

of attrition

Part 3:Illustration of Missing Data

Analysis: Multiple Imputation with NORM and

Proc MI

Multiple Imputation:Basic Steps

Impute

Analyze

Combine results

Imputation and Analysis

Impute 40 datasets a missing value gets a different imputed

value in each dataset

Analyze each data set with USUAL procedures e.g., SAS, SPSS, LISREL, EQS, STATA, HLM

Save parameter estimates and SE’s

Combine the ResultsParameter Estimates to

Report

Average of estimate (b-weight) over 40 imputed datasets

Combine the ResultsStandard Errors to Report

Weighted sum of: “within imputation” variance

average squared standard error usual kind of variability

“between imputation” variancesample variance of parameter estimates

over 40 datasets variability due to missing data

Materials for SPSS Regression

Starting place http://methodology.psu.edu

downloads (you will need to get a free user ID to download all our free software)

missing data software Joe Schafer's Missing Data Programs John Graham's Additional NORM Utilities http://mcgee.hhdev.psu.edu/missing/index.html

(this mcgee website is currently down, but I hope to have it up again in the Fall). Please email me with any questions.

exit for sample analysis

part 2 attrition: bias and loss of power. relevant papers graham, j.w., (2009). missing data...

Documents