missing data issues in rcts: what to do when data are missing? analytic and technical support for...

22
Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting February 8, 2008

Upload: natalie-stewart

Post on 15-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

Missing Data Issues in RCTs: What to Do When Data Are

Missing?

Analytic and Technical Support for Advancing Education Evaluations

REL Directors Meeting

February 8, 2008

Page 2: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

2

What Are We Doing?

Purpose: To provide guidance on how to deal with missing baseline and longitudinal follow-up data (including attrition from data collection)

Audience: – Researchers measuring the impact of an educational

intervention on student achievement using a randomized control trial (RCT) design

– Of particular benefit to study teams just entering the impact analysis phase of their projects

Page 3: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

3

Scope of the Monograph

Restricted to Missing Data Issues in RCTs:– Missing data on outcome variables, control variables, and

subgroup variables– Excludes quasi-experimental studies, does not address

attrition from the intervention

Focus on Analysis Strategies—We take the design and study sample as already fixed

Provide Practical Guidance—Intent is to provide solutions that RELs can actually implement (not cutting edge or overly costly methods)

Page 4: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

4

Today’s Meeting

Status of the Monograph:– We are just writing the first draft, due March 1st

– No solutions today – just an informal discussion

What do we want to accomplish today?– Let you know where the paper is heading– Get your feedback

What’s not here? What help do you need?

Page 5: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

5

General Approach to ITT Impact Estimation in the REL Studies

The RCTs are multi-site cluster randomized controlled trials with random assignment at the classroom or school level

Given this design, RELs are using 2- or 3-level HLM models to estimate impacts, e.g., with 3 levels, students at level 1, classrooms at level 2, and schools at level 3

Covariates are usually included at the student level and sometimes at the class or school level

Page 6: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

6

Data Used in RCTs Conducted by the RELs

Outcome Measures:– Most common outcomes are student test score measures– But other outcomes include student attitudes, teacher

practice, and teacher knowledge

Baseline Measures:– Most RCTs are collecting baseline data to define

subgroups and improve the precision of the impact estimates

– These data include pre-intervention test scores and demographic characteristics, e.g., age, gender, race, ethnicity, Limited English Proficiency, and eligibility for Free or Reduced Price Lunches

Page 7: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

7

Most Common Types of Data Collection for Creating Outcome Measures

1. Student tests conducted for the study (n = 20)

2. Teacher surveys (n = 16)

3. Student tests required by the state (n = 11)

4. Student surveys (n = 6)

5. Teacher tests (n = 5)

Page 8: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

8

Missing Data and Measurement Issues

Missing Data (focus of this presentation):– Studies are likely to encounter missing values for one or

more outcome variables, subgroup variables, or control variables

Other Measurement Issues (included in paper):– Baseline measures collected after random assignment– Outcomes measures collected over time and perhaps on

average later for one experimental group than the other– Others? Pooling state test scores across states?

Page 9: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

9

Student Tests for the Study

Reasons for Missing Data: The student’s parent did not give consent The student failed to attend school on testing day The student transferred to another school The student’s classroom was unavailable for testing (fire

drill) Student refused to do the test

Missing data rates should be low if tests are administered in the usual classroom setting (most will be)

Page 10: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

10

Student Tests Required by the State

Reasons for Missing Data: The student was absent and there was no follow-up testing The student was exempt from taking the state test The student’s school failed to provide test score data The student transferred to another school

Missing data rates should be low since data can be collected at district or state level, and only a small fraction of the sample will transfer outside of the district

Page 11: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

11

Student Surveys

Reasons for Missing Data: The student’s parent did not give consent The student’s teacher failed to administer the survey The student did not attend school on the day of the survey The student chose not to complete the survey The student transferred to another school

Missing data rates should be low if the survey is administered in school, could be high if the survey is administered outside of school

Page 12: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

12

Teacher Surveys

Reasons for Missing Data: The teacher refused to complete the survey The teacher was on temporary leave The teacher left the school and never received the survey

Missing data rates can be high because teachers are busy and may have little incentive to complete the survey (unless required by the school or district)

Page 13: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

13

Teacher Tests

Some REL RCTs are testing teachers on their knowledge using tests conducted online or in school

Reasons for Missing Data: The teacher refused to complete the test The teacher was on temporary leave The teacher left the school and never received the test

Missing data rates can be high because teachers may find such test offensive or onerous (unless required by the school or district)

Page 14: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

14

Why Worry About Missing Data?

Standard software typically drop the cases with missing variables, but this can lead to biased impact estimates

To avoid this problem, researchers may choose to drop the variables for which some cases are missing, but this can have negative consequences

Page 15: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

15

Dropping Cases with Missing Data

Non-Response Bias—Dropping cases with missing data can lead to “non-response” bias if there is a relationship between the outcome and “missingness” – e.g., if student achievement is lower for students exempted from

the state test)

Biased Impact Estimates Bias—Dropping cases with missing data can lead to biased impact estimates if the rate of missing data or the mechanism behind the “missingness” differs between treatment and control (give example)

Page 16: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

16

Dropping Variables from the Analysis

Dropping either outcome variables (dependent variables) or subgroup variables because the data are missing for some cases is like throwing the baby out with the bath water!

Dropping control variables (independent variables) because data are missing will reduce the precision of the impact estimates (since controls are included to increase precision)

Page 17: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

17

Assessing the Size of the Problem

Learn why certain data are missing—In some cases this may shed light on whether non-response bias is likely to be severe

Compare respondents to nonrespondents using data available for both—For student outcomes, it is especially important to compare the two groups in their pre-intervention test scores

Page 18: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

18

Addressing the Problem: The Toolbox

All these methods rely on the information that we do have for the sample, but they vary in their assumptions and technical approach

Some methods work only for baseline covariates, some work only for outcome measures, and some work for both

Page 19: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

19

Different Methods for Baseline Covariates and Outcome Variables

Baseline Covariates (Control Variables) Only:– Dummy variable indicators for cases with missing data

Outcome Variables Only:– Weighting methods—Re-weight respondents to better

represent the population of interest (e.g., weight “up” groups with high rates of missing data)

– Bounding the impacts—Make assumptions about nonrespondents that maximize and minimize the estimated impacts (e.g., make best and worst case assumptions for the true values of the outcome when data are missing)

Page 20: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

20

Methods for Both Types of Variables

Imputation-based Procedures:

– Mean value imputation—Replace missing values with the mean for the variable

– Regression imputation—Replace missing values with a predicted value from regression model

– Stochastic regression imputation—Adds residual to predicted value to maintain correct variance. Stochastic regression imputation can be implemented as a single imputation or a multiple imputation

Model-based Procedures—This is a broad class of procedures that includes maximum likelihood based methods, the EM algorithm, and pattern-mixture models (we are still investigating)

Page 21: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

21

How Do We Plan To Recommend Strategies?

Our current thinking is to assess alternatives on the basis of four criteria:

1. Accessibility—Is this “tool” something that be done with standard software?

2. Bias reduction—How effective is the method likely to be in addressing non-response bias?

3. Correct inference—Will the tool produce standard errors that are at least “not too biased”?

4. Power—Does this method generate estimates that are reasonable precise (relative to alternative options)?

Page 22: Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting

22

Discussion

Time for your feedback:– What’s not here? – What help do you need?