What you don’t see can’t matter?
The effects of unobserved differences on causal attributions
Robert CoeCEM, Durham University
Randomised Controlled Trials in the Social Sciences, York, September 2009
2
The problem …
• Evaluations of education policy initiatives often use regression methods to control for initial differences
• They often seem to make clear and confident causal claims about programme effects
• They seldom seem to consider other possible explanations (eg unobserved differences)
• If regression models incorporate lots of explanatory variables and have high R2, does it matter if there might have been initial unobserved differences?
3
Are you a researcher or a policy maker?
• Would you rather have an answer to your question that is
– Simple, clear and confident, but probably wrong– Complex, nuanced, and carefully justified, but
completely useless for making a decision
4
Model simulated
Measuredvariable
(M)
Unmeasuredvariable
(U)
Outcomevariable
(Y)r
Treatmentgroup
(X)
p q
sd
Y = 0 M + d (X –X) + e–
Observed relationship
Y = 0 M + 1 U + e
True relationship
5
The simulation
1. Generate random variables Y, M, U with desired inter-correlations (p, q, r)
2. Assign group membership (X) to be correlated with U
3. Apply regression model (Y on M and X) to estimate spurious ‘effect’ of X on Y if U were omitted
6
Example evaluations
• Effects of study support on achievement– MacBeath et al (2001) ‘The Impact of Study Support: A report of
a longitudinal study into the impact of participation in out-of-school-hours learning on the academic attainment, attitudes and school attendance of secondary school students’. Published by the Department for Education and Skills
• Effects of gifted and talented provision on achievement– Kendall, et al (2005) ‘Excellence in Cities: The National
Evaluation of a Policy to Raise Standards in Urban Schools 2000-2003’. Published by the Department for Education and Skills.
• Effects of Assisted Places on achievement– Power et al (2006) ‘The Educational and Career Trajectories of
Assisted Place Holders’. Published by the Sutton Trust.
7
Summary of the examples
Study MacBeath et al. (2001)
Kendall et al. (2005) Power et al. (2006)
Intervention / programme
Study support (Y11 Easter School)
Gifted & Talented provision
Assisted Places Scheme
Outcome(s) GCSE English;
GCSE maths
KS3 average level;
GCSE capped 8 score
A-level points
Covariates KS3 SATs average, Gender, School type
Prior attainment, FSM status, gender, ethnicity
Parents’ SES and education
R2 in the model 63%; 70% 66%; 80% 30%
Estimate of the effect, from regression model
0.18; 0.11 0.30; 0.22 0.51
8
Results of the simulation
Estimates of spurious 'effect' of Gifted & Talented provision at KS3(R2 = 0.64, r ym = 0.8)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
s ux : Correlation between unmeasured variable(s) and treatment group membership
Ap
pa
ren
t ef
fec
t si
ze, E
p uy = 0.7, q um = 0.6 r ym = correlation between outcome and measured variable(s) p uy = correlation between outcome and unmeasured variable(s) q um = correlation between measured and measured variable(s)
p uy = 0.7, q um = 0.7
p uy = 0.6, q um = 0.6
p uy = 0.6, q um = 0.7
9
Summary of simulations
Possible relevant unmeasured variable(s)
Socioeconomic status; Motivation; Self-discipline
Attainment used to identify G&T status
Performance on entrance test
Range of possible spurious effects
0.0 – 0.19 0.0 – 0.45
-0.06 – 0.40
0.0 – 1.0
Range of likely spurious effects
0.05 – 0.13 0.16 – 0.32
0.10 – 0.21
0.3 – 0.5
Best guess at spurious effect
0.08 0.26 0.17 0.37
10
Interpretations
Interpretation given by the researchers
‘Study support can improve attainment in Maths and English by half a grade’
‘Pupils designated as gifted and talented had higher levels of attainment at the end of Key Stages 3 and 4 than those of otherwise similar pupils not designated.’
‘AP holders did better attending a private school than if they had gone to a state school’
Justified conclusion, taking account of bias due to omitted factors
Possible small residual effect (0.10) on English but pretty much no genuine effect on maths
Any genuine effect for both outcomes is very close to zero
Possible positive effect (0.14), but much caution and uncertainty surrounds this
11
Conclusions
• Even high R2 with the variables you have got does not necessarily mean you can ignore others you haven’t
• The size of artefactual spurious effects is quite sensitive to assumptions about parameter values, which are themselves contentious
• Which is all the more reason to consider unobserved variables
• The examples considered are otherwise of relatively high quality
12
Recommendations
• Need to replicate, simplify application, extend to other methods
• Be more cautious about making – and believing – causal claims based on statistical control
• Before interpreting as causal:– List possible alternative factors– Evaluate the case for their impact
• Use stronger designs