Non-Experimental Data:Natural Experiments
and more on IV
Non-Experimental Data
• Refers to all data that has not been collected as part of experiment
• Quality of analysis depends on how well one can deal with problems of:– Omitted variables– Reverse causality– Measurement error– selection
• Or… how close one can get to experimental conditions
Natural/ ‘Quasi’ Experiments
• Used to refer to situation that is not experimental but is ‘as if’ it was
• Not a precise definition – saying your data is a ‘natural experiment’ makes it sound better
• Refers to case where variation in X is ‘good variation’ (directly or indirectly via instrument)
• A Famous Example: London, 1854
The Case of the Broad Street Pump
• Regular cholera epidemics in 19th century London
• Widely believed to be caused by ‘bad air’
• John Snow thought ‘bad water’ was cause
• Experimental design would be to randomly give some people good water and some bad water
• Ethical Problems with this
Soho Outbreak August/September 1854
• People closest to Broad Street Pump most likely to die
• But breathe same air so does not resolve air vs. water hypothesis
• Nearby workhouse had own well and few deaths
• Nearby brewery had own well and no deaths (workers all drank beer)
Why is this a Natural experiment?
• Variation in water supply ‘as if’ it had been randomly assigned – other factors (‘air’) held constant
• Can then estimate treatment effect using difference in means
• Or run regression of death on water source distance to pump, other factors
• Strongly suggests water the cause• Woman died in Hampstead, niece in Islington
What’s that got to do with it?
• Aunt liked taste of water from Broad Street pump
• Had it delivered every day• Niece had visited her• Investigation of well found contamination
by sewer• This is non-experimental data but
analysed in a way that makes a very powerful case – no theory either
Methods for Analysing Data from Natural Experiments
• If data is ‘as if’ it were experimental then can use all techniques described for experimental data– OLS (perhaps Snow case)– IV to get appropriate units of measurement
• Will say more about IV than OLS– IV perhaps more common– If can use OLS not more to say– With IV there is more to say – weak instruments
Conditions for Instrument Validity
• To be valid instrument:– Must be correlated with X - testable– Must be uncorrelated with ‘error’ – untestable
– have to argue case for this assumption
• These conditions guaranteed with instrument for experimental data
• But more problematic for data from quasi-experiments
Bombs, Bones and Breakpoints:The Geography of Economic Activity
Davis and Weinstein, AER, 2002• Existence of agglomerations (e.g. cities) a
puzzle• Land and labour costs higher so why don’t firms
relocate to increase profits• Must be some compensatory productivity effect• Different hypotheses about this:
– Locational fundamentals– Increasing returns (Krugman) – path-dependence
Testing these Hypotheses
• Consider a temporary shock to city population
• Locational fundamentals theory would predict no permanent effect
• Increasing returns would suggest permanent effect
• Would like to do experiment of randomly assigning shocks to city size
• This is not going to happen
The Davis-Weinstein idea
• Use US bombing of Japanese cities in WW2• This is a ‘natural experiment’ not a true
experiment because:– WW2 not caused by desire to test theories of
economic geography– Pattern of US bombing not random
• Sample is 303 Japanese cities, data is:– Population before and after bombing– Measures of destruction
Basic Equation
• Δsi,47-40 is change in population just before and after war
• Δsi,60-47 is change in population at later period
• How to test hypotheses:– Locational fundamentals predicts β1=-1
– Increasing returns predicts β1=0
,60 47 0 1 ,47 40 2i i i is s x
The IV approach
• Δsi,47-40 might be influenced by both permanent and temporary factors
• Only want part that is transitory shock caused by war damage
• Instrument Δsi,47-40 by measures of death and destruction
The First-Stage: Correlation of Δsi,47-40 with Z
Why Do We Need First-Stage?
• Establishes instrument relevance – correlation of X and Z
• Gives an idea of how strong this correlation is – ‘weak instrument’ problem
• In this case reported first-stage not obviously that implicit in what follows– That would be bad practice
The IV Estimates
Why Are these other variables included?
• Potential criticisms of instrument exogeneity– Government post-war reconstruction expenses
correlated with destruction and had an effect on population growth
– US bombing heavier of cities of strategic importance (perhaps they had higher growth rates)
• Inclusion of the extra variables designed to head off these criticisms
• Assumption is that of exogeneity conditional on the inclusion of these variables
• Conclusion favours locational fundamentals view
An additional piece of supporting evidence….
• Always trying to build a strong evidence base – many potential ways to do this, not just estimating equations
The Problem of Weak Instruments
• Say that instruments are ‘weak’ if correlation between X and Z low (after inclusion of other exogenous variables)
• Rule of thumb - If F-statistic on instruments in first-stage less than 10 then may be problem (will explain this a bit later)
Why Do Weak Instruments Matter?
• A whole range of problems tend to arise if instruments are weak
• Asymptotic problems:– High asymptotic variance– Small departures from instrument exogeneity lead to
big inconsistencies• Finite-Sample Problems:
– Small-sample distirbution may be very different from asymptotic one
• May be large bias• Computed variance may be wrong• Distribution may be very different from normal
Asymptotic Problems I:Low precision
• asymptotic variance of IV estimator is larger the weaker the instruments
• Intuition – variance in any estimator tends to be lower the bigger the variation in X – think of σ2(X’X)-1
• IV only uses variation in X that is associated with Z
• As instruments get weaker using less and less variation in X
Asymptotic Problems II:Small Departures from Instrument Exogeneity
Lead to Big Inconsistencies
• Suppose true causal model is
y=Xβ+Zγ+ε
So possibly direct effect of Z on y.
• Instrument exogeneity is γ=0.
• Obviously want this to be zero but might hope that no big problem if ‘close to zero’ – a small deviation from exogeneity
But this will not be the case if instruments weak… consider just-
identified case
• If instruments weak then ΣZX small so ΣZX-1
large so γ multiplied by a large number
ˆ ' 'IV Z X Z y
ˆ ' ' ' 'IV Z X Z Z Z X Z
11 1ˆlim lim ' lim 'IVZX ZZp p Z X p Z Z
N N
An Example: The Return to Education
• Economists long-interested in whether investment in human capital a ‘good’ investment
• Some theory shows that coefficient on s in regression:
y=β0+β1s+β2x+εIs measure of rate of return to education • OLS estimates around 8% - suggests very good
investment• Might be liquidity constraints• Might be bias
Potential Sources of Bias
• Most commonly mentioned is ‘ability bias’
• Ability correlated with earnings independent of education
• Ability correlated with education
• If ability omitted from ‘x’ variables then usual formula for omitted variables bias suggests upward bias in OLS estimate
Potential Solution
• Find an instrument correlated with education but uncorrelated with ‘ability’ (or other excluded variables)
• Angrist-Krueger “Does Compulsory Schooling Attendance Affect Schooling and Earnings” , QJE 1991, suggest using quarter of birth
• Argue correlated with education because of school start age policies and school leaving laws (instrument relevance)
• Don’t have to accept this – can test it
A graphical version of first-stage (correlation between education and Z)
In this case…
• Their instrument is binary so IV estimator can be written in Wald form
• And this leads to following expression for potential inconsistency:
1 0ˆlim1 0 1 0
IVE y Z E y Z
pE X Z E X Z E X Z E X Z
• Note denominator is difference in schooling for those born in first- and other quarters
• Instrument will be ‘weak’ if this difference is small
Their Results
Interpretation (and Potential Criticism)
• IV estimates not much below OLS estimates (higher in one case)
• Suggests ‘ability bias’ no big deal
• But instrument is weak
• Being born in 1st quarter reduces education by 0.1 years
• Means ‘γ’ will be multiplied by 10
But why should we have γ≠0
• Remember this would imply a direct effect of quarter of birth on earnings, not just one that works through the effect on education
• Bound, Jaeger and Baker argued that evidence that quarter of birth correlated with:– Mental and physical health– Socioeconomic status of parents
• Unlikely that any effects are large but don’t have to be when instruments are weak
An example: UK data
.32
.32
5.3
3.3
35
.34
Fra
ctio
n of
Kid
s w
ith H
oH M
ana
geria
l/Pro
fess
ion
al
1 2 3 4 5 6 7 8 9 10 11 12Month of Birth of Child
Variation in Socoeconomic Status of Parents by Birth Month
Effect is small but significantly different from zero
A Back-of-the-Envelope Calculation
• Being born in first quarter means 0.01 less likely to have a managerial/professional parent
• Being a manager/professional raises log earnings by 0.64
• Correlation between earnings of children and parents 0.4• Effect on earnings through this route
0.01*0.64*0.4=0.00256 i.e. ¼ of 1 per cent• Small but weak instrument causes effect on
inconsistency of IV estimate to be multiplied by 10 – 0.0256
• Now large relative to OLS estimate of 0.08
Summary
• Small deviations from instrument exogeneity lead to big inconsistencies in IV estimate if instruments are weak
• Suspect this is often of great practical importance
• Quite common to use ‘odd’ instrument – argue that ‘no reason to believe’ it is correlated with ε but show correlation with X
Finite Sample Problems
• This is a very complicated topic• Exact results for special cases, approximations
for more general cases• Hard to say anything that is definitely true but
can give useful guidance• Problems in 3 areas
– Bias– Incorrect measurement of variance– Non-normal distribution
• But really all different symptoms of same thing
Review and Reminder
• If ask STATA to estimate equation by IV• Coefficients compute using formula given• Standard errors computed using formula
for asymptotic variance • T-statistics, confidence intervals and p-
values computed using assumption that estimator is unbiased with variance as computed and normally distributed
• All are asymptotic results
Difference between asymptotic and finite-sample distributions
• This is normal case
• Only in special cases e.g. linear regression model with normally distributed errors are small-sample and asymptotic distributions the same.
• Difference likely to be bigger– The smaller the sample size– The weaker the instruments
Rule of Thumb for Weak Instruments
• F-test for instruments in first-stage >10
• Stricter than significant e.g. if one instrument F=10 equivalent to t=3.3
Conclusion
• Natural experiments useful source of knowledge• Often requires use of IV• Instrument exogeneity and relevance need
justification• Weak instruments potentially serious• Good practice to present first-stage regression• Finding more robust alternative to IV an active
research area