causal inference with panel data · 2020-03-04 · two-step strategy: 1.estimate the propensity...
TRANSCRIPT
![Page 1: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/1.jpg)
Causal Inference with Panel Data
Yiqing Xu (Stanford University)
Northwestern-Duke Causal Inference Workshop
19 August 2019
![Page 2: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/2.jpg)
Motivation
![Page 3: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/3.jpg)
Cadre Visits on Loans
−12 −10 −8 −6 −4 −2 0 2 4
Quarter(s) before / after a cadre's visit
5.0
5.5
6.0
6.5
Log
Out
stan
ding
Loa
ns
182 treated firms
542 control firms
Visits
1
![Page 4: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/4.jpg)
Cadre Visits on Loans
−12 −10 −8 −6 −4 −2 0 2 4
Quarter(s) before / after a cadre's visit
−0.
6−
0.4
−0.
20.
00.
20.
40.
60.
8
Diff
−in
−di
ffs in
Loa
nsThe Effect of Cadre Visits on Loans95% Confidence Interval
Implicitly Assumed
2
![Page 5: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/5.jpg)
Cadre Visits on Loans
−12 −10 −8 −6 −4 −2 0 2 4
Quarter(s) before / after a cadre's visit
−0.
6−
0.4
−0.
20.
00.
20.
40.
60.
8
Diff
−in
−di
ffs in
Loa
nsThe "Effect" of Cadre Visits on Loans95% Confidence Interval
3
![Page 6: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/6.jpg)
−12 −10 −8 −6 −4 −2 0 2 4
Year(s) before / after Reform
−2
0−
10
01
02
03
0
Diff
−in
−d
iffs
in C
rim
e R
ate
%The "Effect" of Assault Weapon Ban on Crime Rate95% Confidence Interval
−20 −16 −12 −8 −4 0 4 8
Year(s) before / after Both Parties in the WTO
−0
.2−
0.1
0.0
0.1
0.2
Diff
−in
−d
iffs
in L
og
Tra
de
Vo
lum
e
The "Effect" of WTO on Trade95% Confidence Interval
−12 −10 −8 −6 −4 −2 0 2 4 6 8
Year(s) before / after Democratization
−0
.15
−0
.10
−0
.05
0.0
00
.05
0.1
00
.15
Diff
−in
−d
iffs
in L
og
GD
P p
er
cap
ita
The "Effect" of Democratization on Economic Development95% Confidence Interval
−28 −24 −20 −16 −12 −8 −4 0 4 8
Day(s) before / after the News Leak
−6
−4
−2
02
46
8
Diff
−in
−d
iffs
in A
bn
orm
al R
etu
rn %
The "Effect" of Tim Geithner Connections on Stock Market Return95% Confidence Interval
![Page 7: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/7.jpg)
Motivations
Two-way fixed effects (FE) and difference-in-differences (DiD) methods
are commonly used in the social sciences.
• Abundant observational panel data
• Accounting for unobserved unit and time heterogeneity
• Easy to estimate and interpret
●● ● ● ●
●
●●
●
●● ●
●
●
●
●
●
●
Cou
nt
05
1015
2025
30
2000 2002 2004 2006 2008 2010 2012 2014 2016
"difference−in−differences"
"fixed effects" + "panel data"
#Papers published in APSR/AJPS/JOP
5
![Page 8: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/8.jpg)
We Try to Answer
1. What about time-varying confounders, e.g. when a “pre-trend”
exists?
2. What if the treatment effect are heterogeneous — as they always
are?
3. What’re the differences between twoway FE and DiD?
4. Are the assumptions credible? What’re the hypothetical experiments
behind these models?
5. How can we do better than DiD or synthetic control?
6
![Page 9: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/9.jpg)
What’s Special about Panel Data?
• The fundamental problem of causal inference (Holland 1986)
τi = Y1i − Yi0
• A statistical solution makes use of others’ information
e.g. ATE = E[Y1]− E[Y0]
• A scientific solution exploits homogeneity or invariance assumptions
e.g. A rock stays a rock.
e.g. The long-run growth rate of the US economy is 2.5%.
• Panel data allow us to construct treated counterfactuals using
information from both the past and the others with the caveat that
treatment assignment mechanism may be complicated
• Panel data is also difficult because of all kinds of interferences
(SUTVA violations)
7
![Page 10: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/10.jpg)
Causal Inference with Panel Data
• “Scientific” solution: modeling (but all models are wrong...)
• Statistical solution: similar to the Selection-on-Observable (SOO)
approach, e.g., matching/reweighting
• Panel data make both easier
• Pre-trends are observable → more information for modeling
• The additional dimension helps relax the conventional ignorability
assumption
• And we can do more...
8
![Page 11: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/11.jpg)
Roadmap
• DiD and synthetic control
• FE/DiD assumptions
• New estimators• Matching and reweighting
• Outcome models
* Diagnostic tools
• Hybrid methods
• Conclusions
![Page 12: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/12.jpg)
Quick Review of DiD and Synth
![Page 13: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/13.jpg)
Difference-in-Differences
Yit = τitDit + αi + ξt + εit
or{Y 0it = αi + ξt + εit
Y 1it = Y 0
it + τit
• τit is the treatment effect for unit i at time t
• Y 0it is a combination of two additive fixed effects and idiosyncratic
errors
• E[εit ] = 0 and εit ⊥⊥ Dis , for all i , t, s (strict exogeneity)
• ATT = E[τit |Dit = 1] can be non-parametrically identified if there
are only two periods (or two treatment histories)
10
![Page 14: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/14.jpg)
Difference-in-Differences
(Y 0T ,pre ??
Y 0C,pre Y 0
C,post
)
• τit is the treatment effect for unit i at time t
• Y 0it is a combination of two additive fixed effects and idiosyncratic
errors
• E[εit ] = 0 and εit ⊥⊥ Dis , for all i , t, s (strict exogeneity)
• ATT = E[τit |Dit = 1] can be non-parametrically identified if there
are only two periods (or two treatment histories)
11
![Page 15: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/15.jpg)
DiD
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Time
Uni
t
Under Control Under Treatment
Treatment Status
12
![Page 16: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/16.jpg)
Semi-parametric DiD
• Abadie (2005) proposes “semi-parametric DiD”
• Assumption: non-parallel outcome dynamics between treated and
controls caused by observed characteristics
• Two-step strategy:
1. estimate the propensity score based on observed covariates; compute
the fitted value
2. run a weighted DiD model
• The idea of using pre-treatment variables to adjust trends is a
precursor to synthetic control
• Strezhnev (2018) extends this approach to incorporate pre-treatment
outcomes
13
![Page 17: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/17.jpg)
Synthetic Control: Basic Idea
• J + 1 units in periods 1, 2, . . . ,T ; one treated “1”, J controls
• Region “1” is exposed to the intervention after period T0
• We aim to estimate the effect of the intervention on Region “1”
1 T0 T
J
1
W*
14
![Page 18: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/18.jpg)
Synthetic Control: Insights
• Athey and Imbens (2017): “[a]rguably the most important
innovation in the policy evaluation literature in the last 15 years.”
• A combo of many innovations
• Take advantage of pre-treatment outcomes
• Use cross-sectional instead of temporal correlations in data
• Construct a convex combination of donors to construct a
counterfactual
• Reserve some pre-treatment periods for testing
15
![Page 19: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/19.jpg)
Difference-in-Differences (DiD)
−12 −10 −8 −6 −4 −2 0 2 4
Time relative to the treatment
5.0
5.5
6.0
6.5
Out
com
e
treated
controls
T0
16
![Page 20: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/20.jpg)
Synthetic Control (and Many Extensions)
−12 −10 −8 −6 −4 −2 0 2 4
Time relative to the treatment
5.0
5.5
6.0
6.5
Out
com
e
treated
controls
T0
17
![Page 21: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/21.jpg)
Theoretical Justification
Yit = τitDit + θ′tZi + ξt + λ′i ft + εit
or{Y 0it = θ′tZi + ξt + λ′i ft + εit
Y 1it = Y 0
it + τit
• Suppose there are R time-varying signals ft out there
• Each unit (e.g. country, participant) picks up a fixed linear
combination of these signals based on factor loadings λi
• Since these “confounders” are evidenced in the pre-treatment
outcomes for both treated and controls, we can try to use this
information to “balance on” these confounders
• We will discuss the model-based approach later18
![Page 22: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/22.jpg)
Theoretical Justification
{Y 0it = θ′tZi + ξt + λ′i ft + εit
Y 1it = Y 0
it + τit
• Let W = (w2, . . . ,wJ+1)′ with wj ≥ 0 and w2 + · · ·+ wJ+1 = 1.
• Let Y K1
i , . . . , Y KM
i be M > R linear functions of pre-intervention
outcomes
• Suppose that we can choose W ∗ such that:
Z1 =∑J+1
j=2 w∗j Zj , Y k1 =
∑J+1j=2 w∗j Y
kj , k ∈ {K1, . . . ,KM}
• When T0 is large, an approximately unbiased estimator of τ1t is:
τ1t = Y1t −∑J+1
j=2 w∗j Yjt , t ∈ {T0 + 1, . . . ,T}
19
![Page 23: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/23.jpg)
Implementation
• Let X1 = (Z1, YK11 , . . . , Y KM
1 )′ be a (k × 1) vector of
pre-intervention characteristics for the treated and X0, a (k × J)
matrix, for the controls.
• The vector W ∗ is chosen to minimize ‖X1 − X0W ‖, subject to our
weight constraints.
• We consider ‖X1 − X0W ‖V =√
(X1 − X0W )′V (X1 − X0W ), where
V is some (k × k) symmetric and positive semidefinite matrix.
• Various ways to choose V (subjective assessment of predictive power
of X , regression, minimize MSPE, cross-validation, etc.).
20
![Page 24: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/24.jpg)
Example: Proposition 99 on Cigarette Consumption
• In 1988, California first passed comprehensive tobacco control
legislation (cigarette tax, media campaign etc.)
• Using 38 states that had never passed such programs as controls
1970 1975 1980 1985 1990 1995 2000
020
4060
8010
012
014
0
year
per−
capi
ta c
igar
ette
sal
es (
in p
acks
)
Californiarest of the U.S.
Passage of Proposition 99
Cigarette Consumption: CA and the Rest of the U.S.
21
![Page 25: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/25.jpg)
Example: Proposition 99 on Cigarette Consumption
• In 1988, California first passed comprehensive tobacco control
legislation (cigarette tax, media campaign etc.)
• Using 38 states that had never passed such programs as controls
1970 1975 1980 1985 1990 1995 2000
020
4060
8010
012
014
0
year
per−
capi
ta c
igar
ette
sal
es (
in p
acks
)
Californiasynthetic California
Passage of Proposition 99
Cigarette Consumption: CA and Synthetic CA
22
![Page 26: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/26.jpg)
Predictor Means: Actual vs. Synthetic California
California Average of
Variables Real Synthetic 38 control states
Ln(GDP per capita) 10.08 9.86 9.86
Percent aged 15-24 17.40 17.40 17.29
Retail price 89.42 89.41 87.27
Beer consumption per capita 24.28 24.20 23.75
Cigarette sales per capita 1988 90.10 91.62 114.20
Cigarette sales per capita 1980 120.20 120.43 136.58
Cigarette sales per capita 1975 127.10 126.99 132.81
Note: All variables except lagged cigarette sales are averaged for the 1980-1988
period (beer consumption is averaged 1984-1988).
23
![Page 27: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/27.jpg)
Smoking Gap Between CA and Synthetic CA
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
)
Passage of Proposition 99
24
![Page 28: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/28.jpg)
Smoking Gap for CA and 38 Control States
(All States in Donor Pool)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
25
![Page 29: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/29.jpg)
Smoking Gap for CA and 34 Control States
(Pre-Prop. 99 MSPE ≤ 20 Times Pre-Prop. 99 MSPE for CA)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
26
![Page 30: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/30.jpg)
Smoking Gap for CA and 29 Control States
(Pre-Treatment MSPE ≤ 5 Times Pre-Treatment MSPE for CA)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
27
![Page 31: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/31.jpg)
Smoking Gap for CA and 19 Control States
(Pre-Treatment MSPE ≤ 2 Times Pre-Treatment MSPE for CA)
1970 1975 1980 1985 1990 1995 2000
−30
−20
−10
010
2030
year
gap
in p
er−
capi
ta c
igar
ette
sal
es (
in p
acks
) Californiacontrol states
Passage of Proposition 99
28
![Page 32: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/32.jpg)
Ratio Post-Treatment MSPE to Pre-Treatment MSPE
(All 38 States in Donor Pool)
0 20 40 60 80 100 120
01
23
45
post/pre−Proposition 99 mean squared prediction error
freq
uenc
y
California
29
![Page 33: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/33.jpg)
Limitations and Potential Solutions
• Deal with only one treated unit at a time
• Multiple treated units, e.g. Acemoglu et al. (2017)
• Inference is hard
• Permutation inference and sensitivity analysis, e.g. Hahn and Shi
(2016); Sergio et al. (2017); Chernochukov (2017)
• Allow too much user discretion, e.g. cherry-picking Y ki results in
over-rejection (Ferman et al. 2017)
• Slow to implement and sometimes difficult to find a solution
• Other reweighting approaches...
• Model-based methods
30
![Page 34: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/34.jpg)
Literature
• Difference-in-differences (Ashenfelter & Card 1985; Card and Krueger 1994)
• Synthetic control (Abadie and Gardeazabal 2003; Abadie et al 2010, 2015)
• Best subset or penalized regression models (e.g. Hsiao et al. 2012;
Valero 2015; Doudchenko and Imbens 2016; Chernozhukov et al 2019)
• Matching and reweighting methods (e.g. Abadie 2005; Imai and Kim
2018; Hazlett and Xu 2018; Strezhnev; 2018; Imai et al 2019)
• Outcome model: factor augmented and matrix completion methods
(e.g. Gobillon and Magnac 2016; Xu 2017; Athey et al 2019)
• Hybrid approaches (e.g. Ben-Michael, Feller and Rothsstien 2018;
Arkhangelsky et al. 2019)
• Growing ...
31
![Page 35: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/35.jpg)
Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Time-varying confounders
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Matrix Completion Athey et al. (2018)
Balancing Robbins et al. (2017)
Hazlett and Xu (2018)
Multiple treated units, Computational and statistical efficiency, Robustness, Inference
Regression Methods Hsiao (2012)
Doudchenko & Immens (2016)
Panel Matching Imai and Kim (2018)
Imai, Kim and Wang(2018)
Literature
Propensity Score Reweighting
Austin (2011) Blackwell & Glynn (2018)
Strezhnev (2018)
Outcome Modeling Hybrid Methods
Augmented Synth Ben-Michael et al. (2018)
Synthetic DID Arkhangelsky et al. (2018)
Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Matching/Reweighting Outcome Modeling
Time-varying confounders
Literature Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Matching/Reweighting Outcome Modeling
Time-varying confounders
Literature Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Matching/Reweighting Outcome Modeling
Time-varying confounders
Literature Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Outcome Modeling
Time-varying confounders
Literature Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Matrix Completion Athey et al. (2018)
Outcome Modeling
Time-varying confounders
Literature Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Multiple treated units, Computational and statistical efficiency, Robustness, Inference
Outcome Modeling
Time-varying confounders
Literature
Matrix Completion Athey et al. (2018)
Balancing Robbins et al. (2017)
Hazlett and Xu (2018)
Panel Matching Imai and Kim (2018)
Imai, Kim and Wang(2018)
Propensity Score Reweighting
Austin (2011) Blackwell & Glynn (2018)
Strezhnev (2018)
Regression Methods Hsiao (2012)
Doudchenko & Imbens (2016)
Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Time-varying confounders
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Matrix Completion Athey et al. (2018)
Multiple treated units, Computational and statistical efficiency, Robustness, Inference
Outcome Modeling Hybrid Methods
Literature
Balancing Robbins et al. (2017)
Hazlett and Xu (2018)
Panel Matching Imai and Kim (2018)
Imai, Kim and Wang(2018)
Propensity Score Reweighting
Austin (2011) Blackwell & Glynn (2018)
Strezhnev (2018)
Regression Methods Hsiao (2012)
Doudchenko & Imbens (2016)
Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Time-varying confounders
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Matrix Completion Athey et al. (2018)
Multiple treated units, Computational and statistical efficiency, Robustness, Inference
Outcome Modeling Hybrid Methods
Augmented Synth Ben-Michael et al. (2018)
Synthetic DID Arkhangelsky et al. (2018)
Literature
Balancing Robbins et al. (2017)
Hazlett and Xu (2018)
Panel Matching Imai and Kim (2018)
Imai, Kim and Wang(2018)
Propensity Score Reweighting
Austin (2011) Blackwell & Glynn (2018)
Strezhnev (2018)
Regression Methods Hsiao (2012)
Doudchenko & Imbens (2016)
Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Time-varying confounders
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Matrix Completion Athey et al. (2018)
Balancing Robbins et al. (2017)
Hazlett and Xu (2018)
Multiple treated units, Computational and statistical efficiency, Robustness, Inference
Panel Matching Imai and Kim (2018)
Imai, Kim and Wang(2018)
Propensity Score Reweighting
Austin (2011) Blackwell & Glynn (2018)
Strezhnev (2018)
Outcome Modeling Hybrid Methods
Augmented Synth Ben-Michael et al. (2018)
Synthetic DID Arkhangelsky et al. (2018)
Literature
Regression Methods Hsiao (2012)
Doudchenko & Imbens (2016)
Difference-in-Differences (DiD) Ashenfelter & Card (1985), Card & Kruger (1993)
Semi-parametric DiDAbadie (2003)
Synthetic Control Method Abadie & Gardeazabel (2003)
Abadie et al. (2010)
Time-varying confounders
Matching/Reweighting
Interactive Fixed Effects Bai (2009), Gobillon & Magnac (2016),
Xu (2017)
Matrix Completion Athey et al. (2018)
Balancing Robbins et al. (2017)
Hazlett and Xu (2018)
Multiple treated units, Computational and statistical efficiency, Robustness, Inference
Panel Matching Imai and Kim (2018)
Imai, Kim and Wang(2018)
Propensity Score Reweighting
Austin (2011) Blackwell & Glynn (2018)
Strezhnev (2018)
Outcome Modeling Hybrid Methods
Augmented Synth Ben-Michael et al. (2018)
Synthetic DID Arkhangelsky et al. (2018)
Literature
Regression Methods Hsiao (2012)
Doudchenko & Imbens (2016)
![Page 36: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/36.jpg)
Roadmap
• DiD and synthetic control
• FE/DiD assumptions
• New estimators• Matching and reweighting
• Outcome models
* Diagnostic tools
• Hybrid methods
• Conclusions
![Page 37: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/37.jpg)
FE/DiD Assumptions
![Page 38: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/38.jpg)
Assumptions for Twoway FEs
Yit = τDit + X ′β + αi + ξt + εit
in which Dit is dichotomous
1. Functional form
• Additive fixed effect
• Constant and contemporaneous treatment effect
• Linearity in covariates
2. Strict exogeneity
εit ⊥⊥ Djs ,Xjs , αj , ξs ∀i , j , t, s
⇒ {Yit(0),Yit(1)} ⊥⊥ Djs |XXX ,ααα,ξξξ ∀i , j , t, s
if only two groups, parallel trends:
⇒ E[Yit(0)−Yit′(0)|XXX ] = E[Yjt(0)−Yjt′(0)|XXX ] i ∈ T , j ∈ C,∀t, t ′
33
![Page 39: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/39.jpg)
Shortcomings of Twoway FEs
Yit = τDit + X ′β + αi + ξt + εit
Assumptions
1. Functional form
2. Strict exogeneity
Challenges
1. Treatment effect heterogeneity leads to bias
2. Difficult to evaluate strict exogeneity
3. Strict exogeneity means a lot more than what you think
4. A deeper question: what does fixed effects approach imply from a
design-based inference perspective?
34
![Page 40: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/40.jpg)
Treatment Effect Heterogeneity Causes Bias
Yit = τitDit + αi + εit
Within estimator:
τ = arg minτ
∑i
∑t
{(Yit − Yit)− τ(Dit − Dit)}
Causal estimand (ATT):
τ = E[τit |Ci = 1,Dit = 1], Ci = 1 if var(Dit) 6= 0
Proposition:
τ → E{Ciσ2i [Yi (1)− Yi (0)]}E[Ciσ2
i ]=
E[Ciσ2i τi ]
E[Ciσ2i ]6= τ
i.e. a unit fixed effect model gives the average “variance-weighted”
treatment effect, cf. Chernochukov et al. (2013) Theorem 1
35
![Page 41: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/41.jpg)
Shortcomings of Twoway FEs
Yit = τDit + X ′β + αi + ξt + εit
Assumptions
1. Functional form
2. Strict exogeneity
Challenges
1. Treatment effect heterogeneity leads to bias
2. Difficult to evaluate strict exogeneity
3. Strict exogeneity means a lot more than what you think
4. A deeper question: what does fixed effects approach imply from a
design-based inference perspective?
36
![Page 42: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/42.jpg)
Common Practice: Dynamic Treatment Effect Plots
Adhikari and Alm (2016)
1. parametric assumptions
2. arbitrarily chosen base category
3. unreliable tests
37
![Page 43: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/43.jpg)
Shortcomings of Twoway FEs
Yit = τDit + X ′β + αi + ξt + εit
Assumptions
1. Functional form
2. Strict exogeneity
Challenges
1. Treatment effect heterogeneity leads to bias
2. Difficult to evaluate strict exogeneity
3. Strict exogeneity means a lot more than what you think
4. A deeper question: what does fixed effects approach imply from a
design-based inference perspective?
37
![Page 44: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/44.jpg)
Reinterpreting Strict Exogeneity (Imai and Kim 2019)
1. No unobserved time-varying confounder exists
2. Past outcomes don’t directly affect current outcome (no LDV)
3. Past treatments don’t directly affect current outcome (no “carryover
effect”)
4. Past outcomes don’t directly affect current treatment (no “feedback”)
38
![Page 45: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/45.jpg)
Reinterpreting Strict Exogeneity (Imai and Kim 2019)
1. No unobserved time-varying confounder exists
2. Past outcomes don’t directly affect current outcome (no LDV)
3. Past treatments don’t directly affect current outcome (no “carryover
effect”)
4. Past outcomes don’t directly affect current treatment (no “feedback”)
• To relax 2 or 3, “block”/control for past treatments → but how many?
• To relax 4, need instrumental variables (Arellano and Bond 1991) → hard to
justify instruments; bad finite sample properties
• Often end up directly controlling for arbitrary number of past treatments
and LDVs → Nickel bias
39
![Page 46: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/46.jpg)
Shortcomings of Twoway FEs
Yit = τDit + X ′β + αi + ξt + εit
Assumptions
1. Functional form
2. Strict exogeneity
Challenges
1. Treatment effect heterogeneity leads to bias
2. Difficult to evaluate strict exogeneity
3. Strict exogeneity means a lot more than what you think
4. A deeper question: what does fixed effects approach imply from a
design-based inference perspective?→ hypothetical experiment?
40
![Page 47: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/47.jpg)
Hypothetical Experiment?
Strict exogeneity implies the following data generating processes:
αi ,XiXiXi → DiDiDi → YiYiYi
treatment status are assigned randomly or at one shot, not sequentially!
Examples: random assignment within units
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Time
Uni
t
Under Control Under Treatment
Treatment Status
41
![Page 48: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/48.jpg)
Hypothetical Experiment?
Strict exogeneity implies the following data generating processes:
αi ,XiXiXi → T0i → DiDiDi → YiYiYi
treatment status are assigned randomly or at one shot, not sequentially!
Examples: staggered adoption (Athey and Imbens 2018)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Time
Uni
t
Under Control Under Treatment
Treatment Status
42
![Page 49: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/49.jpg)
What We Try to Do
Yit = τDit + X ′β + αi + ξt + εit
Assumptions
1. Functional form
2. Strict exogeneity
What we do
? Allow treatment effect heterogeneity
? Develop methods to evaluate strict exogeneity
X Relax the no-time-varying confounder assumption
- Think harder about the hypothetical experiment
43
![Page 50: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/50.jpg)
Roadmap
• DiD and synthetic control
• FE/DiD assumptions
• New estimators• Matching and reweighting
• Outcome models
* Diagnostic tools
• Hybrid methods
• Conclusions
![Page 51: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/51.jpg)
Matching and Reweighting
![Page 52: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/52.jpg)
The Matching/Reweighting Approach
Synthetic control
Yit(0) = λ′i ft + ξt + εit , and E[εit ] = 0, ∀i , t
• Choose weights on controls such that Yit =∑
i ′ 6=i w∗i ′Yi ′t , ∀t ≤ T0
• As a result: λi =∑
i ′ 6=i w∗i ′λi ′
New Developments
• Propensity score reweighting – extension to Abadie (2005)
• Panel matching
• Panel regression methods
• Balancing: mean balancing and trajectory balancing
44
![Page 53: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/53.jpg)
Panel Matching (Imai, Kim & Wang 2019)
Assumptions
• Sequential exogeneity (past info can affect today’s treatment)
→ Parallel trends after conditions (similar to Abadie 2005)
• SUTVA (no spillover effect)
• Limited carryover effect
Procedure
1. Create a matched set for each treated observation based on
treatment history
2. Refine the matched set via any matching or weighting method
3. Compute causal effect via weighted DiD using the refined set
4. Calculate model-based standard errors
45
![Page 54: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/54.jpg)
Example: Democracy and Economic Growth
1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999 2002 2005 2008
Year
Cou
ntry
Missing Autocracy Democracy
Democracy and Economic Growth: Treatment Status
46
![Page 55: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/55.jpg)
Example: Democracy and Economic Growth
• Match based on treatment history for the past L periods
47
![Page 56: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/56.jpg)
Example: Democracy and Economic Growth
• Refine the matched set based on covariates and pre-treatment
outcomes
The number of matched control units
48
![Page 57: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/57.jpg)
Example: Democracy and Economic Growth
Estimated treatment effects
49
![Page 58: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/58.jpg)
Panel Matching: Advantages and Limitations
Advantages
• Require sequential exogeneity instead of strict exogeneity
• Allow treatment reversal
• Allow a variety of matching/reweighting methods
Limitations
• Lots of data (w/ info on outcome dynamics) are dropped
• Normally, imbalances remain
• Many choices require user discretion
50
![Page 59: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/59.jpg)
Panel Regression Methods
• Synthetic control finds a convex combination of controls to mimic
the treated
• Implicitly assumes
• non-negative weights
• no intercept shift
• weights add up to 1
• Other ways to obtain weights
• Constrained regression
• Regression based on the best subset
• Penalized regression
• Balancing
51
![Page 60: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/60.jpg)
Constrained Regression
• No intercept shift; weights add up to 1; non-negativity
ωconstr = arg minω
∑t≤T0
(Y1t − ω′YCt)2
s.t.∑i∈C
ωi = 1 and ωi ≥ 0,∀i ∈ C
• Limitation: T0 > Nco
52
![Page 61: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/61.jpg)
Best Subset (Hsiao et al. 2012)
• Choose the number of weights that are allowed to be positive
• Optimize the weights after taking out the intercepts(µsubset , ωsubset
)= arg min
µ,ω
∑t≤T0
(Y1t − µ− ω′YCt)2
s.t.∑i∈C
1ωi 6=0 ≤ k
• Bottom-up approach: search for the best 1, then the best 2, then
the best 3 ... (greedy)
• Weights can be negative and do not need to add up to 1
53
![Page 62: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/62.jpg)
Penalized Regression (Doudchenko and Imbens 2016)
• Use elastic net (L1 and L2 regularization) to select weights
(µen, ωen) = arg minµ,ω
(Y1,pre−µ−ω′YC,pre)2+λ·(
1− α2‖ω‖2
2 + α‖ω‖1
)in which λ and α are tuning parameters
• Allow intercept shift; weights can be negative and do not add up to 1
• Randomization inference: (1) treated unit is randomly selected; (2)
the timing of the onset of the treatment is randomly selected
54
![Page 63: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/63.jpg)
Example: Proposition 99 in California
Adapted from Doudchenko and Imbens (2016)
55
![Page 64: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/64.jpg)
Example: German Reunification
Adapted from Doudchenko and Imbens (2016)
56
![Page 65: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/65.jpg)
Differences Among Existing Methods
synth cstr Hsiao EN IPW PMatch meanbal tjbal
Methods of getting weights bal reg reg reg reg match bal bal
Allow short T0 X X X XNon-negative weights X X X X XWeights add up to 1 X X X XIntercept shift X X X X X X XConvex combination X X X
Multiple treated units X X X XComputational efficiency X X X X X XDimension reduction X X(Almost) exact balancing X XHigher-order features X
See Doudchenko & Imbens (2016) and Ben-Michael et al. (2018) for detailed discussions.
57
![Page 66: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/66.jpg)
Balancing: A Simple, Unified Framework
Almost all commonly used panel data models imply common function
space with Y 0post linear in Ypre .
Linearity in Pre-Treatment Outcomes – LPO
E[Y 0it |Yi,pre ] = (1 Yi,pre)′θt , T0 < t ≤ T .
• Diff-in-Diffs
• Twoway FEs
• ARMA
• IFE, Synth
Basic Idea: Equal means on Yi,pre would imply equal mean on Y 0it,t>T0
regardless of θt (Robbins et al. 2017)
58
![Page 67: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/67.jpg)
Mean Balancing
• Objective: choose weights on controls to get same average
pre-treatment trajectory for weighted controls as treated,
1
Ntr
∑i∈T
Yi,pre =∑j∈C
wjYj,pre
• Conceptually similar to synth: choosing weights to predict for
counterfactual averages
• In practice: seek approximate balance, working from largest toward
smallest principal components of Ypre(Ypre)′ with a stopping rule of
minimizing the upper bound of biases (Hazlett & Xu 2018)
59
![Page 68: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/68.jpg)
Mean Balancing
Assumption 1 (Linearity in Pre-Treatment Outcomes)
E[Y 0it |Yi,pre ] = (1 Yi,pre)′θt , T0 < t ≤ T .
Assumption 2 (Conditional Ignorability)
Y 0it ⊥⊥ Gi |Yi,pre , ∀t > T0
Assumption 3 (Feasibility)1
Ntr
∑Gi=1
Yit =∑Gi=0
wiYit , t ≤ T0 wi ≥ 0,∑Gi=0
wi = 1
Under the above assumptions, we have:
Proposition 4 (Unbiasedness)
E[ATT t |Ypre ] = ATTt , ∀t > T0
60
![Page 69: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/69.jpg)
Limitations of Mean Balance
1. Weights that achieve mean balancing can leave treated and control
different on non-linear functions of Ypre
2. Mean balancing limited by number of pre-treatment points
• Few pre-treatment periods = few constraints unless you make more
• With enough periods, anything that matters to Y 0 will appear in Y 0
and can be balanced on — but with fewer periods, no guarantees
61
![Page 70: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/70.jpg)
Trajectory Balancing
• Mean balancing: a simple approach to get same average trajectory
for weighted controls as treated:
1
Ntr
∑i∈T
Yi,pre =∑j∈C
wjYj,pre
• Trajectory balancing: feature mapping Yi,pre 7→ φ(Yi,pre),
then balance
1
Ntr
∑i∈T
φ(Yi,pre) =∑j∈C
wjφ(Yj,pre)
• Get mean balance on a feature expansion instead
φ(Yi,pre ,Xi ), φ : RP 7→ RP′
62
![Page 71: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/71.jpg)
Choice of φ()
A good choice of φ() is one that:
• requires little or no user discretion
• includes all continuous functions (at the limit)
• perhaps, prioritizes low frequency, smoother functions
• allows covariates to play a role
Implementation: Gaussian kernel then approximation via principal
components
63
![Page 72: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/72.jpg)
Implementation
• Instead of Ypre , form kernel matrixKi,j = k([Yi ,Xi ], [Yj ,Xj ]) = exp(−||[Yi ,Xi ]− [Yj ,Xj ]||2/h)
• Replaces each unit’s [Yi,pre ,Xi ] with a vector ki encoding how similar
observation i is to observation 1, 2, ...
• SVD this matrix to obtain components/ eigenvectors
• Choose weights to get mean balance on these, starting from largest
• We choose the number of principal components to include by minimizing the
upper bound of bias in the ATT estimates
64
![Page 73: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/73.jpg)
When Averages Fail and φ()’s Thrive
Intuition: mean balancing is okay but may emphasize “wrong” features
of the pre-treatment trend
• Trajectory balancing gets you similarity of whole trajectories rather
than just equal means at each time point → balance on
“higher-order” features such as variance, curvature, etc.
• Approximately, trajectory balancing gets multivariate distribution of
Ypre for the controls equal to that of the treated, whereas mean
balancing only gets marginals equal
• This can matter when non-linear functions of Ypre are confounders,
especially when T0 short
65
![Page 74: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/74.jpg)
When Mean Balancing Fail: A Severe Example
• N = 200 countries with simulated GDP over years T ∈ {1, 2, ..., 24}
• Two “types” of countries:
Volatile with no growth:
GDPit = 5 + ai sin(.2πt) + bicos(.2πt) + .1εit
εit ∼ N(0, 1), ai , bi ∼ U(−1, 1)
Or steady growing:
GDPit = 4 + ci1.03t + .1εit
εit ∼ N(0, 1), ci ∼ U(0.9, 1.1)
• A randomly selected 25% of the stable type take the treatment.
66
![Page 75: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/75.jpg)
When Mean Balancing Fails: A Severe Example
Controls
Year
GD
P
34
56
7
0 4 8 12 16 20 24
Treated
Year
GD
P
34
56
7
0 4 8 12 16 20 24
Heavily weighted control units (pre-treatment)
Controls: Mean Balancing
Year
GD
P
34
56
7
1 2 3 4 5 6 7 8
Treated AverageWeighted Control Average
Controls: Kernel Balancing
Year
GD
P
34
56
7
1 2 3 4 5 6 7 8
Treated AverageWeighted Control Average
67
![Page 76: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/76.jpg)
When Mean Balancing Fails: A Severe Example
68
![Page 77: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/77.jpg)
What Information is Encoded in the Kernel Matrix?
●●
●
●
● ●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●●
●
●
●
●
●
●●●
●
●
●●
● ●
●
●
●
●
● ●●
●
● ●●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●●● ●●
●●
●
●
●
● ●
●
●●
●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
Log Variance of Pre−treatment Outcomes
Firs
t Prin
cipa
l Com
pone
nt o
f K
−0.
10−
0.08
−0.
06−
0.04
−0.
020.
00
−7 −6 −5 −4 −3 −2 −1 0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
TreatedControls (all)Controls (weighted)
69
![Page 78: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/78.jpg)
Two Extensions
• Incorporating covariates
Assumption 5 (Linearity in φ(Xi ,Yi,pre))
E[Y 0it |Xi ,Yi,pre ] = φ(Xi ,Yi,pre)′θt , T0 < t ≤ T .
• Intercept shift and demeaning
Assumption 6 (Parallel Trends)
E[Y 0it − Y 0
is |Yi,pre ] = E[Y 0it − Y 0
is |Yi,pre ,Gi ], ∀ t, s ∈ {1, 2, · · · ,T}
70
![Page 79: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/79.jpg)
Truex (2014): Return to office in China’s Parliament
• Treatment: CEO taking a seat in the
National People’s Congress (NPC)
Outcome: Return on assets (ROA)
• 48 treated firms, 984 controls
Pre-treatment: 2005-2007
Post-treatment: 2008-2010
• Two covariates: state ownership,
revenue in 2007
• Balancing on: roa2005, roa2006,
roa2007, so portion, rev2007 (and
higher order terms through a kernel
transformation)2005 2006 2007 2008 2009 2010
YearF
irm
71
![Page 80: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/80.jpg)
Balance on Pre-treatment Outcome Trajectories
Year
Ret
urn
On
Ass
ets
Top 25% Heavily Weighted Controlsw/ Mean Balancing
2005 2006 2007
−0.
50−
0.25
0.00
0.25
0.50
Year
Ret
urn
On
Ass
ets
Treated
2005 2006 2007
−0.
50−
0.25
0.00
0.25
0.50
Year
Ret
urn
On
Ass
ets
Top 25% Heavily Weighted Controlsw/ Kernel Balancing
2005 2006 2007
−0.
50−
0.25
0.00
0.25
0.50
72
![Page 81: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/81.jpg)
Balance Check
Mean
●
●
●
●
●rev2007
so_portion
roa2007
roa2006
roa2005
−0.50 −0.25 0.00 0.25 0.50
Difference in Means
● Unweighted Mean Balancing Kernel Balancing
Variance
●
●
●
●
●rev2007
so_portion
roa2007
roa2006
roa2005
−4 0 4
(Varco − Vartr) Vartr
● Unweighted Mean Balancing Kernel Balancing
73
![Page 82: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/82.jpg)
Truex (2014): Main Results
2005 2006 2007 2008 2009 2010
−0.
020.
000.
020.
04
Year
Diff
eren
ce−
in−
Mea
ns
Mean BalancingKernel Balancing
NPC Membership and Return on Assets
74
![Page 83: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/83.jpg)
Summary
• Removing time-invariant confounders is costly, i.e.,no carryover
effect, no feedback from past Y to current D (Kim & Imai 2018)
• Sequential exogeneity may be more desirable than strict exogeneity
• Panel non-parametric and semi-parametric methods have gone a
long way
• Things quickly get more complex when the number of different
treatment histories grows
• Inference is hard especially when there are only a small number of
treated units
75
![Page 84: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/84.jpg)
Roadmap
• DiD and synthetic control
• FE/DiD assumptions
• New estimators• Matching and reweighting
• Outcome models
* Diagnostic tools
• Hybrid methods
• Conclusions
![Page 85: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/85.jpg)
Outcome Models
![Page 86: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/86.jpg)
Basic Idea
• In a panel setting, treat Y (1) as missing data
• Predict Y (0) based on an outcome model
• (Use pre-treatment data for model selection)
• Estimate ATT by averaging differences between Y (1) and Y (0)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Time
Uni
t
Under Control Under Treatment
Treatment Status
76
![Page 87: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/87.jpg)
Basic Idea
ATT s = E[τit |Di,t−s = 0,Di,t−s+1 = Di,t−s+2 = · · · = Dit = 1︸ ︷︷ ︸s periods
,∀i ∈ T ].
−4 −3 −2 −1 0 1 2 3 4 5
−5 −4 −3 −2 −1 0 1 2 3 4
−6 −5 −4 −3 −2 −1 0 1 2 3
−7 −6 −5 −4 −3 −2 −1 0 1 2
−8 −7 −6 −5 −4 −3 −2 −1 0 1
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10
time
id
Under Control Under Treatment
77
![Page 88: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/88.jpg)
A New Plot for “Dynamic Treatment Effects”
100
−5.0
−2.5
0.0
2.5
5.0
7.5
−30 −20 −10 0 10Time relative to the Treatment
Effe
ct o
n Y
No "Pre−trend"
78
![Page 89: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/89.jpg)
A New Plot for “Dynamic Treatment Effects”
100
−5.0
−2.5
0.0
2.5
5.0
7.5
−30 −20 −10 0 10Time relative to the Treatment
Effe
ct o
n Y
A "Pre−trend"
79
![Page 90: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/90.jpg)
Model-based Counterfactual Estimators
A model-based counterfactual estimator proceeds in the following steps:
• Step 1. Train the model using observations under the control
condition (Dit = 0).
• Step 2. Predict the counterfactual outcome Yit(0) for each
observation under the treatment condition (Dit = 1) and obtain an
estimate of the individual treatment effect: τit = Yit − Yit(0).
• Step 3. Generate estimates for the causal quantities of interest
ATT = E[τit |Dit = 1,∀i ∈ T ,∀t], or
ATTs = E[τit |Di,t−s = 0,Di,t−s+1 = Di,t−s+2 = · · · = Dit = 1︸ ︷︷ ︸s periods
,∀i ∈ T ].
80
![Page 91: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/91.jpg)
Review of Three Estimators
We review three estimation strategies:
• FEct:
Yit(0) = Xit β + αi + ξt
• IFEct (Gobillon&Magnac 2016; Xu 2017):
Yit(0) = Xit β + λ′i Ft
• Matrix Completion (MC) (Athey et al. 2018):
Yit(0) = Xit β + Lit ,
where matrix {Lit}N×T is a lower-rank matrix approximation of
{Y (0)}N×T with missing values
Remarks:
• DiD is a special case of FEct
• Both IFEct and MC are estimated via iterative algorithms
• Cross-validation to choose the tunning parameter
81
![Page 92: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/92.jpg)
IFEct
Xu (2017) proposes a three-step approach based on a latent factor model:
Control Yit(0) = X ′itβ + αi + ξt + λ′i ft + εit
Treated Yit(0) = X ′itβ + αi + ξt + λ′i ft + εit (pre)
Yit(1) = X ′itβ + αi + ξt + λ′i ft + εit + τit (post)
1. Expectation-Maximization
(Gobillon & Magnac 2016)factors: r × T
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30Time
Treated (Pre) Treated (Post) Controls
82
![Page 93: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/93.jpg)
Election Day Registration (EDR) and Voter Turnout
Causal inference is a missing data problem.
WVWAVTVAUTTXTNSDSCRIPA
OROKOHNYNVNMNJNENCMSMOMI
MDMALAKYKSINIL
GAFLDECOCAAZARALCTMTIA
WYNHIDWI
MNME
1920 1924 1928 1932 1936 1940 1944 1948 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012Year
Sta
te
Treated (Pre) Treated (Post) Controls
EDR Reform
83
![Page 94: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/94.jpg)
Main Results
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
FEct
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
IFEct
84
![Page 95: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/95.jpg)
Factors and Factor Loadings
1920 1936 1952 1968 1984 2000 2016
Year
−15
−10
−5
05
1015
Turn
out %
Factor 1Factor 2
AL
AZ
AR
CA
CO
DE
FL
GAIL
IN
KS
KY
LA
MD
MA
MI
MS
MO
NE
NV
NJ
NM
NY
NC
OH
OK
OR
PARI
SC
SD
TN
TXUTVT
VA
WA
WV
CT
ID
IA
MEMN
MT
NH
WI
WY
−20 −10 0 10 20
Loadings for Factor 1
−4
−2
02
46
Load
ings
for
Fact
or 2
85
![Page 96: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/96.jpg)
Example: Cadre Visits on Loans
−12 −10 −8 −6 −4 −2 0 2 4
Quarter(s) before / after a cadre's visit
−0.
6−
0.4
−0.
20.
00.
20.
40.
60.
8
GS
C E
stim
ate
The Effect of Cadre Visits95% Confidence Interval
86
![Page 97: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/97.jpg)
Example: Property Rights and Land Improvement (Sanford 2019)
• Does property rights lead to improved land quality?
• “Experiment”: giving peasants in Borgou, Benin land titles
• Use satellite (remote sensing) data to measure land improvement,
i.e., switch from annual crops to perennial crops (bushes and trees)
• Use IFEct to construct counterfactuals
87
![Page 98: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/98.jpg)
Original Satellite Images
88
![Page 99: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/99.jpg)
Treated and Counterfactual Averages
Pre-treatment Outcome
89
![Page 100: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/100.jpg)
Treated and Counterfactual Averages
Post-treatment Outcome (1 Year)
90
![Page 101: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/101.jpg)
Treated and Counterfactual Averages
Post-treatment Outcome (4 Years)
91
![Page 102: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/102.jpg)
Factors and Loadings
92
![Page 103: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/103.jpg)
Geographic Distribution of Heavily-weighted Controls
Bigger number represents higher dissimilarity
93
![Page 104: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/104.jpg)
Matrix Completion Methods
AL
CA
DE
IA
IN
LA
ME
MO
NC
NJ
NY
OR
SC
TX
VT
WV
1920 1928 1936 1944 1952 1960 1968 1976 1984 1992 2000 2008year
abb
Treated States (before EDR) Treated States (after EDR) Control States
EDR Reform• Recall that our main goal is to
predict treated counterfactuals
• Taking advantage of the matrix
structure,
matrix completion methods use
non-treated data to achieve this
goal
• The basic idea to find a
lower-rank representation of the
matrix to impute the “missing
data”
• Xu (2017) is a special case of
this approach94
![Page 105: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/105.jpg)
Matrix Completion Methods
• Recall in the baseline DiD setup:
Y =
(Y 0T ,pre ??
Y 0C,pre Y 0
C,post
)• Matrix completion (MC) methods attempt to find a lower-rank
representation of Y, which we call L, that makes predictions of
missing values in Y
• Athey et al. (2018) generalize Xu (2017) with different ways of
constructing L
• Plus, missingness can be arbitrary → accommodate reversible
treatments (note: strict exogeneity)
95
![Page 106: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/106.jpg)
Matrix Completion Methods
• Mathematically,
Yit = Lit + αi + ξt + X ′itβ + εit
in which Lit is an element of L, an (N × T ) matrix
• We need regularization on L because of too many parameters:
minL
1
#Controls
∑Dit=0
(Yit − Lit)2 + λL‖L‖∗
• The nuclear norm ‖.‖∗ generally leads to a low-rank solution for L
‖L‖∗ =
min(N,T )∑i=1
σi (L)
in which σi (L) that the singular values of L
96
![Page 107: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/107.jpg)
IFEct vs. MC
• Singular value decomposition of L
LN×T = SN×NΣN×TRT×T
• Difference in how ΣN×T is regularized
IFE MC
best subset nuclear norm
σ1 0 0 · · · 0
0 σ2 0 · · · 0
0 0 0 . . . 0
.
.
.
.
.
.
.
.
.
...
.
.
.
0 0 0 · · · 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 · · · 0
|σ1 − λL|+ 0 0 · · · 0
0 |σ2 − λL|+ 0 · · · 0
0 0 |σ3 − λL|+ . . . 0
.
.
.
.
.
.
.
.
.
... 0
|σT − λL|+0 0 0 · · · 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 · · · 0
in which |a|+ = max(a, 0)
97
![Page 108: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/108.jpg)
IFEct vs. MC
The pros and cons of IFEct and MC:
• IFEct works better with a small number of strong factors
• MC works better with a large number of weak factors
IFEct (known r)
IFEct (CV−ed r)
MC
Variance of each factor
0
5
10
15
20
1 2 3 4 5 6 7 8 9Number of Factors in the DGP
MS
PE
98
![Page 109: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/109.jpg)
Over-fitting of IFEct and MC
When the true DGP is a two-factor IFE model:
sigma (Pre)
SD (ATT)
Bias (ATT)
RMSE (ATT)
IFEct
0.0
0.5
1.0
1.5
2.0
2.5
0 1 2 3 4 5 6Number of Factors in the Model
sigma (Pre)
SD (ATT)
Bias (ATT)
RMSE (ATT)
MC
0.0
0.5
1.0
1.5
2.0
2.5
−4.4 −5.2 −6.1 −7 −7.8 −8.7 −9.6log(Tunning Parameter)
99
![Page 110: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/110.jpg)
Inferential Methods
• Non-parametric block bootstrap
• sample with replacement across units
• valid when N is large, NtrN
is fixed
• A permutation-based test for Sharp Nulls (Chernozhukov et al 2019)
• e.g. Yit(1) = Yit(0), ∀i ∈ T , t > T0i
• randomization over time (by blocks) instead of across units
• valid if T is large, errors are stationary weekly dependent, and
estimators are consistent or stable
• exact if errors are i.i.d.
100
![Page 111: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/111.jpg)
Algorithm for Testing the Sharp Null
Based on Chernozhukov et al (2019):
• Assuming the Sharp Null is true: H0 : Y (0)it = Y (1)it , ∀i ∈ T , t.
• Denote Zt = (Y1t ,Y2t , · · ·YNt ,X1t ,X2t , · · ·XNt)′.
• Denote a i.i.d. block permutation πh a one-to-one mapping:
πh : {b1, · · · , bK} 7→ {b1, · · · , bK}, in which {b1, · · · , bK} is a
partition of {1, · · · ,T}.
101
![Page 112: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/112.jpg)
Algorithm for Testing the Sharp Null
Step 1. Using a counterfactual estimator with real data, calculate the
following test statistics S(u) =√|O||ATT |, in which |O| is the number
of treated observations.
Step 2. Generate a random i.i.d block permutation πh; reorganize the
data (over the time dimension) based on πh while fixing the treatment
assignment matrix.
Step 3. Re-estimate the test statistic using the permuted data,
obtaining S(uπh) using the same method in Step 1.
Step 4. Repeat Steps 2-3 H times, obtaining {S(uπ1 ) · · · ,S(uπH)}.
Step 5. Calculate the p-value: p = 1− F (S(u)) in which
F (x) =1
H
H∑1
1{S(uπh) < x}
102
![Page 113: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/113.jpg)
Diagnostic Tests
![Page 114: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/114.jpg)
A Simulated Example
Data Generating Process:
• T = 35, N = 200
• Outcome model: a linear interactive fixed effect model with two factors: one drift
process and one white noise.
Yit = τitDit + 5 + 1 · Xit,1 + 3 · Xit,2 + λi1 · f1t + λi2 · f2t + αi + ξt + εit
• Treatment assignment: staggered adoption with T0i correlated with additive and
interactive fixed effect.
• Treatment effects: τi,t>T0i = 0.2(t − T0i ) + eit
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Time
Uni
t
Under Control Under Treatment
Treatment Status
103
![Page 115: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/115.jpg)
Dynamic Treatment Effects
100
−5.0
−2.5
0.0
2.5
5.0
7.5
−30 −20 −10 0 10Time relative to the Treatment
Effe
ct o
n Y
IFEct
104
![Page 116: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/116.jpg)
1. Placebo Test
• Drop S periods before the treatment’s onset, and estimate the
average treatment effect in these periods.
• Test whether the average effect is significant
• Robust to model misspecification, but using only limited information
● ●
●●
●● ●
● ●
●
●
●●
●●
●
● ● ● ● ● ●● ●
●●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
Placebo p value: 0.400
100
−5.0
−2.5
0.0
2.5
5.0
7.5
−30 −20 −10 0 10Time relative to the Treatment
Effe
ct o
n Y
IFEct
105
![Page 117: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/117.jpg)
2. Equivalence Test
• Extension to Hartman and Hidalgo (2018) in a TSCS setting
• H0: |ATTt | > τ vs. H1: |ATTt | ≤ τ
• We calculate the maximal possible τ and compare it with
pre-specified threshold: 0.36 ∗ sd(Yit |Dit = 0)
• It has more power when the sample size grows larger, and is more
likely to reject the Null (hence, equivalence holds) when a
confounder is trivial
• Drawback 1: setting the threshold requires user discretion
• Drawback 2: easy to pass when pre-treatment data are used to fit
the model (IFEct and MC)
106
![Page 118: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/118.jpg)
Equivalence Test
Wald p value: 0.321
100
−5.0
−2.5
0.0
2.5
5.0
7.5
−30 −20 −10 0 10Time relative to the Treatment
Effe
ct o
n Y
MC
107
![Page 119: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/119.jpg)
Why Not a Wald test?
• A simpler test: conduct an Wald (F) test on pre-treatment residual
averages
• However, when there exists a small confounders which induces a
neglectable bias compared with the ATT, a Wald test will almost
always reject the Null (that equivalence holds) when there are
enough data
• The Equivalence Test avoids this problem
108
![Page 120: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/120.jpg)
Why Not a Wald Test?
• There exist a time-varying confounder
• We vary its influence on the bias in the ATT
Wald Equivalence
0.00
0.25
0.50
0.75
1.00
0.0 0.1 0.2 0.3 0.4Bias / SD(Y)
Pro
port
ion
Dec
lare
d B
alan
ce
109
![Page 121: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/121.jpg)
Why Not a Wald Test?
Wald
Equivalence
0.00
0.25
0.50
0.75
1.00
0 250 500 750 1000Number of Units (N)
Pro
port
ion
Dec
lare
d B
alan
ce
Bias = 0.08SD(Y )
Wald
Equivalence
0.00
0.25
0.50
0.75
1.00
0 250 500 750 1000Number of Units (N)
Pro
port
ion
Dec
lare
d B
alan
ce
Bias = 0.28SD(Y )
110
![Page 122: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/122.jpg)
Hainmueller & Hangatner (2019)
Does indirect democracy benefit immigrant minorities?
• Unit of analysis: 1400 Swiss municipalities from 1991-2009
• Treatment: Indirect (vs. direct) democracy
• Outcome: Naturalization rate
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Year
Uni
t
Direct Democracy Indirect Democracy
111
![Page 123: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/123.jpg)
Hainmueller & Hangatner (2019)
●
●
●
●
● ● ●
●
●
● ●
●
●
●
●
● ●
●
●
●
●●
Placebo p value: 0.404
470
−4
−2
0
2
4
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n N
atur
aliz
atio
n R
ate
(%)
Placebo Test
112
![Page 124: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/124.jpg)
Xu (2017): EDR on Turnout
Does Election Day Registration increase turnout?
• Unit of analysis: 47 US states from 1920 to 2012
• Treatment: EDR reform
• Outcome: Turnout
1920 1924 1928 1932 1936 1940 1944 1948 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012
Year
Sta
te
No EDR EDR
Treatment Status
113
![Page 125: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/125.jpg)
Xu (2017): EDR on Turnout
FEct
Wald p value: 0.129
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
Dynamic Treatment Effects
●
●
● ● ●
●
●
●
●●
●
●
●●
●
●
●
●
●
●● ●
Placebo p value: 0.088
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
Placebo Test
114
![Page 126: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/126.jpg)
Xu (2017): EDR on Turnout
IFEct
Wald p value: 0.728
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
Dynamic Treatment Effects
● ●● ●
●
●
●
●
● ●
●
●
●●
●
● ●
●
●
●● ●
Placebo p value: 0.316
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
Placebo Test
115
![Page 127: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/127.jpg)
Xu (2017): EDR on Turnout
MC
Wald p value: 0.644
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
Dynamic Treatment Effects
● ●●
●●
●
● ●
●●
●
●
●●
●
●
●
●
●
● ●●
Placebo p value: 0.094
9
−15
−10
−5
0
5
10
15
−15 −10 −5 0 5Time relative to the Treatment
Effe
ct o
n Tu
rnou
t (%
)
Placebo Test
116
![Page 128: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/128.jpg)
Acemoglu et al. (2019): Democracy on Growth
• Unit of analysis: 184 countries over 51 years (1960-2010)
• Treatment: a dichotomous measure of democracy and autocracy
• Outcome: Log GDP per capita (’2000 dollars)
Entering Democracy 117
![Page 129: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/129.jpg)
Acemoglu et al. (2019): Democracy on Growth
• Unit of analysis: 184 countries over 51 years (1960-2010)
• Treatment: a dichotomous measure of democracy and autocracy
• Outcome: Log GDP per capita (’2000 dollars)
Wald p value: 0.111
126
−40
−20
0
20
40
−20 −10 0 10 20Time relative to the Treatment
Effe
ct o
n lo
g G
DP
per
cap
ita
FEct
Wald p value: 0.182
126
−40
−20
0
20
40
−20 −10 0 10 20Time relative to the Treatment
Effe
ct o
n lo
g G
DP
per
cap
ita
IFEct
Wald p value: 0.079
126
−40
−20
0
20
40
−20 −10 0 10 20Time relative to the Treatment
Effe
ct o
n lo
g G
DP
per
cap
ita
MC
Entering Democracy
118
![Page 130: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/130.jpg)
Acemoglu et al. (2019): Democracy on Growth
• Unit of analysis: 184 countries over 51 years (1960-2010)
• Treatment: a dichotomous measure of democracy and autocracy
• Outcome: Log GDP per capita (’2000 dollars)
Wald p value: 0.890
58
−60
−30
0
30
60
−10 0 10 20Time relative to the Treatment
Effe
ct o
n lo
g G
DP
per
cap
ita
FEct
Wald p value: 0.270
58
−60
−30
0
30
60
−10 0 10 20Time relative to the Treatment
Effe
ct o
n lo
g G
DP
per
cap
ita
IFEct
Wald p value: 0.620
58
−60
−30
0
30
60
−10 0 10 20Time relative to the Treatment
Effe
ct o
n lo
g G
DP
per
cap
ita
MC
Exiting Democracy
119
![Page 131: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/131.jpg)
Roadmap
• DiD and synthetic control
• FE/DiD assumptions
• New estimators• Matching and reweighting
• Outcome models
* Diagnostic tools
• Hybrid methods
• Conclusions
![Page 132: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/132.jpg)
Hybrid Methods
![Page 133: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/133.jpg)
Hybrid Methods
• So far, we’ve surveyed two group of methods: (1) those constructing
balancing weights; (2) those modeling the conditional outcomes
• Combining the two approaches will likely produce doubly robust
estimators
• Some methods we discussed, including semi-parametric DiD, panel
matching, trajectory balancing, are already doing a simple version of
it (balancing plus regression)
• We review two new methods that formally adopt this idea
• Augmented synthetic control (Ben-Michael et al 2018): modeling first
• Synthetic DiD (Arkhangelsky et al. 2019): weighting first
120
![Page 134: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/134.jpg)
Augmented Synthetic Control
• Assuming one treated unit (unit N)
• Procedure
1. Run an outcome model (FEct, IFEct, MC, etc.) and obtain model fit m(Xi )
2. Balance on the residual averages, obtaining weights γi for the controls
3. Treated average is constructed using:
Y augN (0) = m(XN) +
N−1∑i=1
γi (Yi − m(Xi ))
• The balancing weights take care of the remaining biases from the outcome
model; the estimator is thus doubly robust
• Inference via clustered bootstrap
121
![Page 135: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/135.jpg)
Synthetic DiD
• Assuming one treated unit (unit N) and one post-treatment period
(period T ); weights add up to 1
• Procedure1. Estimate “synthetic control weight” for each control unit:
ωsc = arg minω∑T−1
t
(∑N−1i=1 ωiYit − YNt
)2. Estimate “synthetic control weight” for each time period:
λsc = arg minλ∑N−1
i
(∑T−1t=1 λtYit − YiT
)3. Estimate a weighted DiD by minimizing:
N∑i=1
T∑t=1
(Yit − µ− αi − βt − Xitγ − Ditτ)2ωi λt
• Either the SC weights or the outcome model is correct, the causal effect
will be identified (doubly robust)
• Inference via jackknife
122
![Page 136: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/136.jpg)
Conclusions
![Page 137: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/137.jpg)
Concluding Remarks
• Removing unit fixed effects is costly, i.e., strict exogeneity; the alternative
is sequential exogeneity
• “Parallel trends” can very well be wrong; when T is large, we can assess
the assumption by checking the “pre-trend” using diagnostic testes
• Counterfactual estimators can relax the homogeneity assumption with
little cost
• Both the matching/reweighting approach (dealing with D) and the
model-based approach (dealing with Y) can help with time-varying
confounders with sufficient data
• Be aware of the modeling assumptions when the latter is employed
• A hybrid estimator is doubly robust
123
![Page 138: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/138.jpg)
Practical Recommendations
• Plotting raw data helps us see obvious problems
• Start from conventional estimators (i.e. DiD, FEct) and check the
“pre-trend”
• If they don’t work, try easy fixes, e.g. trimming the data to make
the treated and controls more alike
• If that doesn’t work, either, we need more complex methods
• Always ask yourself first: “what’s the hypothetical experiment?’
124
![Page 139: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/139.jpg)
Packages
• panelView: panel data visualization
• fastplm: fast panel linear fixed effects estimation (coming soon)
• gsynth: IFEct/MC approach with non-reversible treatments
• fect: IFEct/MC methods with diagnostic tests (coming soon)
• tjbal: trajectory balancing
Thank [email protected]
http://yiqingxu.org
github.com/xuyiqing
125
![Page 140: Causal Inference with Panel Data · 2020-03-04 · Two-step strategy: 1.estimate the propensity score based on observed covariates; compute the tted value 2.run a weighted DiD model](https://reader034.vdocuments.us/reader034/viewer/2022050413/5f8a1b1fc390f239c87269e9/html5/thumbnails/140.jpg)
References
• Abadie, Alberto, Alexis Diamond, and Jens Hainmueller (2010). Journal of the American Statistical
Association. June 1, 2010, 105(490): 493–505.
• Imai, Kosuke and In Song Kim (2019). “When Should We Use Unit Fixed Effects Regression Models
for Causal Inference with Longitudinal Data?” American Journal of Political Science, Vol. 62, Iss. 2,
April 2019, pp. 467–490.
• Abadie, Alberto (2005). “Semiparametric Difference-in-Differences Estimators,” Review of Economic
Studies (2005) 72, 1–19.
• Hsiao, Cheng, H. Steve Ching and Shui Ki Wan (2012). “A Panel Data Approach for Program
Evaluation: Measuring the Benefits of Political and Economic Integration of Hong Kong with
Mainland China,” Journal of Applied Econometrics, Vol. 27, Iss. 5, August 2012, pp. 705–740.
• Doudchenko, Nikolay and Guido Imbens (2016). “Balancing, Regression, Difference-In-Differences
and Synthetic Control Methods: A Synthesis.” Working Paper Stanford University.
• Hazlett, Chad and Yiqing Xu (2018). “Trajectory Balancing: A Kernel Method for Causal Inference
with Time-Series Cross-Sectional Data.” Working Paper, UCLA.
• Xu, Yiqing (2017). “Generalized Synthetic Control Method: Causal Inference with Interactive Fixed
Effects Models” Political Analysis, Vol. 25, Iss. 1, January 2017, pp. 57-76.
• Athey, Susan, Mohsen Bayati, Nikolay Doudchenko, Guido Imbens, Khashayar Khosravi (2017).
“Matrix Completion Methods for Causal Panel Data Models.” Working Paper, Stanford University.
• Liu, Licheng, Ye Wang, Yiqing Xu (2019). ”A Practical Guide to Counterfacutal Estimators for
Causal Inference with Time-Series Cross-Sectional Data.” Working Paper, Stanford University.
• Ben-Michael Eli, Avi Feller, Jesse Rothstein (2018). “The Augmented Synthetic Control Method.”
Working Paper, UC Berkeley.
• Arkhangelsky, Dmitry, Susan Athey, David A. Hirshberg, Guido W. Imbens and Stefan Wager (2019).
“Synthetic Difference In Differences.” Working Paper, UC Berkeley.126