publ0055:introductiontoquantitativemethods lecture2 ... · insured health age female non_white...
TRANSCRIPT
PUBL0055: Introduction to Quantitative Methods
Lecture 2: Causality
Jack Blumenau and Benjamin Lauderdale
1 / 49
Motivation
Does health insurance improve health outcomes?A critical issue in public policy is whether and how governments shouldprovide health care to their citizens. But is there a causal effect ofhealth insurance on actual levels of health? We will evaluate thisquestion, and use it as an example to illustrate strengths andweaknesses of experimental and observational research.
• Y (Dependent variable): Health
• What is the self-assessed health of an individual?
• X (Independent variable, or ”treatment”): Health insurance
• Does the individual have health insurance, or not?
2 / 49
Lecture Outline
Causality and counterfactuals
Difference in means and confounding
Randomized experiments and causality
Observational studies and causality
Conclusions
3 / 49
Causality and counterfactuals
Why causality?
• Cause-and-effect relationships are at the heart of some of the mostinteresting social science theories
• We can often express important research topics as a simplequestion: Does X cause Y?
• Does increasing the minimum wage decrease employment?• Does economic development lead to democracy?• Does the race of a politician affect their chances of being elected?
• Establishing causal effects using quantitative data is difficult!
• Today we focus on defining what causal effects are, and how wemight think about measuring them using quantitative data
4 / 49
Counterfactuals
Whenever a person makes a causal statement, they are contrasting whatthey observe (something factual) with what they believe they wouldobserve if a key condition was different (something counterfactual).
• “Austerity caused Brexit” (Fetzer, 2019)
• Observation: Austerity policy in the UK and a vote to leave the EU• Belief: Without austerity, smaller Leave vote
• “Oil wealth inhibits democracy” (Ross, 2013)
• Observation: Oil-rich countries tend to be less democratic• Belief: If countries had less oil they would be more democratic
Problem: We can’t observe counterfactual outcomes! Causal inferencerequires estimating counterfactuals for comparison to realised outcomes.
5 / 49
Potential outcomes and causal effects
We can define the causal effect of an independent variable𝑋 on anoutcome variable 𝑌 by considering the potential outcomes of 𝑌Potential outcomesThe potential outcomes of 𝑌 are the values of 𝑌 that would be realisedfor different values of𝑋. e.g.
• 𝑌𝑖(1) = the value that 𝑌𝑖 would have been if𝑋𝑖 was equal to 1
• 𝑌𝑖(0) = the value that 𝑌𝑖 would have been if𝑋𝑖 was equal to 0
Treatment effectFor any given individual, if we could observe both potential outcomes,the treatment effect of X on Y for that individual can be calculated as
𝑌𝑖(1) − 𝑌𝑖(0)
6 / 49
Potential outcomes and causal effects (example)
What are the potential outcomes in our health insurance example?
• 𝑌𝑖(1) = Health of individual 𝑖 if they have health insurance• 𝑌𝑖(0) = Health of individual 𝑖 if they do not have health insurance
What are the treatment effects?
• If 𝑌𝑖(1) > 𝑌𝑖(0) then insurance improves health• If 𝑌𝑖(1) < 𝑌𝑖(0) then insurance worsens health• If 𝑌𝑖(1) = 𝑌𝑖(0) then insurance has no effect on health
7 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1
5 3 2
2 1
5 4 1
3 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5
3 2
2 1
5 4 1
3 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3
2
2 1
5 4 1
3 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1
5 4 1
3 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5
4 1
3 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4
1
3 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0
3 3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3
3 0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3
0
4 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0
4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4
3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3
1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
Potential outcomes and causal effects (example)
• 𝑋𝑖 = 1 if the individual is insured, and𝑋𝑖 = 0 if uninsured• 𝑌𝑖(1) is the health of the individual if they were insured• 𝑌𝑖(0) is the health of the individual if they were uninsured• The treatment effect for an individual is 𝑌𝑖(1) − 𝑌𝑖(0)
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
8 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1
5 ? ?
2 1
5 ? ?
3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5
? ?
2 1
5 ? ?
3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ?
?
2 1
5 ? ?
3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1
5 ? ?
3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5
? ?
3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ?
?
3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0
? 3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ?
3 ?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ? 3
?
4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ? 3 ?4 0
? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ? 3 ?4 0 ?
3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ? 3 ?4 0 ? 3
?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ? 3 ?4 0 ? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
But we cannot observe both potential outcomes for any individual!
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Causal effect
1 1 5 ? ?2 1 5 ? ?3 0 ? 3 ?4 0 ? 3 ?
Average treatment effect (ATE)= ?+?+?+?4 = ?
4 = ?
9 / 49
The Fundamental Problem of Causal Inference
The Fundamental Problem of Causal InfereceWe only ever observe one potential outcome for a given individual, andour observed outcome depends on the status of our explanatoryvariable. We can never directly observe individual causal effects.
10 / 49
11 / 49
Difference in means and confounding
Quantity of Interest
We want to estimate the average treatment effect (ATE):
1𝑁
𝑁∑𝑖=1
{𝑌𝑖(1) − 𝑌𝑖(0)}
But we can’t observe 𝑌𝑖(1) and 𝑌𝑖(0) for any given unit!One alternative is to use the difference-in-means to estimate the ATE:
𝑌𝑋=1 − 𝑌𝑋=0
where 𝑌𝑋=1 and 𝑌𝑋=0 are the average observed outcomes for treatedand control units, respectively.
Question: Is the difference-in-means equal to the ATE?
12 / 49
Using the difference-in-means to estimate the ATE
Question: Is the difference-in-means likely to equal the ATE?
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1Difference-in-means= 5+5
2 − 3+32 = 5 − 3 = 2
No! The difference-in-means is larger than the ATE. Why?
13 / 49
Using the difference-in-means to estimate the ATE
Question: Is the difference-in-means likely to equal the ATE?
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1
Difference-in-means= 5+52 − 3+3
2 = 5 − 3 = 2No! The difference-in-means is larger than the ATE. Why?
13 / 49
Using the difference-in-means to estimate the ATE
Question: Is the difference-in-means likely to equal the ATE?
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1Difference-in-means= 5+5
2 − 3+32 = 5 − 3 = 2
No! The difference-in-means is larger than the ATE. Why?
13 / 49
Using the difference-in-means to estimate the ATE
Question: Is the difference-in-means likely to equal the ATE?
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1Difference-in-means= 5+5
2 − 3+32 = 5 − 3 = 2
No! The difference-in-means is larger than the ATE.
Why?
13 / 49
Using the difference-in-means to estimate the ATE
Question: Is the difference-in-means likely to equal the ATE?
Individual 𝑋𝑖 𝑌𝑖(1) 𝑌𝑖(0) Treatment effect
1 1 5 3 22 1 5 4 13 0 3 3 04 0 4 3 1
Average treatment effect (ATE)= 2+1+0+14 = 4
4 = 1Difference-in-means= 5+5
2 − 3+32 = 5 − 3 = 2
No! The difference-in-means is larger than the ATE. Why?
13 / 49
Confounding
Confounding differences between treatment and control groups meanthat the difference-in-means can give a biased estimate of the ATE.
1. Causal effect of job training on employment
• Finding: People with training are more likely to be employed• Confounding: People with training may be more motivated• Direction of bias: Positive, we overstate the effectiveness of training
2. Causal effect of aspirin on headache symptoms
• Finding: People who take aspirin report worse headache symptoms• Confounding: People who took aspirin had headaches to begin with!• Direction of bias: Negative, we understate the effectiveness of aspirin
→ confounding bias can be positive or negative!
Question: Is confounding likely in the health care example?
14 / 49
Confounding bias example
Does health insurance improve health outcomes?The National Health Interview Survey (NHIS) is an annual survey of theUS population that asks questions about health and health insurance.We are interested in two questions in particular:
• Y (Dependent variable): health
• “Would you say your health in general is excelent (5), very good (4),good (3), fair (2), or poor (1)?”
• X (Independent variable): insured
• “Do you have health insurance?” TRUE = Insured, FALSE = Notinsured
We will also use information from some of the other questions on thesurvey (gender, income, race, etc).
15 / 49
Confounding bias example
Here are the first six rows of the NHIS data:
Table 4: NHIS data
id insured health age female years_educ non_white income
1 FALSE 4 29 TRUE 14 FALSE 19282.932 FALSE 4 35 FALSE 11 FALSE 19282.933 TRUE 3 32 FALSE 12 FALSE 167844.534 TRUE 3 34 TRUE 16 FALSE 167844.535 TRUE 4 45 FALSE 12 FALSE 85985.786 TRUE 4 44 TRUE 12 FALSE 85985.78
16 / 49
Difference-in-means
To calculate the difference in means between the two groups we will use:
• the mean() function• the subsetting ([]) operator
## Mean health level for insured individualsmean(nhis$health[nhis$insured == TRUE])
## [1] 3.943847
## Mean health level for non-insured individualsmean(nhis$health[nhis$insured == FALSE])
## [1] 3.617647
Insured individuals report health levels that are approximately 10%better than non-insured individuals.
17 / 49
Difference-in-means
To calculate the difference in means between the two groups we will use:
• the mean() function• the subsetting ([]) operator
## Mean health level for insured individualsmean(nhis$health[nhis$insured == TRUE])
## [1] 3.943847
## Mean health level for non-insured individualsmean(nhis$health[nhis$insured == FALSE])
## [1] 3.617647
Insured individuals report health levels that are approximately 10%better than non-insured individuals.
17 / 49
Difference-in-means
To calculate the difference in means between the two groups we will use:
• the mean() function• the subsetting ([]) operator
## Mean health level for insured individualsmean(nhis$health[nhis$insured == TRUE])
## [1] 3.943847
## Mean health level for non-insured individualsmean(nhis$health[nhis$insured == FALSE])
## [1] 3.617647
Insured individuals report health levels that are approximately 10%better than non-insured individuals.
17 / 49
Difference-in-means
Questions:
1. Can we interpret the difference in means here as causal?2. How might we assess the possibility of confounding bias here?
18 / 49
Checking “balance”
One implication of confounding: treatment and control units will bedifferent with respect to characteristics other than the treatment.
We can check this by evaluating the degree of balance for treated andcontrol units on different pre-treatment variables.
1. Calculate ��𝑇=1 (average value of pre-treatment variable for treated)2. Calculate ��𝑇=0 (average value of pre-treatment variable for control)3. Large differences between ��𝑇=1 and ��𝑇=0 suggest confounding
For example, for age:mean(nhis$age[nhis$insured == TRUE]) -
mean(nhis$age[nhis$insured == FALSE])
## [1] 2.318222
→ The insured are, on average, 2.3 years older than the uninsured
19 / 49
Checking “balance”
One implication of confounding: treatment and control units will bedifferent with respect to characteristics other than the treatment.
We can check this by evaluating the degree of balance for treated andcontrol units on different pre-treatment variables.
1. Calculate ��𝑇=1 (average value of pre-treatment variable for treated)2. Calculate ��𝑇=0 (average value of pre-treatment variable for control)3. Large differences between ��𝑇=1 and ��𝑇=0 suggest confounding
For example, for age:mean(nhis$age[nhis$insured == TRUE]) -
mean(nhis$age[nhis$insured == FALSE])
## [1] 2.318222
→ The insured are, on average, 2.3 years older than the uninsured
19 / 49
Checking “balance”
insured health age female non_white years_educ income
Uninsured 3.6 41.0 46.4 17.8 11.6 42892.6Insured 4.0 43.3 49.2 15.3 14.3 101315.4Difference 0.3 2.3 2.8 -2.4 2.7 58422.9
Insured individuals are• less likely to be non-white (15.3%) than the non-insured (17.8%)• more likely to be female (49.2%) than the non-insured (46.4%)• older (43) than non-insured (41)• more educated (14.3 years) than the non-insured (12 years)• significantly richer ($101315) than the non-insured ($42893)
Implications:1. Race, gender, age, education and income are all imbalanced withrespect to insurance status.
2. Confounding is very likely a problem in this data.20 / 49
21 / 49
Randomized experiments and causality
Observational studies and randomized experiments
Randomized experiments• Units are randomly assigned todifferent X values
• Researchers directly intervenein the world they study
• Gold standard for causalinference
Observational studies• Units are assigned to X values“by nature”
• Researchers observe, but donot intervene in, the world
• Very commonly used in socialscience research
22 / 49
Randomization and selection bias
We saw previously that, in general, the difference-in-means will not be anunbiased estimate of the ATE because of confounding.
Question: How can we avoid this problem?
Answer: Randomize!
Key intuitionRandomization of treatment doesn’t eliminate differences betweenindividuals, but ensures that the mix of individuals being compared isthe same on average.
23 / 49
Randomization and selection bias
We saw previously that, in general, the difference-in-means will not be anunbiased estimate of the ATE because of confounding.
Question: How can we avoid this problem? Answer: Randomize!
Key intuitionRandomization of treatment doesn’t eliminate differences betweenindividuals, but ensures that the mix of individuals being compared isthe same on average.
23 / 49
Randomization and selection bias
We saw previously that, in general, the difference-in-means will not be anunbiased estimate of the ATE because of confounding.
Question: How can we avoid this problem? Answer: Randomize!
Key intuitionRandomization of treatment doesn’t eliminate differences betweenindividuals, but ensures that the mix of individuals being compared isthe same on average.
23 / 49
Randomization and selection bias
When the treatment,𝑋, is randomly assigned to units…
• … treated and untreated groups will be similar, on average, in termsof all characteristics (both observed and unobserved)
• … the only systematic difference between the two will be the receiptof treatment
• … the average outcome for controls will be similar to what wouldhave happened for the treatment group if they had not been treated
→ we can use the difference in means to estimate the ATE!
24 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
All units
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
1 sample
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
2 samples
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
5 samples
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
100 samples
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
1000 samples
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias (visualization)
0 5 10 15 20
010
2030
40
Potential Outcome (Control)
Cou
nt
0 5 10 15 20
05
1015
2025
3035
6000 samples
Potential Outcome (Treatment)
Cou
nt
4.0 4.5 5.0 5.5 6.0 6.5
050
100
150
Difference in means
Cou
nt
True ATE
25 / 49
Randomization and selection bias
• Randomization of the treatment makes the difference in groupmeans an unbiased estimator of the true ATE
• On average, you can expect a randomized experiment to get theright answer
• This does not guarantee that the answer you get from any particularrandomization will be exactly correct!
26 / 49
Example – Experimental data
Does health insurance improve health outcomes?The RAND Health Insurance Experiment (RAND) was an experimentconducted between 1974 and 1982 in the US. In this experiment,researchers randomly allocated individuals to receive health insurance.
• Y (Dependent variable): health
• “Would you say your health in general is excelent (5), very good (4),good (3), fair (2), or poor (1)?”
• X (Independent variable): insured
• Was the participant randomly allocated to receive health insurance?TRUE = Insured, FALSE = Not insured
We will again use information from some of the other questions on thesurvey (gender, income, race, etc).
27 / 49
Experimental data
Here are the first six rows of the RAND data:
Table 6: RAND data
id insured health age female years_educ non_white income
1 FALSE 4 42 FALSE 12 TRUE 67486.4842 FALSE 4 43 TRUE 12 TRUE 67486.4843 TRUE 2 38 TRUE 12 TRUE 27608.1074 TRUE 3 24 FALSE 12 TRUE 4322.2035 TRUE 4 60 TRUE 9 TRUE 24540.5416 TRUE 4 59 FALSE 4 TRUE 24540.541
28 / 49
Difference-in-means - Experimental data
Let’s calculate the difference in means again using the experimental data:
## Mean health level for insured individualsmean(rand$health[rand$insured == TRUE])
## [1] 3.405373
## Mean health level for non-insured individualsmean(rand$health[rand$insured == FALSE])
## [1] 3.423304
The difference between insured and non-insured individuals almostcompletely disappears in the experimental data!
29 / 49
Results - Experimental data
insured health age female non_white years_educ income
Uninsured 3.4 33.1 56.0 16.9 12.1 32597.2Insured 3.4 33.2 53.4 17.1 12.0 31220.9Difference 0.0 0.1 -2.6 0.2 -0.1 -1376.3
Insured and non-insured individuals• are equally likely to be non-white• are equally likely to be female• have similar ages• have similar years of education• have similar income levels.• and also report very similar health outcomes!
Implications:1. There is less evidence of imbalance in the experimental data2. The experimental effect of insurance is much smaller
30 / 49
Observational and Experimental Results Compared
Health
Age
Female
Non−White
Education
Income
−60 −40 −20 0% difference between non−insured and insured
ExperimentalObservational
31 / 49
Why not always use experiments?
If randomized experiments are the “gold standard” for causal inference,why not always use them?
1. Practical concerns
• It is often infeasible to randomly vary the “treatment” you care about• e.g. Could you randomize the type of electoral system in a country?
2. Ethical concerns
• Experiments involving real people can be ethically dubious• e.g. Manipulating the type of messages people see on Facebook
3. Cost concerns
• Experiments can be expensive in both money and time• Particularly relevant to planning dissertation projects
32 / 49
33 / 49
Observational studies and causality
Observational research designs
Researchers using observational data cannot rely on randomization tosolve the selection bias problem so they must employ alternative designs:
1. Cross-sectional designs
• Compare outcomes for treated and control units, possibly adjustingfor pre-treatment characteristics
2. Before and after designs
• Compare outcomes before and after treatment for treated unitsusing over-time data
3. Difference-in-differences designs
• Compare outcomes before and after treatment for treated units, andcompare to the same change over time for control units
34 / 49
Cross-sectional designs
• Data: One observation per unit (i.e. individuals, countries, firms, etc),many units
• Approach: Compare average outcomes between treatment andcontrol units, often “controlling” for potential confounders
• Assumption: No unmeasured confounding between treated andcontrol units
35 / 49
Statistical control (example)
Our observational difference-in-means estimate is relatively large:
mean(nhis$health[nhis$insured == T]) -mean(nhis$health[nhis$insured == F])
## [1] 0.3262003
But insurance status is imbalanced with respect to income:
mean(nhis$income[nhis$insured == T]) -mean(nhis$income[nhis$insured == F])
## [1] 58422.87
How can we control for income?
36 / 49
Subclassification (example)
Subclassification: calculate differences between treatment and controlwithin levels of a confounding variable.
Imagine that we have just 3 levels of income (low, mid and high).
Caluclate the average health of insured and uninsured for each level:
insured_low_mean <- mean(nhis$health[nhis$insured == T &nhis$income_cat == "Low"])
uninsured_low_mean <- mean(nhis$health[nhis$insured == F &nhis$income_cat == "Low"])
insured_low_mean - uninsured_low_mean
## [1] 0.1010293
37 / 49
Subclassification
Subclassification: calculate differences between treatment and controlwithin levels of a confounding variable.
Imagine that we have just 3 levels of income (low, mid and high).
Caluclate the average health of insured and uninsured for each level:
insured_mid_mean <- mean(nhis$health[nhis$insured == T &nhis$income_cat == "Mid"])
uninsured_mid_mean <- mean(nhis$health[nhis$insured == F &nhis$income_cat == "Mid"])
insured_mid_mean - uninsured_mid_mean
## [1] 0.04954519
37 / 49
Subclassification
Subclassification: calculate differences between treatment and controlwithin levels of a confounding variable.
Imagine that we have just 3 levels of income (low, mid and high).
Caluclate the average health of insured and uninsured for each level:
insured_high_mean <- mean(nhis$health[nhis$insured == T &nhis$income_cat == "High"])
uninsured_high_mean <- mean(nhis$health[nhis$insured == F &nhis$income_cat == "High"])
insured_high_mean - uninsured_high_mean
## [1] 0.1542666
37 / 49
Subclassification
insured_low_mean - uninsured_low_mean
## [1] 0.1010293
insured_mid_mean - uninsured_mid_mean
## [1] 0.04954519
insured_high_mean - uninsured_high_mean
## [1] 0.1542666
Consequence: Once we control for income, the effects of insurance onhealth are much smaller than in the naive difference-in-meanscomparison.
38 / 49
Cross-sectional designs summary
1. Cross-sectional data (one observation per unit) is very common inthe social sciences
2. Using statistical “control” strategies is a common approach forattempting to overcome confounding bias
3. Strong assumptions required: we must control for all potentialconfounders to be confident in the causal claims we are making
4. We will revisit these designs in weeks 4, 5 and 6, particularly in thecontext of regression models
39 / 49
Before and after designs
• Data: Many observations of treated units over time
• Approach: Difference in average outcomes before and aftertreatment
• Assumption: No time varying confounding/selection bias
• Example: A health care reform that provides all citizens withcoverage at a given point in time.
40 / 49
Before and after designs
0.0
0.5
1.0
1.5
2.0
2.5
Ave
rage
Hea
lth
Pre−treatment Reform Post−treatment
41 / 49
Before and after designs
0.0
0.5
1.0
1.5
2.0
2.5
Ave
rage
Hea
lth
Pre−treatment Reform Post−treatment
Treatment effect?
41 / 49
Before and after designs
Advantages:
• Holds unit-specific features “constant” by comparing outcomeswithin units over time
• Only requires data on treated units
Disadvantages:
• Requires assuming that nothing else changed for treated units
• E.g. Maybe health outcomes are just getting better over time?
• Requires data from multiple time points
42 / 49
Difference-in-differences designs
• Data: Many observations of many units, some of which are treatedat some point in time
• Approach: Difference in average outcomes before and aftertreatment for treated group, compared to same difference forcontrol group
• Assumption: Without treatment, treatment group outcomes wouldhave mirrored those in control group (“parallel trends”)
• Example: A health care reform that provides citizens of some stateswith health care coverage at a given point in time. Healthcare ofcitizens in non-treated states is not affected.
43 / 49
Difference-in-differences designs
0.0
0.5
1.0
1.5
2.0
2.5
Ave
rage
Hea
lth
Pre−treatment Reform Post−treatment
Treated states
44 / 49
Difference-in-differences designs
0.0
0.5
1.0
1.5
2.0
2.5
Ave
rage
Hea
lth
Pre−treatment Reform Post−treatment
Treated states
Control states
44 / 49
Difference-in-differences designs
0.0
0.5
1.0
1.5
2.0
2.5
Ave
rage
Hea
lth
Pre−treatment Reform Post−treatment
Treated states
Control states
Counterfactual
44 / 49
Difference-in-differences designs
0.0
0.5
1.0
1.5
2.0
2.5
Ave
rage
Hea
lth
Pre−treatment Reform Post−treatment
Treatment effect
44 / 49
Difference in differences designs
Advantages:
• Holds unit-specific features “constant” by comparing outcomeswithin units over time
• Removes confounding by shocks that are common to all units(treatment and control)
• Does not assume that outcomes remain constant over time
Disadvantages:
• Requires assuming that nothing else changed only for treated units
• E.g. Did other policies change in treatment states at the same time?
• Requires data from multiple time points and multiple types of units
45 / 49
Observational designs and assumptions
Each of these approaches to using observational data requireassumptions.
• Cross-sectional: No confounding
• Treatment and control groups should be similar in observed andunobserved characteristics
• Before/after: No time-varying confounding
• Without treatment, treatment units would not have changedbetween pre- and post-treatment periods
• Difference-in-differences: Parallel trends
• Without treatment, treatment units would have followed a similartrajectory as control units
46 / 49
External and internal validity
Internal validity: Can we interpret estimates in a study as causal?
• Randomized experiments have high internal validity→ treatmentand control groups are similar on average
• Observational studies have lower internal validity→ pre-treatmentvariables may differ between treatment and control
External validity: How generalizable are the conclusions of a study?
• Randomized experiments tend to have lower external validity→difficult to conduct on representative samples
• Observational studies tend to have higher external validity→easier to use representative samples or the population itself
Picking a research design requires trading off between these goals.
47 / 49
Conclusions
What have we covered?
• Causality can be understood as a process of counterfactualreasoning
• We can understand causal effects as being the differences inpotential outcomes for units with different treatment statuses
• Randomized experiments are the “gold standard” for causalinference
• Observational approaches can also be used for making causalclaims, but they require strong assumptions
• Causal inference is hard. But at least now you know that it is hard.
48 / 49
Seminar
In seminars this week, you will learn about …
1. … working directories
2. … loading data into R
3. … calculating causal effects
4. … thinking critically about causality in empirical research
49 / 49