methodological workshop 2: ignorability, selection bias, and causal inference yu xie university of...
TRANSCRIPT
Methodological Workshop 2:Methodological Workshop 2:
Ignorability, Selection Bias, Ignorability, Selection Bias, and Causal Inferenceand Causal Inference
Yu XieYu XieUniversity of MichiganUniversity of Michigan
Observed DataObserved Data
A population with N individuals, from which A population with N individuals, from which we draw a sample of size n. we draw a sample of size n.
There is an outcome of interest, say Y, There is an outcome of interest, say Y, that is measured on the real line.that is measured on the real line.
There is an independent variable of There is an independent variable of interest, say D. For simplicity, let us interest, say D. For simplicity, let us assume that D is a binary “treatment,” D=1 assume that D is a binary “treatment,” D=1 (T), D=0 (C). This is the simplest case. (T), D=0 (C). This is the simplest case.
Let us call this setup “canonical case”Let us call this setup “canonical case”
Canonical Case ExaminedCanonical Case Examined What is the causal effect of treatment D?What is the causal effect of treatment D? It is the counterfactual effect for the ith individual:It is the counterfactual effect for the ith individual:
YYiiTT - Y - Yii
CC
However, we either observe However, we either observe
YYiiT T when Dwhen Di i =1 or=1 or
YYiiC C when Dwhen Di i =0.=0.
ConclusionConclusion: it is not possible to identify individual-: it is not possible to identify individual-level causal effect without assumptions. level causal effect without assumptions.
At Another ExtremeAt Another Extreme
We can impose a strong, unrealistic assumption We can impose a strong, unrealistic assumption that all individuals are identical (a homogeneity that all individuals are identical (a homogeneity assumption often made in physical science); assumption often made in physical science); then we havethen we have
YYiiTT = Y = YT T ; Y; Yii
CC = Y = YCC
We only need two observations to identify the causal We only need two observations to identify the causal effect:effect:YYT T when Dwhen D =1 and=1 andYYC C when Dwhen D =0.=0.
ImplicationImplication: it is population variability that makes : it is population variability that makes “scientific sampling” necessary. “scientific sampling” necessary.
Yu Xie’s “Fundamental Paradox in Yu Xie’s “Fundamental Paradox in Social Science”Social Science”
There is There is alwaysalways variability at the individual variability at the individual level. level.
Causal inference is impossible at the Causal inference is impossible at the individual level and thus individual level and thus alwaysalways requires requires statistical analysis at the group level on statistical analysis at the group level on the basis of some homogeneity the basis of some homogeneity assumption. assumption.
Different methods boil down to different Different methods boil down to different comparison groups. comparison groups.
Consider the Usual CaseConsider the Usual Case Population is divided into two subpopulations: PPopulation is divided into two subpopulations: P11 if D if Dii =1, =1,
PP00 if D if Dii=0. =0. Use the following notations:Use the following notations:
q = proportion of Pq = proportion of P00 in P in P E(YE(Y11
TT) = E(Y) = E(YTT|D=1) , E(Y|D=1) , E(Y11CC) = E(Y) = E(YCC|D=1) |D=1)
E(YE(Y00TT) = E(Y) = E(YTT|D=0) , E(Y|D=0) , E(Y00
CC) = E(Y) = E(YCC|D=0) |D=0) By total expectation rule: By total expectation rule:
ATE=E(YATE=E(YTT - Y - YCC)) = E(Y = E(Y11TT – Y – Y11
CC)(1-q) + E(Y)(1-q) + E(Y00TT – Y – Y00
CC)q)q = = E(YE(Y11
TT – Y – Y00CC)) - E(Y - E(Y11
CC – Y – Y00CC) - () - (11--00)q, )q,
where where 1 1 == E(YE(Y11TT – Y – Y11
CC) = TT,) = TT,
0 0 == E(YE(Y00TT – Y – Y00
CC) = TUT. ) = TUT.
In Other WordsIn Other Words
The standard estimator The standard estimator E(YE(Y11TT – Y – Y00
CC)) contains two contains two sources of biases: sources of biases: (1) The average difference between P(1) The average difference between P11 and P and P00 in in
the absence of treatment (“pre-treatment the absence of treatment (“pre-treatment heterogeneity bias,” or “Type I selection bias.”):heterogeneity bias,” or “Type I selection bias.”):
E(YE(Y11CC – Y – Y00
CC)) (2) The difference in the average treatment effect (2) The difference in the average treatment effect
between Pbetween P11 and P and P00 (“treatment-effect (“treatment-effect heterogeneity bias,” or “Type II selection bias.”):heterogeneity bias,” or “Type II selection bias.”):
11--00
Both sources of bias Both sources of bias averageaverage to zero under to zero under randomized assignment. randomized assignment.
In Regression LanguageIn Regression Language YYii = = + + iiDDii + + i i
There are two types of variability that may There are two types of variability that may cause biases: cause biases: (1) Type I selection bias (focusing on (1) Type I selection bias (focusing on ii): ):
If corr(If corr(,,,,D)≠0. D)≠0.
(2) Type II selection bias (focusing on (2) Type II selection bias (focusing on i i ):):
If corr(If corr(,,,,D)≠0. D)≠0.
Selection Bias and EstimandsSelection Bias and Estimands ATEATEE(YE(YTT - Y - YCC))
= = E(YE(Y11TT – Y – Y00
CC)) - E(Y - E(Y11CC – Y – Y00
CC) - () - (11--00)q, )q, When Type I selection bias is present, but Type II selection When Type I selection bias is present, but Type II selection
bias is absent (say homogenous treatment effect). bias is absent (say homogenous treatment effect).
E(YE(Y11TT – Y – Y00
CC) ≠ ) ≠ When Type I selection bias is absent, but Type II selection When Type I selection bias is absent, but Type II selection
bias is present. bias is present.
E(YE(Y11TT – Y – Y00
CC) ≠ ) ≠ ATEATE≠ ≠ 1 1 ≠ ≠ 00
Type II selection bias is important. Type II selection bias is important. Type II selection bias cannot be eliminated by “fixed-Type II selection bias cannot be eliminated by “fixed-
effects” approach. effects” approach.
Ignorability and Selection BiasIgnorability and Selection Bias
Type of Selection BiasType of Selection Bias
Type IType I Type IIType II
IgnorabilityIgnorability
Assumed?Assumed?
(Invoking (Invoking Unobserv-Unobserv-
ables?)ables?)
YesYes
(No)(No)
Propensity Score Propensity Score
(Rubin et al.)(Rubin et al.) ??NoNo
(Yes)(Yes)
Structural Selection Structural Selection ModelModel
(Heckman et al.)(Heckman et al.)
Non-parametric IV Non-parametric IV Models Models
(Heckman et al.)(Heckman et al.)
IV versus LATEIV versus LATE
Exactly the same formula, but different Exactly the same formula, but different interpretations interpretations
IV interpretation: constant treatment effect.IV interpretation: constant treatment effect. LATE interpretation: heterogeneous LATE interpretation: heterogeneous
treatment effects, averaged into different treatment effects, averaged into different groups (strata). groups (strata).
Heckman Selection ModelHeckman Selection Model
X
X
Z
Y
d
Z
)BN( ~
DLatent Rule
Important Role of UnobservablesImportant Role of Unobservables
The treatment of selection bias in The treatment of selection bias in economics requires specification of economics requires specification of unobserved variables.unobserved variables.
Such specifications are subject to dispute. Such specifications are subject to dispute. The issue of unobservables also splits The issue of unobservables also splits
economists and statisticians into two economists and statisticians into two camps. camps.
As a result, not enough attention has been As a result, not enough attention has been paid to (1,2) cell, marked by “paid to (1,2) cell, marked by “??”. ”.
Missing KnowledgeMissing Knowledge
We do not know much about the cell marked We do not know much about the cell marked by “by “?? ”. ”.
Most work in economics on selection bias Most work in economics on selection bias assumes that ignorability does not hold true. assumes that ignorability does not hold true.
Since we can easily handle Type I selection Since we can easily handle Type I selection bias under ignorability, it seems that Type II bias under ignorability, it seems that Type II selection bias under ignorability is a trivial selection bias under ignorability is a trivial matter. matter.
I will show that this is not true. I will show that this is not true.
Making SenseMaking Sense
In this presentation, I discuss a simple In this presentation, I discuss a simple scenario where Type II selection bias scenario where Type II selection bias (which I call “composition bias”) arises (which I call “composition bias”) arises from a common situation in which we from a common situation in which we assume ignorability. assume ignorability.
Ignorability AssumptionIgnorability Assumption
Also called “selection on observables.” Also called “selection on observables.” Let Let XX denote a vector of observed covariates. The denote a vector of observed covariates. The
ignorability assumption states: ignorability assumption states: DD ‖‖ (Y (YCC, Y, YTT) | ) | XX.. We start with the assumption, although we do not We start with the assumption, although we do not
necessarily believe that this is true.necessarily believe that this is true. We want to learn as much as the data can tell us.We want to learn as much as the data can tell us.
Under the Ignorability AssumptionUnder the Ignorability Assumption
The important work by Rosenbaum and Rubin The important work by Rosenbaum and Rubin (1984) shows that, when the ignorability (1984) shows that, when the ignorability assumption holds true, it is sufficient to condition on assumption holds true, it is sufficient to condition on the propensity score as a function of the propensity score as a function of XX. The . The condition is changed tocondition is changed to
DD ‖‖ (Y (YCC, Y, YTT) | ) | pp((DD=1|=1|XX). ).
In Other WordsIn Other Words
here is no bias, conditional on propensity here is no bias, conditional on propensity score:score:
E[YE[YTT - Y - YC C ||p(X)]p(X)] = = E[YE[Y11TT – Y – Y00
CC | |p(X)]p(X)]
Recall Earlier ResultRecall Earlier Result
E(YE(YTT - Y - YCC)) [ATE] [ATE] = = E(YE(Y11
TT – Y – Y00CC)) - E(Y - E(Y11
CC – Y – Y00CC) - () - (11--00)q.)q.
The ignorability assumption thus means:The ignorability assumption thus means: No Type I selection bias, conditional on p(X):No Type I selection bias, conditional on p(X):
E[YE[Y11CC – Y – Y00
CC|p(X)] = 0|p(X)] = 0
E[YE[Y00CC|p(X)] = E[Y|p(X)] = E[Y11
CC|p(X)] = E[Y|p(X)] = E[YCC|p(X)] |p(X)] No Type II selection bias, conditional on p(X): No Type II selection bias, conditional on p(X):
E[(YE[(Y11TT – Y – Y11
CC) - (Y) - (Y00TT – Y – Y00
CC)|p(X)] = 0)|p(X)] = 0
E[YE[Y11TT – Y – Y00
CC|p(X)] = E[Y|p(X)] = E[Y11TT – Y – Y11
CC|p(X)] |p(X)]
= E[(Y= E[(YTT - Y - YCC)|p(X)])|p(X)]
ImplicationsImplications
Implication 1Implication 1: we should conduct propensity-score : we should conduct propensity-score specific analysis under ignorability.specific analysis under ignorability.
Implication 2Implication 2: the only “interaction” effects that can : the only “interaction” effects that can lead to selection bias (Type II) are those lead to selection bias (Type II) are those between the treatment status and the propensity between the treatment status and the propensity score. score.
SetupSetup
Two requirements:Two requirements: There are heterogeneous treatment effects There are heterogeneous treatment effects The heterogeneity in treatment effects is correlated The heterogeneity in treatment effects is correlated
with the propensity of treatment. with the propensity of treatment. Both requirements are accepted in the standard Both requirements are accepted in the standard
(statistical) approach assuming ignorability. (statistical) approach assuming ignorability. We wish to show: We wish to show:
(1) treatment-effect heterogeneity => (1) treatment-effect heterogeneity => Type II selection bias. Type II selection bias.
(2) Type II selection bias = composition bias. (2) Type II selection bias = composition bias. (3) This happens without unobservables. (3) This happens without unobservables.
Example I: Market Premium in Example I: Market Premium in Contemporary ChinaContemporary China
We found that the social mechanisms and We found that the social mechanisms and social consequences of transitioning from social consequences of transitioning from the state sector to the market significantly the state sector to the market significantly changed over time (Wu and Xie 2003, changed over time (Wu and Xie 2003, ASR). ASR).
Jann (2005) and Xie and Wu Jann (2005) and Xie and Wu (2005)(2005)
Jann argued that there is no statistical Jann argued that there is no statistical difference in returns to education between difference in returns to education between early entrants and late entrants. Thus, Wu early entrants and late entrants. Thus, Wu and Xie’s conclusion is incorrect.and Xie’s conclusion is incorrect.
Social processes generating the three Social processes generating the three groups are cumulative so that the three groups are cumulative so that the three groups are not symmetric. groups are not symmetric.
Year
p1=0.11 p2=0.16d=2d=1
199619871978
State S
ectorM
arket S
ector
Experienced Workers (1197)
Flow Chart of Labor Market Transitions in China, 1978 – 1996
Stayers(1068)
Stayers(1590)
Stayers (1337)
Later Entrants(253)
Earl Birds(129)
New Entrants to the State Sector (522)
Earl Birds(129)
Xie and Wu’s ( 2005) Key Results: Xie and Wu’s ( 2005) Key Results: Market Premium of Late EntryMarket Premium of Late Entry
6.07
3.33
3.1
2.41
.93 .74
.31
-.65-20
00
200
400
600
800
Mar
ket E
ffect
on
Ear
nin
gs
1 2 3 4 5 6 7 8Propensity Score Strata
observed linear fit
Example II: College Returns (Brand and Xie)Example II: College Returns (Brand and Xie)
Research questionResearch question What’s the earnings return to college What’s the earnings return to college
educationeducation Data set: WLS. Earnings are measured at Data set: WLS. Earnings are measured at
different points in life course. different points in life course.
Preliminary Findings: College Graduation Preliminary Findings: College Graduation Treatment Effect on Earnings Treatment Effect on Earnings
by Propensity Score Strata: WLS Menby Propensity Score Strata: WLS Men
y = -0.0276x + 0.2501
R2 = 0.0932
y = -0.0222x + 0.3992
R2 = 0.3564
y = -0.0636x + 1.1008
R2 = 0.1433
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3 4 5 6 7 8 9 10
Propensity Score Strata
Eff
ect o
n L
og E
arni
ngs
lnyr74s1 lnwg92s1 lnwg04s1 Age 35 Earnings Effects Age 53 Earnings Effects Age 64 Earnings Effects
Example III: NSW Data on Job TrainingExample III: NSW Data on Job Training
Research questionResearch question Does participation in the National Supported Does participation in the National Supported
Work Demonstration (NSW) improve workers’ Work Demonstration (NSW) improve workers’ wages?wages?
NSWNSW A temporary employment program designed to A temporary employment program designed to
help low skilled workers move into the labor help low skilled workers move into the labor market.market.
Original NSW data were experimental (random Original NSW data were experimental (random assignment into treatment and control groups). assignment into treatment and control groups).
Re-Analysis in Xie, Perez, and Re-Analysis in Xie, Perez, and Raudenbush (in progress) Raudenbush (in progress)
-10
000
-50
000
5000
Mea
n In
com
e C
hang
e
0 2 4 6 8 10Propensity Score Strata
Observed Linear Fit
PSID Comparison DataUnweighted Treatment Effect by Propensity Strata
Main InsightsMain Insights
Selection into treatment is a dynamic process Selection into treatment is a dynamic process (akin to survival analysis), so that net (akin to survival analysis), so that net “composition” changes with the proportion of the “composition” changes with the proportion of the subpopulation being treated (Psubpopulation being treated (P11). ).
Heterogeneous treatment propensities + Heterogeneous treatment propensities + associated heterogeneous treatment effects associated heterogeneous treatment effects “composition bias” -- Type II selection bias. “composition bias” -- Type II selection bias.
In this setup, we use the marginal proportion of In this setup, we use the marginal proportion of treatment as an “instrument” for the definition of treatment as an “instrument” for the definition of the marginal treatment effect. the marginal treatment effect.
Simulation One, SetupSimulation One, Setup
Baseline
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 100 0 0 5.00 0.00 0.00
0.15 150 100 0 0 15.00 0.00 0.00
0.25 250 100 0 0 25.00 0.00 0.00
0.35 350 100 0 0 35.00 0.00 0.00
0.45 450 100 0 0 45.00 0.00 0.00
0.55 550 100 0 0 55.00 0.00 0.00
0.65 650 100 0 0 65.00 0.00 0.00
0.75 750 100 0 0 75.00 0.00 0.00
0.85 850 100 0 0 85.00 0.00 0.00
0.95 950 100 0 0 95.00 0.00 0.00
SUM 1000 0 0 TUT MTE TT
500.00 0.00 0.00
Simulation OneSimulation One
Draw1
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 99 1 1 5.50 0.50 0.50
0.15 150 97 3 3 16.17 4.50 4.50
0.25 250 95 5 5 26.39 12.50 12.50
0.35 350 93 7 7 36.17 24.50 24.50
0.45 450 91 9 9 45.50 40.50 40.50
0.55 550 89 11 11 54.39 60.50 60.50
0.65 650 87 13 13 62.83 84.50 84.50
0.75 750 85 15 15 70.83 112.50 112.50
0.85 850 83 17 17 78.39 144.50 144.50
0.95 950 81 19 19 85.50 180.50 180.50
SUM 900 100 100 TUT MTE TT
481.667 665.000 665.00
Simulation OneSimulation One
Draw2
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 98 1 2 6.12 0.57 0.54
0.15 150 94 3 6 17.56 5.03 4.77
0.25 250 90 5 10 27.98 13.70 13.10
0.35 350 85 8 15 37.40 26.28 25.39
0.45 450 82 9 18 45.87 42.51 41.50
0.55 550 78 11 22 53.42 62.10 61.30
0.65 650 74 13 26 60.09 84.79 84.65
0.75 750 70 15 30 65.90 110.29 111.40
0.85 850 67 16 33 70.90 138.33 141.42
0.95 950 63 18 37 75.11 168.63 174.57
SUM 800 100 200 TUT MTE TT
460.344 652.249 658.62
Simulation One, SummarySimulation One, SummaryTreatment Effects by Round
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
0 100 200 300 400 600 700 800 900 1000
Number of Cases Sampled
Tre
atm
ent
Eff
ect
TUT MTE TT TT-TUT
Simulation Two (Micro)Simulation Two (Micro)
A population of 100,000 with 1000 trained A population of 100,000 with 1000 trained per round.per round.
Propensity score (P): uniform (.001 to Propensity score (P): uniform (.001 to 0.999)0.999)
Heterogeneous treatment effects: Heterogeneous treatment effects: = 1000*P= 1000*P
Simple random sampling without Simple random sampling without stratification.stratification.
Summary: Average Treatment Summary: Average Treatment Effects Decrease with Marginal P.Effects Decrease with Marginal P.
MTE, TUT, TOT, and TOT-TUT by round(100,000 workers, 1,000 trained per round)
0
100
200
300
400
500
600
700
800
0 20 40 60 80 100 120training round
tre
atm
en
t e
ffe
ct
MTE
TUT
TOT
TOT-TUT
A Small Sample Case of Micro A Small Sample Case of Micro SimulationSimulation
MTE, TUT, TOT, and TOT-TUT by round1,000 workers, 50 treated per round
0
100
200
300
400
500
600
700
800
0 10 20 30 40 50 60training round
tre
atm
en
t e
ffe
ct
MTE TUTTOT TOT-TUT
DiscussionDiscussion
It is not possible to discuss causal It is not possible to discuss causal inference at the individual level.inference at the individual level.
Causal inference is possible only at the Causal inference is possible only at the group level – which requires some sort of group level – which requires some sort of homogeneity assumption.homogeneity assumption.
Ignorability is unlikely to be true, but Ignorability is unlikely to be true, but needed for causal inference with needed for causal inference with observational data without strong and observational data without strong and unverifiable assumptions. unverifiable assumptions.
SolutionSolution
Even in this ideal situation (with Even in this ideal situation (with ignorability assumption being true), causal ignorability assumption being true), causal effects can be heterogeneous. effects can be heterogeneous.
This can be handled with hierarchical This can be handled with hierarchical models (Bayesian or not) assuming models (Bayesian or not) assuming homogeneous effects (or structure) within homogeneous effects (or structure) within subgroups.subgroups.
However -- However --
Conclusion 1Conclusion 1
(1) Any estimand (something that is to be (1) Any estimand (something that is to be estimated) in causal inference is essentially estimated) in causal inference is essentially a weighted mean by “composition”.a weighted mean by “composition”.
(2) There is a “composition bias,” which is a (2) There is a “composition bias,” which is a form of selection bias (Type II), as we form of selection bias (Type II), as we change the marginal proportion of the change the marginal proportion of the population treated. (Bad news for those population treated. (Bad news for those looking for “external validity.” ) We do not looking for “external validity.” ) We do not need controversial “unobservables” for this need controversial “unobservables” for this to happen. to happen.
Conclusion 2Conclusion 2
Discovering patterns of heterogeneous Discovering patterns of heterogeneous treatment effects (under ignorability) is treatment effects (under ignorability) is informative to our understanding of social informative to our understanding of social processes.processes. Examples: Xie and Wu (2005), Tsai and Xie Examples: Xie and Wu (2005), Tsai and Xie
(2008), Brand and Xie (2007), Xie, Perez, and (2008), Brand and Xie (2007), Xie, Perez, and Raudenbush (in progress). Raudenbush (in progress).
Conclusion 3Conclusion 3
Observed patterns of heterogeneous Observed patterns of heterogeneous treatment effects (under ignorability) can treatment effects (under ignorability) can help us question the ignorability help us question the ignorability assumption and understand potential assumption and understand potential unobserved selection process:unobserved selection process: Examples: Xie and Wu (2005), Tsai and Xie Examples: Xie and Wu (2005), Tsai and Xie
(2008), Brand and Xie (2007), Bruch and Xie (2008), Brand and Xie (2007), Bruch and Xie (in progress). (in progress).
Modeling Heterogeneous Modeling Heterogeneous Treatment Effects AND SelectionTreatment Effects AND Selection
Heckman’s Marginal Treatment Effects Heckman’s Marginal Treatment Effects (MTE) approach. (MTE) approach.
It is very general, but highly demanding in It is very general, but highly demanding in terms of richness of data. terms of richness of data.
Not only do we need exclusion restriction, Not only do we need exclusion restriction, we also need full support of exclusion we also need full support of exclusion restriction over the whole range of the restriction over the whole range of the latent tendency of being treated. latent tendency of being treated.
Marginal Treatment EffectsMarginal Treatment Effects
Focus on the treatment effects for those who are Focus on the treatment effects for those who are at the margin of being treated.at the margin of being treated.
The term The term UUDD can be interpreted as latent can be interpreted as latent resistance to participate. resistance to participate.
Originally attributable to Bjorklund and Moffitt Originally attributable to Bjorklund and Moffitt (1987). (1987).
DDDMTE uUYYEu |01
Usefulness of MTEUsefulness of MTE
Cornerstone of Heckman’s recent work on Cornerstone of Heckman’s recent work on heterogeneous treatment effectsheterogeneous treatment effects
It provides a linkage to LIV and unifies all It provides a linkage to LIV and unifies all other estimands (e.g., Heckman, Urzua, other estimands (e.g., Heckman, Urzua, and Vytlacil 2006). and Vytlacil 2006).
Treatment heterogeneity is specified at the Treatment heterogeneity is specified at the level of the latent tendency/resistance to level of the latent tendency/resistance to participate. participate.
Some homogeneity is still assumed. Some homogeneity is still assumed.
More Detailed Empirical ExampleMore Detailed Empirical Example
““Social Selection and Returns to College Social Selection and Returns to College Education.”Education.”
Collaborative with Jennie Brand, with her Collaborative with Jennie Brand, with her as the first author (Brand and Xie, in as the first author (Brand and Xie, in progress).progress).
““Economic” or “Positive” Economic” or “Positive” SelectionSelection
Individuals who are most likely to benefit Individuals who are most likely to benefit from college are the most likely to attend from college are the most likely to attend collegecollege Theory of comparative advantageTheory of comparative advantage
Empirical supportEmpirical support Willis and Rosen (1979)Willis and Rosen (1979) Recent series of papers by Heckman and Recent series of papers by Heckman and
colleaguescolleagues
““Social” or “Negative” SelectionSocial” or “Negative” Selection
Individuals who are most likely to benefit Individuals who are most likely to benefit from college are the from college are the leastleast likely to attend likely to attend collegecollege
Theory rooted in a social stratification Theory rooted in a social stratification researchresearch
Social Stratification ResearchSocial Stratification Research
Education is the main factor in the reproduction Education is the main factor in the reproduction of SES of SES andand in upward mobility in upward mobility (Blau & Duncan 1967; (Blau & Duncan 1967; Featherman & Hauser 1978)Featherman & Hauser 1978)
Social reproduction theorySocial reproduction theory (Bourdieu 1977; Bowles and (Bourdieu 1977; Bowles and Gintis 1976; Collins 1971; MacLeod 1989)Gintis 1976; Collins 1971; MacLeod 1989)• Differences in educational attainment by origins Differences in educational attainment by origins (Mare 1980, (Mare 1980,
1981, 1995; Hout, Raftery, & Bell 1993; Lucas 2001)1981, 1995; Hout, Raftery, & Bell 1993; Lucas 2001)
The higher the level of educational attainment, the The higher the level of educational attainment, the less dependence between origins and less dependence between origins and destinations destinations (Yamaguchi 1983; Hout 1988; DiPrete & (Yamaguchi 1983; Hout 1988; DiPrete & Grusky 1990)Grusky 1990)
Hypothetical ModelHypothetical ModelOrigins, Education, and DestinationsOrigins, Education, and Destinations
Socioeconomic origins
Socioeconomicdestinations College-educated
workers
Less-educated workers
Benefit of a college degree
Benefit of a college degree
Main ArgumentMain Argument
There is no blanket answer to the question There is no blanket answer to the question of whether the selection is positive or of whether the selection is positive or negative negative
The answer depends on what is being The answer depends on what is being controlled. controlled.
With adequate controls for relevant With adequate controls for relevant factors, we may observe positive, rather factors, we may observe positive, rather than, negative, selection. than, negative, selection.
Research DescriptionResearch Description
Three panel data sourcesThree panel data sources National Longitudinal Study of Youth 1979 National Longitudinal Study of Youth 1979
(NLSY79)(NLSY79) National Longitudinal Study of the Class of National Longitudinal Study of the Class of
1972 (NLS72)1972 (NLS72) Wisconsin Longitudinal Study (WLS57)Wisconsin Longitudinal Study (WLS57)
Analyses are separate for men and Analyses are separate for men and women.women.
Treatment Effects as a Function of Treatment Effects as a Function of Propensity Scores Propensity Scores A Hierarchical Linear ModelA Hierarchical Linear Model
Propensity score: Propensity score: P(X) P(X) == P P((d d == 11 | | XX)) Group individuals into propensity score strataGroup individuals into propensity score strata
Level 1:Level 1: Estimate the treatment effect specific to Estimate the treatment effect specific to balanced propensity score stratabalanced propensity score strata
Level 2:Level 2: Pool the results and examine the trend Pool the results and examine the trend in the variation of effectsin the variation of effects
A trend in the heterogeneous effects provides a A trend in the heterogeneous effects provides a clear depiction of the direction of selectionclear depiction of the direction of selection
Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: NLSY79 Menby Propensity Score Strata: NLSY79 Men
y90 = -0.03x + 0.48R2 = 0.05
y94 = -0.11x + 0.85R2 = 0.26
y98 = -0.06x + 0.76R2 = 0.53 y02 = -0.03x + 0.60
R2 = 0.10
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnwg90s1 lnwg94s1 lnwg98s1 lnwg02s1
Age 25-28 Earnings Effects Age 29-32 Earnings Effects Age 33-36 Earnings Effects Age 37-40 Earnings Effects
Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: NLS72 Menby Propensity Score Strata: NLS72 Men
y86 = -0.03x + 0.58R2 = 0.02
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7 8 9
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs 1
986
lnwg86s1 Age 32 Earnings Effects
Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: WLS57 Menby Propensity Score Strata: WLS57 Men
y74 = -0.03x + 0.28R2 = 0.11
y92 = -0.01x + 0.46R2 = 0.02
y04 = -0.09x + 1.20R2 = 0.22
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7 8 9
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnyr74s1 lnwg92s1 lnwg04s1 Age 35 Earnings Effects Age 53 Earnings Effects Age 64 Earnings Effects
Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: NLSY79 Womenby Propensity Score Strata: NLSY79 Women
y90 = 0.00x + 0.38R2 = 0.00
y94 = -0.04x + 0.41R2 = 0.10
y98 = -0.17x + 0.88R2 = 0.56
y02 = -0.08x + 0.84R2 = 0.22
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnwg90s2 lnwg94s2 lnwg98s2 lnwg02s2
Age 25-28 Earnings Effects Age 29-32 Earnings Effects Age 33-36 Earnings Effects Age 37-40 Earnings Effects
Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: NLS72 Womenby Propensity Score Strata: NLS72 Women
y86 = -0.06x + 0.60R2 = 0.10
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7 8 9 10
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs 1
986
lnwg86s1 Age 32 Earnings Effects
Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: WLS57 Womenby Propensity Score Strata: WLS57 Women
y74 = -0.19x + 1.26R2 = 0.50
y92 = -0.06x + 0.80R2 = 0.29
y04 = -0.05x + 0.78R2 = 0.11
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnyr74s2 lnwg92s2 lnwg04s2 Age 35 Earnings Effects Age 53 Earnings Effects Age 64 Earnings Effects
Why do some studies find evidence Why do some studies find evidence for economic selection?for economic selection?
Omitted variable biasOmitted variable bias
Propensity Score CovariatesPropensity Score Covariates
Family BackgroundFamily BackgroundParents’ incomeParents’ incomeMother’s educationMother’s educationFather’s educationFather’s educationIntact familyIntact familyNumber of siblingsNumber of siblingsRural residenceRural residenceProximity to Proximity to college/univ.college/univ.RaceRaceReligionReligion
Ability and AcademicsAbility and AcademicsAbility / IQAbility / IQClass rank / HS gradesClass rank / HS gradesCollege-prep. programCollege-prep. program
Social-PsychologicalSocial-PsychologicalTeachers’ Teachers’ encouragementencouragementParents’ Parents’ encouragementencouragementFriends’ plansFriends’ plans
Descriptive StatisticsDescriptive StatisticsMental Ability by Propensity Score StrataMental Ability by Propensity Score Strata
FullFull SetSet of Covariates: WLS57 Men of Covariates: WLS57 Men
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9
Propensity Score Strata
IQ P
erce
ntile
non-college graduates college graduates Linear (non-college graduates) Linear (college graduates)
Descriptive StatisticsDescriptive StatisticsMental Ability by Propensity Score StrataMental Ability by Propensity Score Strata
Small SetSmall Set of Covariates: WLS57 Men of Covariates: WLS57 Men
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8
Propensity Score Strata
IQ P
erce
ntile
non-college graduates college graduates Linear (non-college graduates) Linear (college graduates)
24 point difference
29 point difference
Results, Results, FullFull Set of Covariates Set of Covariates College Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: WLS57 Menby Propensity Score Strata: WLS57 Men
y74 = -0.03x + 0.28R2 = 0.11
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7 8 9
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnyr74s1 Age 35 Earnings Effects
Results, Results, SmallSmall Set of Covariates Set of CovariatesCollege Graduation Treatment Effect on Earnings College Graduation Treatment Effect on Earnings
by Propensity Score Strata: WLS57 Menby Propensity Score Strata: WLS57 Men
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnyr74s5 Age 35 Earnings Effects
Results, Results, FullFull Set of Covariates Set of Covariates College Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings
by Propensity Score Strata: WLS57 Womenby Propensity Score Strata: WLS57 Women
y74 = -0.19x + 1.26
R2 = 0.50
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnyr74s2 Age 35 Earnings Effects
Results, Results, SmallSmall Set of Covariates Set of CovariatesCollege Graduation Treatment Effect on Earnings College Graduation Treatment Effect on Earnings
by Propensity Score Strata: WLS57 Womenby Propensity Score Strata: WLS57 Women
-0.3
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
1 2 3 4 5 6 7 8 9
Propensity Score Strata
Effe
ct o
n Lo
g Ea
rnin
gs
lnyr74s6 Age 35 Earnings Effects
Why is there social selection?Why is there social selection?
Forms of HeterogeneityForms of Heterogeneity Pre-treatmentPre-treatment heterogeneity heterogeneity Treatment effectTreatment effect heterogeneity heterogeneity
HeterogeneousHeterogeneous treatments treatments• {{dd = 1 | p( = 1 | p(dd=1|=1|XX) = ) = jj} ≠ {} ≠ {dd = 1 | p( = 1 | p(dd=1|=1|XX) = ) = kk}, },
where where jj ≠ ≠ kk• Low propensity students utilize college as a Low propensity students utilize college as a
means for economic mobilitymeans for economic mobility
Heterogeneous TreatmentsHeterogeneous TreatmentsCollege Majors for WLS57 MenCollege Majors for WLS57 Men
Low propensity menLow propensity men High proportion of majors: Business, High proportion of majors: Business,
Education Education
High propensity menHigh propensity men High proportion of majors: Sciences, High proportion of majors: Sciences,
HumanitiesHumanities
Ratio of Monetary to Non-monetary Importance in Selecting Ratio of Monetary to Non-monetary Importance in Selecting a Career by Propensity Score Strata: NLS72 Mena Career by Propensity Score Strata: NLS72 Men
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1 2 3 4 5 6 7 8 9
Propensity Score Strata
Rel
ativ
e Im
pt. o
f Mon
etar
y R
etur
ns
Ratio of Monetary to Non-monetary Importance in Selecting Ratio of Monetary to Non-monetary Importance in Selecting a Career by Propensity Score Strata: NLS72 Womena Career by Propensity Score Strata: NLS72 Women
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1 2 3 4 5 6 7 8 9 10
Propensity Score Strata
Rel
ativ
e Im
pt. o
f Mon
etar
y R
etur
ns
Value of College by Propensity Score Strata: Value of College by Propensity Score Strata: WLS57 MenWLS57 Men
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9
Propensity Score Strata
Val
ue o
f Col
lege
non-college graduates college graduates Linear (non-college graduates) Linear (college graduates)
Value of College by Propensity Score Strata: Value of College by Propensity Score Strata: WLS57 WomenWLS57 Women
40
50
60
70
80
90
100
1 2 3 4 5 6 7
Propensity Score Strata
Val
ue o
f Col
lege
non-college graduates college graduates Linear (non-college graduates) Linear (college graduates)
SummarySummary
Main resultsMain results Robust evidence for social selectionRobust evidence for social selection
• NLS79, NLS72, and WLS57NLS79, NLS72, and WLS57• Men and womenMen and women• Early, mid-, and late career returnsEarly, mid-, and late career returns
Exploratory resultsExploratory results Why do some prior studies find evidence for Why do some prior studies find evidence for
economic selection?economic selection?• Omitted variable biasOmitted variable bias
Why is there social selection?Why is there social selection?• Low propensity students: View college as a means Low propensity students: View college as a means
for economic mobilityfor economic mobility
ReferencesReferences Brand, Jennie, and Yu Xie. 2007. “Who Benefits Most Brand, Jennie, and Yu Xie. 2007. “Who Benefits Most
From College? Evidence for Negative Selection in From College? Evidence for Negative Selection in Heterogeneous Economic Returns to Higher Education.” Heterogeneous Economic Returns to Higher Education.”
Rosenbaum, Paul R. and Donald B. Rubin. 1984. Rosenbaum, Paul R. and Donald B. Rubin. 1984. "Reducing Bias in Observational Studies Using "Reducing Bias in Observational Studies Using Subclassification on the Propensity Score.'' Subclassification on the Propensity Score.'' Journal of Journal of the American Statistical Associationthe American Statistical Association 79, 516-524. 79, 516-524.
Tsai, Shu-Ling, and Yu Xie. 2008. “Changes in Tsai, Shu-Ling, and Yu Xie. 2008. “Changes in Earnings Returns to Higher Education in Taiwan since Earnings Returns to Higher Education in Taiwan since the 1990s.” the 1990s.” Population ReviewPopulation Review. .
Xie, Yu, Steve Raudenbush, and Tony Perez. In Xie, Yu, Steve Raudenbush, and Tony Perez. In Progress. “Weighting in Causal Inference.” Xie, Yu and Progress. “Weighting in Causal Inference.” Xie, Yu and Xiaogang Wu. 2005. “Market Premium, Social Process, Xiaogang Wu. 2005. “Market Premium, Social Process, and Statisticism.” and Statisticism.” American Sociological ReviewAmerican Sociological Review 70:865- 70:865-870. 870.