methodological workshop 2: ignorability, selection bias, and causal inference yu xie university of...

Methodological Workshop 2:Methodological Workshop 2:

Ignorability, Selection Bias, Ignorability, Selection Bias, and Causal Inferenceand Causal Inference

Yu XieYu XieUniversity of MichiganUniversity of Michigan

Observed DataObserved Data

A population with N individuals, from which A population with N individuals, from which we draw a sample of size n. we draw a sample of size n.

There is an outcome of interest, say Y, There is an outcome of interest, say Y, that is measured on the real line.that is measured on the real line.

There is an independent variable of There is an independent variable of interest, say D. For simplicity, let us interest, say D. For simplicity, let us assume that D is a binary “treatment,” D=1 assume that D is a binary “treatment,” D=1 (T), D=0 (C). This is the simplest case. (T), D=0 (C). This is the simplest case.

Let us call this setup “canonical case”Let us call this setup “canonical case”

Canonical Case ExaminedCanonical Case Examined What is the causal effect of treatment D?What is the causal effect of treatment D? It is the counterfactual effect for the ith individual:It is the counterfactual effect for the ith individual:

YYiiTT - Y - Yii

CC

However, we either observe However, we either observe

YYiiT T when Dwhen Di i =1 or=1 or

YYiiC C when Dwhen Di i =0.=0.

ConclusionConclusion: it is not possible to identify individual-: it is not possible to identify individual-level causal effect without assumptions. level causal effect without assumptions.

At Another ExtremeAt Another Extreme

We can impose a strong, unrealistic assumption We can impose a strong, unrealistic assumption that all individuals are identical (a homogeneity that all individuals are identical (a homogeneity assumption often made in physical science); assumption often made in physical science); then we havethen we have

YYiiTT = Y = YT T ; Y; Yii

CC = Y = YCC

We only need two observations to identify the causal We only need two observations to identify the causal effect:effect:YYT T when Dwhen D =1 and=1 andYYC C when Dwhen D =0.=0.

ImplicationImplication: it is population variability that makes : it is population variability that makes “scientific sampling” necessary. “scientific sampling” necessary.

Yu Xie’s “Fundamental Paradox in Yu Xie’s “Fundamental Paradox in Social Science”Social Science”

There is There is alwaysalways variability at the individual variability at the individual level. level.

Causal inference is impossible at the Causal inference is impossible at the individual level and thus individual level and thus alwaysalways requires requires statistical analysis at the group level on statistical analysis at the group level on the basis of some homogeneity the basis of some homogeneity assumption. assumption.

Different methods boil down to different Different methods boil down to different comparison groups. comparison groups.

Consider the Usual CaseConsider the Usual Case Population is divided into two subpopulations: PPopulation is divided into two subpopulations: P11 if D if Dii =1, =1,

PP00 if D if Dii=0. =0. Use the following notations:Use the following notations:

q = proportion of Pq = proportion of P00 in P in P E(YE(Y11

TT) = E(Y) = E(YTT|D=1) , E(Y|D=1) , E(Y11CC) = E(Y) = E(YCC|D=1) |D=1)

E(YE(Y00TT) = E(Y) = E(YTT|D=0) , E(Y|D=0) , E(Y00

CC) = E(Y) = E(YCC|D=0) |D=0) By total expectation rule: By total expectation rule:

ATE=E(YATE=E(YTT - Y - YCC)) = E(Y = E(Y11TT – Y – Y11

CC)(1-q) + E(Y)(1-q) + E(Y00TT – Y – Y00

CC)q)q = = E(YE(Y11

TT – Y – Y00CC)) - E(Y - E(Y11

CC – Y – Y00CC) - () - (11--00)q, )q,

where where 1 1 == E(YE(Y11TT – Y – Y11

CC) = TT,) = TT,

0 0 == E(YE(Y00TT – Y – Y00

CC) = TUT. ) = TUT.

In Other WordsIn Other Words

The standard estimator The standard estimator E(YE(Y11TT – Y – Y00

CC)) contains two contains two sources of biases: sources of biases: (1) The average difference between P(1) The average difference between P11 and P and P00 in in

the absence of treatment (“pre-treatment the absence of treatment (“pre-treatment heterogeneity bias,” or “Type I selection bias.”):heterogeneity bias,” or “Type I selection bias.”):

E(YE(Y11CC – Y – Y00

CC)) (2) The difference in the average treatment effect (2) The difference in the average treatment effect

between Pbetween P11 and P and P00 (“treatment-effect (“treatment-effect heterogeneity bias,” or “Type II selection bias.”):heterogeneity bias,” or “Type II selection bias.”):

11--00

Both sources of bias Both sources of bias averageaverage to zero under to zero under randomized assignment. randomized assignment.

In Regression LanguageIn Regression Language YYii = = + + iiDDii + + i i

There are two types of variability that may There are two types of variability that may cause biases: cause biases: (1) Type I selection bias (focusing on (1) Type I selection bias (focusing on ii): ):

If corr(If corr(,,,,D)≠0. D)≠0.

(2) Type II selection bias (focusing on (2) Type II selection bias (focusing on i i ):):

If corr(If corr(,,,,D)≠0. D)≠0.

Selection Bias and EstimandsSelection Bias and Estimands ATEATEE(YE(YTT - Y - YCC))

= = E(YE(Y11TT – Y – Y00

CC)) - E(Y - E(Y11CC – Y – Y00

CC) - () - (11--00)q, )q, When Type I selection bias is present, but Type II selection When Type I selection bias is present, but Type II selection

bias is absent (say homogenous treatment effect). bias is absent (say homogenous treatment effect).

E(YE(Y11TT – Y – Y00

CC) ≠ ) ≠ When Type I selection bias is absent, but Type II selection When Type I selection bias is absent, but Type II selection

bias is present. bias is present.

E(YE(Y11TT – Y – Y00

CC) ≠ ) ≠ ATEATE≠ ≠ 1 1 ≠ ≠ 00

Type II selection bias is important. Type II selection bias is important. Type II selection bias cannot be eliminated by “fixed-Type II selection bias cannot be eliminated by “fixed-

effects” approach. effects” approach.

Ignorability and Selection BiasIgnorability and Selection Bias

Type of Selection BiasType of Selection Bias

Type IType I Type IIType II

IgnorabilityIgnorability

Assumed?Assumed?

(Invoking (Invoking Unobserv-Unobserv-

ables?)ables?)

YesYes

(No)(No)

Propensity Score Propensity Score

(Rubin et al.)(Rubin et al.) ??NoNo

(Yes)(Yes)

Structural Selection Structural Selection ModelModel

(Heckman et al.)(Heckman et al.)

Non-parametric IV Non-parametric IV Models Models

(Heckman et al.)(Heckman et al.)

IV versus LATEIV versus LATE

Exactly the same formula, but different Exactly the same formula, but different interpretations interpretations

IV interpretation: constant treatment effect.IV interpretation: constant treatment effect. LATE interpretation: heterogeneous LATE interpretation: heterogeneous

treatment effects, averaged into different treatment effects, averaged into different groups (strata). groups (strata).

Heckman Selection ModelHeckman Selection Model

X

X

Z

Y

d

Z

)BN( ~

DLatent Rule

Important Role of UnobservablesImportant Role of Unobservables

The treatment of selection bias in The treatment of selection bias in economics requires specification of economics requires specification of unobserved variables.unobserved variables.

Such specifications are subject to dispute. Such specifications are subject to dispute. The issue of unobservables also splits The issue of unobservables also splits

economists and statisticians into two economists and statisticians into two camps. camps.

As a result, not enough attention has been As a result, not enough attention has been paid to (1,2) cell, marked by “paid to (1,2) cell, marked by “??”. ”.

Missing KnowledgeMissing Knowledge

We do not know much about the cell marked We do not know much about the cell marked by “by “?? ”. ”.

Most work in economics on selection bias Most work in economics on selection bias assumes that ignorability does not hold true. assumes that ignorability does not hold true.

Since we can easily handle Type I selection Since we can easily handle Type I selection bias under ignorability, it seems that Type II bias under ignorability, it seems that Type II selection bias under ignorability is a trivial selection bias under ignorability is a trivial matter. matter.

I will show that this is not true. I will show that this is not true.

Making SenseMaking Sense

In this presentation, I discuss a simple In this presentation, I discuss a simple scenario where Type II selection bias scenario where Type II selection bias (which I call “composition bias”) arises (which I call “composition bias”) arises from a common situation in which we from a common situation in which we assume ignorability. assume ignorability.

Ignorability AssumptionIgnorability Assumption

Also called “selection on observables.” Also called “selection on observables.” Let Let XX denote a vector of observed covariates. The denote a vector of observed covariates. The

ignorability assumption states: ignorability assumption states: DD ‖‖ (Y (YCC, Y, YTT) | ) | XX.. We start with the assumption, although we do not We start with the assumption, although we do not

necessarily believe that this is true.necessarily believe that this is true. We want to learn as much as the data can tell us.We want to learn as much as the data can tell us.

Under the Ignorability AssumptionUnder the Ignorability Assumption

The important work by Rosenbaum and Rubin The important work by Rosenbaum and Rubin (1984) shows that, when the ignorability (1984) shows that, when the ignorability assumption holds true, it is sufficient to condition on assumption holds true, it is sufficient to condition on the propensity score as a function of the propensity score as a function of XX. The . The condition is changed tocondition is changed to

DD ‖‖ (Y (YCC, Y, YTT) | ) | pp((DD=1|=1|XX). ).

In Other WordsIn Other Words

here is no bias, conditional on propensity here is no bias, conditional on propensity score:score:

E[YE[YTT - Y - YC C ||p(X)]p(X)] = = E[YE[Y11TT – Y – Y00

CC | |p(X)]p(X)]

Recall Earlier ResultRecall Earlier Result

E(YE(YTT - Y - YCC)) [ATE] [ATE] = = E(YE(Y11

TT – Y – Y00CC)) - E(Y - E(Y11

CC – Y – Y00CC) - () - (11--00)q.)q.

The ignorability assumption thus means:The ignorability assumption thus means: No Type I selection bias, conditional on p(X):No Type I selection bias, conditional on p(X):

E[YE[Y11CC – Y – Y00

CC|p(X)] = 0|p(X)] = 0

E[YE[Y00CC|p(X)] = E[Y|p(X)] = E[Y11

CC|p(X)] = E[Y|p(X)] = E[YCC|p(X)] |p(X)] No Type II selection bias, conditional on p(X): No Type II selection bias, conditional on p(X):

E[(YE[(Y11TT – Y – Y11

CC) - (Y) - (Y00TT – Y – Y00

CC)|p(X)] = 0)|p(X)] = 0

E[YE[Y11TT – Y – Y00

CC|p(X)] = E[Y|p(X)] = E[Y11TT – Y – Y11

CC|p(X)] |p(X)]

= E[(Y= E[(YTT - Y - YCC)|p(X)])|p(X)]

ImplicationsImplications

Implication 1Implication 1: we should conduct propensity-score : we should conduct propensity-score specific analysis under ignorability.specific analysis under ignorability.

Implication 2Implication 2: the only “interaction” effects that can : the only “interaction” effects that can lead to selection bias (Type II) are those lead to selection bias (Type II) are those between the treatment status and the propensity between the treatment status and the propensity score. score.

SetupSetup

Two requirements:Two requirements: There are heterogeneous treatment effects There are heterogeneous treatment effects The heterogeneity in treatment effects is correlated The heterogeneity in treatment effects is correlated

with the propensity of treatment. with the propensity of treatment. Both requirements are accepted in the standard Both requirements are accepted in the standard

(statistical) approach assuming ignorability. (statistical) approach assuming ignorability. We wish to show: We wish to show:

(1) treatment-effect heterogeneity => (1) treatment-effect heterogeneity => Type II selection bias. Type II selection bias.

(2) Type II selection bias = composition bias. (2) Type II selection bias = composition bias. (3) This happens without unobservables. (3) This happens without unobservables.

Example I: Market Premium in Example I: Market Premium in Contemporary ChinaContemporary China

We found that the social mechanisms and We found that the social mechanisms and social consequences of transitioning from social consequences of transitioning from the state sector to the market significantly the state sector to the market significantly changed over time (Wu and Xie 2003, changed over time (Wu and Xie 2003, ASR). ASR).

Jann (2005) and Xie and Wu Jann (2005) and Xie and Wu (2005)(2005)

Jann argued that there is no statistical Jann argued that there is no statistical difference in returns to education between difference in returns to education between early entrants and late entrants. Thus, Wu early entrants and late entrants. Thus, Wu and Xie’s conclusion is incorrect.and Xie’s conclusion is incorrect.

Social processes generating the three Social processes generating the three groups are cumulative so that the three groups are cumulative so that the three groups are not symmetric. groups are not symmetric.

Year

p1=0.11 p2=0.16d=2d=1

199619871978

State S

ectorM

arket S

ector

Experienced Workers (1197)

Flow Chart of Labor Market Transitions in China, 1978 – 1996

Stayers(1068)

Stayers(1590)

Stayers (1337)

Later Entrants(253)

Earl Birds(129)

New Entrants to the State Sector (522)

Earl Birds(129)

Xie and Wu’s ( 2005) Key Results: Xie and Wu’s ( 2005) Key Results: Market Premium of Late EntryMarket Premium of Late Entry

6.07

3.33

3.1

2.41

.93 .74

.31

-.65-20

00

200

400

600

800

Mar

ket E

ffect

on

Ear

nin

gs

1 2 3 4 5 6 7 8Propensity Score Strata

observed linear fit

Example II: College Returns (Brand and Xie)Example II: College Returns (Brand and Xie)

Research questionResearch question What’s the earnings return to college What’s the earnings return to college

educationeducation Data set: WLS. Earnings are measured at Data set: WLS. Earnings are measured at

different points in life course. different points in life course.

Preliminary Findings: College Graduation Preliminary Findings: College Graduation Treatment Effect on Earnings Treatment Effect on Earnings

by Propensity Score Strata: WLS Menby Propensity Score Strata: WLS Men

y = -0.0276x + 0.2501

R2 = 0.0932

y = -0.0222x + 0.3992

R2 = 0.3564

y = -0.0636x + 1.1008

R2 = 0.1433

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9 10

Propensity Score Strata

Eff

ect o

n L

og E

arni

ngs

lnyr74s1 lnwg92s1 lnwg04s1 Age 35 Earnings Effects Age 53 Earnings Effects Age 64 Earnings Effects

Example III: NSW Data on Job TrainingExample III: NSW Data on Job Training

Research questionResearch question Does participation in the National Supported Does participation in the National Supported

Work Demonstration (NSW) improve workers’ Work Demonstration (NSW) improve workers’ wages?wages?

NSWNSW A temporary employment program designed to A temporary employment program designed to

help low skilled workers move into the labor help low skilled workers move into the labor market.market.

Original NSW data were experimental (random Original NSW data were experimental (random assignment into treatment and control groups). assignment into treatment and control groups).

Re-Analysis in Xie, Perez, and Re-Analysis in Xie, Perez, and Raudenbush (in progress) Raudenbush (in progress)

-10

000

-50

000

5000

Mea

n In

com

e C

hang

e

0 2 4 6 8 10Propensity Score Strata

Observed Linear Fit

PSID Comparison DataUnweighted Treatment Effect by Propensity Strata

Main InsightsMain Insights

Selection into treatment is a dynamic process Selection into treatment is a dynamic process (akin to survival analysis), so that net (akin to survival analysis), so that net “composition” changes with the proportion of the “composition” changes with the proportion of the subpopulation being treated (Psubpopulation being treated (P11). ).

Heterogeneous treatment propensities + Heterogeneous treatment propensities + associated heterogeneous treatment effects associated heterogeneous treatment effects “composition bias” -- Type II selection bias. “composition bias” -- Type II selection bias.

In this setup, we use the marginal proportion of In this setup, we use the marginal proportion of treatment as an “instrument” for the definition of treatment as an “instrument” for the definition of the marginal treatment effect. the marginal treatment effect.

Simulation One, SetupSimulation One, Setup

Baseline

Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c

0.05 50 100 0 0 5.00 0.00 0.00

0.15 150 100 0 0 15.00 0.00 0.00

0.25 250 100 0 0 25.00 0.00 0.00

0.35 350 100 0 0 35.00 0.00 0.00

0.45 450 100 0 0 45.00 0.00 0.00

0.55 550 100 0 0 55.00 0.00 0.00

0.65 650 100 0 0 65.00 0.00 0.00

0.75 750 100 0 0 75.00 0.00 0.00

0.85 850 100 0 0 85.00 0.00 0.00

0.95 950 100 0 0 95.00 0.00 0.00

SUM 1000 0 0 TUT MTE TT

500.00 0.00 0.00

Simulation OneSimulation One

Draw1


0.05 50 99 1 1 5.50 0.50 0.50

0.15 150 97 3 3 16.17 4.50 4.50

0.25 250 95 5 5 26.39 12.50 12.50

0.35 350 93 7 7 36.17 24.50 24.50

0.45 450 91 9 9 45.50 40.50 40.50

0.55 550 89 11 11 54.39 60.50 60.50

0.65 650 87 13 13 62.83 84.50 84.50

0.75 750 85 15 15 70.83 112.50 112.50

0.85 850 83 17 17 78.39 144.50 144.50

0.95 950 81 19 19 85.50 180.50 180.50


481.667 665.000 665.00

Simulation OneSimulation One

Draw2


0.05 50 98 1 2 6.12 0.57 0.54

0.15 150 94 3 6 17.56 5.03 4.77

0.25 250 90 5 10 27.98 13.70 13.10

0.35 350 85 8 15 37.40 26.28 25.39

0.45 450 82 9 18 45.87 42.51 41.50

0.55 550 78 11 22 53.42 62.10 61.30

0.65 650 74 13 26 60.09 84.79 84.65

0.75 750 70 15 30 65.90 110.29 111.40

0.85 850 67 16 33 70.90 138.33 141.42

0.95 950 63 18 37 75.11 168.63 174.57


460.344 652.249 658.62

Simulation One, SummarySimulation One, SummaryTreatment Effects by Round

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

0 100 200 300 400 600 700 800 900 1000

Number of Cases Sampled

Tre

atm

ent

Eff

ect

TUT MTE TT TT-TUT

Simulation Two (Micro)Simulation Two (Micro)

A population of 100,000 with 1000 trained A population of 100,000 with 1000 trained per round.per round.

Propensity score (P): uniform (.001 to Propensity score (P): uniform (.001 to 0.999)0.999)

Heterogeneous treatment effects: Heterogeneous treatment effects: = 1000*P= 1000*P

Simple random sampling without Simple random sampling without stratification.stratification.

Summary: Average Treatment Summary: Average Treatment Effects Decrease with Marginal P.Effects Decrease with Marginal P.

MTE, TUT, TOT, and TOT-TUT by round(100,000 workers, 1,000 trained per round)

0

100

200

300

400

500

600

700

800

0 20 40 60 80 100 120training round

tre

atm

en

t e

ffe

ct

MTE

TUT

TOT

TOT-TUT

A Small Sample Case of Micro A Small Sample Case of Micro SimulationSimulation

MTE, TUT, TOT, and TOT-TUT by round1,000 workers, 50 treated per round

0

100

200

300

400

500

600

700

800

0 10 20 30 40 50 60training round

tre

atm

en

t e

ffe

ct

MTE TUTTOT TOT-TUT

DiscussionDiscussion

It is not possible to discuss causal It is not possible to discuss causal inference at the individual level.inference at the individual level.

Causal inference is possible only at the Causal inference is possible only at the group level – which requires some sort of group level – which requires some sort of homogeneity assumption.homogeneity assumption.

Ignorability is unlikely to be true, but Ignorability is unlikely to be true, but needed for causal inference with needed for causal inference with observational data without strong and observational data without strong and unverifiable assumptions. unverifiable assumptions.

SolutionSolution

Even in this ideal situation (with Even in this ideal situation (with ignorability assumption being true), causal ignorability assumption being true), causal effects can be heterogeneous. effects can be heterogeneous.

This can be handled with hierarchical This can be handled with hierarchical models (Bayesian or not) assuming models (Bayesian or not) assuming homogeneous effects (or structure) within homogeneous effects (or structure) within subgroups.subgroups.

However -- However --

Conclusion 1Conclusion 1

(1) Any estimand (something that is to be (1) Any estimand (something that is to be estimated) in causal inference is essentially estimated) in causal inference is essentially a weighted mean by “composition”.a weighted mean by “composition”.

(2) There is a “composition bias,” which is a (2) There is a “composition bias,” which is a form of selection bias (Type II), as we form of selection bias (Type II), as we change the marginal proportion of the change the marginal proportion of the population treated. (Bad news for those population treated. (Bad news for those looking for “external validity.” ) We do not looking for “external validity.” ) We do not need controversial “unobservables” for this need controversial “unobservables” for this to happen. to happen.


Discovering patterns of heterogeneous Discovering patterns of heterogeneous treatment effects (under ignorability) is treatment effects (under ignorability) is informative to our understanding of social informative to our understanding of social processes.processes. Examples: Xie and Wu (2005), Tsai and Xie Examples: Xie and Wu (2005), Tsai and Xie

(2008), Brand and Xie (2007), Xie, Perez, and (2008), Brand and Xie (2007), Xie, Perez, and Raudenbush (in progress). Raudenbush (in progress).


Observed patterns of heterogeneous Observed patterns of heterogeneous treatment effects (under ignorability) can treatment effects (under ignorability) can help us question the ignorability help us question the ignorability assumption and understand potential assumption and understand potential unobserved selection process:unobserved selection process: Examples: Xie and Wu (2005), Tsai and Xie Examples: Xie and Wu (2005), Tsai and Xie

(2008), Brand and Xie (2007), Bruch and Xie (2008), Brand and Xie (2007), Bruch and Xie (in progress). (in progress).

Modeling Heterogeneous Modeling Heterogeneous Treatment Effects AND SelectionTreatment Effects AND Selection

Heckman’s Marginal Treatment Effects Heckman’s Marginal Treatment Effects (MTE) approach. (MTE) approach.

It is very general, but highly demanding in It is very general, but highly demanding in terms of richness of data. terms of richness of data.

Not only do we need exclusion restriction, Not only do we need exclusion restriction, we also need full support of exclusion we also need full support of exclusion restriction over the whole range of the restriction over the whole range of the latent tendency of being treated. latent tendency of being treated.

Marginal Treatment EffectsMarginal Treatment Effects

Focus on the treatment effects for those who are Focus on the treatment effects for those who are at the margin of being treated.at the margin of being treated.

The term The term UUDD can be interpreted as latent can be interpreted as latent resistance to participate. resistance to participate.

Originally attributable to Bjorklund and Moffitt Originally attributable to Bjorklund and Moffitt (1987). (1987).

DDDMTE uUYYEu |01

Usefulness of MTEUsefulness of MTE

Cornerstone of Heckman’s recent work on Cornerstone of Heckman’s recent work on heterogeneous treatment effectsheterogeneous treatment effects

It provides a linkage to LIV and unifies all It provides a linkage to LIV and unifies all other estimands (e.g., Heckman, Urzua, other estimands (e.g., Heckman, Urzua, and Vytlacil 2006). and Vytlacil 2006).

Treatment heterogeneity is specified at the Treatment heterogeneity is specified at the level of the latent tendency/resistance to level of the latent tendency/resistance to participate. participate.

Some homogeneity is still assumed. Some homogeneity is still assumed.

More Detailed Empirical ExampleMore Detailed Empirical Example

““Social Selection and Returns to College Social Selection and Returns to College Education.”Education.”

Collaborative with Jennie Brand, with her Collaborative with Jennie Brand, with her as the first author (Brand and Xie, in as the first author (Brand and Xie, in progress).progress).

““Economic” or “Positive” Economic” or “Positive” SelectionSelection

Individuals who are most likely to benefit Individuals who are most likely to benefit from college are the most likely to attend from college are the most likely to attend collegecollege Theory of comparative advantageTheory of comparative advantage

Empirical supportEmpirical support Willis and Rosen (1979)Willis and Rosen (1979) Recent series of papers by Heckman and Recent series of papers by Heckman and

colleaguescolleagues

““Social” or “Negative” SelectionSocial” or “Negative” Selection

Individuals who are most likely to benefit Individuals who are most likely to benefit from college are the from college are the leastleast likely to attend likely to attend collegecollege

Theory rooted in a social stratification Theory rooted in a social stratification researchresearch

Social Stratification ResearchSocial Stratification Research

Education is the main factor in the reproduction Education is the main factor in the reproduction of SES of SES andand in upward mobility in upward mobility (Blau & Duncan 1967; (Blau & Duncan 1967; Featherman & Hauser 1978)Featherman & Hauser 1978)

Social reproduction theorySocial reproduction theory (Bourdieu 1977; Bowles and (Bourdieu 1977; Bowles and Gintis 1976; Collins 1971; MacLeod 1989)Gintis 1976; Collins 1971; MacLeod 1989)• Differences in educational attainment by origins Differences in educational attainment by origins (Mare 1980, (Mare 1980,

1981, 1995; Hout, Raftery, & Bell 1993; Lucas 2001)1981, 1995; Hout, Raftery, & Bell 1993; Lucas 2001)

The higher the level of educational attainment, the The higher the level of educational attainment, the less dependence between origins and less dependence between origins and destinations destinations (Yamaguchi 1983; Hout 1988; DiPrete & (Yamaguchi 1983; Hout 1988; DiPrete & Grusky 1990)Grusky 1990)

Hypothetical ModelHypothetical ModelOrigins, Education, and DestinationsOrigins, Education, and Destinations

Socioeconomic origins

Socioeconomicdestinations College-educated

workers

Less-educated workers

Benefit of a college degree

Benefit of a college degree

Main ArgumentMain Argument

There is no blanket answer to the question There is no blanket answer to the question of whether the selection is positive or of whether the selection is positive or negative negative

The answer depends on what is being The answer depends on what is being controlled. controlled.

With adequate controls for relevant With adequate controls for relevant factors, we may observe positive, rather factors, we may observe positive, rather than, negative, selection. than, negative, selection.

Research DescriptionResearch Description

Three panel data sourcesThree panel data sources National Longitudinal Study of Youth 1979 National Longitudinal Study of Youth 1979

(NLSY79)(NLSY79) National Longitudinal Study of the Class of National Longitudinal Study of the Class of

1972 (NLS72)1972 (NLS72) Wisconsin Longitudinal Study (WLS57)Wisconsin Longitudinal Study (WLS57)

Analyses are separate for men and Analyses are separate for men and women.women.

Treatment Effects as a Function of Treatment Effects as a Function of Propensity Scores Propensity Scores A Hierarchical Linear ModelA Hierarchical Linear Model

Propensity score: Propensity score: P(X) P(X) == P P((d d == 11 | | XX)) Group individuals into propensity score strataGroup individuals into propensity score strata

Level 1:Level 1: Estimate the treatment effect specific to Estimate the treatment effect specific to balanced propensity score stratabalanced propensity score strata

Level 2:Level 2: Pool the results and examine the trend Pool the results and examine the trend in the variation of effectsin the variation of effects

A trend in the heterogeneous effects provides a A trend in the heterogeneous effects provides a clear depiction of the direction of selectionclear depiction of the direction of selection

Main ResultsMain ResultsCollege Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings

by Propensity Score Strata: NLSY79 Menby Propensity Score Strata: NLSY79 Men

y90 = -0.03x + 0.48R2 = 0.05

y94 = -0.11x + 0.85R2 = 0.26

y98 = -0.06x + 0.76R2 = 0.53 y02 = -0.03x + 0.60

R2 = 0.10

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5


Effe

ct o

n Lo

g Ea

rnin

gs

lnwg90s1 lnwg94s1 lnwg98s1 lnwg02s1

Age 25-28 Earnings Effects Age 29-32 Earnings Effects Age 33-36 Earnings Effects Age 37-40 Earnings Effects


by Propensity Score Strata: NLS72 Menby Propensity Score Strata: NLS72 Men

y86 = -0.03x + 0.58R2 = 0.02

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7 8 9


Effe

ct o

n Lo

g Ea

rnin

gs 1

986

lnwg86s1 Age 32 Earnings Effects


by Propensity Score Strata: WLS57 Menby Propensity Score Strata: WLS57 Men

y74 = -0.03x + 0.28R2 = 0.11

y92 = -0.01x + 0.46R2 = 0.02

y04 = -0.09x + 1.20R2 = 0.22

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7 8 9


Effe

ct o

n Lo

g Ea

rnin

gs



by Propensity Score Strata: NLSY79 Womenby Propensity Score Strata: NLSY79 Women

y90 = 0.00x + 0.38R2 = 0.00

y94 = -0.04x + 0.41R2 = 0.10

y98 = -0.17x + 0.88R2 = 0.56

y02 = -0.08x + 0.84R2 = 0.22

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6


Effe

ct o

n Lo

g Ea

rnin

gs

lnwg90s2 lnwg94s2 lnwg98s2 lnwg02s2

Age 25-28 Earnings Effects Age 29-32 Earnings Effects Age 33-36 Earnings Effects Age 37-40 Earnings Effects


by Propensity Score Strata: NLS72 Womenby Propensity Score Strata: NLS72 Women

y86 = -0.06x + 0.60R2 = 0.10

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7 8 9 10


Effe

ct o

n Lo

g Ea

rnin

gs 1

986

lnwg86s1 Age 32 Earnings Effects


by Propensity Score Strata: WLS57 Womenby Propensity Score Strata: WLS57 Women

y74 = -0.19x + 1.26R2 = 0.50

y92 = -0.06x + 0.80R2 = 0.29

y04 = -0.05x + 0.78R2 = 0.11

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7


Effe

ct o

n Lo

g Ea

rnin

gs


Why do some studies find evidence Why do some studies find evidence for economic selection?for economic selection?

Omitted variable biasOmitted variable bias

Propensity Score CovariatesPropensity Score Covariates

Family BackgroundFamily BackgroundParents’ incomeParents’ incomeMother’s educationMother’s educationFather’s educationFather’s educationIntact familyIntact familyNumber of siblingsNumber of siblingsRural residenceRural residenceProximity to Proximity to college/univ.college/univ.RaceRaceReligionReligion

Ability and AcademicsAbility and AcademicsAbility / IQAbility / IQClass rank / HS gradesClass rank / HS gradesCollege-prep. programCollege-prep. program

Social-PsychologicalSocial-PsychologicalTeachers’ Teachers’ encouragementencouragementParents’ Parents’ encouragementencouragementFriends’ plansFriends’ plans

Descriptive StatisticsDescriptive StatisticsMental Ability by Propensity Score StrataMental Ability by Propensity Score Strata

FullFull SetSet of Covariates: WLS57 Men of Covariates: WLS57 Men

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8 9


IQ P

erce

ntile

non-college graduates college graduates Linear (non-college graduates) Linear (college graduates)

Descriptive StatisticsDescriptive StatisticsMental Ability by Propensity Score StrataMental Ability by Propensity Score Strata

Small SetSmall Set of Covariates: WLS57 Men of Covariates: WLS57 Men

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8


IQ P

erce

ntile


24 point difference

29 point difference

Results, Results, FullFull Set of Covariates Set of Covariates College Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings


y74 = -0.03x + 0.28R2 = 0.11

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7 8 9


Effe

ct o

n Lo

g Ea

rnin

gs

lnyr74s1 Age 35 Earnings Effects

Results, Results, SmallSmall Set of Covariates Set of CovariatesCollege Graduation Treatment Effect on Earnings College Graduation Treatment Effect on Earnings


-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7


Effe

ct o

n Lo

g Ea

rnin

gs


Results, Results, FullFull Set of Covariates Set of Covariates College Graduation Treatment Effect on Log Earnings College Graduation Treatment Effect on Log Earnings


y74 = -0.19x + 1.26

R2 = 0.50

-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7


Effe

ct o

n Lo

g Ea

rnin

gs


Results, Results, SmallSmall Set of Covariates Set of CovariatesCollege Graduation Treatment Effect on Earnings College Graduation Treatment Effect on Earnings


-0.3

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

1.3

1 2 3 4 5 6 7 8 9


Effe

ct o

n Lo

g Ea

rnin

gs


Why is there social selection?Why is there social selection?

Forms of HeterogeneityForms of Heterogeneity Pre-treatmentPre-treatment heterogeneity heterogeneity Treatment effectTreatment effect heterogeneity heterogeneity

HeterogeneousHeterogeneous treatments treatments• {{dd = 1 | p( = 1 | p(dd=1|=1|XX) = ) = jj} ≠ {} ≠ {dd = 1 | p( = 1 | p(dd=1|=1|XX) = ) = kk}, },

where where jj ≠ ≠ kk• Low propensity students utilize college as a Low propensity students utilize college as a

means for economic mobilitymeans for economic mobility

Heterogeneous TreatmentsHeterogeneous TreatmentsCollege Majors for WLS57 MenCollege Majors for WLS57 Men

Low propensity menLow propensity men High proportion of majors: Business, High proportion of majors: Business,

Education Education

High propensity menHigh propensity men High proportion of majors: Sciences, High proportion of majors: Sciences,

HumanitiesHumanities

Ratio of Monetary to Non-monetary Importance in Selecting Ratio of Monetary to Non-monetary Importance in Selecting a Career by Propensity Score Strata: NLS72 Mena Career by Propensity Score Strata: NLS72 Men

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1 2 3 4 5 6 7 8 9


Rel

ativ

e Im

pt. o

f Mon

etar

y R

etur

ns

Ratio of Monetary to Non-monetary Importance in Selecting Ratio of Monetary to Non-monetary Importance in Selecting a Career by Propensity Score Strata: NLS72 Womena Career by Propensity Score Strata: NLS72 Women

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1 2 3 4 5 6 7 8 9 10


Rel

ativ

e Im

pt. o

f Mon

etar

y R

etur

ns

Value of College by Propensity Score Strata: Value of College by Propensity Score Strata: WLS57 MenWLS57 Men

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8 9


Val

ue o

f Col

lege


Value of College by Propensity Score Strata: Value of College by Propensity Score Strata: WLS57 WomenWLS57 Women

40

50

60

70

80

90

100

1 2 3 4 5 6 7


Val

ue o

f Col

lege


SummarySummary

Main resultsMain results Robust evidence for social selectionRobust evidence for social selection

• NLS79, NLS72, and WLS57NLS79, NLS72, and WLS57• Men and womenMen and women• Early, mid-, and late career returnsEarly, mid-, and late career returns

Exploratory resultsExploratory results Why do some prior studies find evidence for Why do some prior studies find evidence for

economic selection?economic selection?• Omitted variable biasOmitted variable bias

Why is there social selection?Why is there social selection?• Low propensity students: View college as a means Low propensity students: View college as a means

for economic mobilityfor economic mobility

ReferencesReferences Brand, Jennie, and Yu Xie. 2007. “Who Benefits Most Brand, Jennie, and Yu Xie. 2007. “Who Benefits Most

From College? Evidence for Negative Selection in From College? Evidence for Negative Selection in Heterogeneous Economic Returns to Higher Education.” Heterogeneous Economic Returns to Higher Education.”

Rosenbaum, Paul R. and Donald B. Rubin. 1984. Rosenbaum, Paul R. and Donald B. Rubin. 1984. "Reducing Bias in Observational Studies Using "Reducing Bias in Observational Studies Using Subclassification on the Propensity Score.'' Subclassification on the Propensity Score.'' Journal of Journal of the American Statistical Associationthe American Statistical Association 79, 516-524. 79, 516-524.

Tsai, Shu-Ling, and Yu Xie. 2008. “Changes in Tsai, Shu-Ling, and Yu Xie. 2008. “Changes in Earnings Returns to Higher Education in Taiwan since Earnings Returns to Higher Education in Taiwan since the 1990s.” the 1990s.” Population ReviewPopulation Review. .

Xie, Yu, Steve Raudenbush, and Tony Perez. In Xie, Yu, Steve Raudenbush, and Tony Perez. In Progress. “Weighting in Causal Inference.” Xie, Yu and Progress. “Weighting in Causal Inference.” Xie, Yu and Xiaogang Wu. 2005. “Market Premium, Social Process, Xiaogang Wu. 2005. “Market Premium, Social Process, and Statisticism.” and Statisticism.” American Sociological ReviewAmerican Sociological Review 70:865- 70:865-870. 870.

methodological workshop 2: ignorability, selection bias, and causal inference yu xie university of...

Documents