econometric evaluation of social programs part...

484
Matching Matching from Econometric Evaluation of Social Programs Part II James J. Heckman Edward J. Vytlacil University of Chicago Columbia University American Bar Foundation University College Dublin, Ireland Econ 312, Spring 2019 1 / 163

Upload: others

Post on 03-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matchingfrom Econometric Evaluation of Social Programs Part II

James J. Heckman Edward J. VytlacilUniversity of Chicago Columbia University

American Bar FoundationUniversity College Dublin, Ireland

Econ 312, Spring 2019

1 / 163

Page 2: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching

The method of matching assumes selection of treatment basedon potential outcomes

(Y0,Y1)⊥⊥D,

so Pr (D = 1 | Y0,Y1) depends on Y0,Y1.

It assumes access to variables Q such that conditioning on Qremoves the dependence:

(Y0,Y1) ⊥⊥ D | Q. (Q-1)

Thus,Pr (D = 1 | Q,Y0,Y1) = Pr (D = 1 | Q) .

2 / 163

Page 3: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching

The method of matching assumes selection of treatment basedon potential outcomes

(Y0,Y1)⊥⊥D,

so Pr (D = 1 | Y0,Y1) depends on Y0,Y1.

It assumes access to variables Q such that conditioning on Qremoves the dependence:

(Y0,Y1) ⊥⊥ D | Q. (Q-1)

Thus,Pr (D = 1 | Q,Y0,Y1) = Pr (D = 1 | Q) .

2 / 163

Page 4: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching

The method of matching assumes selection of treatment basedon potential outcomes

(Y0,Y1)⊥⊥D,

so Pr (D = 1 | Y0,Y1) depends on Y0,Y1.

It assumes access to variables Q such that conditioning on Qremoves the dependence:

(Y0,Y1) ⊥⊥ D | Q. (Q-1)

Thus,Pr (D = 1 | Q,Y0,Y1) = Pr (D = 1 | Q) .

2 / 163

Page 5: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparisons between treated and untreated can be made at allpoints in the support of Q such that

0 < Pr (D = 1 | Q) < 1. (Q-2)

The method does not explicitly model choices of treatment orthe subjective evaluations of participants, nor is there anydistinction between the variables in the outcome equations (X )and the variables in the choice equations (Z ) that is central tothe IV method and the method of control functions.

In principle, condition (Q-1) can be satisfied using a set ofvariables Q distinct from all or some of the components of Xand Z .

The conditioning variables do not have to be exogenous.

3 / 163

Page 6: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparisons between treated and untreated can be made at allpoints in the support of Q such that

0 < Pr (D = 1 | Q) < 1. (Q-2)

The method does not explicitly model choices of treatment orthe subjective evaluations of participants, nor is there anydistinction between the variables in the outcome equations (X )and the variables in the choice equations (Z ) that is central tothe IV method and the method of control functions.

In principle, condition (Q-1) can be satisfied using a set ofvariables Q distinct from all or some of the components of Xand Z .

The conditioning variables do not have to be exogenous.

3 / 163

Page 7: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparisons between treated and untreated can be made at allpoints in the support of Q such that

0 < Pr (D = 1 | Q) < 1. (Q-2)

The method does not explicitly model choices of treatment orthe subjective evaluations of participants, nor is there anydistinction between the variables in the outcome equations (X )and the variables in the choice equations (Z ) that is central tothe IV method and the method of control functions.

In principle, condition (Q-1) can be satisfied using a set ofvariables Q distinct from all or some of the components of Xand Z .

The conditioning variables do not have to be exogenous.

3 / 163

Page 8: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparisons between treated and untreated can be made at allpoints in the support of Q such that

0 < Pr (D = 1 | Q) < 1. (Q-2)

The method does not explicitly model choices of treatment orthe subjective evaluations of participants, nor is there anydistinction between the variables in the outcome equations (X )and the variables in the choice equations (Z ) that is central tothe IV method and the method of control functions.

In principle, condition (Q-1) can be satisfied using a set ofvariables Q distinct from all or some of the components of Xand Z .

The conditioning variables do not have to be exogenous.

3 / 163

Page 9: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

From condition (Q-1), we recover the distributions of Y0 andY1 given Q, Pr (Y0 ≤ y0 | Q = q) = F0 (y0 | Q = q) andPr (Y1 ≤ y1 | Q = q) = F1 (y1 | Q = q), but not the jointdistributionF (y0, y1 | Q = q), because we do not observe the same personsin the treated and untreated states.

This is a standard evaluation problem common to alleconometric estimators.

Methods for determining which variables belong in Q rely onuntested exogeneity assumptions which we discuss in thissection.

4 / 163

Page 10: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

From condition (Q-1), we recover the distributions of Y0 andY1 given Q, Pr (Y0 ≤ y0 | Q = q) = F0 (y0 | Q = q) andPr (Y1 ≤ y1 | Q = q) = F1 (y1 | Q = q), but not the jointdistributionF (y0, y1 | Q = q), because we do not observe the same personsin the treated and untreated states.

This is a standard evaluation problem common to alleconometric estimators.

Methods for determining which variables belong in Q rely onuntested exogeneity assumptions which we discuss in thissection.

4 / 163

Page 11: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

From condition (Q-1), we recover the distributions of Y0 andY1 given Q, Pr (Y0 ≤ y0 | Q = q) = F0 (y0 | Q = q) andPr (Y1 ≤ y1 | Q = q) = F1 (y1 | Q = q), but not the jointdistributionF (y0, y1 | Q = q), because we do not observe the same personsin the treated and untreated states.

This is a standard evaluation problem common to alleconometric estimators.

Methods for determining which variables belong in Q rely onuntested exogeneity assumptions which we discuss in thissection.

4 / 163

Page 12: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

OLS is a special case of matching that focuses on theidentification of certain conditional means.

In OLS, linear functional forms are maintained as exactrepresentations or valid approximations.

Considering a common coefficient model, OLS writes

Y = Qα + Dβ + U , (Q-3)

where α is the treatment effect and

E (U | Q,D) = 0. (Q-4)

The assumption is made that the variance-covariance matrix of(Q,D) is of full rank:

Var (Q,D) full rank. (Q-5)

Under these conditions, we can identify β even though D andU are dependent: D ⊥⊥ U .

5 / 163

Page 13: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

OLS is a special case of matching that focuses on theidentification of certain conditional means.

In OLS, linear functional forms are maintained as exactrepresentations or valid approximations.

Considering a common coefficient model, OLS writes

Y = Qα + Dβ + U , (Q-3)

where α is the treatment effect and

E (U | Q,D) = 0. (Q-4)

The assumption is made that the variance-covariance matrix of(Q,D) is of full rank:

Var (Q,D) full rank. (Q-5)

Under these conditions, we can identify β even though D andU are dependent: D ⊥⊥ U .

5 / 163

Page 14: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

OLS is a special case of matching that focuses on theidentification of certain conditional means.

In OLS, linear functional forms are maintained as exactrepresentations or valid approximations.

Considering a common coefficient model, OLS writes

Y = Qα + Dβ + U , (Q-3)

where α is the treatment effect and

E (U | Q,D) = 0. (Q-4)

The assumption is made that the variance-covariance matrix of(Q,D) is of full rank:

Var (Q,D) full rank. (Q-5)

Under these conditions, we can identify β even though D andU are dependent: D ⊥⊥ U .

5 / 163

Page 15: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Controlling for the observable Q eliminates any spurious meandependence between D and U : E (U | D) 6= 0 butE (U | D,Q) = 0.

(Q-4) is the linear regression counterpart to (Q-1).

(Q-4) is the linear regression counterpart to (Q-2).

Failure of (Q-5) would mean that using a nonparametricestimator, we might perfectly predict D given Q, and thatPr (D = 1 | Q = q) = 1 or 0.

6 / 163

Page 16: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Controlling for the observable Q eliminates any spurious meandependence between D and U : E (U | D) 6= 0 butE (U | D,Q) = 0.

(Q-4) is the linear regression counterpart to (Q-1).

(Q-4) is the linear regression counterpart to (Q-2).

Failure of (Q-5) would mean that using a nonparametricestimator, we might perfectly predict D given Q, and thatPr (D = 1 | Q = q) = 1 or 0.

6 / 163

Page 17: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Controlling for the observable Q eliminates any spurious meandependence between D and U : E (U | D) 6= 0 butE (U | D,Q) = 0.

(Q-4) is the linear regression counterpart to (Q-1).

(Q-4) is the linear regression counterpart to (Q-2).

Failure of (Q-5) would mean that using a nonparametricestimator, we might perfectly predict D given Q, and thatPr (D = 1 | Q = q) = 1 or 0.

6 / 163

Page 18: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Controlling for the observable Q eliminates any spurious meandependence between D and U : E (U | D) 6= 0 butE (U | D,Q) = 0.

(Q-4) is the linear regression counterpart to (Q-1).

(Q-4) is the linear regression counterpart to (Q-2).

Failure of (Q-5) would mean that using a nonparametricestimator, we might perfectly predict D given Q, and thatPr (D = 1 | Q = q) = 1 or 0.

6 / 163

Page 19: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(Q-5)′ : If the goal of the analysis is to identify β, inplace of (Q-4), we can get by with

(Q − 4)′ : E (U |Q,D) = E (U |Q).

Assuming Var(D | Q) > 0, we can identify β evenif we cannot separate αQ from E (U |Q).

7 / 163

Page 20: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(Q-5)′ : If the goal of the analysis is to identify β, inplace of (Q-4), we can get by with

(Q − 4)′ : E (U |Q,D) = E (U |Q).

Assuming Var(D | Q) > 0, we can identify β evenif we cannot separate αQ from E (U |Q).

7 / 163

Page 21: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(Q-5)′ : If the goal of the analysis is to identify β, inplace of (Q-4), we can get by with

(Q − 4)′ : E (U |Q,D) = E (U |Q).

Assuming Var(D | Q) > 0, we can identify β evenif we cannot separate αQ from E (U |Q).

7 / 163

Page 22: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(Q-5)′ : If the goal of the analysis is to identify β, inplace of (Q-4), we can get by with

(Q − 4)′ : E (U |Q,D) = E (U |Q).

Assuming Var(D | Q) > 0, we can identify β evenif we cannot separate αQ from E (U |Q).

7 / 163

Page 23: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching can be implemented as a nonparametric method.

When this is done, the procedure does not require specificationof the functional form of the outcome equations.

It enforces the requirement that (Q-2) be satisfied byestimating functions pointwise in the support of Q.

To link our notation in this section to that in the rest of thechapter, we assume that Q = (X ,Z ) and that X and Z are thesame except where otherwise noted.

Thus we invoke assumptions (M-1) and (M-2), even though inprinciple we can use a more general conditioning set.

8 / 163

Page 24: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching can be implemented as a nonparametric method.

When this is done, the procedure does not require specificationof the functional form of the outcome equations.

It enforces the requirement that (Q-2) be satisfied byestimating functions pointwise in the support of Q.

To link our notation in this section to that in the rest of thechapter, we assume that Q = (X ,Z ) and that X and Z are thesame except where otherwise noted.

Thus we invoke assumptions (M-1) and (M-2), even though inprinciple we can use a more general conditioning set.

8 / 163

Page 25: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching can be implemented as a nonparametric method.

When this is done, the procedure does not require specificationof the functional form of the outcome equations.

It enforces the requirement that (Q-2) be satisfied byestimating functions pointwise in the support of Q.

To link our notation in this section to that in the rest of thechapter, we assume that Q = (X ,Z ) and that X and Z are thesame except where otherwise noted.

Thus we invoke assumptions (M-1) and (M-2), even though inprinciple we can use a more general conditioning set.

8 / 163

Page 26: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching can be implemented as a nonparametric method.

When this is done, the procedure does not require specificationof the functional form of the outcome equations.

It enforces the requirement that (Q-2) be satisfied byestimating functions pointwise in the support of Q.

To link our notation in this section to that in the rest of thechapter, we assume that Q = (X ,Z ) and that X and Z are thesame except where otherwise noted.

Thus we invoke assumptions (M-1) and (M-2), even though inprinciple we can use a more general conditioning set.

8 / 163

Page 27: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching can be implemented as a nonparametric method.

When this is done, the procedure does not require specificationof the functional form of the outcome equations.

It enforces the requirement that (Q-2) be satisfied byestimating functions pointwise in the support of Q.

To link our notation in this section to that in the rest of thechapter, we assume that Q = (X ,Z ) and that X and Z are thesame except where otherwise noted.

Thus we invoke assumptions (M-1) and (M-2), even though inprinciple we can use a more general conditioning set.

8 / 163

Page 28: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(M-1)

(Y0,Y1) ⊥⊥ D | X .

(M-2)

0 < Pr(D = 1 | X = x) < 1.

9 / 163

Page 29: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assumptions (M-1) and (M-2) introduced in Section 2 or (Q-1)and (Q-2) rule out the possibility that after conditioning on X(or Q), agents possess more information about their choicesthan econometricians, and that the unobserved informationhelps to predict the potential outcomes.

Put another way, the method allows for potential outcomes toaffect choices but only through the observed variables, Q, thatpredict outcomes.

This is the reason why Heckman and Robb (1985, 1986b) callthe method selection on observables.

10 / 163

Page 30: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assumptions (M-1) and (M-2) introduced in Section 2 or (Q-1)and (Q-2) rule out the possibility that after conditioning on X(or Q), agents possess more information about their choicesthan econometricians, and that the unobserved informationhelps to predict the potential outcomes.

Put another way, the method allows for potential outcomes toaffect choices but only through the observed variables, Q, thatpredict outcomes.

This is the reason why Heckman and Robb (1985, 1986b) callthe method selection on observables.

10 / 163

Page 31: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assumptions (M-1) and (M-2) introduced in Section 2 or (Q-1)and (Q-2) rule out the possibility that after conditioning on X(or Q), agents possess more information about their choicesthan econometricians, and that the unobserved informationhelps to predict the potential outcomes.

Put another way, the method allows for potential outcomes toaffect choices but only through the observed variables, Q, thatpredict outcomes.

This is the reason why Heckman and Robb (1985, 1986b) callthe method selection on observables.

10 / 163

Page 32: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This section establishes the following points.

(1) Matching assumptions (M-1) and (M-2) generically imply aflat MTE in uD , i.e., they assume thatE (Y1 − Y0 | X = x ,UD = uD) does not depend on uD .

Thus the unobservables central to the Roy model and itsextensions and the unobservables central to the modern IVliterature are assumed to be absent once the analyst conditionson X .

(M-1) implies that all mean treatment parameters are the same.

(2) Even if we weaken (M-1) and (M-2) to mean independenceinstead of full independence, generically the MTE is flat in uDunder the assumptions of the nonparametric generalized Roymodel, so again all mean treatment parameters are the same.

11 / 163

Page 33: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This section establishes the following points.

(1) Matching assumptions (M-1) and (M-2) generically imply aflat MTE in uD , i.e., they assume thatE (Y1 − Y0 | X = x ,UD = uD) does not depend on uD .

Thus the unobservables central to the Roy model and itsextensions and the unobservables central to the modern IVliterature are assumed to be absent once the analyst conditionson X .

(M-1) implies that all mean treatment parameters are the same.

(2) Even if we weaken (M-1) and (M-2) to mean independenceinstead of full independence, generically the MTE is flat in uDunder the assumptions of the nonparametric generalized Roymodel, so again all mean treatment parameters are the same.

11 / 163

Page 34: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This section establishes the following points.

(1) Matching assumptions (M-1) and (M-2) generically imply aflat MTE in uD , i.e., they assume thatE (Y1 − Y0 | X = x ,UD = uD) does not depend on uD .

Thus the unobservables central to the Roy model and itsextensions and the unobservables central to the modern IVliterature are assumed to be absent once the analyst conditionson X .

(M-1) implies that all mean treatment parameters are the same.

(2) Even if we weaken (M-1) and (M-2) to mean independenceinstead of full independence, generically the MTE is flat in uDunder the assumptions of the nonparametric generalized Roymodel, so again all mean treatment parameters are the same.

11 / 163

Page 35: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This section establishes the following points.

(1) Matching assumptions (M-1) and (M-2) generically imply aflat MTE in uD , i.e., they assume thatE (Y1 − Y0 | X = x ,UD = uD) does not depend on uD .

Thus the unobservables central to the Roy model and itsextensions and the unobservables central to the modern IVliterature are assumed to be absent once the analyst conditionson X .

(M-1) implies that all mean treatment parameters are the same.

(2) Even if we weaken (M-1) and (M-2) to mean independenceinstead of full independence, generically the MTE is flat in uDunder the assumptions of the nonparametric generalized Roymodel, so again all mean treatment parameters are the same.

11 / 163

Page 36: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This section establishes the following points.

(1) Matching assumptions (M-1) and (M-2) generically imply aflat MTE in uD , i.e., they assume thatE (Y1 − Y0 | X = x ,UD = uD) does not depend on uD .

Thus the unobservables central to the Roy model and itsextensions and the unobservables central to the modern IVliterature are assumed to be absent once the analyst conditionson X .

(M-1) implies that all mean treatment parameters are the same.

(2) Even if we weaken (M-1) and (M-2) to mean independenceinstead of full independence, generically the MTE is flat in uDunder the assumptions of the nonparametric generalized Roymodel, so again all mean treatment parameters are the same.

11 / 163

Page 37: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(3) We show that IV and matching make distinct identifyingassumptions even though they both invoke conditionalindependence assumptions.

(4) We compare matching with IV and control function (sampleselection) methods.

Matching assumes that conditioning on observables eliminatesthe dependence between (Y0,Y1) and D.

The control function principle models the dependence.

(5) We present some examples that demonstrate that if theassumptions of the method of matching are violated, themethod can produce substantially biased estimators of theparameters of interest.

12 / 163

Page 38: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(3) We show that IV and matching make distinct identifyingassumptions even though they both invoke conditionalindependence assumptions.

(4) We compare matching with IV and control function (sampleselection) methods.

Matching assumes that conditioning on observables eliminatesthe dependence between (Y0,Y1) and D.

The control function principle models the dependence.

(5) We present some examples that demonstrate that if theassumptions of the method of matching are violated, themethod can produce substantially biased estimators of theparameters of interest.

12 / 163

Page 39: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(3) We show that IV and matching make distinct identifyingassumptions even though they both invoke conditionalindependence assumptions.

(4) We compare matching with IV and control function (sampleselection) methods.

Matching assumes that conditioning on observables eliminatesthe dependence between (Y0,Y1) and D.

The control function principle models the dependence.

(5) We present some examples that demonstrate that if theassumptions of the method of matching are violated, themethod can produce substantially biased estimators of theparameters of interest.

12 / 163

Page 40: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(3) We show that IV and matching make distinct identifyingassumptions even though they both invoke conditionalindependence assumptions.

(4) We compare matching with IV and control function (sampleselection) methods.

Matching assumes that conditioning on observables eliminatesthe dependence between (Y0,Y1) and D.

The control function principle models the dependence.

(5) We present some examples that demonstrate that if theassumptions of the method of matching are violated, themethod can produce substantially biased estimators of theparameters of interest.

12 / 163

Page 41: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(3) We show that IV and matching make distinct identifyingassumptions even though they both invoke conditionalindependence assumptions.

(4) We compare matching with IV and control function (sampleselection) methods.

Matching assumes that conditioning on observables eliminatesthe dependence between (Y0,Y1) and D.

The control function principle models the dependence.

(5) We present some examples that demonstrate that if theassumptions of the method of matching are violated, themethod can produce substantially biased estimators of theparameters of interest.

12 / 163

Page 42: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(6) We show that standard methods for selecting theconditioning variables used in matching assume exogeneity.

This is a property shared with many econometric estimators, asnoted in Part I, section 5.2.

Violations of the exogeneity assumption can produce biasedestimators.

Nonparametric versions of matching embodying (M-2) avoidthe problem of making inferences outside the support of thedata.

This problem is implicit in any application of least squares.

Figure 1 shows the support problem that can arise in linearleast squares when the linearity of the regression is used toextrapolate estimates determined in one empirical support tonew supports.

13 / 163

Page 43: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(6) We show that standard methods for selecting theconditioning variables used in matching assume exogeneity.

This is a property shared with many econometric estimators, asnoted in Part I, section 5.2.

Violations of the exogeneity assumption can produce biasedestimators.

Nonparametric versions of matching embodying (M-2) avoidthe problem of making inferences outside the support of thedata.

This problem is implicit in any application of least squares.

Figure 1 shows the support problem that can arise in linearleast squares when the linearity of the regression is used toextrapolate estimates determined in one empirical support tonew supports.

13 / 163

Page 44: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(6) We show that standard methods for selecting theconditioning variables used in matching assume exogeneity.

This is a property shared with many econometric estimators, asnoted in Part I, section 5.2.

Violations of the exogeneity assumption can produce biasedestimators.

Nonparametric versions of matching embodying (M-2) avoidthe problem of making inferences outside the support of thedata.

This problem is implicit in any application of least squares.

Figure 1 shows the support problem that can arise in linearleast squares when the linearity of the regression is used toextrapolate estimates determined in one empirical support tonew supports.

13 / 163

Page 45: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(6) We show that standard methods for selecting theconditioning variables used in matching assume exogeneity.

This is a property shared with many econometric estimators, asnoted in Part I, section 5.2.

Violations of the exogeneity assumption can produce biasedestimators.

Nonparametric versions of matching embodying (M-2) avoidthe problem of making inferences outside the support of thedata.

This problem is implicit in any application of least squares.

Figure 1 shows the support problem that can arise in linearleast squares when the linearity of the regression is used toextrapolate estimates determined in one empirical support tonew supports.

13 / 163

Page 46: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(6) We show that standard methods for selecting theconditioning variables used in matching assume exogeneity.

This is a property shared with many econometric estimators, asnoted in Part I, section 5.2.

Violations of the exogeneity assumption can produce biasedestimators.

Nonparametric versions of matching embodying (M-2) avoidthe problem of making inferences outside the support of thedata.

This problem is implicit in any application of least squares.

Figure 1 shows the support problem that can arise in linearleast squares when the linearity of the regression is used toextrapolate estimates determined in one empirical support tonew supports.

13 / 163

Page 47: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(6) We show that standard methods for selecting theconditioning variables used in matching assume exogeneity.

This is a property shared with many econometric estimators, asnoted in Part I, section 5.2.

Violations of the exogeneity assumption can produce biasedestimators.

Nonparametric versions of matching embodying (M-2) avoidthe problem of making inferences outside the support of thedata.

This problem is implicit in any application of least squares.

Figure 1 shows the support problem that can arise in linearleast squares when the linearity of the regression is used toextrapolate estimates determined in one empirical support tonew supports.

13 / 163

Page 48: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 1: The Least Squares Extrapolation Problem Avoided by UsingNonparametric Regression or Matching

noitcnuf eurtY = g(X + ) U

X

Y

enil gnitamixorppa serauqs tsaelY = ΠX + V

atad

Page 49: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Careful attention to support problems is a virtue of anynonparametric method, including, but not unique to,nonparametric matching.

Heckman et al. (1998) show that the bias from neglecting theproblem of limited support can be substantial.

See also the discussion in Heckman et al. (1999).

15 / 163

Page 50: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Careful attention to support problems is a virtue of anynonparametric method, including, but not unique to,nonparametric matching.

Heckman et al. (1998) show that the bias from neglecting theproblem of limited support can be substantial.

See also the discussion in Heckman et al. (1999).

15 / 163

Page 51: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Careful attention to support problems is a virtue of anynonparametric method, including, but not unique to,nonparametric matching.

Heckman et al. (1998) show that the bias from neglecting theproblem of limited support can be substantial.

See also the discussion in Heckman et al. (1999).

15 / 163

Page 52: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We now show that matching implies that conditional on X , themarginal return is assumed to be the same as the averagereturn (marginal = average).

This is a strong behavioral assumption implicit in statisticalconditional independence assumption (M-1).

It says that the marginal participant has the same return as theaverage participant.

16 / 163

Page 53: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We now show that matching implies that conditional on X , themarginal return is assumed to be the same as the averagereturn (marginal = average).

This is a strong behavioral assumption implicit in statisticalconditional independence assumption (M-1).

It says that the marginal participant has the same return as theaverage participant.

16 / 163

Page 54: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We now show that matching implies that conditional on X , themarginal return is assumed to be the same as the averagereturn (marginal = average).

This is a strong behavioral assumption implicit in statisticalconditional independence assumption (M-1).

It says that the marginal participant has the same return as theaverage participant.

16 / 163

Page 55: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching Assumption (M-1) Implies a Flat MTE

An immediate consequence of (M-1) is that the MTE does notdepend on UD .

This is so because (Y0,Y1) ⊥⊥ D | X implies that(Y0,Y1) ⊥⊥ UD | X and hence that

∆MTE(x , uD) = E(Y1 − Y0 | X = x ,UD = uD) = E(Y1 − Y0 | X = x). (1)

17 / 163

Page 56: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching Assumption (M-1) Implies a Flat MTE

An immediate consequence of (M-1) is that the MTE does notdepend on UD .

This is so because (Y0,Y1) ⊥⊥ D | X implies that(Y0,Y1) ⊥⊥ UD | X and hence that

∆MTE(x , uD) = E(Y1 − Y0 | X = x ,UD = uD) = E(Y1 − Y0 | X = x). (1)

17 / 163

Page 57: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This, in turn, implies that ∆MTE conditional on X is flat in uD ,so that matching invokes assumption (C-1).

Under our assumptions for the generalized Roy model, itassumes that E (Y | P (Z ) = p) is linear in p . Thus themethod of matching assumes that mean marginal returns andaverage returns are the same and all mean treatment effects arethe same given X .

(C-1)

D ⊥⊥ ∆ =⇒ E (∆ | UD) = E (∆), ∆MTE(uD) is constant in uD and∆MTE = ∆ATE = ∆TT = ∆LATE, i.e., E (β | D = 1) = E (β), becauseβ ⊥⊥ D.

18 / 163

Page 58: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, one can still distinguish marginal from average effectsof the observables (X ) using matching.

See Carneiro (2002).

It is sometimes said that the matching assumptions are “forfree” (See, e.g., Gill and Robins, 2001) because one can alwaysreplace unobserved F1(Y1 | X = x ,D = 0) with observedF1(Y1 | X = x ,D = 1) and unobserved F0(Y0 | X = x ,D = 1)with observed F0(Y0 | X = x ,D = 0).

Such substitutions do not contradict any observed data.

19 / 163

Page 59: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, one can still distinguish marginal from average effectsof the observables (X ) using matching.

See Carneiro (2002).

It is sometimes said that the matching assumptions are “forfree” (See, e.g., Gill and Robins, 2001) because one can alwaysreplace unobserved F1(Y1 | X = x ,D = 0) with observedF1(Y1 | X = x ,D = 1) and unobserved F0(Y0 | X = x ,D = 1)with observed F0(Y0 | X = x ,D = 0).

Such substitutions do not contradict any observed data.

19 / 163

Page 60: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, one can still distinguish marginal from average effectsof the observables (X ) using matching.

See Carneiro (2002).

It is sometimes said that the matching assumptions are “forfree” (See, e.g., Gill and Robins, 2001) because one can alwaysreplace unobserved F1(Y1 | X = x ,D = 0) with observedF1(Y1 | X = x ,D = 1) and unobserved F0(Y0 | X = x ,D = 1)with observed F0(Y0 | X = x ,D = 0).

Such substitutions do not contradict any observed data.

19 / 163

Page 61: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, one can still distinguish marginal from average effectsof the observables (X ) using matching.

See Carneiro (2002).

It is sometimes said that the matching assumptions are “forfree” (See, e.g., Gill and Robins, 2001) because one can alwaysreplace unobserved F1(Y1 | X = x ,D = 0) with observedF1(Y1 | X = x ,D = 1) and unobserved F0(Y0 | X = x ,D = 1)with observed F0(Y0 | X = x ,D = 0).

Such substitutions do not contradict any observed data.

19 / 163

Page 62: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

While the claim is true, it ignores the counterfactual statesgenerated under the matching assumptions.

The assumed absence of selection on unobservables is not a“for free” assumption, and produces fundamentally differentcounterfactual states for the same model under matching andselection assumptions.

To explore these issues in depth, consider a nonparametricregression model more general than the linear regression model(Q-3).

20 / 163

Page 63: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

While the claim is true, it ignores the counterfactual statesgenerated under the matching assumptions.

The assumed absence of selection on unobservables is not a“for free” assumption, and produces fundamentally differentcounterfactual states for the same model under matching andselection assumptions.

To explore these issues in depth, consider a nonparametricregression model more general than the linear regression model(Q-3).

20 / 163

Page 64: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

While the claim is true, it ignores the counterfactual statesgenerated under the matching assumptions.

The assumed absence of selection on unobservables is not a“for free” assumption, and produces fundamentally differentcounterfactual states for the same model under matching andselection assumptions.

To explore these issues in depth, consider a nonparametricregression model more general than the linear regression model(Q-3).

20 / 163

Page 65: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Without assumption (M-1), a nonparametric regression of Y onD conditional on X identifies a nonparametric mean difference:

∆OLS(X ) = E(Y1 | X ,D = 1)− E(Y0 | X ,D = 0)

= E(Y1 − Y0 | X ,D = 1) + E(Y0 | X ,D = 1)− E(Y0 | X ,D = 0) .(2)

The term in braces in the second expression arises fromselection on pre-treatment levels of the outcome.

OLS identifies the parameter treatment on the treated (the firstterm in the second line of (2)) plus a bias term in bracescorresponding to selection on the levels.

21 / 163

Page 66: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Without assumption (M-1), a nonparametric regression of Y onD conditional on X identifies a nonparametric mean difference:

∆OLS(X ) = E(Y1 | X ,D = 1)− E(Y0 | X ,D = 0)

= E(Y1 − Y0 | X ,D = 1) + E(Y0 | X ,D = 1)− E(Y0 | X ,D = 0) .(2)

The term in braces in the second expression arises fromselection on pre-treatment levels of the outcome.

OLS identifies the parameter treatment on the treated (the firstterm in the second line of (2)) plus a bias term in bracescorresponding to selection on the levels.

21 / 163

Page 67: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS estimator can be represented as a weighted averageof ∆MTE.

The weight is given in table 1B where U1 and U0 for the OLSmodel are defined as deviations from conditional expectations,U1 = Y1 − E (Y1 | X ), U0 = Y0 − E (Y0 | X ).

Unlike the weights for ∆TT and ∆ATE, the OLS weights do notnecessarily integrate to one and they are not necessarilynonnegative.

Application of IV eliminates the contribution of the secondterm of equation (2).

The weights for the first term are the same as the weights for∆TT and hence they integrate to one.

22 / 163

Page 68: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS estimator can be represented as a weighted averageof ∆MTE.

The weight is given in table 1B where U1 and U0 for the OLSmodel are defined as deviations from conditional expectations,U1 = Y1 − E (Y1 | X ), U0 = Y0 − E (Y0 | X ).

Unlike the weights for ∆TT and ∆ATE, the OLS weights do notnecessarily integrate to one and they are not necessarilynonnegative.

Application of IV eliminates the contribution of the secondterm of equation (2).

The weights for the first term are the same as the weights for∆TT and hence they integrate to one.

22 / 163

Page 69: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS estimator can be represented as a weighted averageof ∆MTE.

The weight is given in table 1B where U1 and U0 for the OLSmodel are defined as deviations from conditional expectations,U1 = Y1 − E (Y1 | X ), U0 = Y0 − E (Y0 | X ).

Unlike the weights for ∆TT and ∆ATE, the OLS weights do notnecessarily integrate to one and they are not necessarilynonnegative.

Application of IV eliminates the contribution of the secondterm of equation (2).

The weights for the first term are the same as the weights for∆TT and hence they integrate to one.

22 / 163

Page 70: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS estimator can be represented as a weighted averageof ∆MTE.

The weight is given in table 1B where U1 and U0 for the OLSmodel are defined as deviations from conditional expectations,U1 = Y1 − E (Y1 | X ), U0 = Y0 − E (Y0 | X ).

Unlike the weights for ∆TT and ∆ATE, the OLS weights do notnecessarily integrate to one and they are not necessarilynonnegative.

Application of IV eliminates the contribution of the secondterm of equation (2).

The weights for the first term are the same as the weights for∆TT and hence they integrate to one.

22 / 163

Page 71: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS estimator can be represented as a weighted averageof ∆MTE.

The weight is given in table 1B where U1 and U0 for the OLSmodel are defined as deviations from conditional expectations,U1 = Y1 − E (Y1 | X ), U0 = Y0 − E (Y0 | X ).

Unlike the weights for ∆TT and ∆ATE, the OLS weights do notnecessarily integrate to one and they are not necessarilynonnegative.

Application of IV eliminates the contribution of the secondterm of equation (2).

The weights for the first term are the same as the weights for∆TT and hence they integrate to one.

22 / 163

Page 72: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Table 1: A. Treatment effects and estimands as weighted averages of themarginal treatment effect

ATE(x) = E (Y1 − Y0 | X = x) =∫ 1

0 ∆MTE(x , uD) duD

TT(x) = E (Y1 − Y0 | X = x ,D = 1) =∫ 1

0 ∆MTE(x , uD)ωTT(x , uD) duD

TUT(x) = E (Y1 − Y0 | X = x ,D = 0) =∫ 1

0 ∆MTE (x , uD) ωTUT (x , uD) duD

PRTE(x) = E (Ya′ | X = x)− E (Ya | X = x) =∫ 1

0 ∆MTE (x , uD) ωPRTE (x , uD) duD

for two policies a and a′ that affect the Z but not the X

IVJ(x) =∫ 1

0 ∆MTE(x , uD)ωJIV(x , uD) duD , given instrument J

OLS(x) =∫ 1

0 ∆MTE(x , uD)ωOLS(x , uD) duD

Page 73: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Weights

ωATE(x, uD ) = 1

ωTT(x, uD ) =[∫ 1

uDfP|X (p | X = x)dp

] 1

E(P | X = x)

ωTUT (x, uD ) =[∫ uD

0 fP|X (p|X = x) dp] 1

E ((1− P) |X = x)

ωPRTE(x, uD ) =

[FPa′ |X

(uD |x)−FPa|X (uD |x)

∆P(x)

], where ∆P(x) = E (Pa | X = x)− E

(Pa′ | X = x

)

ωJIV(x, uD ) =

[∫ 1uD

∫(J(Z)− E (J(Z) | X = x)) fJ,P|X (j, t | X = x) dj dt

] 1

Cov(J(Z),D | X = x)

ωOLS(x, uD ) = 1 +E(U1 | X = x,UD = uD )ω1(x, uD )− E(U0 | X = x,UD = uD )ω0(x, uD )

∆MTE(x, uD )

ω1(x, uD ) =[∫ 1

uDfP|X (p | X = x) dp

] 1

E(P | X = x)

ω0(x, uD ) =[∫ uD

0 fP|X (p | X = x) dp] 1

E((1− P) | X = x)

Source: Heckman and Vytlacil (2005)

Page 74: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS weights for our generalized Roy model example areplotted in figure 2B.

The negative component of the OLS weight leads to a smallerOLS treatment estimate compared to the other treatmenteffects in table 2.

This table shows the estimated OLS treatment effect for thegeneralized Roy example.

The large negative selection bias in this example is consistentwith comparative advantage as emphasized by Roy (1951) anddetected empirically by Willis and Rosen (1979) and Cunhaet al. (2005).

People who are good in sector 1 (i.e., receive treatment) maybe very poor in sector 0 (those who receive no treatment).

25 / 163

Page 75: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS weights for our generalized Roy model example areplotted in figure 2B.

The negative component of the OLS weight leads to a smallerOLS treatment estimate compared to the other treatmenteffects in table 2.

This table shows the estimated OLS treatment effect for thegeneralized Roy example.

The large negative selection bias in this example is consistentwith comparative advantage as emphasized by Roy (1951) anddetected empirically by Willis and Rosen (1979) and Cunhaet al. (2005).

People who are good in sector 1 (i.e., receive treatment) maybe very poor in sector 0 (those who receive no treatment).

25 / 163

Page 76: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS weights for our generalized Roy model example areplotted in figure 2B.

The negative component of the OLS weight leads to a smallerOLS treatment estimate compared to the other treatmenteffects in table 2.

This table shows the estimated OLS treatment effect for thegeneralized Roy example.

The large negative selection bias in this example is consistentwith comparative advantage as emphasized by Roy (1951) anddetected empirically by Willis and Rosen (1979) and Cunhaet al. (2005).

People who are good in sector 1 (i.e., receive treatment) maybe very poor in sector 0 (those who receive no treatment).

25 / 163

Page 77: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS weights for our generalized Roy model example areplotted in figure 2B.

The negative component of the OLS weight leads to a smallerOLS treatment estimate compared to the other treatmenteffects in table 2.

This table shows the estimated OLS treatment effect for thegeneralized Roy example.

The large negative selection bias in this example is consistentwith comparative advantage as emphasized by Roy (1951) anddetected empirically by Willis and Rosen (1979) and Cunhaet al. (2005).

People who are good in sector 1 (i.e., receive treatment) maybe very poor in sector 0 (those who receive no treatment).

25 / 163

Page 78: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The OLS weights for our generalized Roy model example areplotted in figure 2B.

The negative component of the OLS weight leads to a smallerOLS treatment estimate compared to the other treatmenteffects in table 2.

This table shows the estimated OLS treatment effect for thegeneralized Roy example.

The large negative selection bias in this example is consistentwith comparative advantage as emphasized by Roy (1951) anddetected empirically by Willis and Rosen (1979) and Cunhaet al. (2005).

People who are good in sector 1 (i.e., receive treatment) maybe very poor in sector 0 (those who receive no treatment).

25 / 163

Page 79: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 2: A. Weights for the marginal treatment effect for differentparameters

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0

0 . 5

1

1 . 5

2

2 . 5

3

3 . 5 ω(uD )

u D

MTE 0.35

MTE

ATE

TT

0

TUT

Page 80: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Marginal Treatment Effect vs Linear Instrumental Variables andOrdinary Least Squares Weights

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -3

-2

-1

0

1

2

3

4

5 ωTT (u D )

u D

MTE OLS D )

0.5

MTE

IV

OLS

-0.3

ω (u

Page 81: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Y1 = α+ β + U1 U1 = σ1τ α = 0.67 σ1 = 0.012Y0 = α+ U0 U0 = σ0τ β = 0.2 σ0 = −0.050D = 1 if Z − V ≥ 0 V = σV τ τ ∼ N(0, 1) σV = −1.000

UD = Φ(

VσV στ

)Z ∼ N(−0.0026, 0.2700)

Source: Heckman and Vytlacil (2005).

Page 82: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Table 2: Treatment parameters and estimands in the generalized Royexample

Treatment on the Treated 0.2353Treatment on the Untreated 0.1574Average Treatment Effect 0.2000Sorting Gaina 0.0353Policy Relevant Treatment Effect (PRTE) 0.1549Selection Biasb −0.0628Linear Instrumental Variablesc 0.2013Ordinary Least Squares 0.1725

aTT − ATE = E (Y1 − Y0 | D = 1)− E (Y1 − Y0)bOLS − TT = E (Y0 | D = 1)− E (Y0 | D = 0)cUsing Propensity Score P (Z ) as the instrument.Note: The model used to create Table 2 is the same as those used to createFigures 2A and 2B. The PRTE is computed using a policy t characterized asfollows:If Z > 0 then D = 1 if Z (1 + t)− V ≥ 0.If Z ≤ t then D = 1 if Z − V ≥ 0.For this example t is set equal to 0.2.

Source: Heckman and Vytlacil (2005)

Page 83: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Hence the bias in OLS for the parameter treatment on thetreated may be negative(E (Y0 | X ,D = 1)− E (Y0 | X ,D = 0) < 0).

The differences among the policy relevant treatment effects,the conventional treatment effects and the OLS estimand areillustrated in figure 3A and table 2 for the generalized Roymodel example.

As is evident from table 2, it is not at all clear that theinstrumental variable estimator, with instruments that satisfyclassical properties, performs better than nonparametric OLS inidentifying the policy relevant treatment effect in this example.

While IV eliminates the term in braces in (2), it reweights theMTE differently from what might be desired for many policyanalyses.

30 / 163

Page 84: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Hence the bias in OLS for the parameter treatment on thetreated may be negative(E (Y0 | X ,D = 1)− E (Y0 | X ,D = 0) < 0).

The differences among the policy relevant treatment effects,the conventional treatment effects and the OLS estimand areillustrated in figure 3A and table 2 for the generalized Roymodel example.

As is evident from table 2, it is not at all clear that theinstrumental variable estimator, with instruments that satisfyclassical properties, performs better than nonparametric OLS inidentifying the policy relevant treatment effect in this example.

While IV eliminates the term in braces in (2), it reweights theMTE differently from what might be desired for many policyanalyses.

30 / 163

Page 85: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Hence the bias in OLS for the parameter treatment on thetreated may be negative(E (Y0 | X ,D = 1)− E (Y0 | X ,D = 0) < 0).

The differences among the policy relevant treatment effects,the conventional treatment effects and the OLS estimand areillustrated in figure 3A and table 2 for the generalized Roymodel example.

As is evident from table 2, it is not at all clear that theinstrumental variable estimator, with instruments that satisfyclassical properties, performs better than nonparametric OLS inidentifying the policy relevant treatment effect in this example.

While IV eliminates the term in braces in (2), it reweights theMTE differently from what might be desired for many policyanalyses.

30 / 163

Page 86: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Hence the bias in OLS for the parameter treatment on thetreated may be negative(E (Y0 | X ,D = 1)− E (Y0 | X ,D = 0) < 0).

The differences among the policy relevant treatment effects,the conventional treatment effects and the OLS estimand areillustrated in figure 3A and table 2 for the generalized Roymodel example.

As is evident from table 2, it is not at all clear that theinstrumental variable estimator, with instruments that satisfyclassical properties, performs better than nonparametric OLS inidentifying the policy relevant treatment effect in this example.

While IV eliminates the term in braces in (2), it reweights theMTE differently from what might be desired for many policyanalyses.

30 / 163

Page 87: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 3: A. Marginal Treatment Effect vs Linear Instrumental Variables,Ordinary Least Squares, and Policy Relevant Treatment Effect Weights: WhenP (Z ) is the Instrument

The Policy is Given at the Base of Table 4. The model parameters are given at the base of

Figure 2.

0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1 −0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

p

Wei

ghts

and

MTE

MTE ωIV(Z) ωOLS ωPRTE

Page 88: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Marginal Treatment Effect vs. Linear IV with Z as an Instrument, Linear IVwith P (Z (1 + t (1 [Z > 0]))) = P (z , t) as an Instrument, and Policy RelevantTreatment Effect Weights. For The Policy Defined at the Base of Table 4. The

model parameters are given at the base of Figure 2.

0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1 −0.01

−0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

p

Wei

ghts

and

MTE

MTE ωIV(Z) ωIV(P(Z,t)) ωPRTE

Page 89: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

C. Marginal Treatment Effect vs. IV Policy and Policy Relevant TreatmentEffect Weights For The Policy Defined at the Base of Table 4.

0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1 −0.01

−0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

p

Wei

ghts

and

MTE

MTE ωIV(B) ωPRTE

Source: Heckman and Vytlacil (2005)

Page 90: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 4: A. Plot of the E (Y | P(Z ) = p)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10

15

20

25

p

E[Y

|P(Z

)=p]

E[Y|P(Z)=p] When C2 Holds (Agents Act on Heterogeneity)E[Y|P(Z)=p] When C1 Holds (Agents Do Not Act on Heterogeneity)

Page 91: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Plot of the identified marginal treatment effect from figure 4A(the derivative).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

uD

MTE

MTEAgents Act on HeterogeneityMTEAgents Do Not Act on Heterogeneity

Page 92: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Note: Parameters for the general heterogeneous case are the same asthose used in Figures 2A and 2B. For the homogeneous case we impose

U1 = U0 (σ1 = σ0 = 0.012).

Source: Heckman and Vytlacil (2005).

Page 93: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If there is no selection on unobserved variables conditional oncovariates, UD ⊥⊥ (Y0,Y1) | X , thenE (U1 | X ,UD) = E (U1 | X ) = 0 andE (U0 | X ,UD) = E (U0 | X ) = 0 so that the OLS weights areunity and OLS identifies both ATE and the parametertreatment on the treated (TT), which are the same under thisassumption.

This condition is an implication of matching condition (M-1).

Given the assumed conditional independence in terms of X , wecan identify ATE and TT without use of any instrument Zsatisfying assumptions (A-1)–(A-2).

If there is such a Z , the conditional independence conditionimplies under (A-1)–(A-5) that E (Y | X ,P(Z ) = p) is linear inp.

37 / 163

Page 94: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If there is no selection on unobserved variables conditional oncovariates, UD ⊥⊥ (Y0,Y1) | X , thenE (U1 | X ,UD) = E (U1 | X ) = 0 andE (U0 | X ,UD) = E (U0 | X ) = 0 so that the OLS weights areunity and OLS identifies both ATE and the parametertreatment on the treated (TT), which are the same under thisassumption.

This condition is an implication of matching condition (M-1).

Given the assumed conditional independence in terms of X , wecan identify ATE and TT without use of any instrument Zsatisfying assumptions (A-1)–(A-2).

If there is such a Z , the conditional independence conditionimplies under (A-1)–(A-5) that E (Y | X ,P(Z ) = p) is linear inp.

37 / 163

Page 95: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If there is no selection on unobserved variables conditional oncovariates, UD ⊥⊥ (Y0,Y1) | X , thenE (U1 | X ,UD) = E (U1 | X ) = 0 andE (U0 | X ,UD) = E (U0 | X ) = 0 so that the OLS weights areunity and OLS identifies both ATE and the parametertreatment on the treated (TT), which are the same under thisassumption.

This condition is an implication of matching condition (M-1).

Given the assumed conditional independence in terms of X , wecan identify ATE and TT without use of any instrument Zsatisfying assumptions (A-1)–(A-2).

If there is such a Z , the conditional independence conditionimplies under (A-1)–(A-5) that E (Y | X ,P(Z ) = p) is linear inp.

37 / 163

Page 96: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If there is no selection on unobserved variables conditional oncovariates, UD ⊥⊥ (Y0,Y1) | X , thenE (U1 | X ,UD) = E (U1 | X ) = 0 andE (U0 | X ,UD) = E (U0 | X ) = 0 so that the OLS weights areunity and OLS identifies both ATE and the parametertreatment on the treated (TT), which are the same under thisassumption.

This condition is an implication of matching condition (M-1).

Given the assumed conditional independence in terms of X , wecan identify ATE and TT without use of any instrument Zsatisfying assumptions (A-1)–(A-2).

If there is such a Z , the conditional independence conditionimplies under (A-1)–(A-5) that E (Y | X ,P(Z ) = p) is linear inp.

37 / 163

Page 97: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The assumptions are:

(A-1)

(U0,U1,V ) are independent of Z conditional on X(Independence);

(A-2)

µD(Z ) is a nondegenerate random variable conditional on X(Rank Condition);

(A-3)

The distribution of V is continuous;

38 / 163

Page 98: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(A-4)

The values of E (Y1) and E (Y0) are finite (Finite Means);

(A-5)

0 < Pr(D = 1 | X ) < 1.

39 / 163

Page 99: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The conditional independence assumption invoked in themethod of matching has come into widespread use for muchthe same reason that OLS has come into widespread use.

It is easy to implement with modern software and makes littledemands of the data because it assumes the existence of Xvariables that satisfy the conditional independence assumptions.

The crucial conditional independence assumption is nottestable.

As we note below, additional assumptions on the X arerequired to test the validity of the matching assumptions.

40 / 163

Page 100: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The conditional independence assumption invoked in themethod of matching has come into widespread use for muchthe same reason that OLS has come into widespread use.

It is easy to implement with modern software and makes littledemands of the data because it assumes the existence of Xvariables that satisfy the conditional independence assumptions.

The crucial conditional independence assumption is nottestable.

As we note below, additional assumptions on the X arerequired to test the validity of the matching assumptions.

40 / 163

Page 101: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The conditional independence assumption invoked in themethod of matching has come into widespread use for muchthe same reason that OLS has come into widespread use.

It is easy to implement with modern software and makes littledemands of the data because it assumes the existence of Xvariables that satisfy the conditional independence assumptions.

The crucial conditional independence assumption is nottestable.

As we note below, additional assumptions on the X arerequired to test the validity of the matching assumptions.

40 / 163

Page 102: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The conditional independence assumption invoked in themethod of matching has come into widespread use for muchthe same reason that OLS has come into widespread use.

It is easy to implement with modern software and makes littledemands of the data because it assumes the existence of Xvariables that satisfy the conditional independence assumptions.

The crucial conditional independence assumption is nottestable.

As we note below, additional assumptions on the X arerequired to test the validity of the matching assumptions.

40 / 163

Page 103: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If the sole interest is to identify treatment on the treated, ∆TT,it is apparent from representation (2) that we can weaken(M-1) to

(M-1)′ Y0 ⊥⊥ D | X .

This is possible because E (Y1 | X ,D = 1) is known from dataon outcomes of the treated and only need to constructE (Y0 | X ,D = 1).

In this case, MTE is not restricted to be flat in uD and alltreatment parameters are not the same.

A straightforward implication of (M-1)′ in the Roy model,where selection is made solely on the gain, is that persons mustsort into treatment status positively in terms of levels of Y1.

We now consider more generally the implications of assumingmean independence of the errors rather than full independence.

41 / 163

Page 104: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If the sole interest is to identify treatment on the treated, ∆TT,it is apparent from representation (2) that we can weaken(M-1) to

(M-1)′ Y0 ⊥⊥ D | X .

This is possible because E (Y1 | X ,D = 1) is known from dataon outcomes of the treated and only need to constructE (Y0 | X ,D = 1).

In this case, MTE is not restricted to be flat in uD and alltreatment parameters are not the same.

A straightforward implication of (M-1)′ in the Roy model,where selection is made solely on the gain, is that persons mustsort into treatment status positively in terms of levels of Y1.

We now consider more generally the implications of assumingmean independence of the errors rather than full independence.

41 / 163

Page 105: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If the sole interest is to identify treatment on the treated, ∆TT,it is apparent from representation (2) that we can weaken(M-1) to

(M-1)′ Y0 ⊥⊥ D | X .

This is possible because E (Y1 | X ,D = 1) is known from dataon outcomes of the treated and only need to constructE (Y0 | X ,D = 1).

In this case, MTE is not restricted to be flat in uD and alltreatment parameters are not the same.

A straightforward implication of (M-1)′ in the Roy model,where selection is made solely on the gain, is that persons mustsort into treatment status positively in terms of levels of Y1.

We now consider more generally the implications of assumingmean independence of the errors rather than full independence.

41 / 163

Page 106: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If the sole interest is to identify treatment on the treated, ∆TT,it is apparent from representation (2) that we can weaken(M-1) to

(M-1)′ Y0 ⊥⊥ D | X .

This is possible because E (Y1 | X ,D = 1) is known from dataon outcomes of the treated and only need to constructE (Y0 | X ,D = 1).

In this case, MTE is not restricted to be flat in uD and alltreatment parameters are not the same.

A straightforward implication of (M-1)′ in the Roy model,where selection is made solely on the gain, is that persons mustsort into treatment status positively in terms of levels of Y1.

We now consider more generally the implications of assumingmean independence of the errors rather than full independence.

41 / 163

Page 107: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If the sole interest is to identify treatment on the treated, ∆TT,it is apparent from representation (2) that we can weaken(M-1) to

(M-1)′ Y0 ⊥⊥ D | X .

This is possible because E (Y1 | X ,D = 1) is known from dataon outcomes of the treated and only need to constructE (Y0 | X ,D = 1).

In this case, MTE is not restricted to be flat in uD and alltreatment parameters are not the same.

A straightforward implication of (M-1)′ in the Roy model,where selection is made solely on the gain, is that persons mustsort into treatment status positively in terms of levels of Y1.

We now consider more generally the implications of assumingmean independence of the errors rather than full independence.

41 / 163

Page 108: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching and MTE Using Mean Independence Conditions

To identify all mean treatment parameters, one can weaken theassumption (M-1) to the condition that Y0 and Y1 are meanindependent of D conditional on X .

However, (Y0,Y1) will be mean independent of D conditionalon X without UD being independent of Y0,Y1 conditional on Xonly if fortuitous balancing occurs, with regions of positivedependence of (Y0,Y1) on UD and regions of negativedependence of (Y0,Y1) on UD just exactly offsetting each other.

Such balancing is not generic in the Roy model and in thegeneralized Roy model.

42 / 163

Page 109: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching and MTE Using Mean Independence Conditions

To identify all mean treatment parameters, one can weaken theassumption (M-1) to the condition that Y0 and Y1 are meanindependent of D conditional on X .

However, (Y0,Y1) will be mean independent of D conditionalon X without UD being independent of Y0,Y1 conditional on Xonly if fortuitous balancing occurs, with regions of positivedependence of (Y0,Y1) on UD and regions of negativedependence of (Y0,Y1) on UD just exactly offsetting each other.

Such balancing is not generic in the Roy model and in thegeneralized Roy model.

42 / 163

Page 110: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching and MTE Using Mean Independence Conditions

To identify all mean treatment parameters, one can weaken theassumption (M-1) to the condition that Y0 and Y1 are meanindependent of D conditional on X .

However, (Y0,Y1) will be mean independent of D conditionalon X without UD being independent of Y0,Y1 conditional on Xonly if fortuitous balancing occurs, with regions of positivedependence of (Y0,Y1) on UD and regions of negativedependence of (Y0,Y1) on UD just exactly offsetting each other.

Such balancing is not generic in the Roy model and in thegeneralized Roy model.

42 / 163

Page 111: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that Yj = µj(X ) + Uj for j = 0, 1 andfurther assume that D = 1[Y1 − Y0 ≥ C (Z ) + UC ].

Let V = UC − (U1 − U0).

Assume (U0,U1,V ) ⊥⊥ (X ,Z ).

Then if V ⊥⊥ (U1 − U0), and UC has a log concave density,then E (Y1 − Y0|X ,V = v) is decreasing in v ,∆TT(x) > ∆ATE(x), and the matching conditions do not hold.

If V ⊥⊥ (U1 − U0) but V does not have a log concave density,then it is still the case that (U1 − U0,V ) is negative quadrantdependent.

43 / 163

Page 112: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that Yj = µj(X ) + Uj for j = 0, 1 andfurther assume that D = 1[Y1 − Y0 ≥ C (Z ) + UC ].

Let V = UC − (U1 − U0).

Assume (U0,U1,V ) ⊥⊥ (X ,Z ).

Then if V ⊥⊥ (U1 − U0), and UC has a log concave density,then E (Y1 − Y0|X ,V = v) is decreasing in v ,∆TT(x) > ∆ATE(x), and the matching conditions do not hold.

If V ⊥⊥ (U1 − U0) but V does not have a log concave density,then it is still the case that (U1 − U0,V ) is negative quadrantdependent.

43 / 163

Page 113: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that Yj = µj(X ) + Uj for j = 0, 1 andfurther assume that D = 1[Y1 − Y0 ≥ C (Z ) + UC ].

Let V = UC − (U1 − U0).

Assume (U0,U1,V ) ⊥⊥ (X ,Z ).

Then if V ⊥⊥ (U1 − U0), and UC has a log concave density,then E (Y1 − Y0|X ,V = v) is decreasing in v ,∆TT(x) > ∆ATE(x), and the matching conditions do not hold.

If V ⊥⊥ (U1 − U0) but V does not have a log concave density,then it is still the case that (U1 − U0,V ) is negative quadrantdependent.

43 / 163

Page 114: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that Yj = µj(X ) + Uj for j = 0, 1 andfurther assume that D = 1[Y1 − Y0 ≥ C (Z ) + UC ].

Let V = UC − (U1 − U0).

Assume (U0,U1,V ) ⊥⊥ (X ,Z ).

Then if V ⊥⊥ (U1 − U0), and UC has a log concave density,then E (Y1 − Y0|X ,V = v) is decreasing in v ,∆TT(x) > ∆ATE(x), and the matching conditions do not hold.

If V ⊥⊥ (U1 − U0) but V does not have a log concave density,then it is still the case that (U1 − U0,V ) is negative quadrantdependent.

43 / 163

Page 115: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that Yj = µj(X ) + Uj for j = 0, 1 andfurther assume that D = 1[Y1 − Y0 ≥ C (Z ) + UC ].

Let V = UC − (U1 − U0).

Assume (U0,U1,V ) ⊥⊥ (X ,Z ).

Then if V ⊥⊥ (U1 − U0), and UC has a log concave density,then E (Y1 − Y0|X ,V = v) is decreasing in v ,∆TT(x) > ∆ATE(x), and the matching conditions do not hold.

If V ⊥⊥ (U1 − U0) but V does not have a log concave density,then it is still the case that (U1 − U0,V ) is negative quadrantdependent.

43 / 163

Page 116: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One can show that (U1 − U0,V ) being negative quadrantdependent implies that ∆TT(x) > ∆ATE(x), and thus againthat the matching conditions cannot hold.

We now develop a more general analysis.

Suppose that we assume selection model (3) so thatD = 1[P(Z ) ≥ UD ], where Z is independent of (Y0,Y1)conditional on X , where UD = FV |X (V ) andP (Z ) = FV |X (µD (Z )).

D∗ = µD(Z )− V ; D = 1 if D∗ ≥ 0 ;

D = 0 otherwise,(3)

Consider the weaker mean independence assumptions in placeof assumption (M-1):

44 / 163

Page 117: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One can show that (U1 − U0,V ) being negative quadrantdependent implies that ∆TT(x) > ∆ATE(x), and thus againthat the matching conditions cannot hold.

We now develop a more general analysis.

Suppose that we assume selection model (3) so thatD = 1[P(Z ) ≥ UD ], where Z is independent of (Y0,Y1)conditional on X , where UD = FV |X (V ) andP (Z ) = FV |X (µD (Z )).

D∗ = µD(Z )− V ; D = 1 if D∗ ≥ 0 ;

D = 0 otherwise,(3)

Consider the weaker mean independence assumptions in placeof assumption (M-1):

44 / 163

Page 118: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One can show that (U1 − U0,V ) being negative quadrantdependent implies that ∆TT(x) > ∆ATE(x), and thus againthat the matching conditions cannot hold.

We now develop a more general analysis.

Suppose that we assume selection model (3) so thatD = 1[P(Z ) ≥ UD ], where Z is independent of (Y0,Y1)conditional on X , where UD = FV |X (V ) andP (Z ) = FV |X (µD (Z )).

D∗ = µD(Z )− V ; D = 1 if D∗ ≥ 0 ;

D = 0 otherwise,(3)

Consider the weaker mean independence assumptions in placeof assumption (M-1):

44 / 163

Page 119: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One can show that (U1 − U0,V ) being negative quadrantdependent implies that ∆TT(x) > ∆ATE(x), and thus againthat the matching conditions cannot hold.

We now develop a more general analysis.

Suppose that we assume selection model (3) so thatD = 1[P(Z ) ≥ UD ], where Z is independent of (Y0,Y1)conditional on X , where UD = FV |X (V ) andP (Z ) = FV |X (µD (Z )).

D∗ = µD(Z )− V ; D = 1 if D∗ ≥ 0 ;

D = 0 otherwise,(3)

Consider the weaker mean independence assumptions in placeof assumption (M-1):

44 / 163

Page 120: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(M-3)

E (Y1|X ,D) = E (Y1|X ), E (Y0|X ,D) = E (Y0|X ).

45 / 163

Page 121: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption is all that is needed to identify the meantreatment parameters because under it

E(Y |X = x ,Z = z,D = 1) = E(Y1|X = x ,Z = z,D = 1) = E(Y1|X = x)

and

E(Y |X = x ,Z = z,D = 0) = E(Y0|X = x ,Z = z,D = 0) = E(Y0|X = x).

Thus we can identify all the mean treatment parameters overthe support that satisfies (M-2).

46 / 163

Page 122: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption is all that is needed to identify the meantreatment parameters because under it

E(Y |X = x ,Z = z,D = 1) = E(Y1|X = x ,Z = z,D = 1) = E(Y1|X = x)

and

E(Y |X = x ,Z = z,D = 0) = E(Y0|X = x ,Z = z,D = 0) = E(Y0|X = x).

Thus we can identify all the mean treatment parameters overthe support that satisfies (M-2).

46 / 163

Page 123: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Recalling that ∆ = Y1 − Y0, (M-3) implies in terms of UD that

E (∆|X = x ,Z = z ,UD ≤ P(z)) = E (∆|X = x)

⇔ E (∆MTE(X ,UD)|X = x ,UD ≤ P(z)) = E (∆|X = x),

and hence

E(∆MTE(X ,UD)|X = x ,UD ≤ P(z)) = E(∆MTE(X ,UD)|X = x ,UD > P(z)).

If the support of P(Z ) is the full unit interval conditional onX = x , then ∆MTE(X ,UD) = E (∆|X = x) for all UD .

If the support of P(Z ) is a proper subset of the full unitinterval, then generically (M-3) will hold only if∆MTE(X ,UD) = E (∆|X = x) for all UD , though positive andnegative parts could balance out for any particular value of X .

47 / 163

Page 124: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Recalling that ∆ = Y1 − Y0, (M-3) implies in terms of UD that

E (∆|X = x ,Z = z ,UD ≤ P(z)) = E (∆|X = x)

⇔ E (∆MTE(X ,UD)|X = x ,UD ≤ P(z)) = E (∆|X = x),

and hence

E(∆MTE(X ,UD)|X = x ,UD ≤ P(z)) = E(∆MTE(X ,UD)|X = x ,UD > P(z)).

If the support of P(Z ) is the full unit interval conditional onX = x , then ∆MTE(X ,UD) = E (∆|X = x) for all UD .

If the support of P(Z ) is a proper subset of the full unitinterval, then generically (M-3) will hold only if∆MTE(X ,UD) = E (∆|X = x) for all UD , though positive andnegative parts could balance out for any particular value of X .

47 / 163

Page 125: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To see this, note that

EZ

(E (∆MTE(X ,UD)|X = x ,UD ≤ P(z))|X = x ,D = 1

)= EZ (E (∆MTE(X ,UD)|X = x ,UD > P(z))|X = x ,D = 0).

Working with V = F−1V |X (UD), suppose that

D = 1[µD(Z ,V ) ≥ 0].

Let Ω(z) = v : µD(z , v) ≥ 0].

Then (M-3) implies that

E(∆MTE(X ,V )|X = x ,V ∈ Ω(z)) = E(∆MTE(X ,V )|X = x ,V ∈ (Ω(z))c )

so we expect that generically under assumption (M-3) weobtain a flat MTE in terms of V = F−1

V |X (UD).

We conduct a parallel analysis for the nonseparable choicemodel and obtain similar conditions.

48 / 163

Page 126: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To see this, note that

EZ

(E (∆MTE(X ,UD)|X = x ,UD ≤ P(z))|X = x ,D = 1

)= EZ (E (∆MTE(X ,UD)|X = x ,UD > P(z))|X = x ,D = 0).

Working with V = F−1V |X (UD), suppose that

D = 1[µD(Z ,V ) ≥ 0].

Let Ω(z) = v : µD(z , v) ≥ 0].

Then (M-3) implies that

E(∆MTE(X ,V )|X = x ,V ∈ Ω(z)) = E(∆MTE(X ,V )|X = x ,V ∈ (Ω(z))c )

so we expect that generically under assumption (M-3) weobtain a flat MTE in terms of V = F−1

V |X (UD).

We conduct a parallel analysis for the nonseparable choicemodel and obtain similar conditions.

48 / 163

Page 127: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To see this, note that

EZ

(E (∆MTE(X ,UD)|X = x ,UD ≤ P(z))|X = x ,D = 1

)= EZ (E (∆MTE(X ,UD)|X = x ,UD > P(z))|X = x ,D = 0).

Working with V = F−1V |X (UD), suppose that

D = 1[µD(Z ,V ) ≥ 0].

Let Ω(z) = v : µD(z , v) ≥ 0].

Then (M-3) implies that

E(∆MTE(X ,V )|X = x ,V ∈ Ω(z)) = E(∆MTE(X ,V )|X = x ,V ∈ (Ω(z))c )

so we expect that generically under assumption (M-3) weobtain a flat MTE in terms of V = F−1

V |X (UD).

We conduct a parallel analysis for the nonseparable choicemodel and obtain similar conditions.

48 / 163

Page 128: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To see this, note that

EZ

(E (∆MTE(X ,UD)|X = x ,UD ≤ P(z))|X = x ,D = 1

)= EZ (E (∆MTE(X ,UD)|X = x ,UD > P(z))|X = x ,D = 0).

Working with V = F−1V |X (UD), suppose that

D = 1[µD(Z ,V ) ≥ 0].

Let Ω(z) = v : µD(z , v) ≥ 0].

Then (M-3) implies that

E(∆MTE(X ,V )|X = x ,V ∈ Ω(z)) = E(∆MTE(X ,V )|X = x ,V ∈ (Ω(z))c )

so we expect that generically under assumption (M-3) weobtain a flat MTE in terms of V = F−1

V |X (UD).

We conduct a parallel analysis for the nonseparable choicemodel and obtain similar conditions.

48 / 163

Page 129: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To see this, note that

EZ

(E (∆MTE(X ,UD)|X = x ,UD ≤ P(z))|X = x ,D = 1

)= EZ (E (∆MTE(X ,UD)|X = x ,UD > P(z))|X = x ,D = 0).

Working with V = F−1V |X (UD), suppose that

D = 1[µD(Z ,V ) ≥ 0].

Let Ω(z) = v : µD(z , v) ≥ 0].

Then (M-3) implies that

E(∆MTE(X ,V )|X = x ,V ∈ Ω(z)) = E(∆MTE(X ,V )|X = x ,V ∈ (Ω(z))c )

so we expect that generically under assumption (M-3) weobtain a flat MTE in terms of V = F−1

V |X (UD).

We conduct a parallel analysis for the nonseparable choicemodel and obtain similar conditions.

48 / 163

Page 130: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes a flat MTE, i.e., that marginal returnsconditional on X and V do not depend on V (alternatively,that marginal returns do not depend on UD given X ).

We already noted that IV and matching invoke very differentassumptions.

Matching requires no exclusion restrictions whereas IV is basedon the existence of exclusion restrictions.

Superficially, we can bridge these literatures by invokingmatching with an exclusion condition: (Y0,Y1) ⊥⊥ D | X but(Y0,Y1) ⊥⊥ D | X ,Z .

This looks like an IV condition, but it is not.

49 / 163

Page 131: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes a flat MTE, i.e., that marginal returnsconditional on X and V do not depend on V (alternatively,that marginal returns do not depend on UD given X ).

We already noted that IV and matching invoke very differentassumptions.

Matching requires no exclusion restrictions whereas IV is basedon the existence of exclusion restrictions.

Superficially, we can bridge these literatures by invokingmatching with an exclusion condition: (Y0,Y1) ⊥⊥ D | X but(Y0,Y1) ⊥⊥ D | X ,Z .

This looks like an IV condition, but it is not.

49 / 163

Page 132: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes a flat MTE, i.e., that marginal returnsconditional on X and V do not depend on V (alternatively,that marginal returns do not depend on UD given X ).

We already noted that IV and matching invoke very differentassumptions.

Matching requires no exclusion restrictions whereas IV is basedon the existence of exclusion restrictions.

Superficially, we can bridge these literatures by invokingmatching with an exclusion condition: (Y0,Y1) ⊥⊥ D | X but(Y0,Y1) ⊥⊥ D | X ,Z .

This looks like an IV condition, but it is not.

49 / 163

Page 133: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes a flat MTE, i.e., that marginal returnsconditional on X and V do not depend on V (alternatively,that marginal returns do not depend on UD given X ).

We already noted that IV and matching invoke very differentassumptions.

Matching requires no exclusion restrictions whereas IV is basedon the existence of exclusion restrictions.

Superficially, we can bridge these literatures by invokingmatching with an exclusion condition: (Y0,Y1) ⊥⊥ D | X but(Y0,Y1) ⊥⊥ D | X ,Z .

This looks like an IV condition, but it is not.

49 / 163

Page 134: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes a flat MTE, i.e., that marginal returnsconditional on X and V do not depend on V (alternatively,that marginal returns do not depend on UD given X ).

We already noted that IV and matching invoke very differentassumptions.

Matching requires no exclusion restrictions whereas IV is basedon the existence of exclusion restrictions.

Superficially, we can bridge these literatures by invokingmatching with an exclusion condition: (Y0,Y1) ⊥⊥ D | X but(Y0,Y1) ⊥⊥ D | X ,Z .

This looks like an IV condition, but it is not.

49 / 163

Page 135: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We explore the relationship between matching with exclusionand IV and demonstrate a fundamental contradiction betweenthe two identifying conditions.

For an additively separable representation of the outcomeequations U1 = Y1 − E (Y1|X ) and U0 = Y0 − E (Y0|X ), weestablish that if (U0,U1) is mean independent of D conditionalon (X ,Z ), as required by IV, but (U0,U1) is not meanindependent of D conditional on X alone, then U0 is dependenton Z conditional on X , contrary to all assumptions used tojustify instrumental variables.

We next consider how to implement matching.

50 / 163

Page 136: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We explore the relationship between matching with exclusionand IV and demonstrate a fundamental contradiction betweenthe two identifying conditions.

For an additively separable representation of the outcomeequations U1 = Y1 − E (Y1|X ) and U0 = Y0 − E (Y0|X ), weestablish that if (U0,U1) is mean independent of D conditionalon (X ,Z ), as required by IV, but (U0,U1) is not meanindependent of D conditional on X alone, then U0 is dependenton Z conditional on X , contrary to all assumptions used tojustify instrumental variables.

We next consider how to implement matching.

50 / 163

Page 137: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We explore the relationship between matching with exclusionand IV and demonstrate a fundamental contradiction betweenthe two identifying conditions.

For an additively separable representation of the outcomeequations U1 = Y1 − E (Y1|X ) and U0 = Y0 − E (Y0|X ), weestablish that if (U0,U1) is mean independent of D conditionalon (X ,Z ), as required by IV, but (U0,U1) is not meanindependent of D conditional on X alone, then U0 is dependenton Z conditional on X , contrary to all assumptions used tojustify instrumental variables.

We next consider how to implement matching.

50 / 163

Page 138: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Implementing the Method of Matching

We draw on Heckman et al. (1998) and Heckman et al. (1999)to describe the mechanics of matching.Todd (2007, 2008)presents a comprehensive treatment of the main issues and aguide to software.

51 / 163

Page 139: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To operationalize the method of matching, we assume twosamples: “t” for treatment and “c” for comparison group.

Treatment group members have D = 1 and control groupmembers have D = 0.

Unless otherwise noted, we assume that observations arestatistically independent within and across groups.

Simple matching methods are based on the following idea.

For each person i in the treatment group, we find some groupof “comparable” persons.

The same individual may be in both treated and control groupsif that person is treated at one time and untreated at another.

52 / 163

Page 140: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To operationalize the method of matching, we assume twosamples: “t” for treatment and “c” for comparison group.

Treatment group members have D = 1 and control groupmembers have D = 0.

Unless otherwise noted, we assume that observations arestatistically independent within and across groups.

Simple matching methods are based on the following idea.

For each person i in the treatment group, we find some groupof “comparable” persons.

The same individual may be in both treated and control groupsif that person is treated at one time and untreated at another.

52 / 163

Page 141: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To operationalize the method of matching, we assume twosamples: “t” for treatment and “c” for comparison group.

Treatment group members have D = 1 and control groupmembers have D = 0.

Unless otherwise noted, we assume that observations arestatistically independent within and across groups.

Simple matching methods are based on the following idea.

For each person i in the treatment group, we find some groupof “comparable” persons.

The same individual may be in both treated and control groupsif that person is treated at one time and untreated at another.

52 / 163

Page 142: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To operationalize the method of matching, we assume twosamples: “t” for treatment and “c” for comparison group.

Treatment group members have D = 1 and control groupmembers have D = 0.

Unless otherwise noted, we assume that observations arestatistically independent within and across groups.

Simple matching methods are based on the following idea.

For each person i in the treatment group, we find some groupof “comparable” persons.

The same individual may be in both treated and control groupsif that person is treated at one time and untreated at another.

52 / 163

Page 143: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To operationalize the method of matching, we assume twosamples: “t” for treatment and “c” for comparison group.

Treatment group members have D = 1 and control groupmembers have D = 0.

Unless otherwise noted, we assume that observations arestatistically independent within and across groups.

Simple matching methods are based on the following idea.

For each person i in the treatment group, we find some groupof “comparable” persons.

The same individual may be in both treated and control groupsif that person is treated at one time and untreated at another.

52 / 163

Page 144: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To operationalize the method of matching, we assume twosamples: “t” for treatment and “c” for comparison group.

Treatment group members have D = 1 and control groupmembers have D = 0.

Unless otherwise noted, we assume that observations arestatistically independent within and across groups.

Simple matching methods are based on the following idea.

For each person i in the treatment group, we find some groupof “comparable” persons.

The same individual may be in both treated and control groupsif that person is treated at one time and untreated at another.

52 / 163

Page 145: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We denote outcomes for person i in the treatment group by Y ti

and we match these outcomes to the outcomes of a subsampleof persons in the comparison group to estimate a treatmenteffect.

In principle, we can use a different subsample as a comparisongroup for each person.

In practice, we can construct matches on the basis of aneighborhood ξ(Xi), where Xi is a vector of characteristics forperson i .

Neighbors to treated person i are persons in the comparisonsample whose characteristics are in neighborhood ξ(Xi).

Suppose that there are Nc persons in the comparison sampleand Nt in the treatment sample.

53 / 163

Page 146: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We denote outcomes for person i in the treatment group by Y ti

and we match these outcomes to the outcomes of a subsampleof persons in the comparison group to estimate a treatmenteffect.

In principle, we can use a different subsample as a comparisongroup for each person.

In practice, we can construct matches on the basis of aneighborhood ξ(Xi), where Xi is a vector of characteristics forperson i .

Neighbors to treated person i are persons in the comparisonsample whose characteristics are in neighborhood ξ(Xi).

Suppose that there are Nc persons in the comparison sampleand Nt in the treatment sample.

53 / 163

Page 147: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We denote outcomes for person i in the treatment group by Y ti

and we match these outcomes to the outcomes of a subsampleof persons in the comparison group to estimate a treatmenteffect.

In principle, we can use a different subsample as a comparisongroup for each person.

In practice, we can construct matches on the basis of aneighborhood ξ(Xi), where Xi is a vector of characteristics forperson i .

Neighbors to treated person i are persons in the comparisonsample whose characteristics are in neighborhood ξ(Xi).

Suppose that there are Nc persons in the comparison sampleand Nt in the treatment sample.

53 / 163

Page 148: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We denote outcomes for person i in the treatment group by Y ti

and we match these outcomes to the outcomes of a subsampleof persons in the comparison group to estimate a treatmenteffect.

In principle, we can use a different subsample as a comparisongroup for each person.

In practice, we can construct matches on the basis of aneighborhood ξ(Xi), where Xi is a vector of characteristics forperson i .

Neighbors to treated person i are persons in the comparisonsample whose characteristics are in neighborhood ξ(Xi).

Suppose that there are Nc persons in the comparison sampleand Nt in the treatment sample.

53 / 163

Page 149: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We denote outcomes for person i in the treatment group by Y ti

and we match these outcomes to the outcomes of a subsampleof persons in the comparison group to estimate a treatmenteffect.

In principle, we can use a different subsample as a comparisongroup for each person.

In practice, we can construct matches on the basis of aneighborhood ξ(Xi), where Xi is a vector of characteristics forperson i .

Neighbors to treated person i are persons in the comparisonsample whose characteristics are in neighborhood ξ(Xi).

Suppose that there are Nc persons in the comparison sampleand Nt in the treatment sample.

53 / 163

Page 150: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Thus the persons in the comparison sample who are neighborsto i , are persons j for whom Xj ∈ ξ(Xi), i.e., the set of personsAi = j | Xj ∈ ξ(Xi).

Let W (i , j) be the weight placed on observation j in forming acomparison with observation i and further assume that the

weights sum to one,Nc∑j=1

W (i , j) = 1, and that 0 ≤ W (i , j) ≤ 1.

Form a weighted comparison group mean for person i , given by

Y ci =

Nc∑j=1

W (i , j)Y cj . (4)

The estimated treatment effect for person i is Yi − Y ci .

54 / 163

Page 151: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Thus the persons in the comparison sample who are neighborsto i , are persons j for whom Xj ∈ ξ(Xi), i.e., the set of personsAi = j | Xj ∈ ξ(Xi).Let W (i , j) be the weight placed on observation j in forming acomparison with observation i and further assume that the

weights sum to one,Nc∑j=1

W (i , j) = 1, and that 0 ≤ W (i , j) ≤ 1.

Form a weighted comparison group mean for person i , given by

Y ci =

Nc∑j=1

W (i , j)Y cj . (4)

The estimated treatment effect for person i is Yi − Y ci .

54 / 163

Page 152: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Thus the persons in the comparison sample who are neighborsto i , are persons j for whom Xj ∈ ξ(Xi), i.e., the set of personsAi = j | Xj ∈ ξ(Xi).Let W (i , j) be the weight placed on observation j in forming acomparison with observation i and further assume that the

weights sum to one,Nc∑j=1

W (i , j) = 1, and that 0 ≤ W (i , j) ≤ 1.

Form a weighted comparison group mean for person i , given by

Y ci =

Nc∑j=1

W (i , j)Y cj . (4)

The estimated treatment effect for person i is Yi − Y ci .

54 / 163

Page 153: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Thus the persons in the comparison sample who are neighborsto i , are persons j for whom Xj ∈ ξ(Xi), i.e., the set of personsAi = j | Xj ∈ ξ(Xi).Let W (i , j) be the weight placed on observation j in forming acomparison with observation i and further assume that the

weights sum to one,Nc∑j=1

W (i , j) = 1, and that 0 ≤ W (i , j) ≤ 1.

Form a weighted comparison group mean for person i , given by

Y ci =

Nc∑j=1

W (i , j)Y cj . (4)

The estimated treatment effect for person i is Yi − Y ci .

54 / 163

Page 154: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This selects a set of comparison group members associatedwith i and the mean of their outcomes.

Unlike IV or the control function approach, the method ofmatching identifies counterfactuals for each treated member.

Heckman et al. (1997) and Heckman et al. (1999) survey avariety of alternative matching schemes proposed in theliterature.

Todd (2007, 2008) provides a comprehensive survey.

In this chapter, we briefly consider two widely-used methods.

55 / 163

Page 155: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This selects a set of comparison group members associatedwith i and the mean of their outcomes.

Unlike IV or the control function approach, the method ofmatching identifies counterfactuals for each treated member.

Heckman et al. (1997) and Heckman et al. (1999) survey avariety of alternative matching schemes proposed in theliterature.

Todd (2007, 2008) provides a comprehensive survey.

In this chapter, we briefly consider two widely-used methods.

55 / 163

Page 156: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This selects a set of comparison group members associatedwith i and the mean of their outcomes.

Unlike IV or the control function approach, the method ofmatching identifies counterfactuals for each treated member.

Heckman et al. (1997) and Heckman et al. (1999) survey avariety of alternative matching schemes proposed in theliterature.

Todd (2007, 2008) provides a comprehensive survey.

In this chapter, we briefly consider two widely-used methods.

55 / 163

Page 157: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This selects a set of comparison group members associatedwith i and the mean of their outcomes.

Unlike IV or the control function approach, the method ofmatching identifies counterfactuals for each treated member.

Heckman et al. (1997) and Heckman et al. (1999) survey avariety of alternative matching schemes proposed in theliterature.

Todd (2007, 2008) provides a comprehensive survey.

In this chapter, we briefly consider two widely-used methods.

55 / 163

Page 158: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This selects a set of comparison group members associatedwith i and the mean of their outcomes.

Unlike IV or the control function approach, the method ofmatching identifies counterfactuals for each treated member.

Heckman et al. (1997) and Heckman et al. (1999) survey avariety of alternative matching schemes proposed in theliterature.

Todd (2007, 2008) provides a comprehensive survey.

In this chapter, we briefly consider two widely-used methods.

55 / 163

Page 159: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The nearest-neighbor matching estimator defines Ai such thatonly one j is selected so that it is closest to Xi in some metric:

Ai = j | minj∈1,...,Nc

‖Xi − Xj‖,

where “‖ ‖” is a metric measuring distance in the Xcharacteristics space.

The Mahalanobis metric is one widely used metric forimplementing the nearest neighbor matching estimator.

This metric defines neighborhoods for i as

‖ ‖ = (Xi − Xj)′∑−1

c(Xi − Xj),

where∑

c is the covariance matrix in the comparison sample.

56 / 163

Page 160: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The nearest-neighbor matching estimator defines Ai such thatonly one j is selected so that it is closest to Xi in some metric:

Ai = j | minj∈1,...,Nc

‖Xi − Xj‖,

where “‖ ‖” is a metric measuring distance in the Xcharacteristics space.

The Mahalanobis metric is one widely used metric forimplementing the nearest neighbor matching estimator.

This metric defines neighborhoods for i as

‖ ‖ = (Xi − Xj)′∑−1

c(Xi − Xj),

where∑

c is the covariance matrix in the comparison sample.

56 / 163

Page 161: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The nearest-neighbor matching estimator defines Ai such thatonly one j is selected so that it is closest to Xi in some metric:

Ai = j | minj∈1,...,Nc

‖Xi − Xj‖,

where “‖ ‖” is a metric measuring distance in the Xcharacteristics space.

The Mahalanobis metric is one widely used metric forimplementing the nearest neighbor matching estimator.

This metric defines neighborhoods for i as

‖ ‖ = (Xi − Xj)′∑−1

c(Xi − Xj),

where∑

c is the covariance matrix in the comparison sample.

56 / 163

Page 162: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The weighting scheme for the nearest neighbor matchingestimator is

W (i , j) =

1 if j ∈ Ai ,0 otherwise.

The nearest neighbor in the metric “‖·‖” is used in the match.

A version of nearest-neighbor matching, called “caliper”matching (Cochran and Rubin, 1973), makes matches toperson i only if

‖Xi − Xj‖ < ε,

where ε is a pre-specified tolerance.

Otherwise, person i is bypassed and no match is made to himor her.

57 / 163

Page 163: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The weighting scheme for the nearest neighbor matchingestimator is

W (i , j) =

1 if j ∈ Ai ,0 otherwise.

The nearest neighbor in the metric “‖·‖” is used in the match.

A version of nearest-neighbor matching, called “caliper”matching (Cochran and Rubin, 1973), makes matches toperson i only if

‖Xi − Xj‖ < ε,

where ε is a pre-specified tolerance.

Otherwise, person i is bypassed and no match is made to himor her.

57 / 163

Page 164: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The weighting scheme for the nearest neighbor matchingestimator is

W (i , j) =

1 if j ∈ Ai ,0 otherwise.

The nearest neighbor in the metric “‖·‖” is used in the match.

A version of nearest-neighbor matching, called “caliper”matching (Cochran and Rubin, 1973), makes matches toperson i only if

‖Xi − Xj‖ < ε,

where ε is a pre-specified tolerance.

Otherwise, person i is bypassed and no match is made to himor her.

57 / 163

Page 165: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Kernel matching uses the entire comparison sample, so thatAi = 1, . . . ,Nc, and sets

W (i , j) =K (Xj − Xi)

Nc∑j=1

K (Xj − Xi)

,

where K is a kernel.

Kernel matching is a smooth method that reuses and weightsthe comparison group sample observations differently for eachperson i in the treatment group with a different Xi .

Kernel matching can be defined pointwise at each sample pointXi or for broader intervals.

58 / 163

Page 166: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Kernel matching uses the entire comparison sample, so thatAi = 1, . . . ,Nc, and sets

W (i , j) =K (Xj − Xi)

Nc∑j=1

K (Xj − Xi)

,

where K is a kernel.

Kernel matching is a smooth method that reuses and weightsthe comparison group sample observations differently for eachperson i in the treatment group with a different Xi .

Kernel matching can be defined pointwise at each sample pointXi or for broader intervals.

58 / 163

Page 167: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Kernel matching uses the entire comparison sample, so thatAi = 1, . . . ,Nc, and sets

W (i , j) =K (Xj − Xi)

Nc∑j=1

K (Xj − Xi)

,

where K is a kernel.

Kernel matching is a smooth method that reuses and weightsthe comparison group sample observations differently for eachperson i in the treatment group with a different Xi .

Kernel matching can be defined pointwise at each sample pointXi or for broader intervals.

58 / 163

Page 168: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For example, the impact of treatment on the treated can beestimated by forming the mean difference across the i :

∆TT =1

Nt

Nt∑i=1

(Y ti − Y c

i ) =1

Nt

Nt∑i=1

(Y ti −

Nc∑j=1

W (i , j)Y cj ). (5)

We can define this mean for various subsets of the treatmentsample defined in various ways.

More efficient estimators weight the observations accountingfor the variance (Abadie and Imbens, 2006; Heckman, 1998;Heckman, Ichimura, and Todd, 1997, 1998; Hirano, Imbens,and Ridder, 2003) .

59 / 163

Page 169: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For example, the impact of treatment on the treated can beestimated by forming the mean difference across the i :

∆TT =1

Nt

Nt∑i=1

(Y ti − Y c

i ) =1

Nt

Nt∑i=1

(Y ti −

Nc∑j=1

W (i , j)Y cj ). (5)

We can define this mean for various subsets of the treatmentsample defined in various ways.

More efficient estimators weight the observations accountingfor the variance (Abadie and Imbens, 2006; Heckman, 1998;Heckman, Ichimura, and Todd, 1997, 1998; Hirano, Imbens,and Ridder, 2003) .

59 / 163

Page 170: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes that conditioning on X eliminates selectionbias.

The method requires no functional form assumptions foroutcome equations.

If, however, a functional form assumption is maintained, as inthe econometric procedure proposed by Barnow et al. (1980), itis possible to implement the matching assumption usingstandard regression analysis.

60 / 163

Page 171: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes that conditioning on X eliminates selectionbias.

The method requires no functional form assumptions foroutcome equations.

If, however, a functional form assumption is maintained, as inthe econometric procedure proposed by Barnow et al. (1980), itis possible to implement the matching assumption usingstandard regression analysis.

60 / 163

Page 172: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching assumes that conditioning on X eliminates selectionbias.

The method requires no functional form assumptions foroutcome equations.

If, however, a functional form assumption is maintained, as inthe econometric procedure proposed by Barnow et al. (1980), itis possible to implement the matching assumption usingstandard regression analysis.

60 / 163

Page 173: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Suppose, for example, that Y0 is linearly related to observablesX and an unobservable U0, so that

E (Y0 | X ,D = 0) = Xα + E (U0 | X ,D = 0),

andE (U0 | X ,D = 0) = E (U0 | X )

is linear in X (E (U | X ) = ϕX ).

Under these assumptions, controlling for X via linear regressionallows one to identify E (Y0 | X ,D = 1) from the data onnonparticipants.

Under assumption (Q-4)′, setting X = Q, this approachjustifies OLS equation (Q-3) for identifying treatment effects.

61 / 163

Page 174: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Suppose, for example, that Y0 is linearly related to observablesX and an unobservable U0, so that

E (Y0 | X ,D = 0) = Xα + E (U0 | X ,D = 0),

andE (U0 | X ,D = 0) = E (U0 | X )

is linear in X (E (U | X ) = ϕX ).

Under these assumptions, controlling for X via linear regressionallows one to identify E (Y0 | X ,D = 1) from the data onnonparticipants.

Under assumption (Q-4)′, setting X = Q, this approachjustifies OLS equation (Q-3) for identifying treatment effects.

61 / 163

Page 175: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Suppose, for example, that Y0 is linearly related to observablesX and an unobservable U0, so that

E (Y0 | X ,D = 0) = Xα + E (U0 | X ,D = 0),

andE (U0 | X ,D = 0) = E (U0 | X )

is linear in X (E (U | X ) = ϕX ).

Under these assumptions, controlling for X via linear regressionallows one to identify E (Y0 | X ,D = 1) from the data onnonparticipants.

Under assumption (Q-4)′, setting X = Q, this approachjustifies OLS equation (Q-3) for identifying treatment effects.

61 / 163

Page 176: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Such functional form assumptions are not strictly required toimplement the method of matching.

Moreover, in practice, users of the method of Barnow et al.(1980) do not impose the common support condition (M-2) forthe distribution of X when generating estimates of thetreatment effect.

The distribution of X may be very different in the treatmentgroup (D = 1) and comparison group (D = 0) samples, so thatcomparability is only achieved by imposing linearity in theparameters and extrapolating over different regions.

One advantage of the method of Barnow et al. (1980) is that ituses data parsimoniously.

If the X are high dimensional, the number of observations ineach cell when matching can get very small.

62 / 163

Page 177: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Such functional form assumptions are not strictly required toimplement the method of matching.

Moreover, in practice, users of the method of Barnow et al.(1980) do not impose the common support condition (M-2) forthe distribution of X when generating estimates of thetreatment effect.

The distribution of X may be very different in the treatmentgroup (D = 1) and comparison group (D = 0) samples, so thatcomparability is only achieved by imposing linearity in theparameters and extrapolating over different regions.

One advantage of the method of Barnow et al. (1980) is that ituses data parsimoniously.

If the X are high dimensional, the number of observations ineach cell when matching can get very small.

62 / 163

Page 178: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Such functional form assumptions are not strictly required toimplement the method of matching.

Moreover, in practice, users of the method of Barnow et al.(1980) do not impose the common support condition (M-2) forthe distribution of X when generating estimates of thetreatment effect.

The distribution of X may be very different in the treatmentgroup (D = 1) and comparison group (D = 0) samples, so thatcomparability is only achieved by imposing linearity in theparameters and extrapolating over different regions.

One advantage of the method of Barnow et al. (1980) is that ituses data parsimoniously.

If the X are high dimensional, the number of observations ineach cell when matching can get very small.

62 / 163

Page 179: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Such functional form assumptions are not strictly required toimplement the method of matching.

Moreover, in practice, users of the method of Barnow et al.(1980) do not impose the common support condition (M-2) forthe distribution of X when generating estimates of thetreatment effect.

The distribution of X may be very different in the treatmentgroup (D = 1) and comparison group (D = 0) samples, so thatcomparability is only achieved by imposing linearity in theparameters and extrapolating over different regions.

One advantage of the method of Barnow et al. (1980) is that ituses data parsimoniously.

If the X are high dimensional, the number of observations ineach cell when matching can get very small.

62 / 163

Page 180: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Such functional form assumptions are not strictly required toimplement the method of matching.

Moreover, in practice, users of the method of Barnow et al.(1980) do not impose the common support condition (M-2) forthe distribution of X when generating estimates of thetreatment effect.

The distribution of X may be very different in the treatmentgroup (D = 1) and comparison group (D = 0) samples, so thatcomparability is only achieved by imposing linearity in theparameters and extrapolating over different regions.

One advantage of the method of Barnow et al. (1980) is that ituses data parsimoniously.

If the X are high dimensional, the number of observations ineach cell when matching can get very small.

62 / 163

Page 181: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Another solution to this problem that reduces the dimension ofthe matching problem without imposing arbitrary linearityassumptions is based on the probability of participation or the“propensity score,” P(X ) = Pr(D = 1 | X ).

Rosenbaum and Rubin (1983) demonstrate that underassumptions (M-1) and (M-2),

(Y0,Y1) ⊥⊥ D | P(X ) for X ∈ χc , (6)

for some set χc , where it is assumed that (M-2) holds in theset.

Conditioning either on P(X ) or on X produces conditionalindependence.

63 / 163

Page 182: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Another solution to this problem that reduces the dimension ofthe matching problem without imposing arbitrary linearityassumptions is based on the probability of participation or the“propensity score,” P(X ) = Pr(D = 1 | X ).

Rosenbaum and Rubin (1983) demonstrate that underassumptions (M-1) and (M-2),

(Y0,Y1) ⊥⊥ D | P(X ) for X ∈ χc , (6)

for some set χc , where it is assumed that (M-2) holds in theset.

Conditioning either on P(X ) or on X produces conditionalindependence.

63 / 163

Page 183: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Another solution to this problem that reduces the dimension ofthe matching problem without imposing arbitrary linearityassumptions is based on the probability of participation or the“propensity score,” P(X ) = Pr(D = 1 | X ).

Rosenbaum and Rubin (1983) demonstrate that underassumptions (M-1) and (M-2),

(Y0,Y1) ⊥⊥ D | P(X ) for X ∈ χc , (6)

for some set χc , where it is assumed that (M-2) holds in theset.

Conditioning either on P(X ) or on X produces conditionalindependence.

63 / 163

Page 184: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Conditioning on P(X ) reduces the dimension of the matchingproblem down to matching on the scalar P(X ).

The analysis of Rosenbaum and Rubin (1983) assumes thatP(X ) is known rather than estimated.

Heckman et al. (1998), Hahn (1998), and Hirano et al. (2003)present the asymptotic distribution theory for the kernelmatching estimator in the cases in which P(X ) is known and inwhich it is estimated both parametrically and nonparametrically.

64 / 163

Page 185: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Conditioning on P(X ) reduces the dimension of the matchingproblem down to matching on the scalar P(X ).

The analysis of Rosenbaum and Rubin (1983) assumes thatP(X ) is known rather than estimated.

Heckman et al. (1998), Hahn (1998), and Hirano et al. (2003)present the asymptotic distribution theory for the kernelmatching estimator in the cases in which P(X ) is known and inwhich it is estimated both parametrically and nonparametrically.

64 / 163

Page 186: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Conditioning on P(X ) reduces the dimension of the matchingproblem down to matching on the scalar P(X ).

The analysis of Rosenbaum and Rubin (1983) assumes thatP(X ) is known rather than estimated.

Heckman et al. (1998), Hahn (1998), and Hirano et al. (2003)present the asymptotic distribution theory for the kernelmatching estimator in the cases in which P(X ) is known and inwhich it is estimated both parametrically and nonparametrically.

64 / 163

Page 187: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Conditioning on P identifies all treatment parameters but as wehave seen, it imposes the assumption of a flat MTE.

Marginal returns and average returns are the same.

A consequence of (6) is that

E (Y1|D = 0,P (X )) = E (Y1|D = 1,P (X )) = E (Y1|P (X )) ,

E (Y0|D = 1,P (X )) = E (Y0|D = 0,P (X )) = E (Y0|P (X )) .

65 / 163

Page 188: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Conditioning on P identifies all treatment parameters but as wehave seen, it imposes the assumption of a flat MTE.

Marginal returns and average returns are the same.

A consequence of (6) is that

E (Y1|D = 0,P (X )) = E (Y1|D = 1,P (X )) = E (Y1|P (X )) ,

E (Y0|D = 1,P (X )) = E (Y0|D = 0,P (X )) = E (Y0|P (X )) .

65 / 163

Page 189: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Conditioning on P identifies all treatment parameters but as wehave seen, it imposes the assumption of a flat MTE.

Marginal returns and average returns are the same.

A consequence of (6) is that

E (Y1|D = 0,P (X )) = E (Y1|D = 1,P (X )) = E (Y1|P (X )) ,

E (Y0|D = 1,P (X )) = E (Y0|D = 0,P (X )) = E (Y0|P (X )) .

65 / 163

Page 190: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Support condition (M-2) has the unattractive feature that ifthe analyst has too much information about the decision ofwho takes treatment, so that P (X ) = 1 or 0, the methodbreaks down at such values of X because people cannot becompared at a common X .

The method of matching assumes that, given X , someunspecified randomization in the economic environmentallocates people to treatment.

This jsutifies assumption (Q-5) in the OLS example.

The fact that the cases P (X ) = 1 and P (X ) = 0 must beeliminated suggests that methods for choosing X based on thefit of the model to data on D are potentially problematic, as wediscuss below.

66 / 163

Page 191: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Support condition (M-2) has the unattractive feature that ifthe analyst has too much information about the decision ofwho takes treatment, so that P (X ) = 1 or 0, the methodbreaks down at such values of X because people cannot becompared at a common X .

The method of matching assumes that, given X , someunspecified randomization in the economic environmentallocates people to treatment.

This jsutifies assumption (Q-5) in the OLS example.

The fact that the cases P (X ) = 1 and P (X ) = 0 must beeliminated suggests that methods for choosing X based on thefit of the model to data on D are potentially problematic, as wediscuss below.

66 / 163

Page 192: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Support condition (M-2) has the unattractive feature that ifthe analyst has too much information about the decision ofwho takes treatment, so that P (X ) = 1 or 0, the methodbreaks down at such values of X because people cannot becompared at a common X .

The method of matching assumes that, given X , someunspecified randomization in the economic environmentallocates people to treatment.

This jsutifies assumption (Q-5) in the OLS example.

The fact that the cases P (X ) = 1 and P (X ) = 0 must beeliminated suggests that methods for choosing X based on thefit of the model to data on D are potentially problematic, as wediscuss below.

66 / 163

Page 193: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Support condition (M-2) has the unattractive feature that ifthe analyst has too much information about the decision ofwho takes treatment, so that P (X ) = 1 or 0, the methodbreaks down at such values of X because people cannot becompared at a common X .

The method of matching assumes that, given X , someunspecified randomization in the economic environmentallocates people to treatment.

This jsutifies assumption (Q-5) in the OLS example.

The fact that the cases P (X ) = 1 and P (X ) = 0 must beeliminated suggests that methods for choosing X based on thefit of the model to data on D are potentially problematic, as wediscuss below.

66 / 163

Page 194: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Offsetting these disadvantages, the method of matching with aknown conditioning set that produces condition (M-2) does notrequire separability of outcome or choice equations, exogeneityof conditioning variables, exclusion restrictions, or adoption ofspecific functional forms of outcome equations.

Such features are commonly used in conventional selection(control function) methods and conventional applications of IValthough as we have demonstrated, recent work insemiparametric estimation relaxes these assumptions.

The method of matching does not strictly require (M-1).

One can get by with weaker mean independence assumptions(M-3) in the place of the stronger conditions (M-1).

67 / 163

Page 195: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Offsetting these disadvantages, the method of matching with aknown conditioning set that produces condition (M-2) does notrequire separability of outcome or choice equations, exogeneityof conditioning variables, exclusion restrictions, or adoption ofspecific functional forms of outcome equations.

Such features are commonly used in conventional selection(control function) methods and conventional applications of IValthough as we have demonstrated, recent work insemiparametric estimation relaxes these assumptions.

The method of matching does not strictly require (M-1).

One can get by with weaker mean independence assumptions(M-3) in the place of the stronger conditions (M-1).

67 / 163

Page 196: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Offsetting these disadvantages, the method of matching with aknown conditioning set that produces condition (M-2) does notrequire separability of outcome or choice equations, exogeneityof conditioning variables, exclusion restrictions, or adoption ofspecific functional forms of outcome equations.

Such features are commonly used in conventional selection(control function) methods and conventional applications of IValthough as we have demonstrated, recent work insemiparametric estimation relaxes these assumptions.

The method of matching does not strictly require (M-1).

One can get by with weaker mean independence assumptions(M-3) in the place of the stronger conditions (M-1).

67 / 163

Page 197: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Offsetting these disadvantages, the method of matching with aknown conditioning set that produces condition (M-2) does notrequire separability of outcome or choice equations, exogeneityof conditioning variables, exclusion restrictions, or adoption ofspecific functional forms of outcome equations.

Such features are commonly used in conventional selection(control function) methods and conventional applications of IValthough as we have demonstrated, recent work insemiparametric estimation relaxes these assumptions.

The method of matching does not strictly require (M-1).

One can get by with weaker mean independence assumptions(M-3) in the place of the stronger conditions (M-1).

67 / 163

Page 198: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, if (M-3) is invoked, the assumption that one canreplace X by P (X ) does not follow from the analysis ofRosenbaum and Rubin (1983), and is an additional newassumption.

Methods for implementing matching are provided in Heckmanet al. (1998) and are discussed extensively in Heckman et al.(1999).

See Todd (1999, 2007, 2008) for software and extensivediscussion of the mechanics of matching.

We now contrast the identifying assumptions used in themethod of control functions with those used in matching.

68 / 163

Page 199: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, if (M-3) is invoked, the assumption that one canreplace X by P (X ) does not follow from the analysis ofRosenbaum and Rubin (1983), and is an additional newassumption.

Methods for implementing matching are provided in Heckmanet al. (1998) and are discussed extensively in Heckman et al.(1999).

See Todd (1999, 2007, 2008) for software and extensivediscussion of the mechanics of matching.

We now contrast the identifying assumptions used in themethod of control functions with those used in matching.

68 / 163

Page 200: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, if (M-3) is invoked, the assumption that one canreplace X by P (X ) does not follow from the analysis ofRosenbaum and Rubin (1983), and is an additional newassumption.

Methods for implementing matching are provided in Heckmanet al. (1998) and are discussed extensively in Heckman et al.(1999).

See Todd (1999, 2007, 2008) for software and extensivediscussion of the mechanics of matching.

We now contrast the identifying assumptions used in themethod of control functions with those used in matching.

68 / 163

Page 201: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, if (M-3) is invoked, the assumption that one canreplace X by P (X ) does not follow from the analysis ofRosenbaum and Rubin (1983), and is an additional newassumption.

Methods for implementing matching are provided in Heckmanet al. (1998) and are discussed extensively in Heckman et al.(1999).

See Todd (1999, 2007, 2008) for software and extensivediscussion of the mechanics of matching.

We now contrast the identifying assumptions used in themethod of control functions with those used in matching.

68 / 163

Page 202: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparing Matching and Control Functions Approaches

The method of matching eliminates the dependence between(Y0,Y1) and D, (Y0,Y1) ⊥⊥ D, by assuming access toconditioning variables X such that (M-1) is satisfied:(Y0,Y1) ⊥⊥ D | X .

By conditioning on observables, one can identify thedistributions of Y0 and Y1 over the support of X satisfying(M-2).

69 / 163

Page 203: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparing Matching and Control Functions Approaches

The method of matching eliminates the dependence between(Y0,Y1) and D, (Y0,Y1) ⊥⊥ D, by assuming access toconditioning variables X such that (M-1) is satisfied:(Y0,Y1) ⊥⊥ D | X .

By conditioning on observables, one can identify thedistributions of Y0 and Y1 over the support of X satisfying(M-2).

69 / 163

Page 204: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Other methods model the dependence that gives rise to thespurious relationship and in this way attempt to eliminate it.

IV involves exclusion and a different type of conditionalindependence, (Y0,Y1) ⊥⊥ Z | X , as well as a rank condition(Pr (D = 1 | X ,Z ) depends on Z ).

The instrument Z plays the role of the implicit randomizationused in matching by allocating people to treatment status in away that does not depend on (Y0,Y1).

We have already established that matching and IV make verydifferent assumptions.

Thus, in general, a matching assumption that(Y0,Y1) ⊥⊥ D | X ,Z neither implies nor is implied by(Y0,Y1) ⊥⊥ Z | X .

70 / 163

Page 205: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Other methods model the dependence that gives rise to thespurious relationship and in this way attempt to eliminate it.

IV involves exclusion and a different type of conditionalindependence, (Y0,Y1) ⊥⊥ Z | X , as well as a rank condition(Pr (D = 1 | X ,Z ) depends on Z ).

The instrument Z plays the role of the implicit randomizationused in matching by allocating people to treatment status in away that does not depend on (Y0,Y1).

We have already established that matching and IV make verydifferent assumptions.

Thus, in general, a matching assumption that(Y0,Y1) ⊥⊥ D | X ,Z neither implies nor is implied by(Y0,Y1) ⊥⊥ Z | X .

70 / 163

Page 206: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Other methods model the dependence that gives rise to thespurious relationship and in this way attempt to eliminate it.

IV involves exclusion and a different type of conditionalindependence, (Y0,Y1) ⊥⊥ Z | X , as well as a rank condition(Pr (D = 1 | X ,Z ) depends on Z ).

The instrument Z plays the role of the implicit randomizationused in matching by allocating people to treatment status in away that does not depend on (Y0,Y1).

We have already established that matching and IV make verydifferent assumptions.

Thus, in general, a matching assumption that(Y0,Y1) ⊥⊥ D | X ,Z neither implies nor is implied by(Y0,Y1) ⊥⊥ Z | X .

70 / 163

Page 207: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Other methods model the dependence that gives rise to thespurious relationship and in this way attempt to eliminate it.

IV involves exclusion and a different type of conditionalindependence, (Y0,Y1) ⊥⊥ Z | X , as well as a rank condition(Pr (D = 1 | X ,Z ) depends on Z ).

The instrument Z plays the role of the implicit randomizationused in matching by allocating people to treatment status in away that does not depend on (Y0,Y1).

We have already established that matching and IV make verydifferent assumptions.

Thus, in general, a matching assumption that(Y0,Y1) ⊥⊥ D | X ,Z neither implies nor is implied by(Y0,Y1) ⊥⊥ Z | X .

70 / 163

Page 208: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Other methods model the dependence that gives rise to thespurious relationship and in this way attempt to eliminate it.

IV involves exclusion and a different type of conditionalindependence, (Y0,Y1) ⊥⊥ Z | X , as well as a rank condition(Pr (D = 1 | X ,Z ) depends on Z ).

The instrument Z plays the role of the implicit randomizationused in matching by allocating people to treatment status in away that does not depend on (Y0,Y1).

We have already established that matching and IV make verydifferent assumptions.

Thus, in general, a matching assumption that(Y0,Y1) ⊥⊥ D | X ,Z neither implies nor is implied by(Y0,Y1) ⊥⊥ Z | X .

70 / 163

Page 209: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One special case where they are equivalent is when treatmentstatus is assigned by randomization with full compliance(letting ξ = 1 denote assignment to treatment, ξ = 1⇒ A = 1and ξ = 0⇒ A = 0) and Z = ξ, so that the instrument is theassignment mechanism.

A = 1 if the person actually receives treatment, and A = 0otherwise.

The method of control functions explicitly models thedependence between (Y0,Y1) and D and attempts to eliminateit.

Matzkin (2007) provides a comprehensive review of thesemethods.

71 / 163

Page 210: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One special case where they are equivalent is when treatmentstatus is assigned by randomization with full compliance(letting ξ = 1 denote assignment to treatment, ξ = 1⇒ A = 1and ξ = 0⇒ A = 0) and Z = ξ, so that the instrument is theassignment mechanism.

A = 1 if the person actually receives treatment, and A = 0otherwise.

The method of control functions explicitly models thedependence between (Y0,Y1) and D and attempts to eliminateit.

Matzkin (2007) provides a comprehensive review of thesemethods.

71 / 163

Page 211: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One special case where they are equivalent is when treatmentstatus is assigned by randomization with full compliance(letting ξ = 1 denote assignment to treatment, ξ = 1⇒ A = 1and ξ = 0⇒ A = 0) and Z = ξ, so that the instrument is theassignment mechanism.

A = 1 if the person actually receives treatment, and A = 0otherwise.

The method of control functions explicitly models thedependence between (Y0,Y1) and D and attempts to eliminateit.

Matzkin (2007) provides a comprehensive review of thesemethods.

71 / 163

Page 212: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

One special case where they are equivalent is when treatmentstatus is assigned by randomization with full compliance(letting ξ = 1 denote assignment to treatment, ξ = 1⇒ A = 1and ξ = 0⇒ A = 0) and Z = ξ, so that the instrument is theassignment mechanism.

A = 1 if the person actually receives treatment, and A = 0otherwise.

The method of control functions explicitly models thedependence between (Y0,Y1) and D and attempts to eliminateit.

Matzkin (2007) provides a comprehensive review of thesemethods.

71 / 163

Page 213: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We present a summary of some of the general principlesunderlying the method of control functions, the method ofcontrol variates, replacement functions, and proxy approachesas they apply to the selection problem.

All of these methods attempt to eliminate the θ in (U-1) thatproduces the dependence captured in (U-2).

In this section, we relate matching to the form of the controlfunction introduced in Heckman (1980) and Heckman andRobb (1985, 1986a).

This version was used in our analysis of local instrumentalvariables (LIV), where we compare LIV with control functionapproaches and show that LIV and LATE estimate derivativesof the control functions.

72 / 163

Page 214: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We present a summary of some of the general principlesunderlying the method of control functions, the method ofcontrol variates, replacement functions, and proxy approachesas they apply to the selection problem.

All of these methods attempt to eliminate the θ in (U-1) thatproduces the dependence captured in (U-2).

In this section, we relate matching to the form of the controlfunction introduced in Heckman (1980) and Heckman andRobb (1985, 1986a).

This version was used in our analysis of local instrumentalvariables (LIV), where we compare LIV with control functionapproaches and show that LIV and LATE estimate derivativesof the control functions.

72 / 163

Page 215: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We present a summary of some of the general principlesunderlying the method of control functions, the method ofcontrol variates, replacement functions, and proxy approachesas they apply to the selection problem.

All of these methods attempt to eliminate the θ in (U-1) thatproduces the dependence captured in (U-2).

In this section, we relate matching to the form of the controlfunction introduced in Heckman (1980) and Heckman andRobb (1985, 1986a).

This version was used in our analysis of local instrumentalvariables (LIV), where we compare LIV with control functionapproaches and show that LIV and LATE estimate derivativesof the control functions.

72 / 163

Page 216: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We present a summary of some of the general principlesunderlying the method of control functions, the method ofcontrol variates, replacement functions, and proxy approachesas they apply to the selection problem.

All of these methods attempt to eliminate the θ in (U-1) thatproduces the dependence captured in (U-2).

In this section, we relate matching to the form of the controlfunction introduced in Heckman (1980) and Heckman andRobb (1985, 1986a).

This version was used in our analysis of local instrumentalvariables (LIV), where we compare LIV with control functionapproaches and show that LIV and LATE estimate derivativesof the control functions.

72 / 163

Page 217: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(U-1)

(Y0,Y1) ⊥⊥ D | X ,Z , θ,

(U-2)

(Y0,Y1) ⊥⊥ D | X ,Z .

73 / 163

Page 218: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We analyze conditional means because of their familiarity.

Using the fact that E (1 (Y ≤ y) | X ) = F (y | X ), the analysisapplies to marginal distributions as well.

Thus we work with conditional expectations of (Y0,Y1) given(X ,Z ,D), where Z is assumed to include at least one variablenot in X .

Conventional applications of the control function methodassume additive separability, which is not required in matching.

Strictly speaking, additive separability is not required in theapplication of control functions either.

What is required is a model relating the outcome unobservablesto the observables and the unobservables in the choice oftreatment equation.

74 / 163

Page 219: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We analyze conditional means because of their familiarity.

Using the fact that E (1 (Y ≤ y) | X ) = F (y | X ), the analysisapplies to marginal distributions as well.

Thus we work with conditional expectations of (Y0,Y1) given(X ,Z ,D), where Z is assumed to include at least one variablenot in X .

Conventional applications of the control function methodassume additive separability, which is not required in matching.

Strictly speaking, additive separability is not required in theapplication of control functions either.

What is required is a model relating the outcome unobservablesto the observables and the unobservables in the choice oftreatment equation.

74 / 163

Page 220: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We analyze conditional means because of their familiarity.

Using the fact that E (1 (Y ≤ y) | X ) = F (y | X ), the analysisapplies to marginal distributions as well.

Thus we work with conditional expectations of (Y0,Y1) given(X ,Z ,D), where Z is assumed to include at least one variablenot in X .

Conventional applications of the control function methodassume additive separability, which is not required in matching.

Strictly speaking, additive separability is not required in theapplication of control functions either.

What is required is a model relating the outcome unobservablesto the observables and the unobservables in the choice oftreatment equation.

74 / 163

Page 221: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We analyze conditional means because of their familiarity.

Using the fact that E (1 (Y ≤ y) | X ) = F (y | X ), the analysisapplies to marginal distributions as well.

Thus we work with conditional expectations of (Y0,Y1) given(X ,Z ,D), where Z is assumed to include at least one variablenot in X .

Conventional applications of the control function methodassume additive separability, which is not required in matching.

Strictly speaking, additive separability is not required in theapplication of control functions either.

What is required is a model relating the outcome unobservablesto the observables and the unobservables in the choice oftreatment equation.

74 / 163

Page 222: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We analyze conditional means because of their familiarity.

Using the fact that E (1 (Y ≤ y) | X ) = F (y | X ), the analysisapplies to marginal distributions as well.

Thus we work with conditional expectations of (Y0,Y1) given(X ,Z ,D), where Z is assumed to include at least one variablenot in X .

Conventional applications of the control function methodassume additive separability, which is not required in matching.

Strictly speaking, additive separability is not required in theapplication of control functions either.

What is required is a model relating the outcome unobservablesto the observables and the unobservables in the choice oftreatment equation.

74 / 163

Page 223: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We analyze conditional means because of their familiarity.

Using the fact that E (1 (Y ≤ y) | X ) = F (y | X ), the analysisapplies to marginal distributions as well.

Thus we work with conditional expectations of (Y0,Y1) given(X ,Z ,D), where Z is assumed to include at least one variablenot in X .

Conventional applications of the control function methodassume additive separability, which is not required in matching.

Strictly speaking, additive separability is not required in theapplication of control functions either.

What is required is a model relating the outcome unobservablesto the observables and the unobservables in the choice oftreatment equation.

74 / 163

Page 224: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Various assumptions give operational content to (U-1).

For the additively separable case (7), the control function formean outcomes models the conditional expectations of Y1 andY0 given X , Z , and D as

E (Y1|Z ,X ,D = 1) = µ1 (X ) + E (U1|Z ,X ,D = 1)

E (Y0|Z ,X ,D = 0) = µ0 (X ) + E (U0|Z ,X ,D = 0).

Y1 = µ1(X ) + U1 (7)

Y0 = µ0(X ) + U0.

75 / 163

Page 225: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Various assumptions give operational content to (U-1).

For the additively separable case (7), the control function formean outcomes models the conditional expectations of Y1 andY0 given X , Z , and D as

E (Y1|Z ,X ,D = 1) = µ1 (X ) + E (U1|Z ,X ,D = 1)

E (Y0|Z ,X ,D = 0) = µ0 (X ) + E (U0|Z ,X ,D = 0).

Y1 = µ1(X ) + U1 (7)

Y0 = µ0(X ) + U0.

75 / 163

Page 226: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the traditional method of control functions, the analystmodels E (U1|Z ,X ,D = 1) and E (U0|Z ,X ,D = 0).

If these functions can be independently varied against µ1 (X )and µ0 (X ) respectively, one can identify µ1 (X ) and µ0 (X ) upto constant terms.

It is not required that X or Z be stochastically independent ofU1 or U0, although conventional methods often assume this.

76 / 163

Page 227: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the traditional method of control functions, the analystmodels E (U1|Z ,X ,D = 1) and E (U0|Z ,X ,D = 0).

If these functions can be independently varied against µ1 (X )and µ0 (X ) respectively, one can identify µ1 (X ) and µ0 (X ) upto constant terms.

It is not required that X or Z be stochastically independent ofU1 or U0, although conventional methods often assume this.

76 / 163

Page 228: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the traditional method of control functions, the analystmodels E (U1|Z ,X ,D = 1) and E (U0|Z ,X ,D = 0).

If these functions can be independently varied against µ1 (X )and µ0 (X ) respectively, one can identify µ1 (X ) and µ0 (X ) upto constant terms.

It is not required that X or Z be stochastically independent ofU1 or U0, although conventional methods often assume this.

76 / 163

Page 229: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assume that (U0,U1,V ) ⊥⊥ (X ,Z ) and adopt equation (3) asthe treatment choice model augmented so that X and Z aredeterminants of treatment choice, using V as the latentvariable that generates D given X ,Z : D = 1(µD(Z ) ≥ 0).

Let UD = FV |X (V ) and P (Z ) = FV |X (µD (Z )).

In this notation, the control functions areE (U1|Z ,D = 1) = E (U1|µD (Z) ≥ V ) = E (U1 | P (Z) ≥ UD) = K1 (P (Z)) and

E (U0|Z ,D = 0) = E (U0|µD (Z) < V ) = E (U0 | P (Z) < UD) = K0 (P (Z)) ,

so the control function only depends on the propensity scoreP(Z ).

The key assumption needed to represent the control functionsolely as a function of P (Z ) is

(U0,U1,V ) ⊥⊥ X ,Z . (CF-1)

77 / 163

Page 230: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assume that (U0,U1,V ) ⊥⊥ (X ,Z ) and adopt equation (3) asthe treatment choice model augmented so that X and Z aredeterminants of treatment choice, using V as the latentvariable that generates D given X ,Z : D = 1(µD(Z ) ≥ 0).

Let UD = FV |X (V ) and P (Z ) = FV |X (µD (Z )).

In this notation, the control functions areE (U1|Z ,D = 1) = E (U1|µD (Z) ≥ V ) = E (U1 | P (Z) ≥ UD) = K1 (P (Z)) and

E (U0|Z ,D = 0) = E (U0|µD (Z) < V ) = E (U0 | P (Z) < UD) = K0 (P (Z)) ,

so the control function only depends on the propensity scoreP(Z ).

The key assumption needed to represent the control functionsolely as a function of P (Z ) is

(U0,U1,V ) ⊥⊥ X ,Z . (CF-1)

77 / 163

Page 231: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assume that (U0,U1,V ) ⊥⊥ (X ,Z ) and adopt equation (3) asthe treatment choice model augmented so that X and Z aredeterminants of treatment choice, using V as the latentvariable that generates D given X ,Z : D = 1(µD(Z ) ≥ 0).

Let UD = FV |X (V ) and P (Z ) = FV |X (µD (Z )).

In this notation, the control functions areE (U1|Z ,D = 1) = E (U1|µD (Z) ≥ V ) = E (U1 | P (Z) ≥ UD) = K1 (P (Z)) and

E (U0|Z ,D = 0) = E (U0|µD (Z) < V ) = E (U0 | P (Z) < UD) = K0 (P (Z)) ,

so the control function only depends on the propensity scoreP(Z ).

The key assumption needed to represent the control functionsolely as a function of P (Z ) is

(U0,U1,V ) ⊥⊥ X ,Z . (CF-1)

77 / 163

Page 232: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assume that (U0,U1,V ) ⊥⊥ (X ,Z ) and adopt equation (3) asthe treatment choice model augmented so that X and Z aredeterminants of treatment choice, using V as the latentvariable that generates D given X ,Z : D = 1(µD(Z ) ≥ 0).

Let UD = FV |X (V ) and P (Z ) = FV |X (µD (Z )).

In this notation, the control functions areE (U1|Z ,D = 1) = E (U1|µD (Z) ≥ V ) = E (U1 | P (Z) ≥ UD) = K1 (P (Z)) and

E (U0|Z ,D = 0) = E (U0|µD (Z) < V ) = E (U0 | P (Z) < UD) = K0 (P (Z)) ,

so the control function only depends on the propensity scoreP(Z ).

The key assumption needed to represent the control functionsolely as a function of P (Z ) is

(U0,U1,V ) ⊥⊥ X ,Z . (CF-1)

77 / 163

Page 233: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption is not strictly required but it is traditional anduseful in relating LIV and selection models and selection modelsand matching (this section).

Under this condition

E (Y1|Z ,X ,D = 1) = µ1 (X ) + K1 (P (Z )) ,

E (Y0|Z ,X ,D = 0) = µ0 (X ) + K0 (P (Z )) ,

with limP→1

K1 (P) = 0 and limP→0

K0 (P) = 0.

It is assumed that Z can be independently varied for all X , andthe limits are obtained by changing Z while holding X fixed.

These limit results state that when the values of X ,Z are suchthat the probability of being in a sample (D = 1 or D = 0,respectively) is 1, there is no selection bias and one can separateout µ1 (X ) from K1 (P (Z )) and µ0 (X ) from K0 (P (Z )).

78 / 163

Page 234: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption is not strictly required but it is traditional anduseful in relating LIV and selection models and selection modelsand matching (this section).

Under this condition

E (Y1|Z ,X ,D = 1) = µ1 (X ) + K1 (P (Z )) ,

E (Y0|Z ,X ,D = 0) = µ0 (X ) + K0 (P (Z )) ,

with limP→1

K1 (P) = 0 and limP→0

K0 (P) = 0.

It is assumed that Z can be independently varied for all X , andthe limits are obtained by changing Z while holding X fixed.

These limit results state that when the values of X ,Z are suchthat the probability of being in a sample (D = 1 or D = 0,respectively) is 1, there is no selection bias and one can separateout µ1 (X ) from K1 (P (Z )) and µ0 (X ) from K0 (P (Z )).

78 / 163

Page 235: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption is not strictly required but it is traditional anduseful in relating LIV and selection models and selection modelsand matching (this section).

Under this condition

E (Y1|Z ,X ,D = 1) = µ1 (X ) + K1 (P (Z )) ,

E (Y0|Z ,X ,D = 0) = µ0 (X ) + K0 (P (Z )) ,

with limP→1

K1 (P) = 0 and limP→0

K0 (P) = 0.

It is assumed that Z can be independently varied for all X , andthe limits are obtained by changing Z while holding X fixed.

These limit results state that when the values of X ,Z are suchthat the probability of being in a sample (D = 1 or D = 0,respectively) is 1, there is no selection bias and one can separateout µ1 (X ) from K1 (P (Z )) and µ0 (X ) from K0 (P (Z )).

78 / 163

Page 236: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption is not strictly required but it is traditional anduseful in relating LIV and selection models and selection modelsand matching (this section).

Under this condition

E (Y1|Z ,X ,D = 1) = µ1 (X ) + K1 (P (Z )) ,

E (Y0|Z ,X ,D = 0) = µ0 (X ) + K0 (P (Z )) ,

with limP→1

K1 (P) = 0 and limP→0

K0 (P) = 0.

It is assumed that Z can be independently varied for all X , andthe limits are obtained by changing Z while holding X fixed.

These limit results state that when the values of X ,Z are suchthat the probability of being in a sample (D = 1 or D = 0,respectively) is 1, there is no selection bias and one can separateout µ1 (X ) from K1 (P (Z )) and µ0 (X ) from K0 (P (Z )).

78 / 163

Page 237: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the same identification at infinity condition that isrequired to identify ATE and TT in IV for models withheterogeneous responses.

Unlike the method of matching based on (M-1), the method ofcontrol functions allows the marginal treatment effect to bedifferent from the average treatment effect and from theconditional effect of treatment on the treated.

Although conventional practice has been to derive thefunctional forms of K0 (P) and K1 (P) by making distributionalassumptions about (U0,U1,V ) such as normality or otherconventional distributional assumptions, this is not an intrinsicfeature of the method and there are many nonnormal andsemiparametric versions of this method.

See Powell (1994) for a survey.

79 / 163

Page 238: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the same identification at infinity condition that isrequired to identify ATE and TT in IV for models withheterogeneous responses.

Unlike the method of matching based on (M-1), the method ofcontrol functions allows the marginal treatment effect to bedifferent from the average treatment effect and from theconditional effect of treatment on the treated.

Although conventional practice has been to derive thefunctional forms of K0 (P) and K1 (P) by making distributionalassumptions about (U0,U1,V ) such as normality or otherconventional distributional assumptions, this is not an intrinsicfeature of the method and there are many nonnormal andsemiparametric versions of this method.

See Powell (1994) for a survey.

79 / 163

Page 239: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the same identification at infinity condition that isrequired to identify ATE and TT in IV for models withheterogeneous responses.

Unlike the method of matching based on (M-1), the method ofcontrol functions allows the marginal treatment effect to bedifferent from the average treatment effect and from theconditional effect of treatment on the treated.

Although conventional practice has been to derive thefunctional forms of K0 (P) and K1 (P) by making distributionalassumptions about (U0,U1,V ) such as normality or otherconventional distributional assumptions, this is not an intrinsicfeature of the method and there are many nonnormal andsemiparametric versions of this method.

See Powell (1994) for a survey.

79 / 163

Page 240: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the same identification at infinity condition that isrequired to identify ATE and TT in IV for models withheterogeneous responses.

Unlike the method of matching based on (M-1), the method ofcontrol functions allows the marginal treatment effect to bedifferent from the average treatment effect and from theconditional effect of treatment on the treated.

Although conventional practice has been to derive thefunctional forms of K0 (P) and K1 (P) by making distributionalassumptions about (U0,U1,V ) such as normality or otherconventional distributional assumptions, this is not an intrinsicfeature of the method and there are many nonnormal andsemiparametric versions of this method.

See Powell (1994) for a survey.

79 / 163

Page 241: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In its semiparametric implementation, the method of controlfunctions requires an exclusion restriction (a variable in Z notin X ) to achieve nonparametric identification.

Without any functional-form assumptions one cannot rule out aworst case analysis where, for example, if X = Z , thenK1 (P (X )) = τµ (X ) where τ is a scalar.

In this situation, there is perfect collinearity between thecontrol function and the conditional mean of the outcomeequation, and it is impossible to separately identify either.

Even though this case is not generic, it is possible.

The method of matching does not require an exclusionrestriction, but at the cost of ruling out essential heterogeneity.

80 / 163

Page 242: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In its semiparametric implementation, the method of controlfunctions requires an exclusion restriction (a variable in Z notin X ) to achieve nonparametric identification.

Without any functional-form assumptions one cannot rule out aworst case analysis where, for example, if X = Z , thenK1 (P (X )) = τµ (X ) where τ is a scalar.

In this situation, there is perfect collinearity between thecontrol function and the conditional mean of the outcomeequation, and it is impossible to separately identify either.

Even though this case is not generic, it is possible.

The method of matching does not require an exclusionrestriction, but at the cost of ruling out essential heterogeneity.

80 / 163

Page 243: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In its semiparametric implementation, the method of controlfunctions requires an exclusion restriction (a variable in Z notin X ) to achieve nonparametric identification.

Without any functional-form assumptions one cannot rule out aworst case analysis where, for example, if X = Z , thenK1 (P (X )) = τµ (X ) where τ is a scalar.

In this situation, there is perfect collinearity between thecontrol function and the conditional mean of the outcomeequation, and it is impossible to separately identify either.

Even though this case is not generic, it is possible.

The method of matching does not require an exclusionrestriction, but at the cost of ruling out essential heterogeneity.

80 / 163

Page 244: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In its semiparametric implementation, the method of controlfunctions requires an exclusion restriction (a variable in Z notin X ) to achieve nonparametric identification.

Without any functional-form assumptions one cannot rule out aworst case analysis where, for example, if X = Z , thenK1 (P (X )) = τµ (X ) where τ is a scalar.

In this situation, there is perfect collinearity between thecontrol function and the conditional mean of the outcomeequation, and it is impossible to separately identify either.

Even though this case is not generic, it is possible.

The method of matching does not require an exclusionrestriction, but at the cost of ruling out essential heterogeneity.

80 / 163

Page 245: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In its semiparametric implementation, the method of controlfunctions requires an exclusion restriction (a variable in Z notin X ) to achieve nonparametric identification.

Without any functional-form assumptions one cannot rule out aworst case analysis where, for example, if X = Z , thenK1 (P (X )) = τµ (X ) where τ is a scalar.

In this situation, there is perfect collinearity between thecontrol function and the conditional mean of the outcomeequation, and it is impossible to separately identify either.

Even though this case is not generic, it is possible.

The method of matching does not require an exclusionrestriction, but at the cost of ruling out essential heterogeneity.

80 / 163

Page 246: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the general case, the method of control functions requiresthat in certain limit sets of Z , P (Z ) = 1 and P (Z ) = 0 inorder to achieve full nonparametric identification.

The conventional method of matching does not invoke suchlimit set arguments.

All methods of evaluation, including matching and controlfunctions, require that treatment parameters be defined on acommon support that is the intersection of the supports of Xgiven D = 1 and X given D = 0:Supp (X |D = 1) ∩ Supp (X |D = 0).

This is the requirement for any estimator that seeks to identifytreatment effects by comparing samples of treated persons withsamples of untreated persons.

81 / 163

Page 247: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the general case, the method of control functions requiresthat in certain limit sets of Z , P (Z ) = 1 and P (Z ) = 0 inorder to achieve full nonparametric identification.

The conventional method of matching does not invoke suchlimit set arguments.

All methods of evaluation, including matching and controlfunctions, require that treatment parameters be defined on acommon support that is the intersection of the supports of Xgiven D = 1 and X given D = 0:Supp (X |D = 1) ∩ Supp (X |D = 0).

This is the requirement for any estimator that seeks to identifytreatment effects by comparing samples of treated persons withsamples of untreated persons.

81 / 163

Page 248: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the general case, the method of control functions requiresthat in certain limit sets of Z , P (Z ) = 1 and P (Z ) = 0 inorder to achieve full nonparametric identification.

The conventional method of matching does not invoke suchlimit set arguments.

All methods of evaluation, including matching and controlfunctions, require that treatment parameters be defined on acommon support that is the intersection of the supports of Xgiven D = 1 and X given D = 0:Supp (X |D = 1) ∩ Supp (X |D = 0).

This is the requirement for any estimator that seeks to identifytreatment effects by comparing samples of treated persons withsamples of untreated persons.

81 / 163

Page 249: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the general case, the method of control functions requiresthat in certain limit sets of Z , P (Z ) = 1 and P (Z ) = 0 inorder to achieve full nonparametric identification.

The conventional method of matching does not invoke suchlimit set arguments.

All methods of evaluation, including matching and controlfunctions, require that treatment parameters be defined on acommon support that is the intersection of the supports of Xgiven D = 1 and X given D = 0:Supp (X |D = 1) ∩ Supp (X |D = 0).

This is the requirement for any estimator that seeks to identifytreatment effects by comparing samples of treated persons withsamples of untreated persons.

81 / 163

Page 250: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this version of the method of control functions, P (Z ) is aconditioning variable used to predict U1 conditional on D andU0 conditional on D.

In the method of matching, it is used as a conditioning variableto eliminate the stochastic dependence between (U0,U1) andD.

In the method of LATE or LIV, P (Z ) is used as an instrument.

In the method of control functions, as conventionally applied,(U0,U1) ⊥⊥ (X ,Z ), but this assumption is not intrinsic to themethod.

This assumption plays no role in matching if the correctconditioning set is known.

82 / 163

Page 251: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this version of the method of control functions, P (Z ) is aconditioning variable used to predict U1 conditional on D andU0 conditional on D.

In the method of matching, it is used as a conditioning variableto eliminate the stochastic dependence between (U0,U1) andD.

In the method of LATE or LIV, P (Z ) is used as an instrument.

In the method of control functions, as conventionally applied,(U0,U1) ⊥⊥ (X ,Z ), but this assumption is not intrinsic to themethod.

This assumption plays no role in matching if the correctconditioning set is known.

82 / 163

Page 252: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this version of the method of control functions, P (Z ) is aconditioning variable used to predict U1 conditional on D andU0 conditional on D.

In the method of matching, it is used as a conditioning variableto eliminate the stochastic dependence between (U0,U1) andD.

In the method of LATE or LIV, P (Z ) is used as an instrument.

In the method of control functions, as conventionally applied,(U0,U1) ⊥⊥ (X ,Z ), but this assumption is not intrinsic to themethod.

This assumption plays no role in matching if the correctconditioning set is known.

82 / 163

Page 253: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this version of the method of control functions, P (Z ) is aconditioning variable used to predict U1 conditional on D andU0 conditional on D.

In the method of matching, it is used as a conditioning variableto eliminate the stochastic dependence between (U0,U1) andD.

In the method of LATE or LIV, P (Z ) is used as an instrument.

In the method of control functions, as conventionally applied,(U0,U1) ⊥⊥ (X ,Z ), but this assumption is not intrinsic to themethod.

This assumption plays no role in matching if the correctconditioning set is known.

82 / 163

Page 254: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this version of the method of control functions, P (Z ) is aconditioning variable used to predict U1 conditional on D andU0 conditional on D.

In the method of matching, it is used as a conditioning variableto eliminate the stochastic dependence between (U0,U1) andD.

In the method of LATE or LIV, P (Z ) is used as an instrument.

In the method of control functions, as conventionally applied,(U0,U1) ⊥⊥ (X ,Z ), but this assumption is not intrinsic to themethod.

This assumption plays no role in matching if the correctconditioning set is known.

82 / 163

Page 255: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, as noted below, exogeneity plays a key role indevising algorithms to select the conditioning variables.

In addition, exogeneity is helpful in making out-of-sampleforecasts.

The method of control functions does not require that(U0,U1) ⊥⊥ D | (X ,Z ) , which is a central requirement ofmatching.

Equivalently, the method of control functions does not require

(U0,U1) ⊥⊥ V | (X ,Z ) , or that (U0,U1) ⊥⊥ V | X

whereas matching does and typically equates X and Z .

Thus matching assumes access to a richer set of conditioningvariables than is assumed in the method of control functions.

83 / 163

Page 256: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, as noted below, exogeneity plays a key role indevising algorithms to select the conditioning variables.

In addition, exogeneity is helpful in making out-of-sampleforecasts.

The method of control functions does not require that(U0,U1) ⊥⊥ D | (X ,Z ) , which is a central requirement ofmatching.

Equivalently, the method of control functions does not require

(U0,U1) ⊥⊥ V | (X ,Z ) , or that (U0,U1) ⊥⊥ V | X

whereas matching does and typically equates X and Z .

Thus matching assumes access to a richer set of conditioningvariables than is assumed in the method of control functions.

83 / 163

Page 257: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, as noted below, exogeneity plays a key role indevising algorithms to select the conditioning variables.

In addition, exogeneity is helpful in making out-of-sampleforecasts.

The method of control functions does not require that(U0,U1) ⊥⊥ D | (X ,Z ) , which is a central requirement ofmatching.

Equivalently, the method of control functions does not require

(U0,U1) ⊥⊥ V | (X ,Z ) , or that (U0,U1) ⊥⊥ V | X

whereas matching does and typically equates X and Z .

Thus matching assumes access to a richer set of conditioningvariables than is assumed in the method of control functions.

83 / 163

Page 258: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, as noted below, exogeneity plays a key role indevising algorithms to select the conditioning variables.

In addition, exogeneity is helpful in making out-of-sampleforecasts.

The method of control functions does not require that(U0,U1) ⊥⊥ D | (X ,Z ) , which is a central requirement ofmatching.

Equivalently, the method of control functions does not require

(U0,U1) ⊥⊥ V | (X ,Z ) , or that (U0,U1) ⊥⊥ V | X

whereas matching does and typically equates X and Z .

Thus matching assumes access to a richer set of conditioningvariables than is assumed in the method of control functions.

83 / 163

Page 259: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

However, as noted below, exogeneity plays a key role indevising algorithms to select the conditioning variables.

In addition, exogeneity is helpful in making out-of-sampleforecasts.

The method of control functions does not require that(U0,U1) ⊥⊥ D | (X ,Z ) , which is a central requirement ofmatching.

Equivalently, the method of control functions does not require

(U0,U1) ⊥⊥ V | (X ,Z ) , or that (U0,U1) ⊥⊥ V | X

whereas matching does and typically equates X and Z .

Thus matching assumes access to a richer set of conditioningvariables than is assumed in the method of control functions.

83 / 163

Page 260: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The method of control functions allows for outcomeunobservables to be dependent on D even after conditioning on(X ,Z ), and it models this dependence.

The method of matching assumes no such D dependence.

Thus in this regard, and maintaining all of the assumptionsinvoked for control functions in this section, matching is aspecial case of the method of control functions in which underassumptions (M-1) and (M-2),

E (U1|X ,D = 1) = E (U1|X )

E (U0|X ,D = 0) = E (U0|X ) .

84 / 163

Page 261: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The method of control functions allows for outcomeunobservables to be dependent on D even after conditioning on(X ,Z ), and it models this dependence.

The method of matching assumes no such D dependence.

Thus in this regard, and maintaining all of the assumptionsinvoked for control functions in this section, matching is aspecial case of the method of control functions in which underassumptions (M-1) and (M-2),

E (U1|X ,D = 1) = E (U1|X )

E (U0|X ,D = 0) = E (U0|X ) .

84 / 163

Page 262: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The method of control functions allows for outcomeunobservables to be dependent on D even after conditioning on(X ,Z ), and it models this dependence.

The method of matching assumes no such D dependence.

Thus in this regard, and maintaining all of the assumptionsinvoked for control functions in this section, matching is aspecial case of the method of control functions in which underassumptions (M-1) and (M-2),

E (U1|X ,D = 1) = E (U1|X )

E (U0|X ,D = 0) = E (U0|X ) .

84 / 163

Page 263: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the method of control functions, in the case where(X ,Z ) ⊥⊥ (U0,U1,V ), where the Z can include some or all ofthe elements of X , the conditional expectation of Y givenX ,Z ,D is

E (Y |X ,Z ,D) = E (Y1|X ,Z ,D = 1)D + E (Y0|X ,Z ,D = 0) (1− D) (8)

= µ0 (X ) + [µ1 (X )− µ0 (X )]D

+E (U1|P (Z) ,D = 1)D + E (U0|P (Z) ,D = 0) (1− D)

= µ0 (X ) + K0 (P (Z)) + [µ1 (X )− µ0 (X ) + K1 (P (Z))

−K0 (P (Z))]D.

The coefficient on D in the final equation combinesµ1 (X )− µ0 (X ) with K1 (P (Z ))− K0 (P (Z )).

It does not correspond to any treatment effect.

To identify µ1(X )− µ0(X ), one must isolate it fromK1(P(Z ))− K0(P(Z )).

85 / 163

Page 264: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the method of control functions, in the case where(X ,Z ) ⊥⊥ (U0,U1,V ), where the Z can include some or all ofthe elements of X , the conditional expectation of Y givenX ,Z ,D is

E (Y |X ,Z ,D) = E (Y1|X ,Z ,D = 1)D + E (Y0|X ,Z ,D = 0) (1− D) (8)

= µ0 (X ) + [µ1 (X )− µ0 (X )]D

+E (U1|P (Z) ,D = 1)D + E (U0|P (Z) ,D = 0) (1− D)

= µ0 (X ) + K0 (P (Z)) + [µ1 (X )− µ0 (X ) + K1 (P (Z))

−K0 (P (Z))]D.

The coefficient on D in the final equation combinesµ1 (X )− µ0 (X ) with K1 (P (Z ))− K0 (P (Z )).

It does not correspond to any treatment effect.

To identify µ1(X )− µ0(X ), one must isolate it fromK1(P(Z ))− K0(P(Z )).

85 / 163

Page 265: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the method of control functions, in the case where(X ,Z ) ⊥⊥ (U0,U1,V ), where the Z can include some or all ofthe elements of X , the conditional expectation of Y givenX ,Z ,D is

E (Y |X ,Z ,D) = E (Y1|X ,Z ,D = 1)D + E (Y0|X ,Z ,D = 0) (1− D) (8)

= µ0 (X ) + [µ1 (X )− µ0 (X )]D

+E (U1|P (Z) ,D = 1)D + E (U0|P (Z) ,D = 0) (1− D)

= µ0 (X ) + K0 (P (Z)) + [µ1 (X )− µ0 (X ) + K1 (P (Z))

−K0 (P (Z))]D.

The coefficient on D in the final equation combinesµ1 (X )− µ0 (X ) with K1 (P (Z ))− K0 (P (Z )).

It does not correspond to any treatment effect.

To identify µ1(X )− µ0(X ), one must isolate it fromK1(P(Z ))− K0(P(Z )).

85 / 163

Page 266: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the method of control functions, in the case where(X ,Z ) ⊥⊥ (U0,U1,V ), where the Z can include some or all ofthe elements of X , the conditional expectation of Y givenX ,Z ,D is

E (Y |X ,Z ,D) = E (Y1|X ,Z ,D = 1)D + E (Y0|X ,Z ,D = 0) (1− D) (8)

= µ0 (X ) + [µ1 (X )− µ0 (X )]D

+E (U1|P (Z) ,D = 1)D + E (U0|P (Z) ,D = 0) (1− D)

= µ0 (X ) + K0 (P (Z)) + [µ1 (X )− µ0 (X ) + K1 (P (Z))

−K0 (P (Z))]D.

The coefficient on D in the final equation combinesµ1 (X )− µ0 (X ) with K1 (P (Z ))− K0 (P (Z )).

It does not correspond to any treatment effect.

To identify µ1(X )− µ0(X ), one must isolate it fromK1(P(Z ))− K0(P(Z )).

85 / 163

Page 267: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under assumptions (M-1) and (M-2) of the method ofmatching, the conditional expectation of Y conditional onP (X ) and D is

E (Y |P (X ) ,D) = µ0 (P (X )) + E (U0|P (X )) (9)

+ [(µ1 (P (X ))− µ0 (P (X ))) + E (U1|P (X ))− E (U0|P (X ))]D.

The coefficient on D in this expression is now interpretableand is the average treatment effect.

If we assume that (U0,U1) ⊥⊥ X , which is not strictly required,we reach a more familiar representation

E (Y |P (X ) ,D) = µ0 (P (X )) + [µ1 (P (X ))− µ0 (P (X ))]D, (10)

since E (U1|P (X )) = E (U0|P (X )) = 0.

A parallel derivation can be made conditioning on X instead ofP(X ).

86 / 163

Page 268: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under assumptions (M-1) and (M-2) of the method ofmatching, the conditional expectation of Y conditional onP (X ) and D is

E (Y |P (X ) ,D) = µ0 (P (X )) + E (U0|P (X )) (9)

+ [(µ1 (P (X ))− µ0 (P (X ))) + E (U1|P (X ))− E (U0|P (X ))]D.

The coefficient on D in this expression is now interpretableand is the average treatment effect.

If we assume that (U0,U1) ⊥⊥ X , which is not strictly required,we reach a more familiar representation

E (Y |P (X ) ,D) = µ0 (P (X )) + [µ1 (P (X ))− µ0 (P (X ))]D, (10)

since E (U1|P (X )) = E (U0|P (X )) = 0.

A parallel derivation can be made conditioning on X instead ofP(X ).

86 / 163

Page 269: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under assumptions (M-1) and (M-2) of the method ofmatching, the conditional expectation of Y conditional onP (X ) and D is

E (Y |P (X ) ,D) = µ0 (P (X )) + E (U0|P (X )) (9)

+ [(µ1 (P (X ))− µ0 (P (X ))) + E (U1|P (X ))− E (U0|P (X ))]D.

The coefficient on D in this expression is now interpretableand is the average treatment effect.

If we assume that (U0,U1) ⊥⊥ X , which is not strictly required,we reach a more familiar representation

E (Y |P (X ) ,D) = µ0 (P (X )) + [µ1 (P (X ))− µ0 (P (X ))]D, (10)

since E (U1|P (X )) = E (U0|P (X )) = 0.

A parallel derivation can be made conditioning on X instead ofP(X ).

86 / 163

Page 270: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions that justify matching, treatment effectsATE or TT (conditional on P (X )) are identified from thecoefficient on D in either (9) or (10).

Condition (M-2) guarantees that D is not perfectly predictableby X (or P (X )), so the variation in D identifies the treatmentparameter.

87 / 163

Page 271: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions that justify matching, treatment effectsATE or TT (conditional on P (X )) are identified from thecoefficient on D in either (9) or (10).

Condition (M-2) guarantees that D is not perfectly predictableby X (or P (X )), so the variation in D identifies the treatmentparameter.

87 / 163

Page 272: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The coefficient on D in equation (8) for the more generalcontrol function model does not correspond to any treatmentparameter, whereas the coefficients on D in equations (9) and(10) correspond to treatment parameters under theassumptions of the matching model.

Under assumption (CF-1), µ1 (P (X ))− µ0 (P (X )) = ATE andATE = TT = MTE, so the method of matching identifies all ofthe (conditional on P (X )) mean treatment parameters.

88 / 163

Page 273: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The coefficient on D in equation (8) for the more generalcontrol function model does not correspond to any treatmentparameter, whereas the coefficients on D in equations (9) and(10) correspond to treatment parameters under theassumptions of the matching model.

Under assumption (CF-1), µ1 (P (X ))− µ0 (P (X )) = ATE andATE = TT = MTE, so the method of matching identifies all ofthe (conditional on P (X )) mean treatment parameters.

88 / 163

Page 274: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions justifying matching, when means of Y1

and Y0 are the parameters of interest, and X satisfies (M-1)and (M-2), the bias terms vanish.

They do not vanish in the more general case considered by themethod of control functions.

This is the mathematical counterpart of the randomizationimplicit in matching: conditional on X or P (X ) , (U0,U1) arerandom with respect to D.

The method of control functions allows these error terms to benonrandom with respect to D and models the dependence.

In the absence of functional form assumptions, it requires anexclusion restriction (a variable in Z not in X ) to separate outK0 (P (Z )) from the coefficient on D.

89 / 163

Page 275: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions justifying matching, when means of Y1

and Y0 are the parameters of interest, and X satisfies (M-1)and (M-2), the bias terms vanish.

They do not vanish in the more general case considered by themethod of control functions.

This is the mathematical counterpart of the randomizationimplicit in matching: conditional on X or P (X ) , (U0,U1) arerandom with respect to D.

The method of control functions allows these error terms to benonrandom with respect to D and models the dependence.

In the absence of functional form assumptions, it requires anexclusion restriction (a variable in Z not in X ) to separate outK0 (P (Z )) from the coefficient on D.

89 / 163

Page 276: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions justifying matching, when means of Y1

and Y0 are the parameters of interest, and X satisfies (M-1)and (M-2), the bias terms vanish.

They do not vanish in the more general case considered by themethod of control functions.

This is the mathematical counterpart of the randomizationimplicit in matching: conditional on X or P (X ) , (U0,U1) arerandom with respect to D.

The method of control functions allows these error terms to benonrandom with respect to D and models the dependence.

In the absence of functional form assumptions, it requires anexclusion restriction (a variable in Z not in X ) to separate outK0 (P (Z )) from the coefficient on D.

89 / 163

Page 277: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions justifying matching, when means of Y1

and Y0 are the parameters of interest, and X satisfies (M-1)and (M-2), the bias terms vanish.

They do not vanish in the more general case considered by themethod of control functions.

This is the mathematical counterpart of the randomizationimplicit in matching: conditional on X or P (X ) , (U0,U1) arerandom with respect to D.

The method of control functions allows these error terms to benonrandom with respect to D and models the dependence.

In the absence of functional form assumptions, it requires anexclusion restriction (a variable in Z not in X ) to separate outK0 (P (Z )) from the coefficient on D.

89 / 163

Page 278: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Under the assumptions justifying matching, when means of Y1

and Y0 are the parameters of interest, and X satisfies (M-1)and (M-2), the bias terms vanish.

They do not vanish in the more general case considered by themethod of control functions.

This is the mathematical counterpart of the randomizationimplicit in matching: conditional on X or P (X ) , (U0,U1) arerandom with respect to D.

The method of control functions allows these error terms to benonrandom with respect to D and models the dependence.

In the absence of functional form assumptions, it requires anexclusion restriction (a variable in Z not in X ) to separate outK0 (P (Z )) from the coefficient on D.

89 / 163

Page 279: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching produces identification without exclusion restrictionswhereas identification with exclusion restrictions is a centralfeature of the control function method in the absence offunctional form assumptions.

The fact that the control function approach allows for moregeneral dependencies among the unobservables and theconditioning variables than the matching approach allows isimplicitly recognized in the work of Rosenbaum (1995) andRobins (1997).

Their “sensitivity analyses” for matching when there areunobserved conditioning variables are, in their essence,sensitivity analyses using control functions.

90 / 163

Page 280: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching produces identification without exclusion restrictionswhereas identification with exclusion restrictions is a centralfeature of the control function method in the absence offunctional form assumptions.

The fact that the control function approach allows for moregeneral dependencies among the unobservables and theconditioning variables than the matching approach allows isimplicitly recognized in the work of Rosenbaum (1995) andRobins (1997).

Their “sensitivity analyses” for matching when there areunobserved conditioning variables are, in their essence,sensitivity analyses using control functions.

90 / 163

Page 281: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Matching produces identification without exclusion restrictionswhereas identification with exclusion restrictions is a centralfeature of the control function method in the absence offunctional form assumptions.

The fact that the control function approach allows for moregeneral dependencies among the unobservables and theconditioning variables than the matching approach allows isimplicitly recognized in the work of Rosenbaum (1995) andRobins (1997).

Their “sensitivity analyses” for matching when there areunobserved conditioning variables are, in their essence,sensitivity analyses using control functions.

90 / 163

Page 282: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Aakvik et al. (2005), Carneiro et al. (2003) and Cunha et al.(2005) explicitly model the relationship between matching andselection models using factor structure models, treating theomitted conditioning variables as unobserved factors andestimating their distribution.

Abbring and Heckman discuss this work in Part III.

91 / 163

Page 283: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Aakvik et al. (2005), Carneiro et al. (2003) and Cunha et al.(2005) explicitly model the relationship between matching andselection models using factor structure models, treating theomitted conditioning variables as unobserved factors andestimating their distribution.

Abbring and Heckman discuss this work in Part III.

91 / 163

Page 284: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparing Matching and Classical Control Function Methods for aGeneralized Roy Model

Figure 5, developed in connection with our discussion ofinstrumental variables, shows the contrast between the shape ofthe MTE and the OLS matching estimand as a function of pfor the extended Roy model.

The MTE(p) shows its typical declining shape associated withdiminishing returns, and the assumptions justifying matchingare violated.

Matching attempts to impose a flat MTE(p) and thereforeflattens the estimated MTE(p) compared to its true value.

92 / 163

Page 285: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparing Matching and Classical Control Function Methods for aGeneralized Roy Model

Figure 5, developed in connection with our discussion ofinstrumental variables, shows the contrast between the shape ofthe MTE and the OLS matching estimand as a function of pfor the extended Roy model.

The MTE(p) shows its typical declining shape associated withdiminishing returns, and the assumptions justifying matchingare violated.

Matching attempts to impose a flat MTE(p) and thereforeflattens the estimated MTE(p) compared to its true value.

92 / 163

Page 286: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Comparing Matching and Classical Control Function Methods for aGeneralized Roy Model

Figure 5, developed in connection with our discussion ofinstrumental variables, shows the contrast between the shape ofthe MTE and the OLS matching estimand as a function of pfor the extended Roy model.

The MTE(p) shows its typical declining shape associated withdiminishing returns, and the assumptions justifying matchingare violated.

Matching attempts to impose a flat MTE(p) and thereforeflattens the estimated MTE(p) compared to its true value.

92 / 163

Page 287: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 5: Treatment Parameters and OLS/Matching as a function ofP(Z ) = p

0 0.2 0.4 0.6 0.8 1

-5

-4

-3

-2

-1

0

1

2

3

4

5

p

T T (p)

M T E (p)

A T E ( p )

T U T (p)

M a t c h i n g ( p )

Page 288: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It understates marginal returns at low levels of p (associatedwith unobservables that make it likely to participate intreatment) and overstates marginal returns at high levels of p.

To further illustrate the bias in matching and how the controlfunction eliminates it, we perform sensitivity analyses underdifferent assumptions about the parameters of the underlyingselection model.

94 / 163

Page 289: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It understates marginal returns at low levels of p (associatedwith unobservables that make it likely to participate intreatment) and overstates marginal returns at high levels of p.

To further illustrate the bias in matching and how the controlfunction eliminates it, we perform sensitivity analyses underdifferent assumptions about the parameters of the underlyingselection model.

94 / 163

Page 290: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, we assume that the data are generated by themodel of equations 11 and 12, where µD(Z ) = Zγ,µ0 (X ) = µ0, µ1 (X ) = µ1, and

(U0,U1,V )′ ∼ N (0,Σ)

corr (Uj ,V ) = ρjV

Var (Uj) = σ2j ; j = 0, 1 .

We assume in this section that D = 1 [µD (Z ) + V ≥ 0], inconformity with the examples presented in Heckman andNavarro (2004), from which we draw.

This reformulation of choice model (3) simply entails a changein the sign of V .

We assume that Z ⊥⊥ (U0,U1,V ).

95 / 163

Page 291: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, we assume that the data are generated by themodel of equations 11 and 12, where µD(Z ) = Zγ,µ0 (X ) = µ0, µ1 (X ) = µ1, and

(U0,U1,V )′ ∼ N (0,Σ)

corr (Uj ,V ) = ρjV

Var (Uj) = σ2j ; j = 0, 1 .

We assume in this section that D = 1 [µD (Z ) + V ≥ 0], inconformity with the examples presented in Heckman andNavarro (2004), from which we draw.

This reformulation of choice model (3) simply entails a changein the sign of V .

We assume that Z ⊥⊥ (U0,U1,V ).

95 / 163

Page 292: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, we assume that the data are generated by themodel of equations 11 and 12, where µD(Z ) = Zγ,µ0 (X ) = µ0, µ1 (X ) = µ1, and

(U0,U1,V )′ ∼ N (0,Σ)

corr (Uj ,V ) = ρjV

Var (Uj) = σ2j ; j = 0, 1 .

We assume in this section that D = 1 [µD (Z ) + V ≥ 0], inconformity with the examples presented in Heckman andNavarro (2004), from which we draw.

This reformulation of choice model (3) simply entails a changein the sign of V .

We assume that Z ⊥⊥ (U0,U1,V ).

95 / 163

Page 293: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, we assume that the data are generated by themodel of equations 11 and 12, where µD(Z ) = Zγ,µ0 (X ) = µ0, µ1 (X ) = µ1, and

(U0,U1,V )′ ∼ N (0,Σ)

corr (Uj ,V ) = ρjV

Var (Uj) = σ2j ; j = 0, 1 .

We assume in this section that D = 1 [µD (Z ) + V ≥ 0], inconformity with the examples presented in Heckman andNavarro (2004), from which we draw.

This reformulation of choice model (3) simply entails a changein the sign of V .

We assume that Z ⊥⊥ (U0,U1,V ).

95 / 163

Page 294: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Y1 = µ1(X ,U1) (11)

Y0 = µ0(X ,U0). (12)

96 / 163

Page 295: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Using the selection formulae, we can write the biasesconditional on P (Z ) = p using propensity score matching in ageneralized Roy model as

Bias TT (Z = z) = Bias TT (P (Z) = p) = σ0ρ0VM(p)

Bias ATE (Z = z) = Bias ATE (P (Z) = p) = M(p) [σ1ρ1V (1− p) + σ0ρ0V p] ,

where M(p) =φ(Φ−1(1−p))

p(1−p), φ (·) and Φ (·) are the pdf and cdf

of a standard normal random variable and the propensity scoreP (z) is evaluated at P (z) = p.

We assume that µ1 = µ0 so that the true average treatmenteffect is zero.

97 / 163

Page 296: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Using the selection formulae, we can write the biasesconditional on P (Z ) = p using propensity score matching in ageneralized Roy model as

Bias TT (Z = z) = Bias TT (P (Z) = p) = σ0ρ0VM(p)

Bias ATE (Z = z) = Bias ATE (P (Z) = p) = M(p) [σ1ρ1V (1− p) + σ0ρ0V p] ,

where M(p) =φ(Φ−1(1−p))

p(1−p), φ (·) and Φ (·) are the pdf and cdf

of a standard normal random variable and the propensity scoreP (z) is evaluated at P (z) = p.

We assume that µ1 = µ0 so that the true average treatmenteffect is zero.

97 / 163

Page 297: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We simulate the mean bias for TT (table 3) and ATE (table 4)for different values of the ρjV and σj .

The results in the tables show that, as we let the variances ofthe outcome equations grow, the value of the mean bias thatwe obtain can become substantial.

With larger correlations between the outcomes and theunobservables generating choices, come larger biases.

These tables demonstrate the greater generality of the controlfunction approach, which models the bias rather than assumingit away by conditioning.

98 / 163

Page 298: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We simulate the mean bias for TT (table 3) and ATE (table 4)for different values of the ρjV and σj .

The results in the tables show that, as we let the variances ofthe outcome equations grow, the value of the mean bias thatwe obtain can become substantial.

With larger correlations between the outcomes and theunobservables generating choices, come larger biases.

These tables demonstrate the greater generality of the controlfunction approach, which models the bias rather than assumingit away by conditioning.

98 / 163

Page 299: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We simulate the mean bias for TT (table 3) and ATE (table 4)for different values of the ρjV and σj .

The results in the tables show that, as we let the variances ofthe outcome equations grow, the value of the mean bias thatwe obtain can become substantial.

With larger correlations between the outcomes and theunobservables generating choices, come larger biases.

These tables demonstrate the greater generality of the controlfunction approach, which models the bias rather than assumingit away by conditioning.

98 / 163

Page 300: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We simulate the mean bias for TT (table 3) and ATE (table 4)for different values of the ρjV and σj .

The results in the tables show that, as we let the variances ofthe outcome equations grow, the value of the mean bias thatwe obtain can become substantial.

With larger correlations between the outcomes and theunobservables generating choices, come larger biases.

These tables demonstrate the greater generality of the controlfunction approach, which models the bias rather than assumingit away by conditioning.

98 / 163

Page 301: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Table 3: Mean Bias for Treatment on the Treated

ρ0V Average Bias (σ0 = 1) Average Bias (σ0 = 2)-1.00 -1.7920 -3.5839-0.75 -1.3440 -2.6879-0.50 -0.8960 -1.7920-0.25 -0.4480 -0.89600.00 0.0000 0.00000.25 0.4480 0.89600.50 0.8960 1.79200.75 1.3440 2.68791.00 1.7920 3.5839

Bias TT = ρ0V ∗ σ0 ∗M(p)

M(p) = φ(Φ−1(1−p))p∗(1−p) .

Source: Heckman and Navarro (2004)

Page 302: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Table 4: Mean Bias for Average Treatment Effect

(σ0 = 1)

ρ0V -1.00 -0.75 -0.50 -0.25 0 0.25 0.50 0.75 1.00ρ1V (σ1 = 1)

-1.00 -1.7920 -1.5680 -1.3440 -1.1200 -0.8960 -0.6720 -0.4480 -0.2240 0-0.75 -1.5680 -1.3440 -1.1200 -0.8960 -0.6720 -0.4480 -0.2240 0 0.2240-0.50 -1.3440 -1.1200 -0.8960 -0.6720 -0.4480 -0.2240 0 0.2240 0.4480-0.25 -1.1200 -0.8960 -0.6720 -0.4480 -0.2240 0 0.2240 0.4480 0.6720

0 -0.8960 -0.6720 -0.4480 -0.2240 0 0.2240 0.4480 0.6720 0.89600.25 -0.6720 -0.4480 -0.2240 0 0.2240 0.4480 0.6720 0.8960 1.12000.50 -0.4480 -0.2240 0 0.2240 0.4480 0.6720 0.8960 1.1200 1.34400.75 -0.2240 0 0.2240 0.4480 0.6720 0.8960 1.1200 1.3440 1.56801.00 0 0.2240 0.4480 0.6720 0.8960 1.1200 1.3440 1.5680 1.7920

ρ1V (σ1 = 2)-1.00 -2.6879 -2.2399 -1.7920 -1.3440 -0.8960 -0.4480 0 0.4480 0.8960-0.75 -2.4639 -2.0159 -1.5680 -1.1200 -0.6720 -0.2240 0.2240 0.6720 1.1200-0.50 -2.2399 -1.7920 -1.3440 -0.8960 -0.4480 0 0.4480 0.8960 1.3440-0.25 -2.0159 -1.5680 -1.1200 -0.6720 -0.2240 0.2240 0.6720 1.1200 1.5680

0 -1.7920 -1.3440 -0.8960 -0.4480 0 0.4480 0.8960 1.3440 1.79200.25 -1.5680 -1.1200 -0.6720 -0.2240 0.2240 0.6720 1.1200 1.5680 2.01590.50 -1.3440 -0.8960 -0.4480 0 0.4480 0.8960 1.3440 1.7920 2.23990.75 -1.1200 -0.6720 -0.2240 0.2240 0.6720 1.1200 1.5680 2.0159 2.46391.00 -0.8960 -0.4480 0 0.4480 0.8960 1.3440 1.7920 2.2399 2.6879

Bias ATE = ρ1V ∗ σ1 ∗M1(p)− ρ0V ∗ σ0 ∗M0(p)

M1(p) = φ(Φ−1(p))1−p

M0(p) = −φ(Φ−1(1−p))[1−p]

Source: Heckman and Navarro (2004)

Page 303: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Even if the correlation between the observables and theunobservables (ρjV ) is small, so that one might think thatselection on unobservables is relatively unimportant, we stillobtain substantial biases if we do not control for relevantomitted conditioning variables.

Only for special values of the parameters do we avoid bias bymatching.

These examples also demonstrate that sensitivity analyses canbe conducted for analysis based on control function methodseven when they are not fully identified.

Vijverberg (1993) provides an example.

101 / 163

Page 304: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Even if the correlation between the observables and theunobservables (ρjV ) is small, so that one might think thatselection on unobservables is relatively unimportant, we stillobtain substantial biases if we do not control for relevantomitted conditioning variables.

Only for special values of the parameters do we avoid bias bymatching.

These examples also demonstrate that sensitivity analyses canbe conducted for analysis based on control function methodseven when they are not fully identified.

Vijverberg (1993) provides an example.

101 / 163

Page 305: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Even if the correlation between the observables and theunobservables (ρjV ) is small, so that one might think thatselection on unobservables is relatively unimportant, we stillobtain substantial biases if we do not control for relevantomitted conditioning variables.

Only for special values of the parameters do we avoid bias bymatching.

These examples also demonstrate that sensitivity analyses canbe conducted for analysis based on control function methodseven when they are not fully identified.

Vijverberg (1993) provides an example.

101 / 163

Page 306: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Even if the correlation between the observables and theunobservables (ρjV ) is small, so that one might think thatselection on unobservables is relatively unimportant, we stillobtain substantial biases if we do not control for relevantomitted conditioning variables.

Only for special values of the parameters do we avoid bias bymatching.

These examples also demonstrate that sensitivity analyses canbe conducted for analysis based on control function methodseven when they are not fully identified.

Vijverberg (1993) provides an example.

101 / 163

Page 307: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Informational Requirements of Matching and the Bias WhenThey Are Not Satisfied

In this section, we present some examples of when matching“works” and when it breaks down.

This section is based on Heckman and Navarro (2004).

In particular, we show how matching on some of the relevantinformation but not all can make the bias using matching worsefor standard treatment parameters.

These examples also introduce factor models that play a keyrole in the analysis of Abbring and Heckman in Part III.

102 / 163

Page 308: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Informational Requirements of Matching and the Bias WhenThey Are Not Satisfied

In this section, we present some examples of when matching“works” and when it breaks down.

This section is based on Heckman and Navarro (2004).

In particular, we show how matching on some of the relevantinformation but not all can make the bias using matching worsefor standard treatment parameters.

These examples also introduce factor models that play a keyrole in the analysis of Abbring and Heckman in Part III.

102 / 163

Page 309: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Informational Requirements of Matching and the Bias WhenThey Are Not Satisfied

In this section, we present some examples of when matching“works” and when it breaks down.

This section is based on Heckman and Navarro (2004).

In particular, we show how matching on some of the relevantinformation but not all can make the bias using matching worsefor standard treatment parameters.

These examples also introduce factor models that play a keyrole in the analysis of Abbring and Heckman in Part III.

102 / 163

Page 310: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Informational Requirements of Matching and the Bias WhenThey Are Not Satisfied

In this section, we present some examples of when matching“works” and when it breaks down.

This section is based on Heckman and Navarro (2004).

In particular, we show how matching on some of the relevantinformation but not all can make the bias using matching worsefor standard treatment parameters.

These examples also introduce factor models that play a keyrole in the analysis of Abbring and Heckman in Part III.

102 / 163

Page 311: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The method of matching assumes that the econometrician hasaccess to and uses all of the relevant information in the precisesense defined there.

That means that the X that guarantees conditionalindependence (M-1) is available and is used.

The concept of relevant information is a delicate one and it isdifficult to find the true conditioning set.

103 / 163

Page 312: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The method of matching assumes that the econometrician hasaccess to and uses all of the relevant information in the precisesense defined there.

That means that the X that guarantees conditionalindependence (M-1) is available and is used.

The concept of relevant information is a delicate one and it isdifficult to find the true conditioning set.

103 / 163

Page 313: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The method of matching assumes that the econometrician hasaccess to and uses all of the relevant information in the precisesense defined there.

That means that the X that guarantees conditionalindependence (M-1) is available and is used.

The concept of relevant information is a delicate one and it isdifficult to find the true conditioning set.

103 / 163

Page 314: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assume that the economic model generating the data is ageneralized Roy model of the form

D∗ =Zγ + V where

Z ⊥⊥ V and

V =αV 1f1 + αV 2f2 + εV

D =

1 if D∗ ≥ 00 otherwise

,

and

Y1 = µ1 + U1 where U1 = α11f1 + α12f2 + ε1,

Y0 = µ0 + U0 where U0 = α01f1 + α02f2 + ε0.

We remind the reader that contrary to the analysis throughoutthe rest of this chapter we add V and do not subtract it in thedecision equation.

104 / 163

Page 315: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Assume that the economic model generating the data is ageneralized Roy model of the form

D∗ =Zγ + V where

Z ⊥⊥ V and

V =αV 1f1 + αV 2f2 + εV

D =

1 if D∗ ≥ 00 otherwise

,

and

Y1 = µ1 + U1 where U1 = α11f1 + α12f2 + ε1,

Y0 = µ0 + U0 where U0 = α01f1 + α02f2 + ε0.

We remind the reader that contrary to the analysis throughoutthe rest of this chapter we add V and do not subtract it in thedecision equation.

104 / 163

Page 316: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the familiar representation.

By a change in sign in V , we can go back and forth betweenthe specification used in this section and the specification usedin other sections of the chapter.

In this specification, (f1, f2, εV , ε1, ε0) are assumed to be meanzero random variables that are mutually independent of eachother and Z so that all the correlation among the elements of(U0,U1,V ) is captured by f = (f1, f2).

Models that take this form are known as factor models andhave been applied in the context of selection models by Aakviket al. (2005), Carneiro et al. (2001, 2003), and Hansen et al.(2004), among others.

We keep implicit any dependence on X which may be general.

105 / 163

Page 317: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the familiar representation.

By a change in sign in V , we can go back and forth betweenthe specification used in this section and the specification usedin other sections of the chapter.

In this specification, (f1, f2, εV , ε1, ε0) are assumed to be meanzero random variables that are mutually independent of eachother and Z so that all the correlation among the elements of(U0,U1,V ) is captured by f = (f1, f2).

Models that take this form are known as factor models andhave been applied in the context of selection models by Aakviket al. (2005), Carneiro et al. (2001, 2003), and Hansen et al.(2004), among others.

We keep implicit any dependence on X which may be general.

105 / 163

Page 318: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the familiar representation.

By a change in sign in V , we can go back and forth betweenthe specification used in this section and the specification usedin other sections of the chapter.

In this specification, (f1, f2, εV , ε1, ε0) are assumed to be meanzero random variables that are mutually independent of eachother and Z so that all the correlation among the elements of(U0,U1,V ) is captured by f = (f1, f2).

Models that take this form are known as factor models andhave been applied in the context of selection models by Aakviket al. (2005), Carneiro et al. (2001, 2003), and Hansen et al.(2004), among others.

We keep implicit any dependence on X which may be general.

105 / 163

Page 319: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the familiar representation.

By a change in sign in V , we can go back and forth betweenthe specification used in this section and the specification usedin other sections of the chapter.

In this specification, (f1, f2, εV , ε1, ε0) are assumed to be meanzero random variables that are mutually independent of eachother and Z so that all the correlation among the elements of(U0,U1,V ) is captured by f = (f1, f2).

Models that take this form are known as factor models andhave been applied in the context of selection models by Aakviket al. (2005), Carneiro et al. (2001, 2003), and Hansen et al.(2004), among others.

We keep implicit any dependence on X which may be general.

105 / 163

Page 320: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This is the familiar representation.

By a change in sign in V , we can go back and forth betweenthe specification used in this section and the specification usedin other sections of the chapter.

In this specification, (f1, f2, εV , ε1, ε0) are assumed to be meanzero random variables that are mutually independent of eachother and Z so that all the correlation among the elements of(U0,U1,V ) is captured by f = (f1, f2).

Models that take this form are known as factor models andhave been applied in the context of selection models by Aakviket al. (2005), Carneiro et al. (2001, 2003), and Hansen et al.(2004), among others.

We keep implicit any dependence on X which may be general.

105 / 163

Page 321: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Generically, the minimal relevant information for this modelwhen the factor loadings are not zero (αij 6= 0) is, for generalvalues of the factor loadings,

IR = f1, f2 .

Recall that we assume independence between Z and all errorterms.

If the econometrician has access to IR and uses it, (M-1) issatisfied conditional on IR .

Note that IR plays the role of θ in (U-1).

In the case where the economist knows IR , the economist’sinformation set σ(IE ) contains the relevant information(σ(IE ) ⊇ σ(IR)).

106 / 163

Page 322: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Generically, the minimal relevant information for this modelwhen the factor loadings are not zero (αij 6= 0) is, for generalvalues of the factor loadings,

IR = f1, f2 .

Recall that we assume independence between Z and all errorterms.

If the econometrician has access to IR and uses it, (M-1) issatisfied conditional on IR .

Note that IR plays the role of θ in (U-1).

In the case where the economist knows IR , the economist’sinformation set σ(IE ) contains the relevant information(σ(IE ) ⊇ σ(IR)).

106 / 163

Page 323: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Generically, the minimal relevant information for this modelwhen the factor loadings are not zero (αij 6= 0) is, for generalvalues of the factor loadings,

IR = f1, f2 .

Recall that we assume independence between Z and all errorterms.

If the econometrician has access to IR and uses it, (M-1) issatisfied conditional on IR .

Note that IR plays the role of θ in (U-1).

In the case where the economist knows IR , the economist’sinformation set σ(IE ) contains the relevant information(σ(IE ) ⊇ σ(IR)).

106 / 163

Page 324: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Generically, the minimal relevant information for this modelwhen the factor loadings are not zero (αij 6= 0) is, for generalvalues of the factor loadings,

IR = f1, f2 .

Recall that we assume independence between Z and all errorterms.

If the econometrician has access to IR and uses it, (M-1) issatisfied conditional on IR .

Note that IR plays the role of θ in (U-1).

In the case where the economist knows IR , the economist’sinformation set σ(IE ) contains the relevant information(σ(IE ) ⊇ σ(IR)).

106 / 163

Page 325: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The agent’s information set may include different variables.

If we assume that ε0, ε1 are shocks to outcomes not known tothe agent at the time treatment decisions are made, but theagent knows all other aspects of the model, the agent’sinformation is

IA = f1, f2,Z , εV .

Under perfect certainty, the agent’s information set includes ε1

and ε0:IA = f1, f2,Z , εV , ε1, ε0 .

In either case, all of the information available to the agent isnot required to satisfy conditional independence (M-1).

All three information sets guarantee conditional independence,but only the first is minimal relevant.

107 / 163

Page 326: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The agent’s information set may include different variables.

If we assume that ε0, ε1 are shocks to outcomes not known tothe agent at the time treatment decisions are made, but theagent knows all other aspects of the model, the agent’sinformation is

IA = f1, f2,Z , εV .

Under perfect certainty, the agent’s information set includes ε1

and ε0:IA = f1, f2,Z , εV , ε1, ε0 .

In either case, all of the information available to the agent isnot required to satisfy conditional independence (M-1).

All three information sets guarantee conditional independence,but only the first is minimal relevant.

107 / 163

Page 327: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The agent’s information set may include different variables.

If we assume that ε0, ε1 are shocks to outcomes not known tothe agent at the time treatment decisions are made, but theagent knows all other aspects of the model, the agent’sinformation is

IA = f1, f2,Z , εV .

Under perfect certainty, the agent’s information set includes ε1

and ε0:IA = f1, f2,Z , εV , ε1, ε0 .

In either case, all of the information available to the agent isnot required to satisfy conditional independence (M-1).

All three information sets guarantee conditional independence,but only the first is minimal relevant.

107 / 163

Page 328: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The observing economist may know some variables not inIA, IR∗ or IR but may not know all of the variables in IR .

In the following subsections, we study what happens when thematching assumption that σ (IE ) ⊇ σ (IR) does not hold.

That is, we analyze what happens to the bias from matching asthe amount of information used by the econometrician ischanged.

In order to get closed form expressions for the biases of thetreatment parameters, we make the additional assumption that

(f1, f2, εV , ε1, ε0) ∼ N (0,Σ) ,

where Σ is a matrix with(σ2f1, σ2

f2, σ2

εV, σ2

ε1, σ2

ε0

)on the diagonal

and zero in all the non-diagonal elements.

108 / 163

Page 329: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The observing economist may know some variables not inIA, IR∗ or IR but may not know all of the variables in IR .

In the following subsections, we study what happens when thematching assumption that σ (IE ) ⊇ σ (IR) does not hold.

That is, we analyze what happens to the bias from matching asthe amount of information used by the econometrician ischanged.

In order to get closed form expressions for the biases of thetreatment parameters, we make the additional assumption that

(f1, f2, εV , ε1, ε0) ∼ N (0,Σ) ,

where Σ is a matrix with(σ2f1, σ2

f2, σ2

εV, σ2

ε1, σ2

ε0

)on the diagonal

and zero in all the non-diagonal elements.

108 / 163

Page 330: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The observing economist may know some variables not inIA, IR∗ or IR but may not know all of the variables in IR .

In the following subsections, we study what happens when thematching assumption that σ (IE ) ⊇ σ (IR) does not hold.

That is, we analyze what happens to the bias from matching asthe amount of information used by the econometrician ischanged.

In order to get closed form expressions for the biases of thetreatment parameters, we make the additional assumption that

(f1, f2, εV , ε1, ε0) ∼ N (0,Σ) ,

where Σ is a matrix with(σ2f1, σ2

f2, σ2

εV, σ2

ε1, σ2

ε0

)on the diagonal

and zero in all the non-diagonal elements.

108 / 163

Page 331: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The observing economist may know some variables not inIA, IR∗ or IR but may not know all of the variables in IR .

In the following subsections, we study what happens when thematching assumption that σ (IE ) ⊇ σ (IR) does not hold.

That is, we analyze what happens to the bias from matching asthe amount of information used by the econometrician ischanged.

In order to get closed form expressions for the biases of thetreatment parameters, we make the additional assumption that

(f1, f2, εV , ε1, ε0) ∼ N (0,Σ) ,

where Σ is a matrix with(σ2f1, σ2

f2, σ2

εV, σ2

ε1, σ2

ε0

)on the diagonal

and zero in all the non-diagonal elements.

108 / 163

Page 332: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption links matching models to conventional normalselection models of the sort developed in Part I.

However, the examples based on this specification illustratemore general principles.

We now analyze various commonly encountered cases.

109 / 163

Page 333: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption links matching models to conventional normalselection models of the sort developed in Part I.

However, the examples based on this specification illustratemore general principles.

We now analyze various commonly encountered cases.

109 / 163

Page 334: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This assumption links matching models to conventional normalselection models of the sort developed in Part I.

However, the examples based on this specification illustratemore general principles.

We now analyze various commonly encountered cases.

109 / 163

Page 335: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Economist Uses the Minimal Relevant Information:σ (IR) ⊆ σ (IE )

We begin by analyzing the case in which the information usedby the economist is IE = Z , f1, f2 , so that the econometricianhas access to a relevant information set and it is larger than theminimal relevant information set.

In this case, it is straightforward to show that matchingidentifies all of the mean treatment parameters with no bias.

The matching estimator has population mean

E (Y1|D = 1, IE )− E (Y0|D = 0, IE ) = µ1 − µ0 + (α11 − α01) f1 + (α12 − α02) f2,

and all of the mean treatment parameters collapse to thissame expression since, conditional on knowing f1 and f2, thereis no selection because (ε0, ε1) ⊥⊥ V .

110 / 163

Page 336: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Economist Uses the Minimal Relevant Information:σ (IR) ⊆ σ (IE )

We begin by analyzing the case in which the information usedby the economist is IE = Z , f1, f2 , so that the econometricianhas access to a relevant information set and it is larger than theminimal relevant information set.

In this case, it is straightforward to show that matchingidentifies all of the mean treatment parameters with no bias.

The matching estimator has population mean

E (Y1|D = 1, IE )− E (Y0|D = 0, IE ) = µ1 − µ0 + (α11 − α01) f1 + (α12 − α02) f2,

and all of the mean treatment parameters collapse to thissame expression since, conditional on knowing f1 and f2, thereis no selection because (ε0, ε1) ⊥⊥ V .

110 / 163

Page 337: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Economist Uses the Minimal Relevant Information:σ (IR) ⊆ σ (IE )

We begin by analyzing the case in which the information usedby the economist is IE = Z , f1, f2 , so that the econometricianhas access to a relevant information set and it is larger than theminimal relevant information set.

In this case, it is straightforward to show that matchingidentifies all of the mean treatment parameters with no bias.

The matching estimator has population mean

E (Y1|D = 1, IE )− E (Y0|D = 0, IE ) = µ1 − µ0 + (α11 − α01) f1 + (α12 − α02) f2,

and all of the mean treatment parameters collapse to thissame expression since, conditional on knowing f1 and f2, thereis no selection because (ε0, ε1) ⊥⊥ V .

110 / 163

Page 338: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Recall that, for arbitrary choices of α11, α01, α12, α02,IR = f1, f2 and the economist needs less information toachieve (M-1) than is contained in IE .

In this case, the analysis of Rosenbaum and Rubin (1983) tellsus that knowledge of (Z , f1, f2) and knowledge of P (Z , f1, f2)are equally useful in identifying all of the treatment parametersconditional on P .

If we write the propensity score as

P (IE ) = Pr

(εV

σεV>−Zγ − αV 1f1 − αV 2f2

σεV

)= 1−Φ

(−Zγ − αV 1f1 − αV 2f2

σεV

)= p,

the event(D∗ S 0, given f = f and Z = z

)can be written as

εVσεV

S Φ−1(

1− P(z , f ))

, where Φ is the cdf of a standard

normal random variable and f = (f1, f2).

111 / 163

Page 339: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Recall that, for arbitrary choices of α11, α01, α12, α02,IR = f1, f2 and the economist needs less information toachieve (M-1) than is contained in IE .

In this case, the analysis of Rosenbaum and Rubin (1983) tellsus that knowledge of (Z , f1, f2) and knowledge of P (Z , f1, f2)are equally useful in identifying all of the treatment parametersconditional on P .

If we write the propensity score as

P (IE ) = Pr

(εV

σεV>−Zγ − αV 1f1 − αV 2f2

σεV

)= 1−Φ

(−Zγ − αV 1f1 − αV 2f2

σεV

)= p,

the event(D∗ S 0, given f = f and Z = z

)can be written as

εVσεV

S Φ−1(

1− P(z , f ))

, where Φ is the cdf of a standard

normal random variable and f = (f1, f2).

111 / 163

Page 340: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Recall that, for arbitrary choices of α11, α01, α12, α02,IR = f1, f2 and the economist needs less information toachieve (M-1) than is contained in IE .

In this case, the analysis of Rosenbaum and Rubin (1983) tellsus that knowledge of (Z , f1, f2) and knowledge of P (Z , f1, f2)are equally useful in identifying all of the treatment parametersconditional on P .

If we write the propensity score as

P (IE ) = Pr

(εV

σεV>−Zγ − αV 1f1 − αV 2f2

σεV

)= 1−Φ

(−Zγ − αV 1f1 − αV 2f2

σεV

)= p,

the event(D∗ S 0, given f = f and Z = z

)can be written as

εVσεV

S Φ−1(

1− P(z , f ))

, where Φ is the cdf of a standard

normal random variable and f = (f1, f2).

111 / 163

Page 341: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We abuse notation slightly by using z as the realized fixed valueof Z and f as the realized value of f .

The population matching condition (M-1) implies that

E(Y1|D = 1,P (IE ) = P(z, f )

)− E

(Y0|D = 0,P (IE ) = P(z, f )

)= µ1 − µ0 + E

(U1|D = 1,P (IE ) = P(z, f )

)− E

(U0|D = 0,P (IE ) = P(z, f )

)= µ1 − µ0 + E

(U1|

εV

σεV> Φ−1

(1− P(z, f )

))− E

(U0|

εV

σεV≤ Φ−1

(1− P(z, f )

))= µ1 − µ0.

112 / 163

Page 342: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We abuse notation slightly by using z as the realized fixed valueof Z and f as the realized value of f .

The population matching condition (M-1) implies that

E(Y1|D = 1,P (IE ) = P(z, f )

)− E

(Y0|D = 0,P (IE ) = P(z, f )

)= µ1 − µ0 + E

(U1|D = 1,P (IE ) = P(z, f )

)− E

(U0|D = 0,P (IE ) = P(z, f )

)= µ1 − µ0 + E

(U1|

εV

σεV> Φ−1

(1− P(z, f )

))− E

(U0|

εV

σεV≤ Φ−1

(1− P(z, f )

))= µ1 − µ0.

112 / 163

Page 343: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

This expression is equal to all of the treatment parametersdiscussed in this chapter, since

E

(U1|

εVσεV

> Φ−1(

1− P(z , f )))

=Cov (U1, εV )

σεVM1

(P(z , f )

)and

E

(U0|

εVσεV≤ Φ−1

(1− P(z , f )

))=

Cov (U0, εV )

σεVM0

(P(z , f )

),

where

M1(P(z , f )) =φ(Φ−1(1− P(z , f )))

P(z , f )

M0(P(z , f )) = −φ(Φ−1(1− P(z , f )))

1− P(z , f ),

where φ is the density of a standard normal random variable.113 / 163

Page 344: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

As a consequence of the assumptions about mutualindependence of the errors

Cov (Ui , εV ) = Cov (αi1f1 + αi2f2 + εi , εV ) = 0, i = 0, 1.

In the context of the generalized Roy model, the caseconsidered in this subsection is the one matching is designed tosolve.

Even though a selection model generates the data, the factthat the information used by the econometrician includes theminimal relevant information makes matching a correct solutionto the selection problem.

We can estimate the treatment parameters with no bias since,as a consequence of our assumptions, (U0,U1) ⊥⊥ D| (f ,Z ),which is exactly what matching requires.

The minimal relevant information set is even smaller.

114 / 163

Page 345: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

As a consequence of the assumptions about mutualindependence of the errors

Cov (Ui , εV ) = Cov (αi1f1 + αi2f2 + εi , εV ) = 0, i = 0, 1.

In the context of the generalized Roy model, the caseconsidered in this subsection is the one matching is designed tosolve.

Even though a selection model generates the data, the factthat the information used by the econometrician includes theminimal relevant information makes matching a correct solutionto the selection problem.

We can estimate the treatment parameters with no bias since,as a consequence of our assumptions, (U0,U1) ⊥⊥ D| (f ,Z ),which is exactly what matching requires.

The minimal relevant information set is even smaller.

114 / 163

Page 346: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

As a consequence of the assumptions about mutualindependence of the errors

Cov (Ui , εV ) = Cov (αi1f1 + αi2f2 + εi , εV ) = 0, i = 0, 1.

In the context of the generalized Roy model, the caseconsidered in this subsection is the one matching is designed tosolve.

Even though a selection model generates the data, the factthat the information used by the econometrician includes theminimal relevant information makes matching a correct solutionto the selection problem.

We can estimate the treatment parameters with no bias since,as a consequence of our assumptions, (U0,U1) ⊥⊥ D| (f ,Z ),which is exactly what matching requires.

The minimal relevant information set is even smaller.

114 / 163

Page 347: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

As a consequence of the assumptions about mutualindependence of the errors

Cov (Ui , εV ) = Cov (αi1f1 + αi2f2 + εi , εV ) = 0, i = 0, 1.

In the context of the generalized Roy model, the caseconsidered in this subsection is the one matching is designed tosolve.

Even though a selection model generates the data, the factthat the information used by the econometrician includes theminimal relevant information makes matching a correct solutionto the selection problem.

We can estimate the treatment parameters with no bias since,as a consequence of our assumptions, (U0,U1) ⊥⊥ D| (f ,Z ),which is exactly what matching requires.

The minimal relevant information set is even smaller.

114 / 163

Page 348: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

As a consequence of the assumptions about mutualindependence of the errors

Cov (Ui , εV ) = Cov (αi1f1 + αi2f2 + εi , εV ) = 0, i = 0, 1.

In the context of the generalized Roy model, the caseconsidered in this subsection is the one matching is designed tosolve.

Even though a selection model generates the data, the factthat the information used by the econometrician includes theminimal relevant information makes matching a correct solutionto the selection problem.

We can estimate the treatment parameters with no bias since,as a consequence of our assumptions, (U0,U1) ⊥⊥ D| (f ,Z ),which is exactly what matching requires.

The minimal relevant information set is even smaller.114 / 163

Page 349: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For arbitrary factor loadings, we only need to know (f1, f2) tosecure conditional independence.

We can define the propensity score solely in terms of f1 and f2,and the Rosenbaum-Rubin result still goes through.

Our analysis in this section focuses on treatment parametersconditional on particular values of P(Z , f ) = P(z , f ), i.e., forfixed values of p, but we could condition more finely.

Conditioning on P(z , f ) defines the treatment parameters morecoarsely.

We can use either fine or coarse conditioning to construct theunconditional treatment effects.

115 / 163

Page 350: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For arbitrary factor loadings, we only need to know (f1, f2) tosecure conditional independence.

We can define the propensity score solely in terms of f1 and f2,and the Rosenbaum-Rubin result still goes through.

Our analysis in this section focuses on treatment parametersconditional on particular values of P(Z , f ) = P(z , f ), i.e., forfixed values of p, but we could condition more finely.

Conditioning on P(z , f ) defines the treatment parameters morecoarsely.

We can use either fine or coarse conditioning to construct theunconditional treatment effects.

115 / 163

Page 351: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For arbitrary factor loadings, we only need to know (f1, f2) tosecure conditional independence.

We can define the propensity score solely in terms of f1 and f2,and the Rosenbaum-Rubin result still goes through.

Our analysis in this section focuses on treatment parametersconditional on particular values of P(Z , f ) = P(z , f ), i.e., forfixed values of p, but we could condition more finely.

Conditioning on P(z , f ) defines the treatment parameters morecoarsely.

We can use either fine or coarse conditioning to construct theunconditional treatment effects.

115 / 163

Page 352: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For arbitrary factor loadings, we only need to know (f1, f2) tosecure conditional independence.

We can define the propensity score solely in terms of f1 and f2,and the Rosenbaum-Rubin result still goes through.

Our analysis in this section focuses on treatment parametersconditional on particular values of P(Z , f ) = P(z , f ), i.e., forfixed values of p, but we could condition more finely.

Conditioning on P(z , f ) defines the treatment parameters morecoarsely.

We can use either fine or coarse conditioning to construct theunconditional treatment effects.

115 / 163

Page 353: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

For arbitrary factor loadings, we only need to know (f1, f2) tosecure conditional independence.

We can define the propensity score solely in terms of f1 and f2,and the Rosenbaum-Rubin result still goes through.

Our analysis in this section focuses on treatment parametersconditional on particular values of P(Z , f ) = P(z , f ), i.e., forfixed values of p, but we could condition more finely.

Conditioning on P(z , f ) defines the treatment parameters morecoarsely.

We can use either fine or coarse conditioning to construct theunconditional treatment effects.

115 / 163

Page 354: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example, using more information than what is in therelevant information set (i.e., using Z ) is harmless.

But this is not generally true.

If Z ⊥⊥ (U0,U1,V ), adding Z to the conditioning set canviolate conditional independence assumption (M-1):

(Y0,Y1) ⊥⊥ D | (f1, f2) ,

but(Y0,Y1) ⊥⊥ D | (f1, f2,Z ).

Adding extra variables can destroy the crucial conditionalindependence property of matching.

We present an example of this point below.

116 / 163

Page 355: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example, using more information than what is in therelevant information set (i.e., using Z ) is harmless.

But this is not generally true.

If Z ⊥⊥ (U0,U1,V ), adding Z to the conditioning set canviolate conditional independence assumption (M-1):

(Y0,Y1) ⊥⊥ D | (f1, f2) ,

but(Y0,Y1) ⊥⊥ D | (f1, f2,Z ).

Adding extra variables can destroy the crucial conditionalindependence property of matching.

We present an example of this point below.

116 / 163

Page 356: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example, using more information than what is in therelevant information set (i.e., using Z ) is harmless.

But this is not generally true.

If Z ⊥⊥ (U0,U1,V ), adding Z to the conditioning set canviolate conditional independence assumption (M-1):

(Y0,Y1) ⊥⊥ D | (f1, f2) ,

but(Y0,Y1) ⊥⊥ D | (f1, f2,Z ).

Adding extra variables can destroy the crucial conditionalindependence property of matching.

We present an example of this point below.

116 / 163

Page 357: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example, using more information than what is in therelevant information set (i.e., using Z ) is harmless.

But this is not generally true.

If Z ⊥⊥ (U0,U1,V ), adding Z to the conditioning set canviolate conditional independence assumption (M-1):

(Y0,Y1) ⊥⊥ D | (f1, f2) ,

but(Y0,Y1) ⊥⊥ D | (f1, f2,Z ).

Adding extra variables can destroy the crucial conditionalindependence property of matching.

We present an example of this point below.

116 / 163

Page 358: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We first consider a case where Z ⊥⊥ (U0,U1,V ) but the analystconditions on Z and not (f1, f2).

In this case, there is selection on the unobservables that are notconditioned on.

117 / 163

Page 359: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We first consider a case where Z ⊥⊥ (U0,U1,V ) but the analystconditions on Z and not (f1, f2).

In this case, there is selection on the unobservables that are notconditioned on.

117 / 163

Page 360: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Economist does not Use All of the Minimal RelevantInformation

Next, suppose that the information used by the econometricianis

IE = Z ,

and there is selection on the unobservable (to the analyst) f1and f2, i.e., the factor loadings αij are all non zero.

Recall that we assume that Z and the f are independent.

In this case, the event(D∗ S 0,Z = z

)is characterized by

αV 1f1 + αV 2f2 + εV√α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

S Φ−1 (1− P(z)) .

118 / 163

Page 361: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Economist does not Use All of the Minimal RelevantInformation

Next, suppose that the information used by the econometricianis

IE = Z ,

and there is selection on the unobservable (to the analyst) f1and f2, i.e., the factor loadings αij are all non zero.

Recall that we assume that Z and the f are independent.

In this case, the event(D∗ S 0,Z = z

)is characterized by

αV 1f1 + αV 2f2 + εV√α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

S Φ−1 (1− P(z)) .

118 / 163

Page 362: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Economist does not Use All of the Minimal RelevantInformation

Next, suppose that the information used by the econometricianis

IE = Z ,

and there is selection on the unobservable (to the analyst) f1and f2, i.e., the factor loadings αij are all non zero.

Recall that we assume that Z and the f are independent.

In this case, the event(D∗ S 0,Z = z

)is characterized by

αV 1f1 + αV 2f2 + εV√α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

S Φ−1 (1− P(z)) .

118 / 163

Page 363: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The bias for the different treatment parameters is given by

Bias TT (Z = z) = Bias TT (P (Z ) = P(z)) = η0M(P(z)),(13)

where M(P(z)) = M1(P(z))−M0(P(z)).

Bias ATE (Z = z) = Bias ATE (P (Z) = P(z)) = M(P (z))η1[1− P(z)] + η0P(z),(14)

where

η1 =αV 1α11σ

2f1

+ αV 2α12σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

η0 =αV 1α01σ

2f1

+ αV 2α02σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

.

119 / 163

Page 364: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is not surprising that matching on sets of variables thatexclude the relevant conditioning variables produces bias for theconditional (on P(z) ) treatment parameters.

The advantage of working with a closed form expression for thebias is that it allows us to answer questions about themagnitude of this bias under different assumptions about theinformation available to the analyst, and to present somesimple examples.

We next use expressions (13) and (14) as benchmarks againstwhich to compare the relative size of the bias when we enlargethe econometrician’s information set beyond Z .

120 / 163

Page 365: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is not surprising that matching on sets of variables thatexclude the relevant conditioning variables produces bias for theconditional (on P(z) ) treatment parameters.

The advantage of working with a closed form expression for thebias is that it allows us to answer questions about themagnitude of this bias under different assumptions about theinformation available to the analyst, and to present somesimple examples.

We next use expressions (13) and (14) as benchmarks againstwhich to compare the relative size of the bias when we enlargethe econometrician’s information set beyond Z .

120 / 163

Page 366: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is not surprising that matching on sets of variables thatexclude the relevant conditioning variables produces bias for theconditional (on P(z) ) treatment parameters.

The advantage of working with a closed form expression for thebias is that it allows us to answer questions about themagnitude of this bias under different assumptions about theinformation available to the analyst, and to present somesimple examples.

We next use expressions (13) and (14) as benchmarks againstwhich to compare the relative size of the bias when we enlargethe econometrician’s information set beyond Z .

120 / 163

Page 367: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Adding Information to the Econometrician’s Information Set IE :Using Some but not All the Information from the Minimal RelevantInformation Set IR

Suppose that the econometrician uses more information but notall of the information in the minimal relevant information set.

He still reports values of the parameters conditional on specificp values but now the model for p has different conditioningvariables.

For example, the data set assumed in the preceding sectionmight be augmented or else the econometrician decides to useinformation previously available.

121 / 163

Page 368: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Adding Information to the Econometrician’s Information Set IE :Using Some but not All the Information from the Minimal RelevantInformation Set IR

Suppose that the econometrician uses more information but notall of the information in the minimal relevant information set.

He still reports values of the parameters conditional on specificp values but now the model for p has different conditioningvariables.

For example, the data set assumed in the preceding sectionmight be augmented or else the econometrician decides to useinformation previously available.

121 / 163

Page 369: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Adding Information to the Econometrician’s Information Set IE :Using Some but not All the Information from the Minimal RelevantInformation Set IR

Suppose that the econometrician uses more information but notall of the information in the minimal relevant information set.

He still reports values of the parameters conditional on specificp values but now the model for p has different conditioningvariables.

For example, the data set assumed in the preceding sectionmight be augmented or else the econometrician decides to useinformation previously available.

121 / 163

Page 370: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that the econometrician’s information setis

I ′E = Z , f2 ,

and that he uses this information set.

Under conditions 1 and 2 presented below, the biases for thetreatment parameters conditional on values of P = p arereduced in absolute value relative to their values by changingthe conditioning set in this way.

But these conditions are not generally satisfied, so that addingextra information does not necessarily reduce bias and mayactually increase it.

122 / 163

Page 371: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that the econometrician’s information setis

I ′E = Z , f2 ,

and that he uses this information set.

Under conditions 1 and 2 presented below, the biases for thetreatment parameters conditional on values of P = p arereduced in absolute value relative to their values by changingthe conditioning set in this way.

But these conditions are not generally satisfied, so that addingextra information does not necessarily reduce bias and mayactually increase it.

122 / 163

Page 372: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In particular, assume that the econometrician’s information setis

I ′E = Z , f2 ,

and that he uses this information set.

Under conditions 1 and 2 presented below, the biases for thetreatment parameters conditional on values of P = p arereduced in absolute value relative to their values by changingthe conditioning set in this way.

But these conditions are not generally satisfied, so that addingextra information does not necessarily reduce bias and mayactually increase it.

122 / 163

Page 373: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To show how this happens in our model, we define expressionscomparable to η1 and η0 for this case:

η′1 =αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

η′0 =αV 1α01σ

2f1√

α2V 1σ

2f1

+ σ2εV

.

We compare the biases under the two cases using formulae(13)–(14), suitably modified, but keeping p fixed at a specificvalue even though this implies different conditioning sets interms of (z , f ).

123 / 163

Page 374: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

To show how this happens in our model, we define expressionscomparable to η1 and η0 for this case:

η′1 =αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

η′0 =αV 1α01σ

2f1√

α2V 1σ

2f1

+ σ2εV

.

We compare the biases under the two cases using formulae(13)–(14), suitably modified, but keeping p fixed at a specificvalue even though this implies different conditioning sets interms of (z , f ).

123 / 163

Page 375: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Condition 1 The bias produced by using matching to estimate TTis smaller in absolute value for any given p when the newinformation set σ (I ′E ) is used if

|η0| > |η′0| .

124 / 163

Page 376: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

There is a similar result for ATE:Condition 2 The bias produced by using matching to estimate ATEis smaller in absolute value for any given p when the newinformation setσ (I ′E ) is used if

|η1 (1− p) + η0p| > |η′1 (1− p) + η′0p| .

125 / 163

Page 377: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Proof.

These conditions are a direct consequence of formulae (13) and(14), modified to allow for the different covariance structureproduced by the information structure assumed in this section(replacing η0 with η

′0, η1 with η

′1).

126 / 163

Page 378: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is important to notice that we condition on the same value ofp in deriving these expressions although the variables in P aredifferent across different specifications of the model.

Propensity-score matching defines them conditional on P = p,so we are being faithful to that method.

127 / 163

Page 379: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is important to notice that we condition on the same value ofp in deriving these expressions although the variables in P aredifferent across different specifications of the model.

Propensity-score matching defines them conditional on P = p,so we are being faithful to that method.

127 / 163

Page 380: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

These conditions do not always hold.

In general, whether or not the bias will be reduced by addingadditional conditioning variables depends on the relativeimportance of the additional information in both the outcomeequations and on the signs of the terms inside the absolutevalue.

128 / 163

Page 381: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

These conditions do not always hold.

In general, whether or not the bias will be reduced by addingadditional conditioning variables depends on the relativeimportance of the additional information in both the outcomeequations and on the signs of the terms inside the absolutevalue.

128 / 163

Page 382: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 383: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 384: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 385: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 386: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 387: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 388: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Consider whether Condition (1) is satisfied in general.

Assume η0 > 0 for all α02, αV 2.

Then η0 > η′0 if

η0 =αV 1α01σ

2f1

+ (α2V 2)(α02

αV 2

)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

+ σ2εV

>αV 1α11σ

2f1√

α2V 1σ

2f1

+ σ2εV

= η′0.

When α02

αV 2= 0, clearly η0 < η′0.

Adding information to the conditioning set increases bias.

We can vary(α02

αV 2

)holding all of the other parameters constant

and hence can make the left hand side arbitrarily large.

As α02 increases, there is some critical value α∗02 beyond whichη0 > η′0.

129 / 163

Page 389: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If we assumed that η0 < 0, however, the opposite conclusionwould hold, and the conditions for reduction in bias would beharder to meet, as the relative importance of the newinformation is increased.

Similar expressions can be derived for ATE and MTE, in whichthe direction of the effect depends on the signs of the terms inthe absolute value.

Figures 6A and 6B illustrate the point that adding some butnot all information from the minimal relevant set mightincrease the point-wise bias and the unconditional or averagebias for ATE and TT, respectively.

Values of the parameters of the model are presented at thebase of the figures.

130 / 163

Page 390: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If we assumed that η0 < 0, however, the opposite conclusionwould hold, and the conditions for reduction in bias would beharder to meet, as the relative importance of the newinformation is increased.

Similar expressions can be derived for ATE and MTE, in whichthe direction of the effect depends on the signs of the terms inthe absolute value.

Figures 6A and 6B illustrate the point that adding some butnot all information from the minimal relevant set mightincrease the point-wise bias and the unconditional or averagebias for ATE and TT, respectively.

Values of the parameters of the model are presented at thebase of the figures.

130 / 163

Page 391: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If we assumed that η0 < 0, however, the opposite conclusionwould hold, and the conditions for reduction in bias would beharder to meet, as the relative importance of the newinformation is increased.

Similar expressions can be derived for ATE and MTE, in whichthe direction of the effect depends on the signs of the terms inthe absolute value.

Figures 6A and 6B illustrate the point that adding some butnot all information from the minimal relevant set mightincrease the point-wise bias and the unconditional or averagebias for ATE and TT, respectively.

Values of the parameters of the model are presented at thebase of the figures.

130 / 163

Page 392: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

If we assumed that η0 < 0, however, the opposite conclusionwould hold, and the conditions for reduction in bias would beharder to meet, as the relative importance of the newinformation is increased.

Similar expressions can be derived for ATE and MTE, in whichthe direction of the effect depends on the signs of the terms inthe absolute value.

Figures 6A and 6B illustrate the point that adding some butnot all information from the minimal relevant set mightincrease the point-wise bias and the unconditional or averagebias for ATE and TT, respectively.

Values of the parameters of the model are presented at thebase of the figures.

130 / 163

Page 393: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 6: A. Bias for Treatment on the Treated

Page 394: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Bias for Average Treatment Effect

P

Note: Using proxy Z for f2 increases the bias. Correlation (Z , f2) = 0.5.

Page 395: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Model:

V = Z + f1 + f2 + εV ; Y1 = 2f1 + 0.1f2 + ε1; Y0 = f1 + 0.1f2 + ε0

εV ∼ N(0, 1); ε1 ∼ N(0, 1); ε0 ∼ N(0, 1)

f1 ∼ N(0, 1); f2 ∼ N(0, 1)

Source: Heckman and Navarro (2005)

Page 396: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In these figures, we compare conditioning on P(z), which ingeneral is not guaranteed to eliminate bias, with conditioningon P(z) and f2 but not f1.

Adding f2 to the conditioning increases bias.

The fact that the point-wise (and overall) bias might increasewhen adding some but not all information from IR is a featurethat is not shared by the method of control functions.

Because the method of control functions models the stochasticdependence of the unobservables in the outcome equations onthe observables, changing the variables observed by theeconometrician to include f2 does not generate bias.

It only changes the control function used.

134 / 163

Page 397: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In these figures, we compare conditioning on P(z), which ingeneral is not guaranteed to eliminate bias, with conditioningon P(z) and f2 but not f1.

Adding f2 to the conditioning increases bias.

The fact that the point-wise (and overall) bias might increasewhen adding some but not all information from IR is a featurethat is not shared by the method of control functions.

Because the method of control functions models the stochasticdependence of the unobservables in the outcome equations onthe observables, changing the variables observed by theeconometrician to include f2 does not generate bias.

It only changes the control function used.

134 / 163

Page 398: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In these figures, we compare conditioning on P(z), which ingeneral is not guaranteed to eliminate bias, with conditioningon P(z) and f2 but not f1.

Adding f2 to the conditioning increases bias.

The fact that the point-wise (and overall) bias might increasewhen adding some but not all information from IR is a featurethat is not shared by the method of control functions.

Because the method of control functions models the stochasticdependence of the unobservables in the outcome equations onthe observables, changing the variables observed by theeconometrician to include f2 does not generate bias.

It only changes the control function used.

134 / 163

Page 399: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In these figures, we compare conditioning on P(z), which ingeneral is not guaranteed to eliminate bias, with conditioningon P(z) and f2 but not f1.

Adding f2 to the conditioning increases bias.

The fact that the point-wise (and overall) bias might increasewhen adding some but not all information from IR is a featurethat is not shared by the method of control functions.

Because the method of control functions models the stochasticdependence of the unobservables in the outcome equations onthe observables, changing the variables observed by theeconometrician to include f2 does not generate bias.

It only changes the control function used.

134 / 163

Page 400: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In these figures, we compare conditioning on P(z), which ingeneral is not guaranteed to eliminate bias, with conditioningon P(z) and f2 but not f1.

Adding f2 to the conditioning increases bias.

The fact that the point-wise (and overall) bias might increasewhen adding some but not all information from IR is a featurethat is not shared by the method of control functions.

Because the method of control functions models the stochasticdependence of the unobservables in the outcome equations onthe observables, changing the variables observed by theeconometrician to include f2 does not generate bias.

It only changes the control function used.

134 / 163

Page 401: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

That is, by adding f2 we change the control function from

K1 (P (Z ) = P(z)) = η1M1(P(z))

K0 (P (Z ) = P(z)) = η0M0(P(z))

to

K ′1

(P (Z , f2) = P(z , f2)

)= η′1M1(P(z , f2))

K ′0

(P (Z , f2) = P(z , f2)

)= η′0M0(P(z , f2))

but do not generate any bias in using the control functionestimator.

This is a major advantage of this method.

135 / 163

Page 402: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

That is, by adding f2 we change the control function from

K1 (P (Z ) = P(z)) = η1M1(P(z))

K0 (P (Z ) = P(z)) = η0M0(P(z))

to

K ′1

(P (Z , f2) = P(z , f2)

)= η′1M1(P(z , f2))

K ′0

(P (Z , f2) = P(z , f2)

)= η′0M0(P(z , f2))

but do not generate any bias in using the control functionestimator.

This is a major advantage of this method.

135 / 163

Page 403: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 404: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 405: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 406: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 407: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 408: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 409: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It controls for the bias of the omitted conditioning variables bymodeling it.

Of course, if the model for the bias term is not valid, neither isthe correction for the bias.

Semiparametric selection estimators are designed to protect theanalyst against model misspecification.

(See, e.g., Powell, 1994).

Matching evades this problem by assuming that the analystalways knows the correct conditioning variables and that theysatisfy (M-1).

In actual empirical settings, agents rarely know the relevantinformation set.

Instead they use proxies.

136 / 163

Page 410: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Adding Information to the Econometrician’s Information Set: UsingProxies for the Relevant Information

Suppose that instead of knowing some part of the minimalrelevant information set, such as f2, the analyst has access to aproxy for it.

In particular, assume that he has access to a variable Z that iscorrelated with f2 but that is not the full minimal relevantinformation set.

That is, define the econometrician’s information to be

IE =Z , Z

,

and suppose that he uses it so IE = IE .

137 / 163

Page 411: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Adding Information to the Econometrician’s Information Set: UsingProxies for the Relevant Information

Suppose that instead of knowing some part of the minimalrelevant information set, such as f2, the analyst has access to aproxy for it.

In particular, assume that he has access to a variable Z that iscorrelated with f2 but that is not the full minimal relevantinformation set.

That is, define the econometrician’s information to be

IE =Z , Z

,

and suppose that he uses it so IE = IE .

137 / 163

Page 412: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Adding Information to the Econometrician’s Information Set: UsingProxies for the Relevant Information

Suppose that instead of knowing some part of the minimalrelevant information set, such as f2, the analyst has access to aproxy for it.

In particular, assume that he has access to a variable Z that iscorrelated with f2 but that is not the full minimal relevantinformation set.

That is, define the econometrician’s information to be

IE =Z , Z

,

and suppose that he uses it so IE = IE .

137 / 163

Page 413: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In order to obtain closed-form expressions for the biases weassume that

Z ∼ N(0, σ2

Z

)corr

(Z , f2

)= ρ, and Z ⊥⊥ (ε0, ε1, εV , f1) .

We define expressions comparable to η and η′ :

η1 =α11αV 1σ

2f1

+ α12αV 2 (1− ρ2)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

(1− ρ2) + σ2εV

η0 =α01αV 1σ

2f1

+ α02αV 2 (1− ρ2)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

(1− ρ2) + σ2εV

.

138 / 163

Page 414: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

By substituting for I ′E by IE and η′j by ηj (j = 0, 1) inConditions (1) and (2), we can obtain results for the bias inthis case.

Whether IE will be bias-reducing depends on how well it spansIR and on the signs of the terms in the absolute values in thoseconditions.

139 / 163

Page 415: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

By substituting for I ′E by IE and η′j by ηj (j = 0, 1) inConditions (1) and (2), we can obtain results for the bias inthis case.

Whether IE will be bias-reducing depends on how well it spansIR and on the signs of the terms in the absolute values in thoseconditions.

139 / 163

Page 416: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this case, however, there is another parameter to consider:the correlation ρ between Z and f2, ρ.

If |ρ| = 1 we are back to the case of IE = I ′E because Z is aperfect proxy for f2.

Because we know that the bias at a particular value of p mighteither increase or decrease when f2 is used as a conditioningvariable but f1 is not, we know that it is not possible todetermine whether the bias increases or decreases as we changethe correlation between f2 and Z .

That is, we know that going from ρ = 0 to |ρ| = 1 mightchange the bias in any direction.

Use of a better proxy in this correlational sense may produce amore biased estimate.

140 / 163

Page 417: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this case, however, there is another parameter to consider:the correlation ρ between Z and f2, ρ.

If |ρ| = 1 we are back to the case of IE = I ′E because Z is aperfect proxy for f2.

Because we know that the bias at a particular value of p mighteither increase or decrease when f2 is used as a conditioningvariable but f1 is not, we know that it is not possible todetermine whether the bias increases or decreases as we changethe correlation between f2 and Z .

That is, we know that going from ρ = 0 to |ρ| = 1 mightchange the bias in any direction.

Use of a better proxy in this correlational sense may produce amore biased estimate.

140 / 163

Page 418: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this case, however, there is another parameter to consider:the correlation ρ between Z and f2, ρ.

If |ρ| = 1 we are back to the case of IE = I ′E because Z is aperfect proxy for f2.

Because we know that the bias at a particular value of p mighteither increase or decrease when f2 is used as a conditioningvariable but f1 is not, we know that it is not possible todetermine whether the bias increases or decreases as we changethe correlation between f2 and Z .

That is, we know that going from ρ = 0 to |ρ| = 1 mightchange the bias in any direction.

Use of a better proxy in this correlational sense may produce amore biased estimate.

140 / 163

Page 419: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this case, however, there is another parameter to consider:the correlation ρ between Z and f2, ρ.

If |ρ| = 1 we are back to the case of IE = I ′E because Z is aperfect proxy for f2.

Because we know that the bias at a particular value of p mighteither increase or decrease when f2 is used as a conditioningvariable but f1 is not, we know that it is not possible todetermine whether the bias increases or decreases as we changethe correlation between f2 and Z .

That is, we know that going from ρ = 0 to |ρ| = 1 mightchange the bias in any direction.

Use of a better proxy in this correlational sense may produce amore biased estimate.

140 / 163

Page 420: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this case, however, there is another parameter to consider:the correlation ρ between Z and f2, ρ.

If |ρ| = 1 we are back to the case of IE = I ′E because Z is aperfect proxy for f2.

Because we know that the bias at a particular value of p mighteither increase or decrease when f2 is used as a conditioningvariable but f1 is not, we know that it is not possible todetermine whether the bias increases or decreases as we changethe correlation between f2 and Z .

That is, we know that going from ρ = 0 to |ρ| = 1 mightchange the bias in any direction.

Use of a better proxy in this correlational sense may produce amore biased estimate.

140 / 163

Page 421: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is straightforward to derive conditions under which the biasgenerated when the econometrician’s information is IE issmaller than when it is I ′E .

That is, it can be the case that knowing the proxy variable Z isbetter than knowing the actual variable f2.

Returning to the analysis of treatment on the treated as anexample (i.e., Condition (1)), the bias in absolute value (at a

fixed value of p) is reduced when Z is used instead of f2 if∣∣∣∣∣∣ α01αV 1σ2f1

+ α02αV 2 (1− ρ2)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

(1− ρ2) + σ2εV

∣∣∣∣∣∣ <∣∣∣∣∣∣ α01αV 1σ

2f1√

α2V 1σ

2f1

+ σ2εV

∣∣∣∣∣∣ .Figures 7A and 7B, use the same true model as used in theprevious section to illustrate the two points being made here.

141 / 163

Page 422: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is straightforward to derive conditions under which the biasgenerated when the econometrician’s information is IE issmaller than when it is I ′E .

That is, it can be the case that knowing the proxy variable Z isbetter than knowing the actual variable f2.

Returning to the analysis of treatment on the treated as anexample (i.e., Condition (1)), the bias in absolute value (at a

fixed value of p) is reduced when Z is used instead of f2 if∣∣∣∣∣∣ α01αV 1σ2f1

+ α02αV 2 (1− ρ2)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

(1− ρ2) + σ2εV

∣∣∣∣∣∣ <∣∣∣∣∣∣ α01αV 1σ

2f1√

α2V 1σ

2f1

+ σ2εV

∣∣∣∣∣∣ .Figures 7A and 7B, use the same true model as used in theprevious section to illustrate the two points being made here.

141 / 163

Page 423: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

It is straightforward to derive conditions under which the biasgenerated when the econometrician’s information is IE issmaller than when it is I ′E .

That is, it can be the case that knowing the proxy variable Z isbetter than knowing the actual variable f2.

Returning to the analysis of treatment on the treated as anexample (i.e., Condition (1)), the bias in absolute value (at a

fixed value of p) is reduced when Z is used instead of f2 if∣∣∣∣∣∣ α01αV 1σ2f1

+ α02αV 2 (1− ρ2)σ2f2√

α2V 1σ

2f1

+ α2V 2σ

2f2

(1− ρ2) + σ2εV

∣∣∣∣∣∣ <∣∣∣∣∣∣ α01αV 1σ

2f1√

α2V 1σ

2f1

+ σ2εV

∣∣∣∣∣∣ .Figures 7A and 7B, use the same true model as used in theprevious section to illustrate the two points being made here.

141 / 163

Page 424: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 7: A. Bias for Treatment on the Treated

Page 425: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Bias for Average Treatment Effect

Note: Using proxy Z for f2 increases the bias. Correlation (Z , f2) = 0.5.

Page 426: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Model:

V = Z + f1 + f2 + εV ; Y1 = 2f1 + 0.1f2 + ε1; Y0 = f1 + 0.1f2 + ε0

εV ∼ N(0, 1); ε1 ∼ N(0, 1); ε0 ∼ N(0, 1)

f1 ∼ N(0, 1); f2 ∼ N(0, 1)

Source: Heckman and Navarro (2005)

Page 427: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Namely, using a proxy for an unobserved relevant variable mightincrease the bias.

On the other hand, it might be better in terms of bias to use aproxy than to use the actual variable, f2.

However, as Figures 8A and 8B show, by changing α02 from 0.1to 1, using a proxy might increase the bias versus using theactual variable f2.

Notice that the bias need not be universally negative or positivebut depends on p.

145 / 163

Page 428: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Namely, using a proxy for an unobserved relevant variable mightincrease the bias.

On the other hand, it might be better in terms of bias to use aproxy than to use the actual variable, f2.

However, as Figures 8A and 8B show, by changing α02 from 0.1to 1, using a proxy might increase the bias versus using theactual variable f2.

Notice that the bias need not be universally negative or positivebut depends on p.

145 / 163

Page 429: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Namely, using a proxy for an unobserved relevant variable mightincrease the bias.

On the other hand, it might be better in terms of bias to use aproxy than to use the actual variable, f2.

However, as Figures 8A and 8B show, by changing α02 from 0.1to 1, using a proxy might increase the bias versus using theactual variable f2.

Notice that the bias need not be universally negative or positivebut depends on p.

145 / 163

Page 430: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Namely, using a proxy for an unobserved relevant variable mightincrease the bias.

On the other hand, it might be better in terms of bias to use aproxy than to use the actual variable, f2.

However, as Figures 8A and 8B show, by changing α02 from 0.1to 1, using a proxy might increase the bias versus using theactual variable f2.

Notice that the bias need not be universally negative or positivebut depends on p.

145 / 163

Page 431: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Figure 8: A. Bias for Treatment on the Treated

Page 432: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

B. Bias for Average Treatment Effect

Note: Using proxy Z for f2 increases the bias. Correlation (Z , f2) = 0.5.

Page 433: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Model:

V = Z + f1 + f2 + εV ; Y1 = 2f1 + 0.1f2 + ε1; Y0 = f1 + f2 + ε0

εV ∼ N(0, 1); ε1 ∼ N(0, 1); ε0 ∼ N(0, 1)

f1 ∼ N(0, 1); f2 ∼ N(0, 1)

Source: Heckman and Navarro (2005)

Page 434: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The point of these examples is that matching makes veryknife-edge assumptions.

If the analyst gets the right conditioning set, (M-1) is satisfiedand there is no bias.

But determining the correct information set is not a trivial task.

Having good proxies in the standard usage of that term cancreate substantial bias in estimating treatment effects.

Half a loaf may be worse than none.

149 / 163

Page 435: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The point of these examples is that matching makes veryknife-edge assumptions.

If the analyst gets the right conditioning set, (M-1) is satisfiedand there is no bias.

But determining the correct information set is not a trivial task.

Having good proxies in the standard usage of that term cancreate substantial bias in estimating treatment effects.

Half a loaf may be worse than none.

149 / 163

Page 436: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The point of these examples is that matching makes veryknife-edge assumptions.

If the analyst gets the right conditioning set, (M-1) is satisfiedand there is no bias.

But determining the correct information set is not a trivial task.

Having good proxies in the standard usage of that term cancreate substantial bias in estimating treatment effects.

Half a loaf may be worse than none.

149 / 163

Page 437: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The point of these examples is that matching makes veryknife-edge assumptions.

If the analyst gets the right conditioning set, (M-1) is satisfiedand there is no bias.

But determining the correct information set is not a trivial task.

Having good proxies in the standard usage of that term cancreate substantial bias in estimating treatment effects.

Half a loaf may be worse than none.

149 / 163

Page 438: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The point of these examples is that matching makes veryknife-edge assumptions.

If the analyst gets the right conditioning set, (M-1) is satisfiedand there is no bias.

But determining the correct information set is not a trivial task.

Having good proxies in the standard usage of that term cancreate substantial bias in estimating treatment effects.

Half a loaf may be worse than none.

149 / 163

Page 439: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Case of a Discrete Outcome Variable

Heckman and Navarro (2004) construct parallel examples forcases including discrete dependent variables.

In particular, they consider nonnormal, nonseparable equationsfor odds ratios and probabilities.

The proposition that matching identifies the correct treatmentparameter if the econometrician’s information set includes allthe minimal relevant information is true more generally,provided that any additional extraneous information used isexogenous in a sense to be defined precisely in the next section.

150 / 163

Page 440: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Case of a Discrete Outcome Variable

Heckman and Navarro (2004) construct parallel examples forcases including discrete dependent variables.

In particular, they consider nonnormal, nonseparable equationsfor odds ratios and probabilities.

The proposition that matching identifies the correct treatmentparameter if the econometrician’s information set includes allthe minimal relevant information is true more generally,provided that any additional extraneous information used isexogenous in a sense to be defined precisely in the next section.

150 / 163

Page 441: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The Case of a Discrete Outcome Variable

Heckman and Navarro (2004) construct parallel examples forcases including discrete dependent variables.

In particular, they consider nonnormal, nonseparable equationsfor odds ratios and probabilities.

The proposition that matching identifies the correct treatmentparameter if the econometrician’s information set includes allthe minimal relevant information is true more generally,provided that any additional extraneous information used isexogenous in a sense to be defined precisely in the next section.

150 / 163

Page 442: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

On the Use of Model Selection Criteria to Choose MatchingVariables

We have already shown by way of example that adding morevariables from a minimal relevant information set, but not allvariables in it, may increase bias.

By a parallel argument, adding additional variables to therelevant conditioning set may make the bias worse.

Although we have used our prototypical Roy model as our pointof departure, the point is more general.

151 / 163

Page 443: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

On the Use of Model Selection Criteria to Choose MatchingVariables

We have already shown by way of example that adding morevariables from a minimal relevant information set, but not allvariables in it, may increase bias.

By a parallel argument, adding additional variables to therelevant conditioning set may make the bias worse.

Although we have used our prototypical Roy model as our pointof departure, the point is more general.

151 / 163

Page 444: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

On the Use of Model Selection Criteria to Choose MatchingVariables

We have already shown by way of example that adding morevariables from a minimal relevant information set, but not allvariables in it, may increase bias.

By a parallel argument, adding additional variables to therelevant conditioning set may make the bias worse.

Although we have used our prototypical Roy model as our pointof departure, the point is more general.

151 / 163

Page 445: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

There is no rigorous rule for choosing the conditioning variablesthat produce (M-1).

Adding variables that are statistically significant in thetreatment choice equation is not guaranteed to select a set ofconditioning variables that satisfies condition (M-1).

Adding f2 when it determines D may increase bias at anyselected value of p.

152 / 163

Page 446: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

There is no rigorous rule for choosing the conditioning variablesthat produce (M-1).

Adding variables that are statistically significant in thetreatment choice equation is not guaranteed to select a set ofconditioning variables that satisfies condition (M-1).

Adding f2 when it determines D may increase bias at anyselected value of p.

152 / 163

Page 447: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

There is no rigorous rule for choosing the conditioning variablesthat produce (M-1).

Adding variables that are statistically significant in thetreatment choice equation is not guaranteed to select a set ofconditioning variables that satisfies condition (M-1).

Adding f2 when it determines D may increase bias at anyselected value of p.

152 / 163

Page 448: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The existing literature (e.g., Heckman, Ichimura, Smith, andTodd, 1998) proposes criteria based on selecting a set ofconditioning variables based on a goodness of fit criterion (λ),where a higher λ means a better fit in the equation predictingD.

The intuition behind such criteria is that by using somemeasure of goodness of fit as a guiding principle one is usinginformation relevant to the decision process.

Using f2 improves goodness of fit of the model for D, butincreases bias for the parameters.

In general, such a rule is deficient if f1 is not known or is notused.

153 / 163

Page 449: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The existing literature (e.g., Heckman, Ichimura, Smith, andTodd, 1998) proposes criteria based on selecting a set ofconditioning variables based on a goodness of fit criterion (λ),where a higher λ means a better fit in the equation predictingD.

The intuition behind such criteria is that by using somemeasure of goodness of fit as a guiding principle one is usinginformation relevant to the decision process.

Using f2 improves goodness of fit of the model for D, butincreases bias for the parameters.

In general, such a rule is deficient if f1 is not known or is notused.

153 / 163

Page 450: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The existing literature (e.g., Heckman, Ichimura, Smith, andTodd, 1998) proposes criteria based on selecting a set ofconditioning variables based on a goodness of fit criterion (λ),where a higher λ means a better fit in the equation predictingD.

The intuition behind such criteria is that by using somemeasure of goodness of fit as a guiding principle one is usinginformation relevant to the decision process.

Using f2 improves goodness of fit of the model for D, butincreases bias for the parameters.

In general, such a rule is deficient if f1 is not known or is notused.

153 / 163

Page 451: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The existing literature (e.g., Heckman, Ichimura, Smith, andTodd, 1998) proposes criteria based on selecting a set ofconditioning variables based on a goodness of fit criterion (λ),where a higher λ means a better fit in the equation predictingD.

The intuition behind such criteria is that by using somemeasure of goodness of fit as a guiding principle one is usinginformation relevant to the decision process.

Using f2 improves goodness of fit of the model for D, butincreases bias for the parameters.

In general, such a rule is deficient if f1 is not known or is notused.

153 / 163

Page 452: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

An implicit assumption underlying such procedures is that theadded conditioning variables X are exogenous in the followingsense:

154 / 163

Page 453: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(Y0,Y1) ⊥⊥ D|Iint,X (E-1)

where Iint is interpreted as the variables initially used asconditioning variables before X is added.

Failure of exogeneity is a failure of (M-1) for the augmentedconditioning set, and matching estimators based on theaugmented information set (Iint,X ) are biased when thecondition is not satisfied.

155 / 163

Page 454: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

(Y0,Y1) ⊥⊥ D|Iint,X (E-1)

where Iint is interpreted as the variables initially used asconditioning variables before X is added.

Failure of exogeneity is a failure of (M-1) for the augmentedconditioning set, and matching estimators based on theaugmented information set (Iint,X ) are biased when thecondition is not satisfied.

155 / 163

Page 455: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Exogeneity assumption (E-1) is not usually invoked in thematching literature, which largely focuses on problem P-1,evaluating a program in place, rather than extrapolating to newenvironments (P-2).

Indeed, the robustness of matching to such exogeneityassumptions is trumpeted as one of the virtues of the method.

In this section, we show some examples that illustrate thegeneral point that standard model selection criteria fail toproduce correctly specified conditioning sets unless someversion of exogeneity condition (E-1) is satisfied.

156 / 163

Page 456: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Exogeneity assumption (E-1) is not usually invoked in thematching literature, which largely focuses on problem P-1,evaluating a program in place, rather than extrapolating to newenvironments (P-2).

Indeed, the robustness of matching to such exogeneityassumptions is trumpeted as one of the virtues of the method.

In this section, we show some examples that illustrate thegeneral point that standard model selection criteria fail toproduce correctly specified conditioning sets unless someversion of exogeneity condition (E-1) is satisfied.

156 / 163

Page 457: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Exogeneity assumption (E-1) is not usually invoked in thematching literature, which largely focuses on problem P-1,evaluating a program in place, rather than extrapolating to newenvironments (P-2).

Indeed, the robustness of matching to such exogeneityassumptions is trumpeted as one of the virtues of the method.

In this section, we show some examples that illustrate thegeneral point that standard model selection criteria fail toproduce correctly specified conditioning sets unless someversion of exogeneity condition (E-1) is satisfied.

156 / 163

Page 458: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the literature, the use of model selection criteria is justifiedin two different ways.

Sometimes it is claimed that they provide a relative guide.

Sets of variables with better goodness of fit in predicting D (ahigher λ in the notation of table 5) are alleged to be betterthan sets of variables with lower λ in the sense that theygenerate lower biases.

However, we have already shown that this is not true.

We know that enlarging the analyst’s information fromIint = Z to I ′int = Z , f2 will improve fit since f2 is also in IAand IR .

But, going from Iint to I ′int might increase the bias.

157 / 163

Page 459: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the literature, the use of model selection criteria is justifiedin two different ways.

Sometimes it is claimed that they provide a relative guide.

Sets of variables with better goodness of fit in predicting D (ahigher λ in the notation of table 5) are alleged to be betterthan sets of variables with lower λ in the sense that theygenerate lower biases.

However, we have already shown that this is not true.

We know that enlarging the analyst’s information fromIint = Z to I ′int = Z , f2 will improve fit since f2 is also in IAand IR .

But, going from Iint to I ′int might increase the bias.

157 / 163

Page 460: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the literature, the use of model selection criteria is justifiedin two different ways.

Sometimes it is claimed that they provide a relative guide.

Sets of variables with better goodness of fit in predicting D (ahigher λ in the notation of table 5) are alleged to be betterthan sets of variables with lower λ in the sense that theygenerate lower biases.

However, we have already shown that this is not true.

We know that enlarging the analyst’s information fromIint = Z to I ′int = Z , f2 will improve fit since f2 is also in IAand IR .

But, going from Iint to I ′int might increase the bias.

157 / 163

Page 461: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the literature, the use of model selection criteria is justifiedin two different ways.

Sometimes it is claimed that they provide a relative guide.

Sets of variables with better goodness of fit in predicting D (ahigher λ in the notation of table 5) are alleged to be betterthan sets of variables with lower λ in the sense that theygenerate lower biases.

However, we have already shown that this is not true.

We know that enlarging the analyst’s information fromIint = Z to I ′int = Z , f2 will improve fit since f2 is also in IAand IR .

But, going from Iint to I ′int might increase the bias.

157 / 163

Page 462: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the literature, the use of model selection criteria is justifiedin two different ways.

Sometimes it is claimed that they provide a relative guide.

Sets of variables with better goodness of fit in predicting D (ahigher λ in the notation of table 5) are alleged to be betterthan sets of variables with lower λ in the sense that theygenerate lower biases.

However, we have already shown that this is not true.

We know that enlarging the analyst’s information fromIint = Z to I ′int = Z , f2 will improve fit since f2 is also in IAand IR .

But, going from Iint to I ′int might increase the bias.

157 / 163

Page 463: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In the literature, the use of model selection criteria is justifiedin two different ways.

Sometimes it is claimed that they provide a relative guide.

Sets of variables with better goodness of fit in predicting D (ahigher λ in the notation of table 5) are alleged to be betterthan sets of variables with lower λ in the sense that theygenerate lower biases.

However, we have already shown that this is not true.

We know that enlarging the analyst’s information fromIint = Z to I ′int = Z , f2 will improve fit since f2 is also in IAand IR .

But, going from Iint to I ′int might increase the bias.

157 / 163

Page 464: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Table 5: Goodness of fit statistics λGoodness of fit statistics λ Average Bias

Variables in Probit Correct in-sample prediction rate Pseudo R2 TT ATEZ 66.88% 0.1284 1.1380 1.6553

Z , f2 75.02% 0.2791 1.2671 1.9007Z , f1, f2 83.45% 0.4844 0.0000 0.0000Z , S1 77.38% 0.3282 0.9612 1.3981Z , S2 92.25% 0.7498 0.9997 1.4541

Source: Heckman and Navarro (2004)

Page 465: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

So it is not true that combinations of variables that increasesome measure of fit λ necessarily reduce the bias.

Table 5 illustrates this point using our normal example.

Going from row 1 to row 2 (adding f2) improves goodness of fitand increases the unconditional or overall bias for all threetreatment parameters, because (E-1) is violated.

159 / 163

Page 466: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

So it is not true that combinations of variables that increasesome measure of fit λ necessarily reduce the bias.

Table 5 illustrates this point using our normal example.

Going from row 1 to row 2 (adding f2) improves goodness of fitand increases the unconditional or overall bias for all threetreatment parameters, because (E-1) is violated.

159 / 163

Page 467: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

So it is not true that combinations of variables that increasesome measure of fit λ necessarily reduce the bias.

Table 5 illustrates this point using our normal example.

Going from row 1 to row 2 (adding f2) improves goodness of fitand increases the unconditional or overall bias for all threetreatment parameters, because (E-1) is violated.

159 / 163

Page 468: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The following rule of thumb argument is sometimes invoked asan absolute standard against which to compare alternativemodels.

In versions of the argument, the analyst asserts that there is acombination of variables I ′′ that satisfy (M-1) and henceproduces zero bias and a value of λ = λ′′ larger than that ofany other I .

In our examples, conditioning on Z , f1, f2 generates zero bias.

We can exclude Z and still obtain zero bias.

Because Z is a determinant of D, this shows immediately thatthe best fitting model does not necessarily identify the minimalrelevant information set.

160 / 163

Page 469: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The following rule of thumb argument is sometimes invoked asan absolute standard against which to compare alternativemodels.

In versions of the argument, the analyst asserts that there is acombination of variables I ′′ that satisfy (M-1) and henceproduces zero bias and a value of λ = λ′′ larger than that ofany other I .

In our examples, conditioning on Z , f1, f2 generates zero bias.

We can exclude Z and still obtain zero bias.

Because Z is a determinant of D, this shows immediately thatthe best fitting model does not necessarily identify the minimalrelevant information set.

160 / 163

Page 470: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The following rule of thumb argument is sometimes invoked asan absolute standard against which to compare alternativemodels.

In versions of the argument, the analyst asserts that there is acombination of variables I ′′ that satisfy (M-1) and henceproduces zero bias and a value of λ = λ′′ larger than that ofany other I .

In our examples, conditioning on Z , f1, f2 generates zero bias.

We can exclude Z and still obtain zero bias.

Because Z is a determinant of D, this shows immediately thatthe best fitting model does not necessarily identify the minimalrelevant information set.

160 / 163

Page 471: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The following rule of thumb argument is sometimes invoked asan absolute standard against which to compare alternativemodels.

In versions of the argument, the analyst asserts that there is acombination of variables I ′′ that satisfy (M-1) and henceproduces zero bias and a value of λ = λ′′ larger than that ofany other I .

In our examples, conditioning on Z , f1, f2 generates zero bias.

We can exclude Z and still obtain zero bias.

Because Z is a determinant of D, this shows immediately thatthe best fitting model does not necessarily identify the minimalrelevant information set.

160 / 163

Page 472: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

The following rule of thumb argument is sometimes invoked asan absolute standard against which to compare alternativemodels.

In versions of the argument, the analyst asserts that there is acombination of variables I ′′ that satisfy (M-1) and henceproduces zero bias and a value of λ = λ′′ larger than that ofany other I .

In our examples, conditioning on Z , f1, f2 generates zero bias.

We can exclude Z and still obtain zero bias.

Because Z is a determinant of D, this shows immediately thatthe best fitting model does not necessarily identify the minimalrelevant information set.

160 / 163

Page 473: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example including Z is innocuous because there is stillzero bias and the added conditioning variables satisfy (E-1)where Iint = (f1, f2).

In general, such a rule is not innocuous if Z is not exogenous.

If goodness of fit is used as a rule to choose variables on whichto match, there is no guarantee it produces a desirableconditioning set.

If we include in the conditioning set variables X that violate(E-1), they may improve the fit of predicted probabilities butworsen the bias.

161 / 163

Page 474: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example including Z is innocuous because there is stillzero bias and the added conditioning variables satisfy (E-1)where Iint = (f1, f2).

In general, such a rule is not innocuous if Z is not exogenous.

If goodness of fit is used as a rule to choose variables on whichto match, there is no guarantee it produces a desirableconditioning set.

If we include in the conditioning set variables X that violate(E-1), they may improve the fit of predicted probabilities butworsen the bias.

161 / 163

Page 475: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example including Z is innocuous because there is stillzero bias and the added conditioning variables satisfy (E-1)where Iint = (f1, f2).

In general, such a rule is not innocuous if Z is not exogenous.

If goodness of fit is used as a rule to choose variables on whichto match, there is no guarantee it produces a desirableconditioning set.

If we include in the conditioning set variables X that violate(E-1), they may improve the fit of predicted probabilities butworsen the bias.

161 / 163

Page 476: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

In this example including Z is innocuous because there is stillzero bias and the added conditioning variables satisfy (E-1)where Iint = (f1, f2).

In general, such a rule is not innocuous if Z is not exogenous.

If goodness of fit is used as a rule to choose variables on whichto match, there is no guarantee it produces a desirableconditioning set.

If we include in the conditioning set variables X that violate(E-1), they may improve the fit of predicted probabilities butworsen the bias.

161 / 163

Page 477: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Heckman and Navarro (2004) produce a series of examples thathave the following feature.

Variables S (shown at the base of table 5) are added to theinformation set that improve the prediction of D but arecorrelated with (U0,U1).

Their particular examples use imperfect proxies (S1, S2) for(f1, f2).

The point is more general.

The S variables fail exogeneity and produce greater bias for TTand ATE but they improve the prediction of D as measured bythe correct in-sample prediction rate and the pseudo-R2.

See the bottom two rows of table 5.

162 / 163

Page 478: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Heckman and Navarro (2004) produce a series of examples thathave the following feature.

Variables S (shown at the base of table 5) are added to theinformation set that improve the prediction of D but arecorrelated with (U0,U1).

Their particular examples use imperfect proxies (S1, S2) for(f1, f2).

The point is more general.

The S variables fail exogeneity and produce greater bias for TTand ATE but they improve the prediction of D as measured bythe correct in-sample prediction rate and the pseudo-R2.

See the bottom two rows of table 5.

162 / 163

Page 479: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Heckman and Navarro (2004) produce a series of examples thathave the following feature.

Variables S (shown at the base of table 5) are added to theinformation set that improve the prediction of D but arecorrelated with (U0,U1).

Their particular examples use imperfect proxies (S1, S2) for(f1, f2).

The point is more general.

The S variables fail exogeneity and produce greater bias for TTand ATE but they improve the prediction of D as measured bythe correct in-sample prediction rate and the pseudo-R2.

See the bottom two rows of table 5.

162 / 163

Page 480: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Heckman and Navarro (2004) produce a series of examples thathave the following feature.

Variables S (shown at the base of table 5) are added to theinformation set that improve the prediction of D but arecorrelated with (U0,U1).

Their particular examples use imperfect proxies (S1, S2) for(f1, f2).

The point is more general.

The S variables fail exogeneity and produce greater bias for TTand ATE but they improve the prediction of D as measured bythe correct in-sample prediction rate and the pseudo-R2.

See the bottom two rows of table 5.

162 / 163

Page 481: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Heckman and Navarro (2004) produce a series of examples thathave the following feature.

Variables S (shown at the base of table 5) are added to theinformation set that improve the prediction of D but arecorrelated with (U0,U1).

Their particular examples use imperfect proxies (S1, S2) for(f1, f2).

The point is more general.

The S variables fail exogeneity and produce greater bias for TTand ATE but they improve the prediction of D as measured bythe correct in-sample prediction rate and the pseudo-R2.

See the bottom two rows of table 5.

162 / 163

Page 482: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

Heckman and Navarro (2004) produce a series of examples thathave the following feature.

Variables S (shown at the base of table 5) are added to theinformation set that improve the prediction of D but arecorrelated with (U0,U1).

Their particular examples use imperfect proxies (S1, S2) for(f1, f2).

The point is more general.

The S variables fail exogeneity and produce greater bias for TTand ATE but they improve the prediction of D as measured bythe correct in-sample prediction rate and the pseudo-R2.

See the bottom two rows of table 5.

162 / 163

Page 483: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We next turn to the method of randomization, which isfrequently held up to be an ideal approach for evaluating socialprograms.

Randomization attempts to use a random assignment toachieve the conditional independence assumed in matching.

Aakvik, A., J. J. Heckman, and E. J. Vytlacil (2005). Estimatingtreatment effects for discrete outcomes when responses totreatment vary: An application to Norwegian vocationalrehabilitation programs. Journal of Econometrics 125(1–2),15–51.

Abadie, A. and G. W. Imbens (2006, January). Large sampleproperties of matching estimators for average treatment effects.Econometrica 74(1), 235–267.

Barnow, B. S., G. G. Cain, and A. S. Goldberger (1980). Issues inthe analysis of selectivity bias. In E. W. Stromsdorfer andG. Farkas (Eds.), Evaluation Studies Review Annual, Volume 5,pp. 42–59. Beverly Hills, California: Sage Publications.

Carneiro, P. (2002). Heterogeneity in the Returns to Schooling:Implications for Policy Evaluation. Ph. D. thesis, University ofChicago.

Carneiro, P., K. Hansen, and J. J. Heckman (2001, Fall). Removingthe veil of ignorance in assessing the distributional impacts ofsocial policies. Swedish Economic Policy Review 8(2), 273–301.

Carneiro, P., K. Hansen, and J. J. Heckman (2003, May).Estimating distributions of treatment effects with an applicationto the returns to schooling and measurement of the effects ofuncertainty on college choice. International EconomicReview 44(2), 361–422.

Cochran, W. G. and D. B. Rubin (1973). Controlling bias inobservational studies: A review. Sankyha 35(Series A, Part 4),417–446.

Cunha, F., J. J. Heckman, and S. Navarro (2005, April). Separatinguncertainty from heterogeneity in life cycle earnings, The 2004Hicks Lecture. Oxford Economic Papers 57(2), 191–261.

Gill, R. D. and J. M. Robins (2001, December). Causal inference forcomplex longitudinal data: The continuous case. Annals ofStatistics 29(6), 1785–1811.

Hahn, J. (1998, March). On the role of the propensity score inefficient semiparametric estimation of average treatment effects.Econometrica 66(2), 315–331.

Hansen, K. T., J. J. Heckman, and K. J. Mullen (2004,July–August). The effect of schooling and ability on achievementtest scores. Journal of Econometrics 121(1–2), 39–98.

Heckman, J. J. (1980). Addendum to sample selection bias as aspecification error. In E. Stromsdorfer and G. Farkas (Eds.),Evaluation Studies Review Annual, Volume 5. Beverly Hills: SagePublications.

Heckman, J. J. (1998). The effects of government policies onhuman capital investment, unemployment and earnings inequality.In Third Public GAAC Symposium: Labor Markets in the USAand Germany, Volume 5. Bonn, Germany: German-AmericanAcademic Council Foundation.

Heckman, J. J., H. Ichimura, J. Smith, and P. E. Todd (1998,September). Characterizing selection bias using experimentaldata. Econometrica 66(5), 1017–1098.

Heckman, J. J., H. Ichimura, and P. E. Todd (1997). How detailsmake a difference: Semiparametric estimation of the partiallylinear regression model. Unpublished manuscript, University ofChicago, Department of Economics.

Heckman, J. J., H. Ichimura, and P. E. Todd (1998, April).Matching as an econometric evaluation estimator. Review ofEconomic Studies 65(2), 261–294.

Heckman, J. J., R. J. LaLonde, and J. A. Smith (1999). Theeconomics and econometrics of active labor market programs. InO. C. Ashenfelter and D. Card (Eds.), Handbook of LaborEconomics, Volume 3A, Chapter 31, pp. 1865–2097. New York:North-Holland.

Heckman, J. J. and S. Navarro (2004, February). Using matching,instrumental variables, and control functions to estimate economicchoice models. Review of Economics and Statistics 86(1), 30–57.

Heckman, J. J. and R. Robb (1985). Alternative methods forevaluating the impact of interventions. In J. J. Heckman andB. S. Singer (Eds.), Longitudinal Analysis of Labor Market Data,Volume 10, pp. 156–245. New York: Cambridge University Press.

Heckman, J. J. and R. Robb (1986a). Alternative methods forsolving the problem of selection bias in evaluating the impact oftreatments on outcomes. In H. Wainer (Ed.), Drawing Inferencesfrom Self-Selected Samples, pp. 63–107. New York:Springer-Verlag. Reprinted in 2000, Mahwah, NJ: LawrenceErlbaum Associates.

Heckman, J. J. and R. Robb (1986b). Postscript: A rejoinder toTukey. In H. Wainer (Ed.), Drawing Inferences from Self-SelectedSamples, pp. 111–114. New York: Springer-Verlag. Reprinted in2000, Mahwah, NJ: Lawrence Erlbaum Associates.

Heckman, J. J. and E. J. Vytlacil (2005, May). Structuralequations, treatment effects and econometric policy evaluation.Econometrica 73(3), 669–738.

Hirano, K., G. W. Imbens, and G. Ridder (2003, July). Efficientestimation of average treatment effects using the estimatedpropensity score. Econometrica 71(4), 1161–1189.

Matzkin, R. L. (2007). Nonparametric identification. In J. J.Heckman and E. E. Leamer (Eds.), Handbook of Econometrics,Volume 6B. Amsterdam: Elsevier.

Powell, J. L. (1994). Estimation of semiparametric models. InR. Engle and D. McFadden (Eds.), Handbook of Econometrics,Volume 4, pp. 2443–2521. Amsterdam: Elsevier.

Robins, J. M. (1997). Causal inference from complex longitudinaldata. In M. Berkane (Ed.), Latent Variable Modeling andApplications to Causality. Lecture Notes in Statistics, pp. 69–117.New York: Springer-Verlag.

Rosenbaum, P. R. (1995). Observational Studies. New York:Springer-Verlag.

Rosenbaum, P. R. and D. B. Rubin (1983, April). The central roleof the propensity score in observational studies for causal effects.Biometrika 70(1), 41–55.

Roy, A. (1951, June). Some thoughts on the distribution ofearnings. Oxford Economic Papers 3(2), 135–146.

Todd, P. E. (1999, October). A practical guide to implementingmatching estimators. Unpublished manuscript, University ofPennsylvania, Department of Economics. Prepared for the IADBmeeting in Santiago, Chile.

Todd, P. E. (2007). Evaluating social programs with endogenousprogram placement and selection of the treated. In Handbook ofDevelopment Economics. Amsterdam: Elsevier. Forthcoming.

Todd, P. E. (2008). Matching estimators. In S. Durlauf and L. E.Blume (Eds.), The New Palgrave Dictionary of Economics. NewYork: Palgrave Macmillan. Forthcoming.

Vijverberg, W. P. M. (1993, May-June). Measuring the unidentifiedparameter of the extended roy model of selectivity. Journal ofEconometrics 57(1–3), 69–89.

Willis, R. J. and S. Rosen (1979, October). Education andself-selection. Journal of Political Economy 87(5, Part 2),S7–S36.

163 / 163

Page 484: Econometric Evaluation of Social Programs Part IIjenni.uchicago.edu/econ312/Slides/HBK2_matching-STATIC_2019-04-02a_jbb.pdfApr 02, 2019  · From condition (Q-1), we recover the distributions

Matching References References

We next turn to the method of randomization, which isfrequently held up to be an ideal approach for evaluating socialprograms.

Randomization attempts to use a random assignment toachieve the conditional independence assumed in matching.

Aakvik, A., J. J. Heckman, and E. J. Vytlacil (2005). Estimatingtreatment effects for discrete outcomes when responses totreatment vary: An application to Norwegian vocationalrehabilitation programs. Journal of Econometrics 125(1–2),15–51.

Abadie, A. and G. W. Imbens (2006, January). Large sampleproperties of matching estimators for average treatment effects.Econometrica 74(1), 235–267.

Barnow, B. S., G. G. Cain, and A. S. Goldberger (1980). Issues inthe analysis of selectivity bias. In E. W. Stromsdorfer andG. Farkas (Eds.), Evaluation Studies Review Annual, Volume 5,pp. 42–59. Beverly Hills, California: Sage Publications.

Carneiro, P. (2002). Heterogeneity in the Returns to Schooling:Implications for Policy Evaluation. Ph. D. thesis, University ofChicago.

Carneiro, P., K. Hansen, and J. J. Heckman (2001, Fall). Removingthe veil of ignorance in assessing the distributional impacts ofsocial policies. Swedish Economic Policy Review 8(2), 273–301.

Carneiro, P., K. Hansen, and J. J. Heckman (2003, May).Estimating distributions of treatment effects with an applicationto the returns to schooling and measurement of the effects ofuncertainty on college choice. International EconomicReview 44(2), 361–422.

Cochran, W. G. and D. B. Rubin (1973). Controlling bias inobservational studies: A review. Sankyha 35(Series A, Part 4),417–446.

Cunha, F., J. J. Heckman, and S. Navarro (2005, April). Separatinguncertainty from heterogeneity in life cycle earnings, The 2004Hicks Lecture. Oxford Economic Papers 57(2), 191–261.

Gill, R. D. and J. M. Robins (2001, December). Causal inference forcomplex longitudinal data: The continuous case. Annals ofStatistics 29(6), 1785–1811.

Hahn, J. (1998, March). On the role of the propensity score inefficient semiparametric estimation of average treatment effects.Econometrica 66(2), 315–331.

Hansen, K. T., J. J. Heckman, and K. J. Mullen (2004,July–August). The effect of schooling and ability on achievementtest scores. Journal of Econometrics 121(1–2), 39–98.

Heckman, J. J. (1980). Addendum to sample selection bias as aspecification error. In E. Stromsdorfer and G. Farkas (Eds.),Evaluation Studies Review Annual, Volume 5. Beverly Hills: SagePublications.

Heckman, J. J. (1998). The effects of government policies onhuman capital investment, unemployment and earnings inequality.In Third Public GAAC Symposium: Labor Markets in the USAand Germany, Volume 5. Bonn, Germany: German-AmericanAcademic Council Foundation.

Heckman, J. J., H. Ichimura, J. Smith, and P. E. Todd (1998,September). Characterizing selection bias using experimentaldata. Econometrica 66(5), 1017–1098.

Heckman, J. J., H. Ichimura, and P. E. Todd (1997). How detailsmake a difference: Semiparametric estimation of the partiallylinear regression model. Unpublished manuscript, University ofChicago, Department of Economics.

Heckman, J. J., H. Ichimura, and P. E. Todd (1998, April).Matching as an econometric evaluation estimator. Review ofEconomic Studies 65(2), 261–294.

Heckman, J. J., R. J. LaLonde, and J. A. Smith (1999). Theeconomics and econometrics of active labor market programs. InO. C. Ashenfelter and D. Card (Eds.), Handbook of LaborEconomics, Volume 3A, Chapter 31, pp. 1865–2097. New York:North-Holland.

Heckman, J. J. and S. Navarro (2004, February). Using matching,instrumental variables, and control functions to estimate economicchoice models. Review of Economics and Statistics 86(1), 30–57.

Heckman, J. J. and R. Robb (1985). Alternative methods forevaluating the impact of interventions. In J. J. Heckman andB. S. Singer (Eds.), Longitudinal Analysis of Labor Market Data,Volume 10, pp. 156–245. New York: Cambridge University Press.

Heckman, J. J. and R. Robb (1986a). Alternative methods forsolving the problem of selection bias in evaluating the impact oftreatments on outcomes. In H. Wainer (Ed.), Drawing Inferencesfrom Self-Selected Samples, pp. 63–107. New York:Springer-Verlag. Reprinted in 2000, Mahwah, NJ: LawrenceErlbaum Associates.

Heckman, J. J. and R. Robb (1986b). Postscript: A rejoinder toTukey. In H. Wainer (Ed.), Drawing Inferences from Self-SelectedSamples, pp. 111–114. New York: Springer-Verlag. Reprinted in2000, Mahwah, NJ: Lawrence Erlbaum Associates.

Heckman, J. J. and E. J. Vytlacil (2005, May). Structuralequations, treatment effects and econometric policy evaluation.Econometrica 73(3), 669–738.

Hirano, K., G. W. Imbens, and G. Ridder (2003, July). Efficientestimation of average treatment effects using the estimatedpropensity score. Econometrica 71(4), 1161–1189.

Matzkin, R. L. (2007). Nonparametric identification. In J. J.Heckman and E. E. Leamer (Eds.), Handbook of Econometrics,Volume 6B. Amsterdam: Elsevier.

Powell, J. L. (1994). Estimation of semiparametric models. InR. Engle and D. McFadden (Eds.), Handbook of Econometrics,Volume 4, pp. 2443–2521. Amsterdam: Elsevier.

Robins, J. M. (1997). Causal inference from complex longitudinaldata. In M. Berkane (Ed.), Latent Variable Modeling andApplications to Causality. Lecture Notes in Statistics, pp. 69–117.New York: Springer-Verlag.

Rosenbaum, P. R. (1995). Observational Studies. New York:Springer-Verlag.

Rosenbaum, P. R. and D. B. Rubin (1983, April). The central roleof the propensity score in observational studies for causal effects.Biometrika 70(1), 41–55.

Roy, A. (1951, June). Some thoughts on the distribution ofearnings. Oxford Economic Papers 3(2), 135–146.

Todd, P. E. (1999, October). A practical guide to implementingmatching estimators. Unpublished manuscript, University ofPennsylvania, Department of Economics. Prepared for the IADBmeeting in Santiago, Chile.

Todd, P. E. (2007). Evaluating social programs with endogenousprogram placement and selection of the treated. In Handbook ofDevelopment Economics. Amsterdam: Elsevier. Forthcoming.

Todd, P. E. (2008). Matching estimators. In S. Durlauf and L. E.Blume (Eds.), The New Palgrave Dictionary of Economics. NewYork: Palgrave Macmillan. Forthcoming.

Vijverberg, W. P. M. (1993, May-June). Measuring the unidentifiedparameter of the extended roy model of selectivity. Journal ofEconometrics 57(1–3), 69–89.

Willis, R. J. and S. Rosen (1979, October). Education andself-selection. Journal of Political Economy 87(5, Part 2),S7–S36.

163 / 163