cesis seminar series feb 2014 pardis nabavi
DESCRIPTION
Estimation of multivalued treatment effects under conditional independence Matias D. Cattaneo, David M. Drukker, Ashley D. Holland. CESIS Seminar series Feb 2014 Pardis Nabavi. Multivalued treatment effects. Estimating causal effects of multivalued treatments under ignorability - PowerPoint PPT PresentationTRANSCRIPT
1
Estimation of multivalued treatment effectsunder conditional independence
Matias D. Cattaneo, David M. Drukker, Ashley D. Holland
CESIS Seminar seriesFeb 2014
Pardis Nabavi
2
Multivalued treatment effects• Estimating causal effects of
multivalued treatments under ignorability
possible treatment levels j = 0, 1, ..., J yi is the observed outcome variable wi denotes the treatment level di(j) = 1(wi = j), xi is a kx × 1 vector of covariates
3
Ignorability Assumption• (Selection-on-observables) y(j) ⊥⊥ d(j)|x.
• : implies that the distribution of each potential outcome y(j) is independent of the random treatment d(j), conditional on the covariates x.
• (No-empty-cell) 0 < pmin < pj(x) with pj(x) = P[w = j|x].
• : says that for every possible x in the population, there is a strictly positive probability that someone with that covariate pattern could be assigned to each treatment level.
4
Command• Implements two semiparametric estimators for
multivalued treatment effects discussed in Cattaneo (2010)
• Poparms (short for potential-outcomes parameters)
• The estimators implemented in poparms use a multinomial-logit model (MLM) to predict the probability of each treatment level as function of covariates.
5
Two semiparametric estimators• inverse-probability weighted (IPW) • efficient-influence-function (EIF)
• selection-on-observables assumption to identify and estimate the parameters of the potential-outcomes distributions
• after controlling for observed covariates, the potential-outcome distributions are independent of the treatment level administered, and therefore rules out that some unobservable factor correlated with treatment assignment affects the potential-outcome distributions
6
propensity score
• The only unknown functions for the IPW estimators are the conditional probability functions pj(x), j = 0, 1, . . . , J, forming the generalized propensity score p(x), which can be estimated parametrically or nonparametrically.
7
Some features of the implemented procedures• Under standard regularity conditions, the IPW and
EIF estimators are consistent, asymptotically normal and semiparametric efficient when nonparametric estimators are employed to approximate the unknown functions introduced above
• Because we construct the joint VCE, we can conduct joint inference on the mean and quantile of the potential-outcome distributions, and hence we are also able to conduct valid inference on many other treatment effect parameters of interest.
• Both the IPW and EIF estimators can be used to construct simple inference procedures for joint means and quantiles of the potential-outcome distributions, and combinations thereof.
8
Some features of the implemented procedures• It’s recommended to employ the EIF estimator in
practice because it enjoys the so-called double-robust property when viewed from a (flexible) parametric implementation point of view
• While the IPW estimator may be preferred over the EIF because of its simplicity, it is important to note that the VCE for the IPW estimator requires implementing all the ingredients of the EIF estimator. Thus, the IPW estimator is simpler only as a point estimator
• In poparms, the EIF estimator is the default.
9
Syntax• poparms (treatvar gpsvars) (depvar cvars) [If ][in]
[,quantiles(numlist) vce(vcemethod [, vceoptions]) ipw]
• treatvar is categorical variable indicating treatment
• gpsvars are the covariates in the equations for the generalized propensity score
• depvar is the outcome variable• cvars are the covariates on which conditional
moments are calculated in the efficient influence-function estimator.
• gpsvars and cvars may contain factor variables
10
bfit• bfit subcmd sorts a set of fitted candidate
regression models by an information criterion, puts the best-fitting model in ereturn, and displays a table showing the ranking of the models.
• The Bayesian information criterion is the default and the Akaike information criterion may optionally be specified.
11
A Numerical Example• estimate the population effects of giving harder
tests to students on student performance
a class with normal tests (w=0), a class whose tests included some hard questions
(w=1), or a class whose tests included even more hard
questions (w=2)
• Control variables Parental status = pindex Student’s enviroment = eindex
12
Select a model for the predicted probabilities• From a theoretical perspective, it is assumed that
a sufficient number of terms in a polynomial of the covariates are included in the MLM.
• The practical implication of this approach is that we must find an MLM for treatment probabilities before using poparms. We use our bfit command to automate this procedure.
13
subcommand with the bfit• we specify the logit subcommand with the bfit
command to cause bfit to search among candidate MLMs. The covariates pindex and eindex follow the outcome variable w. The option corder(3) specifies that bfit should fit a sequence of models that builds up to a fully interacted, third-order polynomial in the 2 covariates.
• We specify the option base(0) so that w=0 will be the base category for the the candidate MLMs. Finally, we specify option aic to select the model that minimizes the AIC instead of using the default Bayesian information criterion (BIC).
14
Command• bfit logit w pindex eindex, corder(3) base(0)
sort(aic)• Default corder is 2.
15
16
Control predicted probabilities • Before we proceed, we need to check that
all of the predicted probabilities are sufficiently greater than 0 and less than 1.
• For each treatment level j, an overlap plot depicts the estimated density of the predicted probabilities for treatment level j conditional on each possible treatment level.
17
18
Do the small minimums for phat0 likely indicate a problem?
• conditional densities of the predicted probability
19
No evidence that there is any mass of observations with predictedprobabilities close to either 0 or 1
20
Similar for for treatment levels 1 and 2
21
No mass too close to 0 or 1!
22
Same here!
23
Select a model for the conditional means
• specify the regress subcommand• bfit regress spmeasure pindex eindex, corder(3)
24
Check the results
25
Selecting the model• bfit selected a fully interacted second-order
polynomial. As the output presents no indication of any problems, we accept this model as our model for the conditional mean.
• Note that, for simplicity, we employ the same flexible specification for both regression functions.
26
poparms• poparms takes 2 equations:
model for the predicted probabilities
model for the conditional expectations
27
Output:
28
See the parameter names
29
Average treatment effects
30
Margin
31
Margin Plot
32
Quantile effect
• Or joint mean and quantile effect
33
Quantile effects
34
Thanks for listening to me!
More information:Cattaneo, M.D., Drukker,D.M. & Holland, A.D. (2013), “Estimation of multivalued treatment effects under
conditional independence”, Stata Journal