binary instrumental variable model for causal inference...instrumental variables are not a panacea...

21
Binary Instrumental Variable Model For Causal Inference Jimmy Nguyen Faculty Advisor: Eric Tchetgen Tchetgen Postdoctorate Mentor: Linbo Wang Department of Biostatistics Harvard T.H. Chan School of Public Health Pipelines Into Biostatistics 2016 Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 1 / 19

Upload: others

Post on 31-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Binary Instrumental Variable Model For Causal Inference

Jimmy NguyenFaculty Advisor: Eric Tchetgen Tchetgen

Postdoctorate Mentor: Linbo Wang

Department of BiostatisticsHarvard T.H. Chan School of Public Health

Pipelines Into Biostatistics 2016

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 1 / 19

Page 2: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Outline

1 Background

2 Simulation Study & Application

3 Discussion

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 2 / 19

Page 3: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Introduction

In public health studies, we often want to determine a causal linkbetween a treatment (A) and effect (Y)

What effects does introducing vaccine regimen (A) to a populationhave on average disease status (Y)?How does moving from a high-poverty to low-poverty neighborhood(A) affect psychological outcomes in children (Y)?What effects does post-secondary education (A) have on expectedearnings (Y)?

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 3 / 19

Page 4: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Study Design Challenges

Double-blind randomized trials set the gold standard, but aren’talways possible for ethical or practical reasons

Smoking cessation studies

Observational studies are often done in lieu of randomized trials

Challenge with observational studies: unmeasured confounders

A

Treatment

U

Unmeasured Confounders

Y

Outcome

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 4 / 19

Page 5: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Instrumental Variables

Approach: Study the effect of A on Y indirectly through Z

Z

IV

A

Treatment

U

Unmeasured Confounders

Y

Outcome

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 5 / 19

Page 6: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Are IVs a dream come true?

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 6 / 19

Page 7: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Spoilers: IVs are not a dream come true

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 7 / 19

Page 8: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Average Causal Effect

Average causal effect (ACE) = E[Ya=1 − Ya=0]

Ya: a person’s outcome if, possibly contrary to fact, that he or shewas to receive treatment A

Idea: ACE is the population average causal effect if one were to forceeveryone to take the active treatment a = 1 vs. the control treatmenta = 0

Z

IV

A

Treatment

U

Unmeasured Confounders

Y

Outcome

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 8 / 19

Page 9: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Example: Randomized Vaccine Trial

Z

Assignment

A

Vaccine Regimen

U

Unmeasured Confounders

Y

Disease Status

ACE is average difference in disease incidence if everybody took thevaccine regimen (Ya=1) vs. nobody taking the vaccine regimen (Ya=0)

Z randomized, but A is not

So, U confounds E (Y |A = 1)− E (Y |A = 0)

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 9 / 19

Page 10: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Key Assumptions

Z

IV

A

Treatment

UUnmeasuredConfounders

YOutcome

X

Covariates

Key assumption: Z affects Y only through A

We assume this DAG holds for binary Z , A, and Y .To measure average causal effects, one of two must be true:

E[Ya=1 − Ya=0|U,X ] does not depend on UE[Az=1 − Az=0|U,X ] does not depend on U

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 10 / 19

Page 11: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Background

Estimating Equation

If our assumptions hold, we can estimate

E[Y1 − Y0] = E

[E[Y |Z = 1,X ]− E[Y |Z = 0,X ]

E[A|Z = 1,X ]− E[A|Z = 0,X ]

]∈ (−1, 1)

When Y is binary, regressing each component individually andplugging in is impractical

All 4 regression models must be correctly specifiedEnd result may not lie between (-1,1)

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 11 / 19

Page 12: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Simulation Study & Application

Our semi-parametric approach

We propose to solve for βa = E [Y1 − Y0] in three steps:

1 Fit the logistic model logit f̂ (Z = 1|X ) = γTX by regressing Zagainst X

2 Fit the model E [A|Z = 1,X ]− E [A|Z = 0,X ] = tanh(αTX ) bysolving for α in the estimating equation∑

i

(Xi )(W − tanh(αTX )) = 0

where W = A(−1)1−Z/f̂ (Z |X )

3 Solve for the target estimator E [Y |Z = 1,X ]− E [Y |Z = 0,X ] = βaby fitting the linear regression∑

i

(1

Zi

)(R − β0 − βaZi )f̂ (Z |Xi )

−1 = 0

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 12 / 19

Page 13: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Simulation Study & Application

Simulation Conditions & Data Generation

We used an intercept-only model for α0, i.e.E [A|Z = 1,X ]− E [A|Z = 0,X ] = tanh(α0). This equivalent toassuming no interaction between A and covariates X .

Data generating procedure:

X ∼ Bernoulli(p0), e.g. p0 = 0.5Z |X ∼ Bernoulli(expit(γTX ))Pr(A|Z , x) = Zi ∗ tanh(α0)A|Z , x ∼ Bernoulli(Pr(A|Z , x))Y ∼ µ1A + µ2 ∗ (A− Pr(A|Z , x)) + µT

3∗X + ε, where ε ∼ N (0, 1)

Although we simulate continuous outcome Y here, our model will inprinciple work for binary Y as well

The target quantity is set to µ1 = βa = 1.62

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 13 / 19

Page 14: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Simulation Study & Application

Simulation Results: true βa = 1.62, intercept-only model

Figure:Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 14 / 19

Page 15: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Simulation Study & Application

Card(1993) Education Dataset

Z

Lives Near4yr College

A

Post-HSEducation

UUnmeasuredConfounders

Y

Salary > $537.50(median)

X

Experience, Race,Region of Residence

Data are from the National Longitudinal Survey (NLS) of YoungMen, with 3010 participants.

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 15 / 19

Page 16: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Simulation Study & Application

Results: Intercept-only model, Z ,A,Y binary

Estimates with 95% confidence intervals:

α0 = 0.08(0.0386, 0.124)β̂a = 0.25(−0.3772, 0.8772)

Interpretation: the marginal change in the probability of earning asalary greater than $537.50 was 0.25 when comparing an interventionthat forces the population to have post high school education, to onethat forces the population to have no more than high school education

However: very large confidence interval containing 0, so this is not asignificant result

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 16 / 19

Page 17: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Simulation Study & Application

Results: Intercept-only model, Z ,A binary, Y continuous

Method βa (95% CI)

IV 190.5173 (-144.175, 525.209)

OLS 68.2869 (46.346, 90.228)

Fit Y as continuous linear model on binary A and covariates X

OLS appears to underestimate the causal effect, while IVs are lessefficient

Possible negative confounding, even though we would expect earningsto rise with post-secondary education

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 17 / 19

Page 18: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Discussion

Discussion

βa = E

[E[Y |Z = 1,X ]− E[Y |Z = 0,X ]

E[A|Z = 1,X ]− E[A|Z = 0,X ]

]∈ (−1, 1)

In binary IV model, some estimates oftanh(α̂0) = E[A|Z = 1,X ]− E[A|Z = 0,X ] were close to 0, causingwide variability in the estimates

When the IV model for the denominator was allowed to depend on X ,we observed some instability in cases where the fitted values wereclose to 0

Next steps:

Devise link to ensure that estimates of the denominator are boundedaway from 0Investigate continuous case further

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 18 / 19

Page 19: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Discussion

Conclusion

Instrumental variables are not a panacea for unmeasured confounding,as they require certain assumptions about the data to properlyestablish causality

When such assumptions are met, instrumental variables can behelpful in inferring causal effects in observational studies whereunmeasured confounders are present

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 19 / 19

Page 20: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Discussion

Acknowledgements

My thanks to

Harvard T.H. Chan School of Public Health

Eric Tchetgen Tchetgen

Rebecca Betensky

Linbo Wang

Jessica Boyle

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 19 / 19

Page 21: Binary Instrumental Variable Model For Causal Inference...Instrumental variables are not a panacea for unmeasured confounding, as they require certain assumptions about the data to

Discussion

Questions?

?

Jimmy Nguyen (HSPH) Binary IV Model for Causal Inference PIB 2016 19 / 19