reliable decision support using counterfactual models

74
Suchi Saria Assistant Professor Computer Science, Applied Math & Stats and Health Policy Institute for Computational Medicine Reliable Decision Support using Counterfactual Models w/ Peter Schulam, PhD candidate

Upload: others

Post on 21-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reliable Decision Support using Counterfactual Models

Suchi Saria Assistant Professor

Computer Science, Applied Math & Stats and Health Policy

Institute for Computational Medicine

Reliable Decision Support using Counterfactual Models

w/ Peter Schulam, PhD candidate

Page 2: Reliable Decision Support using Counterfactual Models

Example: Customer Churn

P

Cancels Account |

!

Page 3: Reliable Decision Support using Counterfactual Models

Example: Customer Churn

,

!

,

!

,

!

,

!

SupervisedLearning

Page 4: Reliable Decision Support using Counterfactual Models

Example: Customer Churn

,

!

,

!

,

!

,

!

SupervisedLearning

Supervised ML models can be biased for decision-making problems!

Page 5: Reliable Decision Support using Counterfactual Models

,

! ,

!

,

! ,

!

Ad emails, discounts, etc.

Ad emails, discounts, etc.

Past actions determined by some policy.

Why?

Page 6: Reliable Decision Support using Counterfactual Models

,

! ,

!

,

! ,

!

Ad emails, discounts, etc.

Ad emails, discounts, etc.

Actions determined by a policy based on your learned model

Why?

Page 7: Reliable Decision Support using Counterfactual Models

P

Cancels Account |

!

P

Cancels Account |

!

⇡train,

⇡test(P̂ ),

6=

Why?

Supervised ML leads to models that are unstable to shifts in the policy between the train and test

Page 8: Reliable Decision Support using Counterfactual Models

Example: Risk MonitoringAdverse

Event Onset

Is the patient at risk of a septic shock?

Page 9: Reliable Decision Support using Counterfactual Models

• Rise in Temperature and Rise in WBC are indicators of sepsis and death

• But, doctors in H1 aggressively treat patients with high temperature

• As doctors treat treat more aggressively, supervised learning model learns high temperature is associated with low risk.

Dyagilev and Saria, Machine Learning 2015

Page 10: Reliable Decision Support using Counterfactual Models

Increasing discrepancy in physician prescription behavior in train vs. test environment

Treat based on temp

Treat based on WBC

Dyagilev and Saria, Machine Learning 2015

Predictive model trained using classical supervised ML createsunsafe scenarios where sick patients are overlooked.

Page 11: Reliable Decision Support using Counterfactual Models

• Clone the customer; give a 10% and 20% discount code to each clone

• Choose the outcome that has the better outcome

{ }Y (d10) Y (d20),

Outcome under 10% discount.

Run an experiment: observe outcome under diff scenarios

Page 12: Reliable Decision Support using Counterfactual Models

{ }Y (d10) Y (d20),

Outcome under 20% discount.

Run an experiment: observe outcome under diff scenarios

• Clone the customer; give a 10% and 20% discount code to each clone

• Choose the outcome that has the better outcome

Page 13: Reliable Decision Support using Counterfactual Models

• Factual: outcome observed in the datavs.

• Counterfactual: outcome is unobserved

{ }Y (d10) Y (d20),

Can we learn models of these outcomes from observational data?

Page 14: Reliable Decision Support using Counterfactual Models

Potential Outcomes

{Y (a) : a 2 A}

Set of actionsRandom variable

Action

Potential outcomes model the observed outcome under each possible action (or intervention)

Rubin, 1974 Neyman et al., 1923 Rubin, 2005

Page 15: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 16: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 17: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 18: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 19: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 20: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 21: Reliable Decision Support using Counterfactual Models

Sequential Decisions in Continuous-Time

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Page 22: Reliable Decision Support using Counterfactual Models

Counterfactual GP

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

?

Page 23: Reliable Decision Support using Counterfactual Models

Counterfactual GP

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

E[Y ( ) | H = h]

Page 24: Reliable Decision Support using Counterfactual Models

Counterfactual GP

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

E[Y ( ) | H = h]

E[Y ( ) | H = h]

Page 25: Reliable Decision Support using Counterfactual Models

Counterfactual GP

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

E[Y ( ) | H = h]

E[Y ( ) | H = h]

E[Y ( ) | H = h]

Page 26: Reliable Decision Support using Counterfactual Models

• Counterfactual models: See Schulam and Saria, NIPS 2017 for discussion of related work.

Related Work

Dudik et al., 2011 Paduraru et al. 2013Jiang and Li, 2016

• Off-policy evaluation: Re-weighting to evaluate reward for a policy when learning from offline data.

e.g.

Brodersen et al., 2015 ads; single interventionBottou et al., 2013

Taubman et al.,2009 epidemiology; multiple sequential interventions

Xu, Xu, Saria, 2016 sparse, irregularly sampled longitudinal data; functional outcomesLok et al., 2008

Schulam Saria, 2017

Page 27: Reliable Decision Support using Counterfactual Models

Critical Assumptions• To learn the potential outcome models, we will use three

important assumptions:

• (1) Consistency

• Links observed outcomes to potential outcomes

• (2) Treatment Positivity

• Ensures that we can learn potential outcome models

• (3) No unmeasured confounders (NUC)

• Ensures that we do not learn biased modelsRubin, 1974 Neyman et al., 1923 Rubin, 2005

Page 28: Reliable Decision Support using Counterfactual Models

(1) Consistency• Consider a dataset containing observed outcomes,

observed treatments, and covariates:

• E.g.: blood pressure, exercise, BMI

• Consistency allows us to replace the observed response with the potential outcome of the observed treatment

• Under consistency our dataset satisfies

{yi, ai,xi}ni=1

Y , Y (a) | A = a

{yi, ai,xi}ni=1 , {yi(ai), ai,xi}ni=1

Page 29: Reliable Decision Support using Counterfactual Models

(2) Positivity• When working with observational data, for any set of

covariates we need to assume a non-zero probability of seeing each treatment

• Otherwise, in general, cannot learn a conditional model of the potential outcomes given those covariates

• Formally, we assume that

x

PObs(A = a | X = x) > 0 8a 2 A, 8x 2 X

Page 30: Reliable Decision Support using Counterfactual Models

(3) No Unmeasured Confounders (NUC)• Formally, NUC is an statistical independence assertion:

Y (a) ? A | X = x : 8a 2 A, 8x 2 X

Page 31: Reliable Decision Support using Counterfactual Models

(3) No Unmeasured Confounders (NUC)• Formally, NUC is an statistical independence assertion:

Y (a) ? A | X = x : 8a 2 A, 8x 2 X

xBMI yBP

Exerc

xBMI yBP

Exerc

xBMI yBP

Exerc

Page 32: Reliable Decision Support using Counterfactual Models

Learning Potential Outcome Models• Assumptions allow estimation of potential outcomes from

(observational) data:

(A3)(A1)

P(Y (a) | X = x) = P(Y (a) | X = x, A = a)

= P(Y | X = x, A = a)

Estimation requires a statistical model for estimating conditionals

• To simulate data from a new policy, we need to learn the potential outcome models

• If we have an observational dataset where assumptions 1-3 hold, then this is possible!

UAI Tutorial:Saria and Soleimani, 2017

Page 33: Reliable Decision Support using Counterfactual Models

Observational Traces

Timing between measurements is

irregular and random

Creatinine is a test used to measure kidney function.

Page 34: Reliable Decision Support using Counterfactual Models

Observational Traces

And so are times between treatments

Page 35: Reliable Decision Support using Counterfactual Models

Challenges w/ Observational Traces

In the discrete-time setting, we did not treat the timing of

events as random

Page 36: Reliable Decision Support using Counterfactual Models

Counterfactual GP• Collection of Gaussian processes

n

{Yt(a) : t 2 [0, ⌧ ]} : a 2 Co

Fixed time period Set of finite sequences of

actions

Page 37: Reliable Decision Support using Counterfactual Models

Learning from Observational Traces

●●

●●

●● ●●

●●●●

●●

●●●

●● ●●● ●

● ● ●

●●

●●

●●

● ●●

●●

●●

●● ●

tss pfvc pdlco rvsp

0

25

50

75

0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15Years Since Diagnosis

Mar

ker V

alue Medication

PrednisoneMethotrexCyclophosphamide Cytoxan

Page 38: Reliable Decision Support using Counterfactual Models

Learning from Observational Traces

●●

●●

●● ●●

●●●●

●●

●●●

●● ●●● ●

● ● ●

●●

●●

●●

● ●●

●●

●●

●● ●

tss pfvc pdlco rvsp

0

25

50

75

0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15Years Since Diagnosis

Mar

ker V

alue Medication

PrednisoneMethotrexCyclophosphamide Cytoxan

Treatments administered according to unknown policy

(i.e. not an RCT)

Page 39: Reliable Decision Support using Counterfactual Models

Learning from Observational Traces

●●

●●

●● ●●

●●●●

●●

●●●

●● ●●● ●

● ● ●

●●

●●

●●

● ●●

●●

●●

●● ●

tss pfvc pdlco rvsp

0

25

50

75

0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15Years Since Diagnosis

Mar

ker V

alue Medication

PrednisoneMethotrexCyclophosphamide Cytoxan

Learning is especially difficult because there is time-

dependent feedback between actions and outcomes

Robins 1986

Page 40: Reliable Decision Support using Counterfactual Models

Learning Models from Observational Traces• Road map:

• (1) Establish assumptions that connect probabilistic of observational traces to target counterfactual model

• (2) Posit probabilistic model of observational traces

• (3) Derive maximum likelihood estimator

P ({Ys[a] : s > t} | Ht)

Schulam and Saria, NIPS 2017

Page 41: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• We use a marked point process (MPP):

• Points model the event times: measurements or actions

• Mark models the type of event

{(Ti, Xi)}1i=1

X = (R [ {?})⇥ (C [ {?})⇥ {0, 1}⇥ {0, 1}

Schulam and Saria, NIPS 2017

Page 42: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• We use a marked point process (MPP):

• Points model the event times: measurements or actions

• Mark models the type of event

{(Ti, Xi)}1i=1

X = (R [ {?})⇥ (C [ {?})⇥ {0, 1}⇥ {0, 1}zy

Did we measure an outcome?

Page 43: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• We use a marked point process (MPP):

• Points model the event times: measurements or actions

• Mark models the type of event

{(Ti, Xi)}1i=1

X = (R [ {?})⇥ (C [ {?})⇥ {0, 1}⇥ {0, 1}zy

Did we take an action?

za

Page 44: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• We use a marked point process (MPP):

• Points model the event times: measurements or actions

• Mark models the type of event

{(Ti, Xi)}1i=1

X = (R [ {?})⇥ (C [ {?})⇥ {0, 1}⇥ {0, 1}zy

What is the value of the outcome?

zay

Page 45: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• We use a marked point process (MPP):

• Points model the event times: measurements or actions

• Mark models the type of event

{(Ti, Xi)}1i=1

X = (R [ {?})⇥ (C [ {?})⇥ {0, 1}⇥ {0, 1}zy

What action did we take?

zay a

Page 46: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• Parameterize MPP using hazard and mark density:

Schulam and Saria, NIPS 2017

Page 47: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• Parameterize MPP using hazard and mark density:

Probability of event happening at this time

Probability of mark given event time

Schulam and Saria, NIPS 2017

Page 48: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• Parameterize MPP using hazard and mark density:

Probability of event happening at this time

Probability of mark given event time

Star denotes dependence on

history

Schulam and Saria, NIPS 2017

Page 49: Reliable Decision Support using Counterfactual Models

Modeling Observational Traces

• Parameterize MPP using hazard and mark density:

• Estimate MPP by maximizing probability of traces

`(✓) =nX

j=1

log p⇤✓(yj | tj , zyj) +nX

j=1

log �⇤✓(t)p

⇤✓(aj , zyj , zaj | tj , yj)�

Z ⌧

0�⇤✓(s)ds

Model the conditional probability of the outcome using a GP

Schulam and Saria, NIPS 2017

Page 50: Reliable Decision Support using Counterfactual Models

Recovering the CGP• When does the MPP GP recover the CGP?

• In addition to Consistency, we define two assumptions

Schulam and Saria, NIPS 2017

Page 51: Reliable Decision Support using Counterfactual Models

Recovering the CGP• When does the MPP GP recover the CGP?

• In addition to Consistency, we define two assumptions

• Continuous-time NUC

• Analogue of NUC for MPP

Schulam and Saria, NIPS 2017

Page 52: Reliable Decision Support using Counterfactual Models

Recovering the CGP• When does the MPP GP recover the CGP?

• In addition to Consistency, we define two assumptions

• Continuous-time NUC

• Analogue of NUC for MPP

• Non-informative measurement times

• Measurement and action times are conditionally independent of potential outcomes

Schulam and Saria, NIPS 2017

Page 53: Reliable Decision Support using Counterfactual Models

Reliable Decisions with CGPs

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

Should we treat?

Page 54: Reliable Decision Support using Counterfactual Models

Classical Supervised Model

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

P ({Ys : s > t} | Ht)

History Ht

Page 55: Reliable Decision Support using Counterfactual Models

Counterfactual GP

●●●

●●

40

60

80

100

120

0 5 10 15Years Since First Symptom

PFVC

Lung

Cap

acity

History Ht

P ({Ys(a) : s > t} | Ht)

Page 56: Reliable Decision Support using Counterfactual Models

Simulated Data• Simulate observational traces from multiple regimes

• Traces are treated by policies unknown to learners

• In regimes A and B, policies satisfy our assumptions

• In regime C, policy violates our assumptions

• Simulate three training sets (regimes A, B, and C)

• Simulate one common test set (regime A)

Page 57: Reliable Decision Support using Counterfactual Models

Results• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

Page 58: Reliable Decision Support using Counterfactual Models

Results• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

CGP risk scores are stable across regime A and B training data

Page 59: Reliable Decision Support using Counterfactual Models

Results

Baseline GP scores change

• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

Page 60: Reliable Decision Support using Counterfactual Models

Results

CGP relative risk across patients is also stable across training data A and B

• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

Page 61: Reliable Decision Support using Counterfactual Models

Results

Baseline GP’s relative risk changes

• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

Page 62: Reliable Decision Support using Counterfactual Models

Results

CGP AUC is constant across regimes A and B

• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

Page 63: Reliable Decision Support using Counterfactual Models

Results

Baseline GP’s AUC is unstable

• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Normalize predictions to [0, 1]

Page 64: Reliable Decision Support using Counterfactual Models

Simulated Data• Simulate observational traces from three regimes

• Traces are treated by policies unknown to learners

• In regimes A and B, policies satisfy our assumptions

• In regime C, policy violates our assumptions

• Simulate three training sets (regimes A, B, and C)

• Simulate one common test set (regime A)

Page 65: Reliable Decision Support using Counterfactual Models

Results• Risk scores:

• Use Baseline and CGP to predict final severity marker

• Negate predictions and normalize to [0, 1]

CGP risk scores are unstable if the policy in the training data violates our assumptions

Page 66: Reliable Decision Support using Counterfactual Models

Medical Decision-Support using CGPs

• Dialysis is expensive, but necessary when kidneys fail

• Important questions for decision-making:

• (1) Will this individual be okay if I remove dialysis?

• (2) Will this individual benefit from dialysis?

• CGP can help to answer these questions

Page 67: Reliable Decision Support using Counterfactual Models

Medical Decision-SupportCounterfactual (no treatment)

Factual

Page 68: Reliable Decision Support using Counterfactual Models

Medical Decision-Support

Counterfactual (CVVHD)

Page 69: Reliable Decision Support using Counterfactual Models

A Real ICU Patient with AKI

1. Irregularly sampled2. Unaligned signals3. Cross correlations

0 100 200 300 400 500Time (hours)

20

40

60

80

BUN

0 100 200 300 400 500Time (hours)

3.5

4.0

4.5

5.0

5.5

Potassium

0 100 200 300 400 500Time (hours)

60

80

100

120

HR

0 100 200 300 400 500Time (hours)

1

2

3

4

Creatinine

0 100 200 300 400 500Time (hours)

7

8

9

10

11Calcium

0 100 200 300 400 500Time (hours)

80

100

120

140

160

Blood Pressure

Page 70: Reliable Decision Support using Counterfactual Models

Continuous-time actions, continuous-time multi-variate trajectories

Input x(t) convolved with impulse-response h(t) to generate response ⇢(t)

Input⇢(t) = x(t) ⇤ h(t)

Response

0.0

0.5

1.0

�1

0

1

2

�0.4

0.0

0.4

0.8

1.2

0 5 10 15 20

0.0

0.5

1.0

0 1 2 3 4 5�1

0

1

2

0 5 10 15 20�0.5

0.0

0.5

1.0

1.52nd order

3rd order

0.0

0.5

1.0

�1

0

1

2

�0.5

0.0

0.5

1.0

0 5 10 15 20

0.0

0.5

1.0

0 1 2 3 4 5�1

0

1

2

0 5 10 15 20

�0.5

0.0

0.5

1.0

1.50.0

0.5

1.0

�1

0

1

2

�0.5

0.0

0.5

1.0

0 5 10 15 20

0.0

0.5

1.0

0 1 2 3 4 5�1

0

1

2

0 5 10 15 20

�0.5

0.0

0.5

1.0

1.5

complex roots

2nd order

x(t) h(t) ⇢(t)

⇢(t) = x(t) ⇤ h(t) =Z 1

�1x(⌧)h(t� ⌧)d⌧

h(t) =↵�

� � ↵(e�↵t � e��t)1(t � 0)Example:

To allow sharing across signals: gd(t) = ⇢0(t)| {z }shared

+(1� ) ⇢d(t)| {z }signal-specific 2 [0, 1]

Similar ideas in pharmacokinetics:Cutler, 1978

Shargel et al. 2005

Rich et al., 2016

Soleimani, Subbaswamy, Saria, UAI 2017

Page 71: Reliable Decision Support using Counterfactual Models

Quantitative Results

Better relative performance at longer prediction horizons

For horizon 7: on test regions with treatment, 15% than BART and 8% better than LSTM

1 2 3 4 5 6 7Prediction Horizon (days)

0.6

0.7

0.8

0.9

1.0N

RM

SE

Proposed model

RNN

BART

Soleimani, Subbaswamy, Saria, UAI 2017

Proposed ModelLSTMBART

Page 72: Reliable Decision Support using Counterfactual Models

Conclusions• Use counterfactual objectives for training predictive models

• Assumptions are critical for counterfactual models

• But they are not statistically testable

• Can we develop formal sensitivity analyses?

• Are the other structural assumptions where CGP’s can be learned?

• Counterfactual reasoning is orthogonal to other efforts in interpretability and accountability

• Counterfactual objective tells us what to fit

• Interpretable models: how to parameterize for transparency

Page 73: Reliable Decision Support using Counterfactual Models

Key References• Potential Outcomes

• Neyman 1923 & Neyman et al. 1990 (English)

• Rubin 2005

• Treatment-Confounder Feedback and G-computation

• Robins 1986

• Robins and Hernán 2009

• Counterfactual Reasoning and Reliable Decision Support

• Schulam and Saria, NIPS 2017

• Soleimani, Subbaswamy, and Saria, UAI 2017

• Xu, Xu and Saria, JMLR 2017

• Dyagilev and Saria, Maching Learning Journal 2017

• Saria and Soleimani, UAI Tutorial 2017

• Saria and Schulam, NIPS Tutorial 2016

Dyagilev and Saria, Machine Learning 2015

Soleimani, Subbaswamy, Saria, UAI 2017

Schulam and Saria, NIPS 2017

Xu, Xu, Saria, MLHC 2016 (JMLR-to appear)

Robins 1986

Rubin, 1974

Neyman et al., 1923

Rubin, 2005

Soleimani and Saria, UAI 2017

Robins and Hernan 2009

Page 74: Reliable Decision Support using Counterfactual Models

Thank [email protected]

www.suchisaria.com@suchisaria

All references throughout the slides are active links and clickable.For errors and edits, please contact: [email protected] Thanks!