causesof effects

On the Causes of Effects

Stephen E. Fienberg Department of Statistics, Machine Learning

Department, Cylab, and i-Lab Carnegie Mellon University

Séminaire de philosophie des mathématiques – Paris Diderot

November 30, 2010

2 Paris 11-30-10

Frachon, I. et al. (2010) PLOS One. 5 (4), e10128. 3 Paris 11-30-10

The Data

Benfluorex Use

Cases Controls Totals

Yes 19 3 22

No 8 51 59

Totals 27 54 81

Odds Ratio=(19×51)/(3×8)=40.1

Adjusted Odds Ratio = 17.1 (from logistic regression) 4 Paris 11-30-10

Hypothetical Toxic Tort Case

•  A woman with unexplained valvular heart disease sues the manufacturer of Benfluorex, claiming that it caused her illness.

•  Dr. Frachon testifies for plaintiff, based on her study, and claims that the medication causes valvular heart disease.

•  The manufacturer’s expert testifies that their clinical trials to suggest this as a side effect.

•  How should the judge rule? 5 Paris 11-30-10

Causes of Effects versus Effects of Causes

•  The judge wants to know the cause of the woman’s heart disease---the cause of the effect.

•  Dr. Fachon mistakenly testified about the scientific question: “Does Benfluorex can be show to cause heat disease?” as if she had carried out a clinical trial, i.e., the effects of a cause.

•  But the data retrospective case-control study. •  What would have happened had the woman not

taken Benfluorex? 6 Paris 11-30-10

Statistical Question

•  Is a question about “The Causes of Effects” essentially the same as one about “The Effects of Causes”?

•  If not how do they differ?

7 Paris 11-30-10

Comparing Causal Questions

•  Dawid contrasts: – EoC: I have a headache. Will taking aspirin

help? – CoE: Was it the aspirin I took 30 minutes ago

that caused my headache to disappear? •  Different from direct versus indirect effects and

from general versus specific causation: – Does taking aspirin relieve headaches? –  If I take an aspirin for my migraine headache at

this conference today, will I get relief? Paris 11-30-10 8

J. S. Mill

Induction is mainly a process for finding the causes of effects: and … in the more perfect of the sciences, we ascend, by generalization from particulars, to the tendencies of causes considered singly, and then reason downward from those separate tendencies, to the effect of the same causes when combined.

…as a general rule, the effects of causes are far more accessible to our study than the causes of effects...

9 Paris 11-30-10

Defining Causation Statistically

•  Not simply “if x, then y.” •  Wikipedia: The belief that events occur in

predictable ways and that one event leads to another.

•  The “but for test” in law: ‘But for the defendant’s act, the harm would not have occurred.’ Counterfactuals.

•  Multiple technical definitions.

Paris 11-30-10 10

Definitions From Philosophy

•  (PR) C is a cause of E just in case: P(E | C) > P(E | ~C).

•  (Reich) Ct is a cause of Et′ if and only if: – P(Et′ | Ct) > P(Et′ | ~Ct); and – There is no further event Bt″, occurring at a

time t″ earlier than or simultaneously with t, that screens Et′ off from Ct.

•  (Cart) C causes E if and only if: –  P(E | C & B) > P(E | ~C & B) for every

background context B. (no Yule-Simpson Paradox) Paris 11-30-10 11

Assessing the Effects of Causes

•  Rubin/Holland: Average Causal Effect – Counterfactuals: We are interested in the

effect of the treatment you actually receive and what would of happened had you received the alternative.

– Treatments, x=1 and x=0, and potential outcomes, Y(1) and Y(0).

– Yi(1) - Yi(0) = the casual effect of x=1 relative to x=0 for unit i

12 Paris 11-30-10

Average Causal Effect

•  ACE = E[Y(1) - Y(0) | x=1] = E[Y(1) | x=1] - E[Y(0) | x=1] •  But counterfactuals are not observable so

we look at prima faciae ACE: – FACE = E[Y(1) | x=1] – E[Y(0) | x=0] – We estimate FACE using samples of treated

and untreated. – FACE = ACE + bias

•  Under randomization, E(bias) = 0!! 13 Paris 11-30-10

ACE and Statistical Models

•  ACE appears to be universal, i.e. model independent.

•  Expectations are with respect to distribution of individuals as well as the r.v.’s for the effects. – Akin to sampling theory and the Fisher-

Kempthorne randomization view of the analysis of experiments.

•  Why shouldn’t we think of causal effects as embedded within statistical models?

Paris 11-30-10 14

ACE vs. Odds Ratio

•  If we replace ACE by – E[Y(1) | x=1]/E[Y(0) | x=1]

or by E[Y(1) | x=1] E[Y(0) | x=0] E[Y(0) | x=1] E[Y(1) | x=0]

Then we are back to the odds ratio as a measure of causal effect.

This seems more appropriate for the categorical data setting.

15 Paris 11-30-10

The Magic Odds Ratio

•  Crucial Property of Odds Ratio: It is unchanged by rescaling of rows and columns.

•  Validity of analyzing data obtained from retrospective study as if they were prospective (Farewell, 1979). – True only if key response and explanatory

variables are binary. – Then we are looking at adjusted odds-ratios!

16 Paris 11-30-10

Assessing Causes of Effects

•  Was it the aspirin I took 30 minutes ago that caused my headache to disappear?

•  Recovery rates (from randomized trial): no aspirin 12%; aspirin 30%. – Odds Ratio: α=(30×88)/(12×70)=3.142

•  Potential responses: – R1 to aspirin; R0 to no aspirin

•  Probability of Causation (Dawid): – PC=Pr(R0=0 | R1=1)

17 Paris 11-30-10

Assessing the Causes of Effects

•  Probability of Causation: – PC=Pr(R0=0 | R1=1)

R0

R1 0 1 0 88-x x-18 70 1 x 30-x 30

88 12 100 •  Could do better if we could “adjust” for latent

covariate (genetics?).

18 ≤ x ≤ 30

PC = x/30 which yields PC ≥ 60%.

18 Paris 11-30-10

Eyewitness Testimony

•  Extensive cognitive theory on unreliability; experimental testing in lab and other settings. – All in spirit of effects of causes.

•  In criminal trials, eyewitness testimony may be crucial element of proof.

•  Experts for defense invoke the psychological theory and evidence.

•  How does this relate to the case at hand? –  From general to particular? –  Causes of effects?

Paris 11-30-10 19

Measuring Discrimination

•  Employees of a major retailer file a “class action” lawsuit against company for sex discrimination in hiring, promotion, and pay.

•  Plaintiffs’ expert uses company data to run regressions (pay) and logistic regressions (hiring and promotion) and use “coefficient for sex” to measure discrimination, “adjusting for” education, etc.

•  Defendant’s expert does something similar but with more explanatory variables.

20 Paris 11-30-10

Discrimination Law

•  To identify the presence or absence of discrimination we typically observe an individual’s gender and a particular outcome (e.g., hiring) and try to determine whether that outcome would have been different had the individual been of a different gender.

•  In other words, to measure discrimination we must answer the truly unobservable counterfactual question: What would have happened to a woman had she been a man?

21 Paris 11-30-10

Statistical Evidence of What?

•  We want to know the cause of the effects: – Different rates of hiring, pay, promotion. –  Is it company policy, educational

background, marketplace factors, etc.? •  Analysis models are “prospective” but data are

observational: – Unobservable counterfactuals. – Do models capture the company processes? –  Is pay regression model “reversible”?

22 Paris 11-30-10

Battle of Discrimination Experts

•  Experts battle over which variables belong in the model, and granularity of the analysis. –  e.g., store level, district level, aggregate at

company level. •  Other experts discuss “implicit discrimination”

and societal effects! •  But should they be measuring the

probability of causation (PC)? How?

23 Paris 11-30-10

Science Versus Policymaking

•  Social scientists need to accumulate information prospectively, especially via experimentation. –  This is “getting the science right”!

•  When policymakers are choosing a policy to implement, they look retrospectively. – This requires “getting the right science”! – Mixing EoC and CoE? We still may prefer

experimental over observational evidence. •  Evaluating an implemented policy, however,

involves assessing the cause of effects. 24 Paris 11-30-10

Bayesians v. Frequentists

•  Today’s discussion applies equally to Bayesian and frequentists: –  It is not how one does the analysis

statistically, but which analysis framework one uses.

•  See Rubin (1978), for why Bayesians should randomize to assess the effects of causes.

•  But for causes of effects, a Bayesian can put a distribution over values of x.

25 Paris 11-30-10

Morale of Story

•  The effects of causes is not necessarily the same as the causes of effects.

•  Good science, and especially experimental evidence, helps us assess the effects of causes.

•  Assessing the causes of effects, as in judicial decision-making or policy assessment may require different tools and forms of statistical analysis.

26 Paris 11-30-10

References •  Blank, R. M., Dabady, M. and Citro, C. F., eds. (2004)

Measuring Racial Discrimination. NRC Panel on Methods for Assessing Discrimination. National Academy Press.

•  Dawid, A. P. (2000) Causal inference without counterfactuals (with discussion). J. Amer. Statist. Assoc. 95, 407–448.

•  Dawid, A. P. (2007) Fundamentals of statistical causality. Dept. Stat. Sci., University College London, RR No. 279.

•  Dawid, A. P. Assessing the causes of effects. Undated ms. •  Dempster, A. P. (1988) Employment discrimination and

statistical science (with discussion). Statist. Sci., 3 (2), 149–195. 27

References II

•  Faigman, D. L. (2010) A preliminary exploration of the problem of reasoning from general scientific data to individualized legal decisionmaking. Brook. L. Rev. 75, 1115-.

•  Farewell, V. (1979) Some results on the estimation of logistic models based on retrospective data. Biometrika, 66 (1), 27–32.

•  Hitchcock, C. R. (2001) A Tale of Two Effects. The Philosophical Review 110, 361–396.

•  Hitchcock, C. R. (2010) Probabilistic causation. rev. The Stanford Encyclopedia of Philosophy. Online.

28

References III •  Holland, P. W. (1986) Statistics and causal inference. J.

Amer. Statist. Assoc. 81, 945–960. •  Holland, P. W. (1993) What comes first, cause or effect?

In G. Keren and G. Lewis, eds., A Handbook for Data Analysis in the Behavioral Sciences: Methodological Issues. Lawrence Erlbaum, 273–282.

•  Mill, J. S. (1843) The Collected Works of John Stuart Mill, Volume VII - A System of Logic Ratiocinative & Inductive.

•  Pearl, J. (2009) Causality: Models, Reasoning, and Inference. 2nd ed. Cambridge University Press.

29

References IV •  Rubin, D. B. (1974) Estimating causal effects of

treatments in randomized and non-randomized studies. J. Educ. Psychol. 66, 688–701.

•  Rubin, D. B. (1978) Bayesian inference for causal effects. The role of randomization. Ann. Statist. 6, 34–58.

•  Sfer, A. M. (2005) Randomization and Causality. Ph.D. thesis, Facultad de Ciencias Economicas, Universidad Nacional de Tucuman.

•  Spirtes, P., Glymour, C. and Scheines, R. (2001) Causation, Prediction and Search. 2nd ed. MIT Press.

Paris 11-30-10 30

causesof effects

Health & Medicine