causal reasoning for decision aiding systems cognitive systems laboratory ucla

CAUSAL REASONING FOR DECISION AIDING SYSTEMS

COGNITIVE SYSTEMS LABORATORYUCLA

Judea Pearl, Mark Hopkins, Blai Bonet,Chen Avin, Ilya Shpitser

Judea Pearl Robustness of Causal Claims

Ilya Shpitser and Chen AvinExperimental Testability of Counterfactuals

Blai BonetLogic-based Inference on Bayes Networks

Mark HopkinsInference using Instantiations

Chen AvinInference in Sensor Networks

Blai BonetReport from Probabilistic Planning Competition

PRESENTATIONS

FROM STATISTICAL TO CAUSAL ANALYSIS:1. THE DIFFERENCES

Datajoint

distribution

inferencesfrom passiveobservations

Probability and statistics deal with static relations

ProbabilityStatistics

CausalModel

Data

Causalassumptions

1. Effects of interventions

2. Causes of effects

3. Explanations

Causal analysis deals with changes (dynamics)

Experiments

Z

YX

INPUT OUTPUT

TYPICAL CAUSAL MODEL

TYPICAL CLAIMS

1. Effects of potential interventions,

2. Claims about attribution (responsibility)

3. Claims about direct and indirect effects

4. Claims about explanations

ROBUSTNESS:MOTIVATION

The effect of smoking on cancer is, in general, non-identifiable (from observational studies).

Smoking

x y

Genetic Factors (unobserved)

Cancer

u

In linear systems: y = x + is non-identifiable.


Z – Instrumental variable; cov(z,u) = 0

Smoking

y


Cancer

u

x

ZPrice ofCigarettes

xz

yz

xz

yzR

R

R

R

is identifiable


Problem with Instrumental Variables:The model may be wrong!

xz

yzyz R

RR

Smoking

ZPrice ofCigarettes

x y


Cancer

u

Smoking


Z1

Price ofCigarettes

Solution: Invoke several instruments

Surprise: 1 = 2 model is likely correct

2

22

1

11

xz

yz

xz

yz

R

R

R

R

x y


Cancer

u

PeerPressure

Z2


Z1

Price ofCigarettes

x y


Cancer

u

PeerPressure

Z2

Smoking

Greater surprise: 1 = 2 = 3….= n = qClaim = q is highly likely to be correct

Z3

Zn

Anti-smoking Legislation


x y


Cancer

u

Smoking

Symptoms do not act as instruments

remains non-identifiable

s

Symptom

Why? Taking a noisy measurement (s) of an observed variable (y) cannot add new information


x


Cancer

u

Smoking

Adding many symptoms does not help.

remains non-identifiable

ySymptom

S1

S2

Sn


Find if can evoke an equality surprise1 = 2 = …n

associated with several independent estimands of

x y

Given a parameter in a general graph

Formulate: Surprise, over-identification, independenceRobustness: The degree to which is robust to violations

of model assumptions

ROBUSTNESS:FORMULATION

Bad attempt: Parameter is robust (over identifies)

f1, f2: Two distinct functions

)()( 21 ff

distinct. are

then constraint induces model if

)]([

)]([)()]([)(

,0)(

21

gt

gtfgtf

g

i

if:


ex ey ez

x y zb c

x = ex

y = bx + ey

z = cy + ez

Ryx = bRzx = bcRzy = c

zyyxzx

yxzxzy

zyzxyx

RRR

RRcRc

RRbRb

/

/

constraint:

(b)

(c)

y → z irrelvant to derivation of b

RELEVANCE:FORMULATION

Definition 8 Let A be an assumption embodied in model M, and p a parameter in M. A is said to be relevant to p if and only if there exists a set of assumptions S in M such that S and A sustain the identification of p but S alone does not sustain such identification.

Theorem 2 An assumption A is relevant to p if and only if A is a member of a minimal set of assumptions sufficient for identifying p.


Definition 5 (Degree of over-identification)A parameter p (of model M) is identified to degree k (read: k-identified) if there are k minimal sets of assumptions each yielding a distinct estimand of p.


x y

b

z

c

Minimal assumption sets for c.

xy z

c xy z

c

G3G2

xy z

c

G1

Minimal assumption sets for b. xy

bz

FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS

FROM PARAMETERS TO CLAIMS

DefinitionA claim C is identified to degree k in model M (graph G), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand.

TE(x,z) = Rzx TE(x,z) = Rzx Rzy ·x

xy zx

y z

e.g., Claim: (Total effect) TE(x,z) = q x y z

FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS

FROM PARAMETERS TO CLAIMS

DefinitionA claim C is identified to degree k in model M (graph G), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand.

xy zx

y z

e.g., Claim: (Total effect) TE(x,z) = q x y z

Nonparametric y x

xPyxzPxyPxzTExzPzxTE'

)'(),'|()|(),()|(),(

CONCLUSIONS

1. Formal definition to ROBUSTNESS of causal claims.

2. Graphical criteria and algorithms for computing the degree of robustness of a given causal claim.

causal reasoning for decision aiding systems cognitive systems laboratory ucla

Documents

identification of p

motivationxgenetic factors

minimal sets of assumptions

identification of c

maximal edge supergraphs

edge supergraphs of

causal reasoning

causal analysis