b ayesian n etworks. s ome a pplications of bn medical diagnosis troubleshooting of...

55
BAYESIAN NETWORKS

Upload: augustine-edwards

Post on 01-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

BAYESIAN NETWORKS

Page 2: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

SOME APPLICATIONS OF BN

Medical diagnosis Troubleshooting of hardware/software

systems Fraud/uncollectible debt detection Data mining Analysis of genetic sequences Data interpretation, computer vision, image

understanding

Page 3: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

MORE COMPLICATED SINGLY-CONNECTED BELIEF NET

Radio

Battery

SparkPlugs

Starts

Gas

Moves

Page 4: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Region = {Sky, Tree, Grass, Rock}

R2

R4R3

R1

Above

Page 5: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data
Page 6: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

CALCULATION OF JOINT PROBABILITY

B E P(A|…)

TTFF

TFTF

0.950.940.290.001

Burglary Earthquake

Alarm

MaryCallsJohnCalls

P(B)

0.001

P(E)

0.002

A P(J|…)

TF

0.900.05

A P(M|…)

TF

0.700.01

P(JMABE)= P(J|A)P(M|A)P(A|B,E)P(B)P(E)= 0.9 x 0.7 x 0.001 x 0.999 x 0.998= 0.00062

P(x1x2…xn) = Pi=1,…,nP(xi|parents(Xi))

full joint distribution table

Page 7: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

WHAT DOES THE BN ENCODE?

Burglary EarthquakeJohnCalls MaryCalls | AlarmJohnCalls Burglary | AlarmJohnCalls Earthquake | AlarmMaryCalls Burglary | AlarmMaryCalls Earthquake | Alarm

Burglary Earthquake

Alarm

MaryCallsJohnCalls

A node is independent of its non-descendents, given its parents

Page 8: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

PROBABILISTIC INFERENCE

Is the following problem…. Given:

A belief state P(X1,…,Xn) in some form (e.g., a Bayes net or a joint probability table)

A query variable indexed by q Some subset of evidence variables indexed by

e1,…,ek

Find: P(Xq | Xe1 ,…, Xek)

Page 9: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

B E P(A|…)

TTFF

TFTF

0.950.940.290.001

Burglary Earthquake

Alarm

MaryCallsJohnCalls

P(B)

0.001

P(E)

0.002

A P(J|…)

TF

0.900.05

A P(M|…)

TF

0.700.01

TOP-DOWN INFERENCE: RECURSIVE COMPUTATION OF ALL MARGINALS DOWNSTREAM OF EVIDENCE

P(A|E) = P(A|B,E)P(B) +P(A| B,E)P(B)

P(J|E) = P(J|A,E)P(A) +P(J| A,E)P(A) P(M|E) = P(M|A,E)P(A) +

P(M| A,E)P(A)

Page 10: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

TOP-DOWN INFERENCE

Only works if the graph of ancestors of a variable is a polytree

Evidence given on ancestor(s) of the query variable

Efficient: O(d 2k) time, where d is the number of ancestors

of a variable, with k a bound on # of parents Evidence on an ancestor cuts off influence of

portion of graph above evidence node

Page 11: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

QUERYING THE BN

The BN gives P(T|C) P(C|T) can be computed using

Bayes rule:

P(A|B) = P(B|A) P(A) / P(B)

Cavity

Toothache

P(C)

0.1

C P(T|C)

TF

0.40.01111

Page 12: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

QUERYING THE BN The BN gives P(T|C) What about P(C|T)? P(Cavity|Toothache) =

P(Toothache|Cavity) P(Cavity)

P(Toothache)

[Bayes’ rule]

Querying a BN is just applying Bayes’ rule on a larger scale…

Cavity

Toothache

P(C)

0.1

C P(T|C)

TF

0.40.01111 Denominator computed by

summing out numerator over Cavity and Cavity

Page 13: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

NAÏVE BAYES MODELS

P(Cause,Effect1,…,Effectn)= P(Cause) Pi P(Effecti | Cause)

Cause

Effect1 Effect2 Effectn

Page 14: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

NAÏVE BAYES CLASSIFIER

P(Class,Feature1,…,Featuren)= P(Class) Pi P(Featurei | Class)

Class

Feature1 Feature2 Featuren

P(C|F1,….,Fk) = P(C,F1,….,Fk)/P(F1,….,Fk)

= 1/Z P(C) Pi P(Fi|C)

Given features, what class?

Spam / Not Spam

English / French/ Latin

Word occurrences

Page 15: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

COMMENTS ON NAÏVE BAYES MODELS

Very scalable (thousands or millions of features!), easy to implement

Easily handles missing data: just ignore the feature

Conditional independence of features is main weakness. What if two features were actually correlated? Many features?

Page 16: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

VARIABLE ELIMINATION: PROBABILISTIC INFERENCE IN GENERAL NETWORKS

Coherence

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

Basic idea: Eliminate “nuisance” variables one

at a time via marginalization

Example: P(J)

Elimination order: C,D,I,H,G,S,L

Page 17: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Coherence

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

P(D|C)

P(C)

P(I)

P(G|I,D)

P(H|G,J)

P(J|S,L)

P(S|I)

Page 18: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Coherence

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

P(D|C)

P(C)

ELIMINATING C

Page 19: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

P(D)=cP(D|C)P(C)

C IS ELIMINATED, GIVING A NEW FACTOR OVER D

Page 20: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

P(D)

ELIMINATING D

P(G|I,D)

Page 21: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Intelligence

Happy

Grade SAT

Letter

Job

D IS ELIMINATED, GIVING A NEW FACTOR OVER G, I

P(G|I)=dP(G|I,d)P(d)

Page 22: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Intelligence

Happy

Grade SAT

Letter

Job

ELIMINATING I

P(G|I) P(S|I)

P(I)

Page 23: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Happy

Grade SAT

Letter

Job

I IS ELIMINATED, PRODUCING A NEW FILL EDGE AND FACTOR OVER G AND S

P(G,S)=iP(i)P(G|i)P(S|i)

New undirected fill edge

Page 24: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Happy

Grade SAT

Letter

Job

ELIMINATING H

P(H|G,J)

Page 25: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Happy

Grade SAT

Letter

Job

ELIMINATING H

P(H|G,J)

fGJ(G,J)=hP(h|G,J)=1

Page 26: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Grade SAT

Letter

Job

H IS ELIMINATED, PRODUCING A NEW FILL EDGE AND FACTOR OVER G, J

fGJ(G,J)

Page 27: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Grade SAT

Letter

Job

ELIMINATING G

fGJ(G,J)

P(G,S)

P(L|G)

Page 28: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Grade SAT

Letter

Job

G IS ELIMINATED, MAKING A NEW TRINARY FACTOR OVER S,L,J AND A NEW FILL EDGE

fGJ(G,J)

P(G,S)

P(L|G)

fSLJ(S,L,J) = g P(g,S) P(L|g) fGJ(g,J)

Page 29: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

SAT

Letter

Job

ELIMINATING S

fSLJ(S,L,J)

P(J|S,L)

Page 30: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

SAT

Letter

Job

S IS ELIMINATED, CREATING A NEW FACTOR OVER L, J

fSLJ(S,L,J)

P(J|S,L)

fLJ(L,J) = s fSLJ(s,L,J) P(J|s ,L)

Page 31: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Letter

Job

ELIMINATING L

fLJ(L,J)

Page 32: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Letter

Job

L IS ELIMINATED, GIVING A NEW FACTOR OVER J (WHICH TURNS OUT TO BE P(J))

fLJ(L,J)

P(J)=l fLJ(l,J)

Page 33: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

Job

L IS ELIMINATED, GIVING A NEW FACTOR OVER J (WHICH TURNS OUT TO BE P(J))

P(J)

Page 34: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

JOINT DISTRIBUTION

P(X) = P(C)P(D|C)P(I)P(G|I,D)P(S|I)P(L|G) P(J|L,S)P(H|G,J)

Apply elimination ordering C,D,I,H,G,S,L

Page 35: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

P(X) = P(C)P(D|C)P(I)P(G|I,D)P(S|I)P(L|G) P(J|L,S)P(H|G,J)

Apply elimination ordering C,D,I,H,G,S,L

fD(D)=SCP(C)P(D|C)

Page 36: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SCP(X) = fD(D)P(I)P(G|I,D)P(S|I)P(L|G) P(J|L,S)P(H|G,J)

Apply elimination ordering C,D,I,H,G,S,L

fD(D)=SCP(C)P(D|C)

Page 37: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SCP(X) = fD(D)P(I)P(G|I,D)P(S|I)P(L|G) P(J|L,S)P(H|G,J)

Apply elimination ordering C,D,I,H,G,S,L

fGI(G,I)=SDfD(D)P(G|I,D)

Page 38: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,DP(X) = fGI(G,I)P(I)P(S|I)P(L|G) P(J|L,S)P(H|G,J)

Apply elimination ordering C,D,I,H,G,S,L

fGI(G,I)=SDfD(D)P(G|I,D)

Page 39: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,DP(X) = fGI(G,I)P(I)P(S|I)P(L|G) P(J|L,S)P(H|G,J)

Apply elimination ordering C,D,I,H,G,S,L

fGS(G,S)=SIfGI(G,I)P(I)P(S|I)

Page 40: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,IP(X) = fGS(G,S)P(L|G)P(J|L,S)P(H|G,J) Apply elimination ordering C,D,I,H,G,S,L

fGS(G,S)=SIfGI(G,I)P(I)P(S|I)

Page 41: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,IP(X) = fGS(G,S)P(L|G)P(J|L,S)P(H|G,J) Apply elimination ordering C,D,I,H,G,S,L

fGJ(G,J)=SHP(H|G,J)

What values does this factor store?

Page 42: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,HP(X) = fGS(G,S)P(L|G)P(J|L,S)fGJ(G,J) Apply elimination ordering C,D,I,H,G,S,L

fGJ(G,J)=SHP(H|G,J)

Page 43: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,HP(X) = fGS(G,S)P(L|G)P(J|L,S)fGJ(G,J) Apply elimination ordering C,D,I,H,G,S,L

fSLJ(S,L,J)=SG fGS(G,S)P(L|G)fGJ(G,J)

Page 44: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,H,GP(X) = fSLJ(S,L,J)P(J|L,S) Apply elimination ordering C,D,I,H,G,S,L

fSLJ(S,L,J)=SG fGS(G,S)P(L|G)fGJ(G,J)

Page 45: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,H,GP(X) = fSLJ(S,L,J)P(J|L,S) Apply elimination ordering C,D,I,H,G,S,L

fLJ(L,J)=SS fSLJ(S,L,J)P(J|L,S)

Page 46: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,H,G,SP(X) = fLJ(L,J) Apply elimination ordering C,D,I,H,G,S,L

fLJ(L,J)=SS fSLJ(S,L,J)

Page 47: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,H,G,SP(X) = fLJ(L,J) Apply elimination ordering C,D,I,H,G,S,L

fJ(J)=SL fLJ(L,J)

Page 48: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

GOING THROUGH VE

SC,D,I,H,G,S,LP(X) = fJ(J) Apply elimination ordering C,D,I,H,G,S,L

fJ(J)=SL fLJ(L,J)

Page 49: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

ORDER-DEPENDENCE

Page 50: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

ORDER MATTERS

Coherence

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

If we were to eliminate G first, we’d create a factor over D, I, L, and H (their distribution becomes coupled)

Page 51: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

ELIMINATION ORDER MATTERS

Coherence

Difficulty Intelligence

Happy

SAT

Letter

Job

If we were to eliminate G first, we’d create a factor over D, I, L, and H (their distribution becomes coupled)

fDILH(D,I,L,H) = g P(g|D,I)*P(L|g)*P(H|g)

Page 52: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

COMPLEXITY

In polytree networks where each node has at most k parents, O(n2k) with top-down ordering

In other networks, intermediate factors may involve more than k terms Worst case O(n) Good ordering heuristics exist, e.g. min-

neighbors, min-fill

Exact inference on non-polytree networks is NP-hard!

Page 53: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

VARIABLE ELIMINATION WITH EVIDENCE

Coherence

Difficulty Intelligence

Happy

Grade SAT

Letter

Job

Two-step process:1. Find P(X,e) with VE2. Normalize by P(e)

Example: P(J|H)

1. Run VE, enforcing H=T when H is eliminated.

2. This produces P(J,H=T) (a factor over J)

3. P(J=T|H=T) = P(J=T,H=T) / (P(J=T,H=T)+P(J=F,H=T))

Page 54: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

RECAP

Exact inference techniques Top-down inference: linear time when

ancestors of query variable are polytree, evidence is on ancestors

Bottom-up inference in Naïve Bayes models General inference using Variable Elimination

(We’ll come back to approximation techniques in a week.)

Page 55: B AYESIAN N ETWORKS. S OME A PPLICATIONS OF BN Medical diagnosis Troubleshooting of hardware/software systems Fraud/uncollectible debt detection Data

NEXT TIME

Learning Bayes nets R&N 20.1-2