bayesian networks - exact inference by variable eliminationlarrosa/mei-csi-files/bn/2-bn-ve.pdf ·...

25
Bayesian Networks Exact Inference by Variable Elimination Emma Rollon and Javier Larrosa Q1-2015-2016 Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 1 / 25

Upload: others

Post on 09-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Bayesian NetworksExact Inference by Variable Elimination

Emma Rollon and Javier Larrosa

Q1-2015-2016

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 1 / 25

Page 2: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Recall the most usual queries

Let X = E ∪Q ∪Z , where E are the evidence variables, e are the observedvalues, Q are the variables of interest, Z are the rest of the variables.

Probability of Evidence (Q = ∅):

Pr(e) =∑

Z Pr(e,Z )

Prior Marginals (E = ∅):

Pr(Q) =∑

Z Pr(Q,Z )

Posterior Marginals (E 6= ∅):

Pr(Q | e) =∑

Z Pr(Q,Z | e)

Most Probable Explanation (MPE) (E 6= ∅, Z = ∅):

q∗ = arg maxQ Pr(Q | e)

Maximum a Posteriori Probability (MAP):

q∗ = arg maxQ Pr(Q | e) = arg maxQ∑

Z Pr(Q,Z | e)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 2 / 25

Page 3: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

In this session

We are going to see a generic algorithm for inference (inference = queryanswer)

It is called Variable Elimination because it eliminates one by one thosevariables which are irrelevant for the query.

It relies on some basic operations on a class of functions known asfactors.

It uses an algorithmic technique called dynamic programming

We will illustrate it for the computation of prior and posteriors marginals,

P(Y) =∑Z

P(Y,Z )

P(Y|e) =∑Z

P(Y,Z |e)

It can be easily extended to other queries.

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 3 / 25

Page 4: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Factors

A factor f is a function over a set of variables {X1,X2, . . . ,Xk}, calledscope, mapping each instantiation of these variables to a real number R.

X Y Z f (X , Y , Z)0 0 0 50 0 1 30 0 2 10 1 0 50 1 1 20 1 2 71 0 0 11 0 1 41 0 2 61 1 0 21 1 1 41 1 2 3

Observation: Condition Probability Tables (CPTs) are factors.

We define three operations on factors:

Product: f (X ,Y ,Z )× g(Q,Y ,R)

Conditioning: f (X , y ,Z )

Marginalization:∑

y∈Y f (X , y ,Z )

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 4 / 25

Page 5: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Factors: Product

X Y f (X ,Y )

0 0 10 1 31 0 21 1 1

Y Z g(Y ,Z )

0 0 40 1 31 0 11 1 2

X Y Z f (X ,Y )× g(Y ,Z )

0 0 0 1 · 40 0 1 1 · 30 1 0 3 · 10 1 1 3 · 21 0 0 2 · 41 0 1 2 · 31 1 0 1 · 11 1 1 1 · 2

Factor multiplication is commutative and associative. It is time and spaceexponential in the number of variables in the resulting factor.

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 5 / 25

Page 6: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Factors: Conditioning

X Y f (X ,Y )

0 0 10 1 31 0 21 1 1

Y f (0,Y )

0 11 3

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 6 / 25

Page 7: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Factors: Marginalization

X Y f (X ,Y )

0 0 10 1 31 0 21 1 1

Y∑

x∈X f (x ,Y )

0 1 + 21 3 + 1

Factor marginalization is commutative. It is time exponential in thenumber of variables in the original factor, and space exponential in thenumber of variables in the resulting factor.

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 7 / 25

Page 8: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Factors: Example

Given the following BN:

compute: Pr(S | I )× Pr(I ), Pr(G | i1,D),∑

G Pr(L | G ) ,∑

L Pr(L | G )

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 8 / 25

Page 9: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

The basic property: distributivity

When dealing with integers, we know that:

n∑i=1

K × i = K ×n∑

i=1

i

The same happens when dealing with factors (e.g., D(X ) = {x0, x1}):∑X

f (Z )× g(X ,Y ) = f (Z )× g(x0,Y ) + f (Z )× g(x1,Y ) =

= f (Z )× (g(x0,Y ) + g(x1,Y )) = f (Z )×∑X

g(X ,Y )

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 9 / 25

Page 10: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on a chain

We will illustrate VE in the simplest case (a chain with no evidence):

Boolean variables Xi

The BN is a chain:

P(X1,X2,X3,X4) = Pr(X1)Pr(X2 | X1)Pr(X3 | X2)Pr(X4 | X3)

We have no evidence (E = ∅)The only variable of interest is X4

Thus, we want to compute:

Pr(X4) =∑X1

∑X2

∑X3

Pr(X1)Pr(X2 | X1)Pr(X3 | X2)Pr(X4 | X3)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 10 / 25

Page 11: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on a chain

Elimination of X1:

Pr(X4) =∑X1

∑X2

∑X3

Pr(X1)Pr(X2 | X1)Pr(X3 | X2)Pr(X4 | X3) =

=∑X2

∑X3

∑X1

Pr(X1)Pr(X2 | X1)Pr(X3 | X2)Pr(X4 | X3) =

=∑X2

∑X3

Pr(X3 | X2)Pr(X4 | X3)(∑X1

Pr(X1)Pr(X2 | X1)) =

=∑X2

∑X3

Pr(X3 | X2)Pr(X4 | X3)λ1(X2)

where λ1(X2) =∑

X1Pr(X1)Pr(X2 | X1)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 11 / 25

Page 12: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on a chain

Elimination of X2:

P(X4) =∑X2

∑X3

Pr(X3 | X2)Pr(X4 | X3)λ1(X2)

=∑X3

∑X2

Pr(X3 | X2)Pr(X4 | X3)λ1(X2) =

=∑X3

Pr(X4 | X3)(∑X2

Pr(X3 | X2)λ1(X2)) =

=∑X3

Pr(X4 | X3)λ2(X3) =

where λ2(X3) =∑

X2Pr(X3 | X2)λ1(X2)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 12 / 25

Page 13: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on a chain

Elimination of X3:

P(X4) =∑X3

Pr(X4 | X3)λ2(X3) = λ3(X4)

Question:

What would be the time and space complexity of this execution?

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 13 / 25

Page 14: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on general networks

Joint probability distribution: P(C ,D, I ,G , L,S ,H, J) =P(H|G , J)P(J|L,S)P(L|G )P(S |I )P(I )P(G |D, I )P(D|C )P(C )

Goal: P(G , L,S ,H, J)

Elimination order: C ,D, I

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 14 / 25

Page 15: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on general networks

Elimination of C :

∑I ,D,C

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )P(G |D, I )P(D|C )P(C ) =

=∑I ,D

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )P(G |D, I )(∑C

P(D|C )P(C )) =

=∑I ,D

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )P(G |D, I )λC (D)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 15 / 25

Page 16: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on general networks

Elimination of D:

∑I ,D

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )P(G |D, I )λC (D)

=∑I

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )(∑D

P(G |D, I )λC (D)) =

=∑I

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )λD(G , I )

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 16 / 25

Page 17: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: prior marginal on general networks

Elimination of I :

∑I

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )λD(G , I )

P(H|G , J)P(J|L, S)P(L|G )(∑I

P(S |I )P(I )λD(G , I )) =

P(H|G , J)P(J|L, S)P(L|G )λI (S ,G )

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 17 / 25

Page 18: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: posterior marginal on general networks

Given the joint probability:

P(C ,D, I ,G , L, S ,H, J) =

P(H|G , J)P(J|L, S)P(L|G )P(S |I )P(I )P(G |D, I )P(D|C )P(C )

and evidence {I = i ,H = h}, we want to compute:

P(J, S , L | i , h) =P(J,S , L, i , h)

P(i , h)=

∑C ,D,G P(C ,D, i ,G , L,S , h, J)∑

J,S ,L,C ,D,G P(C ,D, i ,G , L,S , h, J)

using elimination order C ,D,G .

Trick

Compute P(J, S , L, i , h) and normalize.

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 18 / 25

Page 19: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: posterior marginal on general networks

Elimination of C :∑C ,D,G

P(h|G , J)P(J|L, S)P(L|G )P(S |i)P(i)P(G |D, i)P(D|C )P(C ) =

∑D,G

P(h|G , J)P(J|L, S)P(L|G )P(S |i)P(i)P(G |D, i)(∑C

P(D|C )P(C )) =

∑D,G

P(h|G , J)P(J|L, S)P(L|G )P(S |i)P(i)P(G |D, i)λC (D)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 19 / 25

Page 20: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: posterior marginal on general networks

Elimination of D:

∑D,G

P(h|G , J)P(J|L, S)P(L|G )P(S |i)P(i)P(G |D, i)λC (D) =

∑G

P(h|G , J)P(J|L, S)P(L|G )P(S |i)P(i)(∑D

P(G |D, i)λC (D)) =∑G

P(h|G , J)P(J|L, S)P(L|G )P(S |i)P(i)λD(G )

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 20 / 25

Page 21: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: posterior marginal on general networks

Elimination of G :

∑G

P(h|G , J)P(J|L,S)P(L|G )P(S |i)P(i)λD(G ) =

P(J|L, S)P(S |i)P(i)(∑G

P(h|G , J)P(L|G )λD(G )) =

P(J|L, S)P(S |i)P(i)λG (J, L)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 21 / 25

Page 22: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: Summary

Condition all factors on evidence.

Eliminate/marginalize each non-query variable X :

Move factors mentioning X towards the right.

Push-in the summation over X towards the right.

Replace expression by λ function.

Normalize to get distribution.

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 22 / 25

Page 23: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: Exercise

Given the following BN:

Eliminate Letter and Difficulty from the network.

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 23 / 25

Page 24: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

VE: Some tricks

Given a BN and a set of query nodes Q, and a set of evidence variables E :

Pruning edges: the effect of conditioning is to remove any edge froma node in E to its parents.

Pruning nodes: we can remove any leaf node (with its CPT) from thenetwork as long as it does not belong to variables in Q ∪ E .

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 24 / 25

Page 25: Bayesian Networks - Exact Inference by Variable Eliminationlarrosa/MEI-CSI-files/BN/2-BN-VE.pdf · Pruning nodes: we can remove any leaf node (with its CPT) from the network as long

Complexity and Elimination Order

P(C ,X1, . . . ,Xn) = P(C )P(X1|C )P(X2|C ) . . .P(Xn|C )

Eliminate Xn first:∑C ,X1,X2,...Xn−1

P(C )P(X1|C )P(X2|C ) . . .P(Xn−1|C )(∑Xn

P(Xn|C )) =

∑C ,X1,X2,...,Xn−1

P(C )P(X1|C )P(X2|C ) . . .P(Xn−1|C )λXn(C )

Eliminate C first:∑X1,X2,...,Xn

(∑C

P(C )P(X1|C )P(X2|C ) . . .P(Xn|C ))

∑X1,X2,...,Xn

λC (X1,X2, . . . ,Xn)

Emma Rollon and Javier Larrosa Bayesian Networks Q1-2015-2016 25 / 25