Download - PGM 2002/03 Tirgul 4 Exact Inference

PGM 2002/03 Tirgul 4

Exact Inference

Inference in Simple Chains

How do we compute P(X2)?

X1 X2

11

)|()(),()( 121212xx

xxPxPxxPxP

Inference in Simple Chains (cont.)

How do we compute P(X3)?

we already know how to compute P(X2)...

X1 X2

22

)|()(),()( 232323xx

xxPxPxxPxP

X3

11

)|()(),()( 121212xx

xxPxPxxPxP


How do we compute P(Xn)?

Compute P(X1), P(X2), P(X3), … We compute each term by using the previous one

Complexity: Each step costs O(|Val(Xi)|*|Val(Xi+1)|) operations Compare to naïve evaluation, that requires summing

over joint values of n-1 variables

ix

iiii xxPxPxP )|()()( 11

X1 X2 X3Xn

...


Suppose that we observe the value of X2 =x2

How do we compute P(X1|x2)? Recall that we it suffices to compute P(X1,x2)

X1 X2

)()|(),( 11221 xPxxPxxP


Suppose that we observe the value of X3 =x3

How do we compute P(X1,x3)?

How do we compute P(x3|x1)?

X1 X2

)|()(),( 13131 xxPxPxxP

X3

2

22

)|()|(

),|()|()|,()|(

2312

2131213213

x

xx

xxPxxP

xxxPxxPxxxPxxP


Suppose that we observe the value of Xn =xn

How do we compute P(X1,xn)?

We compute P(xn|xn-1), P(xn|xn-2), … iteratively

X1 X2

)|()(),( 111 xxPxPxxP nn

X3

i

i

xinii

xiniin

xxPxxP

xxxPxxP

)|()|(

)|,()|(

11

11

Xn...


Suppose that we observe the value of Xn =xn

We want to find P(Xk|xn )

How do we compute P(Xk,xn )?

We compute P(Xk ) by forward iterations

We compute P(xn | Xk ) by backward iterations

X1 X2

)|()(),( knknk xxPxPxxP

Xk Xn......

Elimination in Chains

We now try to understand the simple chain example using first-order principles

Using definition of probability, we have

d c b a

edcbaPeP ),,,,()(

A B C ED


By chain decomposition, we get

A B C ED

d c b a

d c b a

dePcdPbcPabPaP

edcbaPeP

)|()|()|()|()(

),,,,()(


Rearranging terms ...

A B C ED

d c b a

d c b a

abPaPdePcdPbcP

dePcdPbcPabPaPeP

)|()()|()|()|(

)|()|()|()|()()(


Now we can perform innermost summation

This summation, is exactly the first step in the forward iteration we describe before

A B C ED

d c b

d c b a

bpdePcdPbcP

abPaPdePcdPbcPeP

)()|()|()|(

)|()()|()|()|()(

X


Rearranging and then summing again, we get

A B C ED

d c

d c b

d c b

cpdePcdP

bpbcPdePcdP

bpdePcdPbcPeP

)()|()|(

)()|()|()|(

)()|()|()|()(

X X

Elimination in Chains with Evidence

Similarly, we understand the backward pass

We write the query in explicit form

A B C ED

b c d

b c d

dePcdPbcPabPaP

edcbaPeaP

)|()|()|()|()(

),,,,(),(


Eliminating d, we get

A B C ED

b c

b c d

b c d

cePbcPabPaP

dePcdPbcPabPaP

dePcdPbcPabPaPeaP

)|()|()|()(

)|()|()|()|()(

)|()|()|()|()(),(

X


Eliminating c, we get

A B C ED

b

b c

b c

bepabPaP

cePbcPabPaP

cePbcPabPaPeaP

)|()|()(

)|()|()|()(

)|()|()|()(),(

XX


Finally, we eliminate b

A B C ED

)|()(

)|()|()(

)|()|()(),(

aePaP

bepabPaP

bepabPaPeaP

b

b

XXX

Variable Elimination

General idea: Write query in the form

Iteratively Move all irrelevant terms outside of innermost

sum Perform innermost sum, getting a new term Insert the new term into the product

kx x x i

iin paxPXP3 2

)|(),( e

A More Complex Example

Visit to Asia

Smoking

Lung CancerTuberculosis

Abnormalityin Chest

Bronchitis

X-Ray Dyspnea

“Asia” network:

V S

LT

A B

X D

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

We want to compute P(d) Need to eliminate: v,s,x,t,l,a,b

Initial factors

V S

LT

A B

X D


We want to compute P(d) Need to eliminate: v,s,x,t,l,a,b

Initial factors

Eliminate: v

Note: fv(t) = P(t)In general, result of elimination is not necessarily a probability term

Compute: v

v vtPvPtf )|()()(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv

V S

LT

A B

X D


We want to compute P(d) Need to eliminate: s,x,t,l,a,b

Initial factors

Eliminate: s

Summing on s results in a factor with two arguments fs(b,l)In general, result of elimination may be a function of several variables

Compute: s

s slPsbPsPlbf )|()|()(),(


),|()|(),|(),()( badPaxPltaPlbftf sv

V S

LT

A B

X D


We want to compute P(d) Need to eliminate: x,t,l,a,b

Initial factors

Eliminate: x

Note: fx(a) = 1 for all values of a !!

Compute: x

x axPaf )|()(



),|(),|()(),()( badPltaPaflbftf xsv

V S

LT

A B

X D


We want to compute P(d) Need to eliminate: t,l,a,b

Initial factors

Eliminate: t

Compute: t

vt ltaPtflaf ),|()(),(




),|(),()(),( badPlafaflbf txs

V S

LT

A B

X D


We want to compute P(d) Need to eliminate: l,a,b

Initial factors

Eliminate: l

Compute: l

tsl laflbfbaf ),(),(),(




),|(),()(),( badPlafaflbf txs

),|()(),( badPafbaf xl

V S

LT

A B

X D


We want to compute P(d) Need to eliminate: b

Initial factors

Eliminate: a,bCompute:

b

aba

xla dbfdfbadpafbafdbf ),()(),|()(),(),(




),|()(),( badPafbaf xl),|(),()(),( badPlafaflbf txs

)(),( dfdbf ba

Variable Elimination

We now understand variable elimination as a sequence of rewriting operations

Actual computation is done in elimination step

Exactly the same computation procedure applies to Markov networks

Computation depends on order of elimination We will return to this issue in detail

Dealing with evidence

How do we deal with evidence?

Suppose get evidence V = t, S = f, D = t We want to compute P(L, V = t, S = f, D = t)

V S

LT

A B

X D

Dealing with Evidence

We start by writing the factors:

Since we know that V = t, we don’t need to eliminate V Instead, we can replace the factors P(V) and P(T|V) with

These “select” the appropriate parts of the original factors given the evidence

Note that fp(V) is a constant, and thus does not appear in elimination of other variables


V S

LT

A B

X D

)|()()( )|()( tVTPTftVPf VTpVP

Dealing with Evidence Given evidence V = t, S = f, D = t Compute P(L, V = t, S = f, D = t ) Initial factors, after setting evidence:

),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP

V S

LT

A B

X D


Eliminating x, we get


V S

LT

A B

X D

),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP



Eliminating t, we get


V S

LT

A B

X D


),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP




Eliminating a, we get


V S

LT

A B

X D



),()()( )|()|()()( lbfbflfff asbPslPsPvP




Eliminating a, we get

Eliminating b, we get


V S

LT

A B

X D



),()()( )|()|()()( lbfbflfff asbPslPsPvP

)()()|()()( lflfff bslPsPvP

x

kxkx yyxfyyf ),,,('),,( 11

m

ilikx i

yyxfyyxf1

,1,1,11 ),,(),,,('

Complexity of variable elimination

Suppose in one elimination step we compute

This requires multiplications

For each value for x, y1, …, yk, we do m multiplications

additions For each value of y1, …, yk , we do |Val(X)| additions

Complexity is exponential in number of variables in the intermediate factor!

i

iYXm )Val()Val(

i

iYX )Val()Val(

Numeric example: “Green” Network

Car

Rain

Waste

Gazlan Hiker

Pollution

R0 R1

P(Ri) 0.8 0.2

R0 R1

P(H0|R) 0.2 0.7P(H1|R) 0.8 0.3

R0 R1

P(G0|R) 0.1 0.3P(G1|R) 0.9 0.7

F0 F1

P(P0|C) 0.8 0.1P(P1|C) 0.2 0.9

C0 C1

P(C0|H) 0.95 0.4P(C1|H) 0.05 0.6

G0H0 G0H1 G1H0 G1H1

P(W0|G,H) 0.9 0.8 0.7 0.15P(W1|G,H) 0.1 0.2 0.3 0.85

Example:

Suppose we wish to calculate P(c0|p1,w0), using the algorithm shown in class.

Answer:

First, We’ll calculate P(C,p1,w0).

We have the initial factors:

P(R)=fR (R) = r1 0.2

r0 0.8

Initial Factors

P(G|R) = fG(G,R) = r0 g0 0.1

r0 g1 0.9

r1 g0 0.3

r1 g1 0.7

P(H|R) = fH(H,R) = h0 r0 0.2

h0 r1 0.7

h1 r0 0.8

h1 r1 0.3

Initial Factors (cont.)

P(w0|G,H) = fW(w0,G,H) = w0 h0 g0 0.9

w0 h0 g1 0.7

w0 h1 g0 0.8

w0 h1 g1 0.15

P(C|H) = fC(C,H) = h0 c0 0.95

h0 c1 0.05

h1 c0 0.4

h1 c1 0.6

Initial Factors (cont.)

P(p1|C) = FP(p1,C) = p1 c0 0.2

p1 c1 0.9

We will now choose an elimination order:

P(F,p1,w0) =

H R GWGRHCP ffffff

Calculation

The steps are: gwghr = fWXfG = w0 g0 h0 r0 0.9X0.1 = 0.09

w0 g0 h0 r1 0.9X0.3 = 0.27

w0 g0 h1 r0 0.8X0.1 = 0.08

w0 g0 h1 r1 0.8X0.3 = 0.24

w0 g1 h0 r0 0.7X0.9 = 0.63

w0 g1 h0 r1 0.7X0.7 = 0.49

w0 g1 h1 r0 0.15X0.9 = 0.135

w0 g1 h1 r1 0.15X0.7 = 0.105

Calculation (cont.)

mhwr = = w0 h0 r0 0.09+0.63 = 0.72

w0 h0 r1 0.27 + 0.49 = 0.76

w0 h1 r0 0.08+0.135 = 0.215

w0 h1 r1 0.24+0.105 = 0.345

gwhr = fH x fR x mhwr = w0 h0 r0 0.8x0.2x0.72 = 0.1152

w0 h0 r1 0.2x0.7x0.76 = 0.1064

w0 h1 r0 0.8x0.8x0.215 = 0.1376

w0 h1 r1 0.2x0.3x0.345 = 0.0207

G

wghrg

Calculation (cont.)

mhw = = w0 h0 0.1152+0.1064 = 0.2216

w0 h1 0.1376+0.0207 = 0.1583

gwhc = fC x mhw = w0 h0 c0 0.2216x0.95 = 0.21052

w0 h0 c1 0.2216x0.05= 0.01108

w0 h1 c0 0.1583x0.4 = 0.06332

w0 h1 c1 0.1583x0.6 = 0.09498

mwc = = w0 c0 0.21052+0.06332= 0.27384

w0 c1 0.01108+0.09498= 0.10606

R

whrg

H

whcg

Calculation (cont.)

gwcp = fP x mwc = w0 c0 p1 0.2x0.27384 = 0.054768

w0 c1 p1 0.8x0.10606 = 0.095454

Now, we have:

3646.0095454.0054768.0

054768.0

)w,P(p

)w,pP(c)w,p|P(c

01

010,010

The Computational Cost

Computing gwghr = fW X fG requires 8 multiplications.

Computing mhwr = requires 4 additions.

Computing gwhr = fH x fR x mhwr requires 8 multiplications.

Computing mhw = requires 2 additions.

Computing gwhc = fC x mhw requires 4 multiplications.

Computing mwc = requires 2 additions.

Computing gwcp = fP x mwc requires 2 multiplications.

For a total of: 8+8+4+2 = 22 multiplications

and 4+2+2 = 8 additions.

G

wghrg

R

whrg

H

whcg

Elimination order does matter

We can choose another elimination order, say: R,G,H

For a total of 16+4+4+2 = 26 multiplications,

and 4+2+2 = 8 additions.

Download - PGM 2002/03 Tirgul 4 Exact Inference

Top Related