probabilistic inference lecture 6 – part 2

Post on 22-Feb-2016

44 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Probabilistic Inference Lecture 6 – Part 2. M. Pawan Kumar pawan.kumar@ecp.fr. Slides available online http:// cvc.centrale-ponts.fr /personnel/ pawan /. MRF. d 1. d 2. d 3. V 1. V 2. V 3. d 4. d 5. d 6. V 4. V 5. V 6. d 7. d 8. d 9. V 7. V 8. V 9. - PowerPoint PPT Presentation

TRANSCRIPT

Probabilistic InferenceLecture 6 – Part 2

M. Pawan Kumarpawan.kumar@ecp.fr

Slides available online http://cvc.centrale-ponts.fr/personnel/pawan/

MRF

V1

d1

V2

d2

V3

d3

V4

d4

V5

d5

V6

d6

V7

d7

V8

d8

V9

d9

A is conditionally independent of B given C if

there is no path from A to B when C is removed

MRF

V1

d1

V2

d2

V3

d3

V4

d4

V5

d5

V6

d6

V7

d7

V8

d8

V9

d9

Va is conditionally independent of Vb given Va’s neighbors

Pairwise MRF

V1

d1

V2

d2

V3

d3

V4

d4

V5

d5

V6

d6

V7

d7

V8

d8

V9

d9

Z is known as the partition function

UnaryPotentialψ1(v1,d1)

PairwisePotentialψ56(v5,v6)

Probability P(v,d) =Πa ψa(va,da) Π(a,b) ψab(va,vb)

Z

Inference

maxv P(v) Maximum a Posteriori (MAP) Estimation

minv Q(v) Energy Minimization

P(va = li) = Σv P(v)δ(va = li)

Computing Marginals

P(va = li, vb = lk) = Σv P(v)δ(va = li)δ(vb = lk)

P(v) = exp(-Q(v))/Z

Outline

• Belief Propagation on Chains

• Belief Propagation on Trees

• Loopy Belief Propagation

Overview

Va Vb Vc Vd

Compute the marginal probability for Vd

P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)

Compute (unnormalized) distribution

Ψa(va)Ψab(va,vb)Σva

Function m(vb)

Overview

Va Vb Vc Vd

Compute the marginal probability for Vd

P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)

Compute (unnormalized) distribution

Ψb(vb)Ψbc(vb,vc)m(vb)Σvb

Function m(vc)

Overview

Va Vb Vc Vd

Compute the marginal probability for Vd

P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)

Compute (unnormalized) distribution

Ψc(vc)Ψcd(vc,vd)m(vc)Σvc

(Unnormalized) Marginals !!

Overview

Va Vb Vc Vd

Compute the marginal probability for Vc

P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)

P(v) = P(va|vb)P(vb|vc)P(vd|vc)P(vc)

Several common terms !!

Overview

Va Vb Vc Vd

Compute the marginal probability for Vb

P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)

P(v) = P(va|vb)P(vb|vc)P(vd|vc)P(vc)

P(v) = P(va|vb)P(vc|vb)P(vd|vc)P(vb)

Overview

Va Vb Vc Vd

Compute the marginal probability for Va

P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)

P(v) = P(va|vb)P(vb|vc)P(vd|vc)P(vc)

P(v) = P(va|vb)P(vc|vb)P(vd|vc)P(vb)

P(v) = P(vb|va)P(vc|vb)P(vd|vc)P(va)

Belief Propagation on Chains

Compute exact marginals

Avoids re-computing common terms

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

Unary Potentials ψa(li)

Pairwise Potentials ψab(li,lk)

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

Marginal Probability P(vb = lj) = Σi ψa(li)ψb(lj)ψab(li,lj)/Z

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

Un-normalized

Marginal Probability P’(vb = lj) = Σi ψa(li)ψb(lj)ψab(li,lj)/Z

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

Un-normalized

Marginal Probability P’(vb = lj) = Σi ψa(li)ψb(lj)ψab(li,lj)

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

Un-normalized

Marginal Probability P’(vb = lj) = ψb(lj)Σi ψa(li)ψab(li,lj)

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

2 x 3

Two Variables

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

2 x 3 + 5 x 1 Mab;0

11

Two Variables

Va Vb

2

5

41

3

2 x 1

Va Vb

2

5 2

3

1

11

Two Variables

2 x 1

11

Va Vb

2

5 2

3

1Va Vb

2

5

41

3

+ 5 x 3 Mab;1

17

Two Variables11

Va Vb

2

5 2

3

1

17

Marginal Probability P’(vb = lj) = ψb(lj)Σi ψa(li)ψab(li,lj)

Va Vb

2

5

41

3

Two Variables11

Va Vb

2

5 2

3

1

17

Marginal Probability P’(vb = lj) = ψb(lj)Mab;j

Va Vb

2

5

41

3

P’(vb = l0) = 22 P’(vb = l1) = 68

Two Variables11

Va Vb

2

5 2

3

1

17

Marginal Probability P(vb = lj) = ψb(lj)Mab;j/Z

Va Vb

2

5

41

3

P’(vb = l0) = 22 P’(vb = l1) = 68

Z = Σj P’(vb = lj) = 90

Two Variables11

Va Vb

2

5 2

3

1

17

Va Vb

2

5

41

3

P(vb = l0) = 0.244… P(vb = l1) = 0.755…

= 90 O(h2)!!

Marginal Probability P(vb = lj) = ψb(lj)Mab;j/Z

Z = Σj P’(vb = lj)

Two Variables11

Va Vb

2

5 2

3

1

17

Va Vb

2

5

41

3

P(vb = l0) = 0.244… P(vb = l1) = 0.755…

O(h2)!!Same as brute-force

Three Variables

Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

P’(vc = lk) Σj Σi ψa(li)ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)

Three Variables

Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

P’(vc = lk) ψc(lk)Σj Σi ψa(li)ψb(lj)ψab(li,lj)ψbc(lj,lk)

Three Variables

Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

P’(vc = lk) ψc(lk)Σj ψb(lj)Σi ψa(li)ψab(li,lj)ψbc(lj,lk)

Three Variables

Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Σi ψa(li)ψab(li,lj)

Mab;j

11

17

Three Variables

Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j

11

17 Mbc;k

Three Variables

Va Vb

2

5 2

3

1Vc

4 6

2

1

3

3

2

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j

11

17

Three Variables

Va Vb

2

5 2

3

1Vc

4 6

2

1

3

3

2

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j

11

17

4 x 2 x 11

Three Variables

Va Vb

2

5 2

3

1Vc

4 6

2

1

3

3

2

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j

11

17

4 x 2 x 11+ 2 x 2 x 17

Three Variables

Va Vb

2

5 2

3

1Vc

4 6

2

1

3

3

2

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j

11

17

4 x 2 x 11+ 2 x 2 x 17

156

Three Variables

P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11 146

Three Variables

P’(vc = lk) ψc(lk)Mbc;k

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

NOTE: Mbc;k “includes” Mab;j

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P(vc = 0) = 0.35

P(vc = 1) = 0.65

Z = 156 x 3 + 146 x 6 = 1344

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

O(nh2) Better than brute-force

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

What about P(vb = lj)?

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P’(vb = lj) Σk Σi ψa(li)ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P’(vb = lj) ψb(lj)Σk Σi ψa(li)ψc(lk)ψab(li,lj)ψbc(lj,lk)

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P’(vb = lj) ψb(lj)Σk ψc(lk)Σi ψa(li)ψab(li,lj)ψbc(lj,lk)

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P’(vb = lj) ψb(lj)Σk ψc(lk)ψbc(lj,lk)Σi ψa(li)ψab(li,lj)

Mab;j

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P’(vb = lj) ψb(lj)Mab;jΣk ψc(lk)ψbc(lj,lk)

Mcb;j

NOTE: Mcb;j does not “include” Mbc;k

146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

P’(vb = lj) ψb(lj)Mab;jMcb;j

24

12 146

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P(vb = 0) = 0.39

P(vb = 1) = 0.61

Z = 11 x 12 x 4 + 17 x 24 x 2 = 1344

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

O(nh2) Better than brute-force

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

What about P(va = li)?

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P’(va = li) Σj Σk ψa(li)ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P’(va = li) ψa(li)Σj Σk ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P’(va = li) ψa(li)Σj ψb(lj)Σk ψc(lk)ψab(li,lj)ψbc(lj,lk)

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P’(va = li) ψa(li)Σj ψb(lj)ψab(li,lj)Σk ψc(lk)ψbc(lj,lk)

Mcb;j

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P’(va = li) ψa(li)Σj ψb(lj)ψab(li,lj)Mcb;j Mba;i

NOTE: Mba;i “includes” Mcb;j

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

P’(va = li) ψa(li)Mba;i

192

192

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

192

192

P(va = 0) = 0.71

P(vb = 1) = 0.29

Z = 192 x 2 + 192 x 5 = 1344

Three Variables

17 156Va Vb

2

5 2

3

1Vc

4 61

2

1

3

3

2 3

11

24

12 146

192

192

O(nh2) Better than brute-force

Belief Propagation on Chains

Start from left, go to right

For current edge (a,b), compute

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Repeat till the end of the chain

Start from right, go to left

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Repeat till the end of the chain

Belief Propagation on Chains

P’(va = li,vb = lj) = ?

Normalize to compute true marginals

P’(va = li) = ?

ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j

ψa(li)ΠnMna;i

Outline

• Belief Propagation on Chains

• Belief Propagation on Trees

• Loopy Belief Propagation

Pearl, 1988

Belief Propagation on Trees

Vc

Vd

Va Vb

ΣkΣj Σi ψa(li)ψb(lj)ψc(lk)ψd(lo)ψac(li,lk)ψbc(lj,lk)ψcd(lk,lo)

P’(vd = lo)

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)ΣkΣj Σi ψa(li)ψb(lj)ψc(lk)ψac(li,lk)ψbc(lj,lk)ψcd(lk,lo)

P’(vd = lo)

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Σkψc(lk)Σj Σi ψa(li)ψb(lj)ψac(li,lk)ψbc(lj,lk)ψcd(lk,lo)

P’(vd = lo)

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj Σi ψa(li)ψb(lj)ψac(li,lk)ψbc(lj,lk)

P’(vd = lo)

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj ψb(lj)Σi ψa(li)ψac(li,lk)ψbc(lj,lk)

P’(vd = lo)

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj ψb(lj)ψbc(lj,lk)Σi ψa(li)ψac(li,lk)

P’(vd = lo) Mac;k

Mac;k

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj ψb(lj)ψbc(lj,lk)Mac;k

P’(vd = lo) Mbc;k

Mac;k Mbc;k

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Σkψc(lk)ψcd(lk,lo)Mbc;kMac;k

P’(vd = lo)

Mac;k Mbc;k

Mcd;o

Mcd;o

Belief Propagation on Trees

Vc

Vd

Va Vb

ψd(lo)Mcd;o

P’(vd = lo)

Mac;k Mbc;k

Mcd;o

Belief Propagation on Trees

Vc

Vd

Va Vb

P’(vc = lk)

Mac;k Mbc;k

Mcd;o

Mdc;k

ψc(lk)Mac;kMbc;kMdc;k

Belief Propagation on Trees

Vc

Vd

Va Vb

P’(vb = lj)

Mac;k Mbc;k

Mcd;o

Mdc;k

Mcb;j

ψb(lj)Mcb;j

Belief Propagation on Trees

Vc

Vd

Va Vb

P’(va = li)

Mac;k Mbc;k

Mcd;o

Mdc;k

Mcb;jMca;i

ψa(li)Mca;i

Belief Propagation on Trees

Start from leaf, go towards root

For current edge (a,b), compute

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Repeat till the root is reached

Start from root, go towards leaves

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Repeat till the leafs are reached

Belief Propagation on Trees

P’(va = li,vb = lj) = ?

Normalize to compute true marginals

P’(va = li) = ?

ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j

ψa(li)ΠnMna;i

Outline

• Belief Propagation on Chains

• Belief Propagation on Trees

• Loopy Belief Propagation

Pearl, 1988; Murphy et al., 1999

Loopy Belief Propagation

Initialize all messages to 1

In some order of edges, update messages

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Until Convergence

Rate of changes in messages < threshold

Loopy Belief Propagation

Va Vb

Vd Vc

Mab

Mbc

Mbc contains Mab

Mcd

Mda

Mcd contains Mbc

Mda contains Mcd

Overcounting!!

Loopy Belief Propagation

Initialize all messages to 1

In some order of edges, update messages

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Until Convergence

Rate of changes in messages < threshold

Not Guaranteed !!

Loopy Belief Propagation

B’ab(i,j) =

Normalize to compute beliefs Ba(i), Bab(i,j)

B’a(i) =

ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j

ψa(li)ΠnMna;i

At convergence Σj Bab(i,j) = Ba(i)

top related