probabilistic inference lecture 6 – part 2
Post on 22-Feb-2016
44 Views
Preview:
DESCRIPTION
TRANSCRIPT
Probabilistic InferenceLecture 6 – Part 2
M. Pawan Kumarpawan.kumar@ecp.fr
Slides available online http://cvc.centrale-ponts.fr/personnel/pawan/
MRF
V1
d1
V2
d2
V3
d3
V4
d4
V5
d5
V6
d6
V7
d7
V8
d8
V9
d9
A is conditionally independent of B given C if
there is no path from A to B when C is removed
MRF
V1
d1
V2
d2
V3
d3
V4
d4
V5
d5
V6
d6
V7
d7
V8
d8
V9
d9
Va is conditionally independent of Vb given Va’s neighbors
Pairwise MRF
V1
d1
V2
d2
V3
d3
V4
d4
V5
d5
V6
d6
V7
d7
V8
d8
V9
d9
Z is known as the partition function
UnaryPotentialψ1(v1,d1)
PairwisePotentialψ56(v5,v6)
Probability P(v,d) =Πa ψa(va,da) Π(a,b) ψab(va,vb)
Z
Inference
maxv P(v) Maximum a Posteriori (MAP) Estimation
minv Q(v) Energy Minimization
P(va = li) = Σv P(v)δ(va = li)
Computing Marginals
P(va = li, vb = lk) = Σv P(v)δ(va = li)δ(vb = lk)
P(v) = exp(-Q(v))/Z
Outline
• Belief Propagation on Chains
• Belief Propagation on Trees
• Loopy Belief Propagation
Overview
Va Vb Vc Vd
Compute the marginal probability for Vd
P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)
Compute (unnormalized) distribution
Ψa(va)Ψab(va,vb)Σva
Function m(vb)
Overview
Va Vb Vc Vd
Compute the marginal probability for Vd
P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)
Compute (unnormalized) distribution
Ψb(vb)Ψbc(vb,vc)m(vb)Σvb
Function m(vc)
Overview
Va Vb Vc Vd
Compute the marginal probability for Vd
P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)
Compute (unnormalized) distribution
Ψc(vc)Ψcd(vc,vd)m(vc)Σvc
(Unnormalized) Marginals !!
Overview
Va Vb Vc Vd
Compute the marginal probability for Vc
P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)
P(v) = P(va|vb)P(vb|vc)P(vd|vc)P(vc)
Several common terms !!
Overview
Va Vb Vc Vd
Compute the marginal probability for Vb
P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)
P(v) = P(va|vb)P(vb|vc)P(vd|vc)P(vc)
P(v) = P(va|vb)P(vc|vb)P(vd|vc)P(vb)
Overview
Va Vb Vc Vd
Compute the marginal probability for Va
P(v) = P(va|vb)P(vb|vc)P(vc|vd)P(vd)
P(v) = P(va|vb)P(vb|vc)P(vd|vc)P(vc)
P(v) = P(va|vb)P(vc|vb)P(vd|vc)P(vb)
P(v) = P(vb|va)P(vc|vb)P(vd|vc)P(va)
Belief Propagation on Chains
Compute exact marginals
Avoids re-computing common terms
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
Unary Potentials ψa(li)
Pairwise Potentials ψab(li,lk)
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
Marginal Probability P(vb = lj) = Σi ψa(li)ψb(lj)ψab(li,lj)/Z
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
Un-normalized
Marginal Probability P’(vb = lj) = Σi ψa(li)ψb(lj)ψab(li,lj)/Z
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
Un-normalized
Marginal Probability P’(vb = lj) = Σi ψa(li)ψb(lj)ψab(li,lj)
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
Un-normalized
Marginal Probability P’(vb = lj) = ψb(lj)Σi ψa(li)ψab(li,lj)
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
2 x 3
Two Variables
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
2 x 3 + 5 x 1 Mab;0
11
Two Variables
Va Vb
2
5
41
3
2 x 1
Va Vb
2
5 2
3
1
11
Two Variables
2 x 1
11
Va Vb
2
5 2
3
1Va Vb
2
5
41
3
+ 5 x 3 Mab;1
17
Two Variables11
Va Vb
2
5 2
3
1
17
Marginal Probability P’(vb = lj) = ψb(lj)Σi ψa(li)ψab(li,lj)
Va Vb
2
5
41
3
Two Variables11
Va Vb
2
5 2
3
1
17
Marginal Probability P’(vb = lj) = ψb(lj)Mab;j
Va Vb
2
5
41
3
P’(vb = l0) = 22 P’(vb = l1) = 68
Two Variables11
Va Vb
2
5 2
3
1
17
Marginal Probability P(vb = lj) = ψb(lj)Mab;j/Z
Va Vb
2
5
41
3
P’(vb = l0) = 22 P’(vb = l1) = 68
Z = Σj P’(vb = lj) = 90
Two Variables11
Va Vb
2
5 2
3
1
17
Va Vb
2
5
41
3
P(vb = l0) = 0.244… P(vb = l1) = 0.755…
= 90 O(h2)!!
Marginal Probability P(vb = lj) = ψb(lj)Mab;j/Z
Z = Σj P’(vb = lj)
Two Variables11
Va Vb
2
5 2
3
1
17
Va Vb
2
5
41
3
P(vb = l0) = 0.244… P(vb = l1) = 0.755…
O(h2)!!Same as brute-force
Three Variables
Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
P’(vc = lk) Σj Σi ψa(li)ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)
Three Variables
Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
P’(vc = lk) ψc(lk)Σj Σi ψa(li)ψb(lj)ψab(li,lj)ψbc(lj,lk)
Three Variables
Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
P’(vc = lk) ψc(lk)Σj ψb(lj)Σi ψa(li)ψab(li,lj)ψbc(lj,lk)
Three Variables
Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Σi ψa(li)ψab(li,lj)
Mab;j
11
17
Three Variables
Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j
11
17 Mbc;k
Three Variables
Va Vb
2
5 2
3
1Vc
4 6
2
1
3
3
2
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j
11
17
Three Variables
Va Vb
2
5 2
3
1Vc
4 6
2
1
3
3
2
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j
11
17
4 x 2 x 11
Three Variables
Va Vb
2
5 2
3
1Vc
4 6
2
1
3
3
2
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j
11
17
4 x 2 x 11+ 2 x 2 x 17
Three Variables
Va Vb
2
5 2
3
1Vc
4 6
2
1
3
3
2
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j
11
17
4 x 2 x 11+ 2 x 2 x 17
156
Three Variables
P’(vc = lk) ψc(lk)Σj ψb(lj)ψbc(lj,lk)Mab;j
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11 146
Three Variables
P’(vc = lk) ψc(lk)Mbc;k
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
NOTE: Mbc;k “includes” Mab;j
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P(vc = 0) = 0.35
P(vc = 1) = 0.65
Z = 156 x 3 + 146 x 6 = 1344
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
O(nh2) Better than brute-force
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
What about P(vb = lj)?
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P’(vb = lj) Σk Σi ψa(li)ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P’(vb = lj) ψb(lj)Σk Σi ψa(li)ψc(lk)ψab(li,lj)ψbc(lj,lk)
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P’(vb = lj) ψb(lj)Σk ψc(lk)Σi ψa(li)ψab(li,lj)ψbc(lj,lk)
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P’(vb = lj) ψb(lj)Σk ψc(lk)ψbc(lj,lk)Σi ψa(li)ψab(li,lj)
Mab;j
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P’(vb = lj) ψb(lj)Mab;jΣk ψc(lk)ψbc(lj,lk)
Mcb;j
NOTE: Mcb;j does not “include” Mbc;k
146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
P’(vb = lj) ψb(lj)Mab;jMcb;j
24
12 146
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P(vb = 0) = 0.39
P(vb = 1) = 0.61
Z = 11 x 12 x 4 + 17 x 24 x 2 = 1344
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
O(nh2) Better than brute-force
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
What about P(va = li)?
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P’(va = li) Σj Σk ψa(li)ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P’(va = li) ψa(li)Σj Σk ψb(lj)ψc(lk)ψab(li,lj)ψbc(lj,lk)
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P’(va = li) ψa(li)Σj ψb(lj)Σk ψc(lk)ψab(li,lj)ψbc(lj,lk)
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P’(va = li) ψa(li)Σj ψb(lj)ψab(li,lj)Σk ψc(lk)ψbc(lj,lk)
Mcb;j
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P’(va = li) ψa(li)Σj ψb(lj)ψab(li,lj)Mcb;j Mba;i
NOTE: Mba;i “includes” Mcb;j
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
P’(va = li) ψa(li)Mba;i
192
192
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
192
192
P(va = 0) = 0.71
P(vb = 1) = 0.29
Z = 192 x 2 + 192 x 5 = 1344
Three Variables
17 156Va Vb
2
5 2
3
1Vc
4 61
2
1
3
3
2 3
11
24
12 146
192
192
O(nh2) Better than brute-force
Belief Propagation on Chains
Start from left, go to right
For current edge (a,b), compute
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Repeat till the end of the chain
Start from right, go to left
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Repeat till the end of the chain
Belief Propagation on Chains
P’(va = li,vb = lj) = ?
Normalize to compute true marginals
P’(va = li) = ?
ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j
ψa(li)ΠnMna;i
Outline
• Belief Propagation on Chains
• Belief Propagation on Trees
• Loopy Belief Propagation
Pearl, 1988
Belief Propagation on Trees
Vc
Vd
Va Vb
ΣkΣj Σi ψa(li)ψb(lj)ψc(lk)ψd(lo)ψac(li,lk)ψbc(lj,lk)ψcd(lk,lo)
P’(vd = lo)
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)ΣkΣj Σi ψa(li)ψb(lj)ψc(lk)ψac(li,lk)ψbc(lj,lk)ψcd(lk,lo)
P’(vd = lo)
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Σkψc(lk)Σj Σi ψa(li)ψb(lj)ψac(li,lk)ψbc(lj,lk)ψcd(lk,lo)
P’(vd = lo)
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj Σi ψa(li)ψb(lj)ψac(li,lk)ψbc(lj,lk)
P’(vd = lo)
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj ψb(lj)Σi ψa(li)ψac(li,lk)ψbc(lj,lk)
P’(vd = lo)
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj ψb(lj)ψbc(lj,lk)Σi ψa(li)ψac(li,lk)
P’(vd = lo) Mac;k
Mac;k
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Σkψc(lk)ψcd(lk,lo)Σj ψb(lj)ψbc(lj,lk)Mac;k
P’(vd = lo) Mbc;k
Mac;k Mbc;k
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Σkψc(lk)ψcd(lk,lo)Mbc;kMac;k
P’(vd = lo)
Mac;k Mbc;k
Mcd;o
Mcd;o
Belief Propagation on Trees
Vc
Vd
Va Vb
ψd(lo)Mcd;o
P’(vd = lo)
Mac;k Mbc;k
Mcd;o
Belief Propagation on Trees
Vc
Vd
Va Vb
P’(vc = lk)
Mac;k Mbc;k
Mcd;o
Mdc;k
ψc(lk)Mac;kMbc;kMdc;k
Belief Propagation on Trees
Vc
Vd
Va Vb
P’(vb = lj)
Mac;k Mbc;k
Mcd;o
Mdc;k
Mcb;j
ψb(lj)Mcb;j
Belief Propagation on Trees
Vc
Vd
Va Vb
P’(va = li)
Mac;k Mbc;k
Mcd;o
Mdc;k
Mcb;jMca;i
ψa(li)Mca;i
Belief Propagation on Trees
Start from leaf, go towards root
For current edge (a,b), compute
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Repeat till the root is reached
Start from root, go towards leaves
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Repeat till the leafs are reached
Belief Propagation on Trees
P’(va = li,vb = lj) = ?
Normalize to compute true marginals
P’(va = li) = ?
ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j
ψa(li)ΠnMna;i
Outline
• Belief Propagation on Chains
• Belief Propagation on Trees
• Loopy Belief Propagation
Pearl, 1988; Murphy et al., 1999
Loopy Belief Propagation
Initialize all messages to 1
In some order of edges, update messages
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Until Convergence
Rate of changes in messages < threshold
Loopy Belief Propagation
Va Vb
Vd Vc
Mab
Mbc
Mbc contains Mab
Mcd
Mda
Mcd contains Mbc
Mda contains Mcd
Overcounting!!
Loopy Belief Propagation
Initialize all messages to 1
In some order of edges, update messages
Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i
Until Convergence
Rate of changes in messages < threshold
Not Guaranteed !!
Loopy Belief Propagation
B’ab(i,j) =
Normalize to compute beliefs Ba(i), Bab(i,j)
B’a(i) =
ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j
ψa(li)ΠnMna;i
At convergence Σj Bab(i,j) = Ba(i)
top related