convergent message-passing algorithms for inference over general graphs with convex free energies...

Post on 17-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Convergent Message-Passing Algorithms for Inference over General Graphs with Convex

Free Energies

Tamir Hazan, Amnon Shashua

School of Computer Science and EngineeringHebrew University of Jerusalem

2

Input:

Graphical Models - Background

)(),...,( 1 xxxp n ?)( ixpOutput:

3

Input: )(),...,( 1 xxxp n ?)( ixpOutput:

Graphical Models - Background

}2,1{ }4,3,2{ }5,4,2{

1x 2x 3x 4x 5x

4

Input: )(),...,( 1 xxxp n ?)( ixpOutput:

aiNc

iiciai xmxn\)(

)()(

iaNj

jajax

aiia xnxmia \)(\

)()(:)( xx

Belief Propagation:

Graphical Models - Background

}2,1{ }4,3,2{ }5,4,2{

1x 2x 3x 4x 5x

5

BP - Variational Methods

i

iixbbbHdbHxxb

i

)()1()()(ln)(min,

2x32 d

Bethe free energy: Approximating the marginals )(),( ii xbxb

6

BP - Variational Methods

Stationary Bethe free energy = BP fixed points (Yedidia, Freeman, Weiss ’01)

When the factor graph has no cycles Bethe is convex over the marginalization constraints and BP is exact.

i

iixbbbHdbHxxb

i

)()1()()(ln)(min,

Bethe free energy: Approximating the marginals )(),( ii xbxb

Factor graph has cycles: Bethe is non-convex and BP might not converge.

ixxii xbxb

\)()(

2x32 d

7

BP - Variational Methods

i

iixbHcbHcxxb )()()(ln)(min

TRW free energy (Wainwright, Jaakkola, Willsky ’02):

c Weighted number of spanning trees through an edge α

)(1

iNi cc

TRW free energy is convex over the marginalization constraints.

Convergent message passing algorithm for cliques of size 2(Globerson & Jaakkola, ’07)

8

“convex free energy” is convex (over the marginalization constraints) for all factor graphs if there exists such that0,,c ii cc

)(

c

Ni

icc

)(

ciNiii cc

and

Claim: (Pakzad and Anantharam ‘02, Heskes ’04, Weiss et al. ’07)

Convex Free Energies

i

iixbHcbHcxxb )()()(ln)(min

9

“convex free energy” is convex (over the marginalization constraints) for all factor graphs if there exists such that0,,c ii cc

)(

c

Ni

icc

)(

ciNiii cc

and

Claim: (Pakzad and Anantharam ‘02, Heskes ’04, Weiss et al. ’06)

Convex Free Energies

i

iixbHcbHcxxb )()()(ln)(min

i iNi

iiiix

aa bHbHcbHcbHcxbx)(,

))()(()()()()(lnmin

10

Background - SummaryBethe free energy TRW free energy convex free energy

0,,c ii cc

non-convex for general graphs

specific convexification of free energy

general form of convexification with parameters

Contributions:

1) Convergent message passing algorithm for convex free energy

2) Heuristic for choosing good convex free energy.

Belief Propagation Globerson Jaakkola ’07 message passing algorithm for |α|=2

?message passing algorithm for |α|≥2

11

strictly convex& differentiable

n

ii

Rbbhbf

m1

)()(min

Strictly convex& proper ( for some b) )(bhi

Convex Belief Propagation

12

strictly convex& differentiable

n

ii

Rbbhbf

m1

)()(min

Strictly convex& proper ( for some b) )(bhi

i iNiiii

xaa bHbHcbHcbHcxbx

)(

))()(()()()()(lnmin

i marginals of xi agree

Convex Belief Propagation

13

Convex Belief Propagation

strictly convex& differentiable

n

ii

Rbbhbf

m1

)()(min

Strictly convex& proper ( for some b) )(bhi

i iNiiii

xaa bHbHcbHcbHcxbx

)(

))()(()()()()(lnmin

i marginals of xi agree

))()(()()( iiii

i

bHbHcbHcbh Whenever marginals of xi agree

Otherwise

14

)()(min bhbf ib

Output:

For t=1,2,...

For i=1,2,...,n

)( *bfii

*b

ij ji

)()(minarg)(

* bhbbfb iiT

hdomainb i

Convex Message Passing

* We also have a similar parallel message passing algorithm

1h

ih

nh

1b

b

mb

1,i

mi,

,1

,n

15

)()(min bhbf ib

Output:

For t=1,2,...

For i=1,2,...,n

)( *bfii

*b

ij ji

)()(minarg)(

* bhbbfb iiT

hdomainb i

Convex Message Passing

* We also have a similar parallel message passing algorithm

1h

ih

nh

1b

b

mb

1,i

mi,

,1

,n

16

)(bh

)(bg

*b

Primer on Fenchel Duality

slope

)(* h

)(* g

)(max)(* bhxh T

b

)(min)(* bgbg T

b

convex conjugate of h(b)

concave conjugate of g(b)

Primal )()( bgbh )()( ** hg Dual

17

)(bh

)(bg

*b

Primer on Fenchel Duality

)(slope ** bg

)( ** h

)( ** g

)(),()(max)()(min **** bgandhgbgbhb

More Conveniently: )()(minmax)()(min *

hbbgbhbg Tb

b

)( ** bg

*b

18

Sequential message-passingFenchel duality theorem

)()(minmax)()(min *

hbbgbhbg Tb

b )( ** bg

19

Generalized Fenchel duality theorem

n

i ii

n

i iT

b

n

i i hbbfbhbfk

1

*

1,...,1)()()(minmax)()(min

1

Sequential message-passingFenchel duality theorem

)()(minmax)()(min *

hbbgbhbg Tb

b )( ** bg

20

Generalized Fenchel duality theorem

Sequential message-passingFenchel duality theorem

)()(minmax)()(min *

hbbgbhbg Tb

b )( ** bg

Block update approach: Optimize i Set

ij ji

)()(minmax *iii

Ti

Tb hbbbf

i

n

i ii

n

i iT

b

n

i i hbbfbhbfk

1

*

1,...,1)()()(minmax)()(min

1

21

Generalized Fenchel duality theorem

Sequential message-passingFenchel duality theorem

)()(minmax *iii

Ti

Tb hbbbf

i

)()(minmax)()(min *

hbbgbhbg Tb

b )( ** bg

Block update approach: Optimize i Set

ij ji

n

i ii

n

i iT

b

n

i i hbbfbhbfk

1

*

1,...,1)()()(minmax)()(min

1

22

Generalized Fenchel duality theorem

Sequential message-passingFenchel duality theorem

)( ** bg

)()(min)()(minmax * bhbbfhbbbf iT

biiiT

iT

bi

)()(minmax)()(min *

hbbgbhbg Tb

b

ii bf )( *

Block update approach: Optimize i Set

ij ji

n

i ii

n

i iT

b

n

i i hbbfbhbfk

1

*

1,...,1)()()(minmax)()(min

1

23

Generalized Fenchel duality theorem

Sequential message-passingFenchel duality theorem

)()(min)()(minmax * bhbbfhbbbf iT

biiiT

iT

bi

)()(minmax)()(min *

hbbgbhbg Tb

b )( ** bg

ii bf )( *

Algorithm: Repeat until convergence

Output: *b

Block update approach: Optimize i Set

ij ji

n

i ii

n

i iT

b

n

i i hbbfbhbfk

1

*

1,...,1)()()(minmax)()(min

1

24

Sequential message-passing

Bregman’s successive projection algorithm (Hildreth, Dykstra, Csiszar, Tseng)

otherwise

Cbifbh i

i

0)(

Output:

For t=1,2,...

For i=1,2,...,n

)( *bfii

*b

ij ji

)()(minarg)(

* bhbbfb iiT

hdomainb i

)()(min bhbf ib

)(min bfiCb

1h

ih

nh

1b

b

mb

1,i

mi,

,1

,n

25

Convex Belief Propagation

n

ii

Rbbhbf

m1

)()(min

i iNiiii

xaa bHbHcbHcbHcxbx

)(

))()(()()()()(lnmin

i marginals of xi agree

Output:

For t=1,2,...

For i=1,2,...,n

)( *bfii *b

ij ji

)()(minarg)(

* bhbbfb iiT

hdomainb i

1h

ih

nh

1b

b

mb

1,i

mi ,

,1

,n

26

Convex Belief Propagation

Our algorithm:

ia

i

x

c

iaNjajaaiia xnxm

\

ˆ/1

\)(

)()(:)(x

x

)(

ˆ/ˆ )()(iN

icc

ii xmxb ii

i

c

ii

ii

iNjji xm

xbxnxxn

icic

)(

)()()()(

ˆ

\)(

)()(minarg)(

* bhbbfb iiT

hdomainb i

Closed form solution for

Belief propagation:

)(

)()(iNa iiaii xmxb

)(

)()(

ii

iiiai xm

xbxn

iaNj

jajax

aiia xnxmia \)(\

)()(:)( xx

27

Convex Belief Propagation

Our algorithm:

ia

i

x

c

iaNjajaaiia xnxm

\

ˆ/1

\)(

)()(:)(x

x

)(

ˆ/ˆ )()(iN

icc

ii xmxb ii

i

c

ii

ii

iNjji xm

xbxnxxn

icic

)(

)()()()(

ˆ

\)(

)()(minarg)(

* bhbbfb iiT

hdomainb i

Closed form solution for

Belief propagation:

)(

)()(iNa iiaii xmxb

)(

)()(

ii

iiiai xm

xbxn

iaNj

jajax

aiia xnxmia \)(\

)()(:)( xx

28

3) For general and (non-convex!) our algorithm,1, ccc i

1) when f(),h i() are non-convex if the algorithm

converges then it is to a stationary point.

2) Set (Bethe free energy, non-convex!)

our message passing algorithm = belief propagation.

Convex Belief Propagation

Interesting Notes:

i ib bhbf )()(min

,0,1,1 iii cdcc

aiNci

ciai xmxn

i\)(

)(:)(

iaNj

jajx

ac

iia xnxmia

a\)(\

/1 )()(:)(x

x

Open question =? “Fractional BP” (Wiegerinck, Heskes ‘03)

,0ic

29

Determine Convex Energy

i

iix

aa bHcbHcxbx )()()()(lnmin

Heuristic: Find convex energy while is close to 1 as possible c

Motivation: - Bethe approximation1c

)(c-1c

iNi

30

Experiments – Ising Model

gridnnEvery variable has two values

variables2n}1,1{ix

i

ijfield

interaction

0ij attractive

0ij mixed

)exp(),( jiijjiij xxxx )exp()( iiii xx

i iiEji jiijn xxxxxp )(),(),...,(

,1

31

Experiments – Efficiency

Grid size

Seconds

32

Approximating Marginals – Ising model

convex free energies Vs. Bethe free energy

Marginal Error

Interaction strength

33

convex free energies Vs. Bethe free energy

Marginal Error

Interaction strength

Approximating Marginals – Ising model

34

Approximating Marginals – Random Graph

10 vertices. Edge density 30% (almost tree), 50% (far from tree)

TRW free energy Vs. L2-convex free energy

35

• Novel perspective of message passing as primal-dual optimization for the general class of

• Convex free energy mapped to

• Heuristic for setting convex free energy by “convexifying” Bethe free energy

• Algorithm applies to general clique sizes; BP is a particular case of our algorithm

• Region Graphs energies (Kikuchi)

• Future work: Theoretical foundation on our heuristic for setting the convex free energy.

Summary

i ib bhbf )()(min

i ib bhbf )()(min

top related