What Energy Functions Can be Minimized Using
Graph Cuts?
Shai Bagon
Advanced Topics in Computer Vision
June 2010
What is an Energy Function?
E
a number
suggested solution
For a given problem:Image Segmentation:
237-20
Useful Energy function:
1. Good solution Low energy
2. Tractable Can be minimized
Families of Functions or Outline
• F2 submodular
• Non submodular
• F3
• Beyond F3
Foreground SelectionLet
yi – color of ith pixel
xi ϵ {0,1} BG/FG labels (variables)
Given BG/FG scribbles:Pr(xi|yi)=How likely each pixel to be FG/BG
Pr(xm|xn)=Adjacent pixels should have same label
F2 energy:
E(x)=∑iEi(xi)+∑ijEij(xi,xj)
xm xn
xi
yi
Submodular
Known concept from set-functions:
E(x) = ∑i Ei(xi) + ∑ij Eij (xi, xj), xi ϵ {0,1}
Syxyxfyxfyfxf ,
20,01,11,00,1 Sffff
1 C D
0 A B
xj
xi0 1
Eij(xi,xj):
What does it mean?
B+C-A-D ≥ 0
How toMinimize?
E(x) = ∑i Ei(xi) + ∑ij Eij (xi, xj), xi ϵ {0,1}
Local “beliefs”:
Data termPrior knowledge:
Smoothness term
F2 submodular
Graph Partitioning
A weighted graph G=( V E w )
Special Nodes: s t
s-t cut:
Cost of a cut:
Nice property: 1:1 mapping
s-t cut ↔ {0,1}|V|-2
wV E
wij
s
t
TjSi
wijTSCut,
(,)
VTSTS
TtSsVTVS
,
,,,
s
t
Graph Partitioning - Energy
E(x) = ∑i Ei(xi) + ∑ij Eij (xi, xj)
Graph Partitioning
i j
Ej(1)
D-C
B+C-A-D
Ei(0)
1 C D
0 A B
xj
xi
0 1
Eij(xi,xj)
C-AC-A
00
D-C0
D-C0
00
B+C-A-D0= A + + +
C-A
s
t
Graph Partitioning - Energy
E(x) = ∑i Ei(xi) + ∑ij Eij (xi, xj)
Graph Partitioning
i j
Ej(1)
B+C-A-D
Ei(0)
C-A
Tv
Svxv 1
0
xE
DACBCDEEA
wATScut
ij
TjSiij
01
,,
D-C
st cut binary assignment
cut cost energy of assignment
min cut Energy min.
B=Eij(0,1)
Recap
F2 submodular:
E(x) = ∑i Ei(xi) + ∑ij Eij (xi, xj)
Eij(1,0)+Eij(0,1)≥Eij(0,0)+Eij(1,1)
Mapping from energy to graph partition
Min Energy = computing min-cut
Global optimum in poly timefor submodular functions!
Next…
Multi-label F2
E(x)=∑i Ei(xi) + ∑ij Eij(xi,xj) s.t. xi ϵ {1,…,L}
– Fusion moves: solving binary sub-problems– Applications to stereo, stitching, segmentation…
●
Currentlabeling
suggestedlabeling
“Alpha expansion”
=
Fusion
Solve Binary problem: xi=0 xi=1
Stereo matching see http://vision.middlebury.edu/stereo/
Ground truthPairwise MRF[Boykov et al. ‘01]
slide by Carsten Rother, ICCV’09
Input:
Panoramic stitching
slide by Carsten Rother, ICCV’09
Panoramic stitching
slide by Pushmeet Kohli, ICCV’09
AutoCollage
http://research.microsoft.com/en-us/um/cambridge/projects/autocollage/ [Rother et. al. Siggraph ‘05 ]
Next…
Multi-label F2
E(x)=∑i Ei(xi) + ∑ij Eij(xi,xj) s.t. xi ϵ {1,…,L}
– Fusion moves: solving binary sub-problems– Applications to stereo, stitching, segmentation…
Non-submodular
Beyond pair-wise interactions: F3
Merging Regionsinput image regions (Ncuts) “edge” prob.
pi
0:1:
1Prii xi
ixi
i ppx
1: 0:
1loglogi ixi xi
ii ppx
“weak” edge
“strong” edge
pi – prob. of boundary being edgeGOAL: Find labeling xiϵ{0,1} that max:
i
j
min:
Taking -log
Merging Regions
ii i
i
xiii
ii
xii
xii
xii
xii
xp
pC
ppp
pppp
i
iiii
11
log
log1loglog
loglog1loglog
0:
0:0:0:1:
x
Adding and subtracting the same number
1: 0:
1loglogi ixi xi
ii ppx merged be likey to210
edgean be likely to210
1log :
def
ii
ii
i
ii
pw
pw
p
pw
i
ii xwC
Merging Regions
Solving for edges:
Consistency constraints:No “dangling” edge
i iix xwCminarg
J
x1 x2 x3 EJ
0 0 0 0
1 1 1 0
0 1 1 0
0 0 1 λ
wi
xi
No longer pair-wise:
F3
31321
321
11
11
xxxxx
xxxEJ
Minimization trick
21min1 3211,0
321
xxxzxxxz
Freedman D., Turek MW, Graph cuts with many pixel interactions: theory and applications to shape modeling. Image Vision Computing 2010
1min
11,01
KxzxK
i iz
K
ii
Merging Regions
The resulting energy:
+ Pair-wise
- Non submodular!
Jnml nlnmllmn
n nn
xxxxxz
xwE
,,11
min
zx
Quadratic Pseudo-Boolean Optimization
s
i j
ti j
Kolmogorov V., Carsten R., Minimizing non-submodular functions with graph cuts – a review. PAMI’07
+ All edges with positive capacities
- No constraint
Labeling rule:
partial labeling
s
i j
t
i j
ii 1
otherwise
, if1
, if0
SiTi
TiSi
yi
Quadratic Pseudo-Boolean Optimization
Properties of partial labeling y:
1. Let z=FUSE(y,x) E(z)≤E(x)
2. y is subset of optimal y*
y is complete:
1. E submodular
2. Exists flipping
(inference in trees)
s
i j
t
i j
Quadratic Pseudo-Boolean Optimization
0?????
rp q s t
000?? 0010?
rp q s t
rp q s tQPBO:
Probe Node p:0 1
What can we say about variables?
•r -> is always 0•s -> is always equal to q•t -> is 0 when q = 1 slide by Pushmeet Kohli, ICCV’09
QBPO - Probing
• Probe nodes in an order until energy unchanged
• Simplified energy preserves global optimality and (sometimes) gives the global minimum
slide by Pushmeet Kohli, ICCV’09
QBPO - Probing
Merging Regions
Result using QPBO-P:
Resultregions (Ncuts)input image
Recap
• F3 and more– Minimization trick
• Non submodular– QPBO approx. – partial labeling
Beyond F3…
[Kohli et. al. CVPR ‘07, ‘08, PAMI ’08, IJCV ‘09]
Image Segmentation
E(X) = ∑ ci xi + ∑ dij |xi-xj|i i,j
E: {0,1}n → R
0 →fg, 1→bg
n = number of pixels
[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother et al.`04]
Image Unary Cost Segmentation
Pn Potts Potentials
Patch Dictionary
(Tree)
Cmax 0
{0 if xi = 0, i ϵ p Cmax otherwise
h(Xp) =
p
[slide credits: Kohli]
Pn Potts Potentials
E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp) i i,j p
p
{0 if xi = 0, i ϵ p Cmax otherwise
h(Xp) =
E: {0,1}n → R
0 →fg, 1→bg
n = number of pixels
[slide credits: Kohli]
Image Segmentation
E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp) i i,j
Image Pairwise Segmentation
Final Segmentation
p
E: {0,1}n → R
0 →fg, 1→bg
n = number of pixels
[slide credits: Kohli]
Application: Recognition and Segmentation
from [Kohli et al. ‘08]
Image
Unaries onlyTextonBoost
[Shotton et al. ‘06]
Pairwise CRF only[Shotton et al. ‘06]
Pn Potts
One super-pixelization
another super-pixelization
Robust(soft) Pn Potts model
{0 if xi = 0, i ϵ p f(∑xp) otherwise
h(xp) =p
p
from [Kohli et al. ‘08]
Robust Pn PottsPn Potts
Application: Recognition and Segmentation
From [Kohli et al. ‘08]
Image
Unaries onlyTextonBoost
[Shotton et al. ‘06]
Pairwise CRF only[Shotton et al. ‘06]
Pn Potts robust Pn Potts robust Pn Potts(different f)
One super-pixelization
another super-pixelization
Same idea for surface-based stereo]Bleyer ‘10[
One input image
Ground truth depth
Stereo with hard-segmentation
Stereo with robust Pn Potts
This approach gets best result on Middlebury Teddy image-pair:
How is it done…
H (X) = F ( ∑ xi )
Most general binary function:
H (X)
∑ xi
concave
0
The transformation is to a submodular pair-wise MRF, hence optimization globally optimal
[slide credits: Kohli]
Higher order to Quadratic
• Start with Pn Potts model:
{0 if all xi = 0C1 otherwise
f(x) = x ϵ {0,1}n
min f(x) min C1a + C1 (1-a) ∑xix =x,a ϵ {0,1}
Higher Order Function
Quadratic Submodular Function
∑xi = 0 a=0f(x) = 0
∑xi > 0 a=1f(x) = C1
[slide credits: Kohli]
Higher order to Quadratic
min f(x) min C1a + C1 (1-a) ∑xix=
x,a ϵ {0,1}
Higher Order Function
Quadratic Submodular Function
∑xi
1 2 3
C1
C1∑xi
[slide credits: Kohli]
Higher order to Quadratic
min f(x) min C1a + C1 (1-a) ∑xix=
x,a ϵ {0,1}
Higher Order Submodular
Function
Quadratic Submodular Function
∑xi
1 2 3
C1
C1∑xi
a=1a=0Lower
envelope of concave
functions is concave
[slide credits: Kohli]
Summary• Submodular F2
• F3 and beyond: minimization trick
• Non submodular– QPBO(P)
• Beyond F3 – Robust HOP
s
i j
t
i j
∑xi
a=1a=0
f2(x)
f1(x)