part1 mlss 2011

Upload: mgpadalkar

Post on 03-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Part1 MLSS 2011

    1/33

    Markov Random Fields for

    Computer Vision (Part 1)Machine Learning Summer School (MLSS 2011)

    Stephen [email protected]

    Australian National University

    1317 June, 2011

    Stephen Gould 1/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    2/33

    Pixel Labeling

    Label every pixel in an image with a class label from some

    pre-defined set, i.e., yp L.

    Stephen Gould 2/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    3/33

    Pixel Labeling

    Label every pixel in an image with a class label from some

    pre-defined set, i.e., yp L.

    Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)

    Stephen Gould 2/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    4/33

    Pixel Labeling

    Label every pixel in an image with a class label from some

    pre-defined set, i.e., yp L.

    Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)

    Surface context (Hoiemet al., 2005)

    Stephen Gould 2/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    5/33

    Pixel Labeling

    Label every pixel in an image with a class label from some

    pre-defined set, i.e., yp L.

    Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)

    Surface context (Hoiemet al., 2005)

    Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)

    Stephen Gould 2/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    6/33

    Pixel Labeling

    Label every pixel in an image with a class label from some

    pre-defined set, i.e., yp L.

    Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)

    Surface context (Hoiemet al., 2005)

    Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)

    Stereo matching (Scharstein and Szeliski,2002)

    Stephen Gould 2/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    7/33

    Pixel Labeling

    Label every pixel in an image with a class label from some

    pre-defined set, i.e., yp L.

    Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)

    Surface context (Hoiemet al., 2005)

    Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)

    Stereo matching (Scharstein and Szeliski,2002)

    Image denoising (Felzen-szwalb and Huttenlocher,2004; Szeliski et al., 2008)

    Stephen Gould 2/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    8/33

    Digital Photo Montage

    (Agarwala et al., 2004)

    Stephen Gould 3/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    9/33

    Digital Photo Montage

    demonstration

    Stephen Gould 4/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    10/33

    Tutorial Overview

    Part 1. Pairwise conditional Markov random fields for thepixel labeling problem (45 minutes)

    Part 2.Pseudo-boolean functions and graph-cuts (1 hour)

    Part 3. Higher-order terms and inference as integerprogramming (30 minutes)

    please ask lots of questions

    Stephen Gould 5/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    11/33

  • 8/12/2019 Part1 MLSS 2011

    12/33

    Probability Review

    Bayes Rule

    P (y|x) posterior

    =

    likelihood

    P (x| y)

    prior

    P (y)

    P (x)

    Maximum a Posteriori (MAP) inference: y =argmaxyP (y|x).

    Conditional IndependenceRandom variablesy and x areconditionally independentgiven z ifP (y, x|z) =P (y|z) P (x| z).

    Stephen Gould 6/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    13/33

    Graphical Models

    We can exploit conditional independence assumptions to represent

    probability distributions in a way that is both compactand efficientfor inference.

    This tutorial is all about one particular representation, calledaMarkov Random Field(MRF), and the associated inferencealgorithms that are used in computer vision.

    A B

    C D

    ad |b, c

    A B

    C D

    1

    Z(a, b)(b, d)(d, c)(c, a)

    Stephen Gould 7/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    14/33

    Graphical Models

    A B

    C D

    P (a, b, c, d) = 1

    Z(a, b)(b, d)(d, c)(c, a)

    = 1Z

    exp {(a, b) (b, d) (d, c) (c, a)}

    where = log .

    Stephen Gould 8/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    15/33

    Energy Functions

    Let x be some observations (i.e., features from the image) and lety= (y1, . . . , yn) be a vector of random variables. Then we canwrite the conditional probability ofy given x as

    P (y| x) = 1

    Z(x)exp {E(y; x)}

    where Z(x) =

    yLn

    exp {E(y; x)} is called the partition function.

    Stephen Gould 9/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    16/33

    Energy Functions

    Let x be some observations (i.e., features from the image) and lety= (y1, . . . , yn) be a vector of random variables. Then we canwrite the conditional probability ofy given x as

    P (y| x) = 1

    Z(x)exp {E(y; x)}

    where Z(x) =

    yLn

    exp {E(y; x)} is called the partition function.

    The energy function E(y; x) usually has some structured form:

    E(y; x) =c

    c(yc; x)

    where c(yc; x) are clique potentialsdefined over a subset ofrandom variablesycy.

    Stephen Gould 9/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    17/33

    Conditional Markov Random Fields

    E(y; x) = c

    c(yc

    ; x)

    =iV

    Ui (yi; x)

    unary+ijE

    Pij(yi, yj; x)

    pairwise+cC

    Hc(yc; x).

    higher-orderx1 x2 x3

    y1 y2 y3

    x4 x5 x6

    y4 y5 y6

    x7 x8 x9

    y7 y8 y9

    Stephen Gould 10/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    18/33

    Pixel Neighbourhoods

    y1 y2 y3

    y4 y5 y6

    y7 y8 y9

    4-connected,N4

    y1 y2 y3

    y4 y5 y6

    y7 y8 y9

    8-connected,N8

    Stephen Gould 11/23

    http://find/http://goback/
  • 8/12/2019 Part1 MLSS 2011

    19/33

    Binary MRF Example

    Consider the following energy function fortwo binary random variables, y1 and y2.

    01

    5

    201

    1

    301

    0 1

    0 3

    4 0

    E(y1, y2) =1(y1) +2(y2) +12(y1, y2)

    Stephen Gould 12/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    20/33

    Binary MRF Example

    Consider the following energy function fortwo binary random variables, y1 and y2.

    01

    5

    201

    1

    301

    0 1

    0 3

    4 0

    E(y1, y2) =1(y1) +2(y2) +12(y1, y2)= 5y1+ 2y1

    1+ y2+ 3y2 2

    + 3y1y2+ 4y1y2

    12

    where y1 = 1 y1 and y2 = 1 y2.

    Stephen Gould 12/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    21/33

    Binary MRF Example

    Consider the following energy function fortwo binary random variables, y1 and y2.

    01

    5

    201

    1

    301

    0 1

    0 3

    4 0

    E(y1, y2) =1(y1) +2(y2) +12(y1, y2)= 5y1+ 2y1

    1+ y2+ 3y2 2

    + 3y1y2+ 4y1y2

    12

    where y1 = 1 y1 and y2 = 1 y2.

    Graphical Model

    y1 y2

    Probability Table

    y1 y2 E P

    0 0 6 0.2440 1 11 0.002

    1 0 7 0.090

    1 1 5 0.664

    Stephen Gould 12/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    22/33

    Compactness of Representation

    Consider a 1 mega-pixel image, e.g., 1000 1000 pixels. We wantto annotate each pixel with a label from L. Let L= |L|.

    There are L106

    possible ways to label such an image.

    A naive encodingi.e., one big tablewould require L106 1parameters.

    A pairwise MRF over N4 requires 106L parameters for theunary terms and 2 1000 (1000 1)L2 parameters for the

    pairwise terms, i.e., O(106

    L2

    ). Even less are required if weshare parameters.

    Stephen Gould 13/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    23/33

    Inference and Energy Minimization

    We are usually interested in finding the most probable labeling,

    y =argmaxy

    P (y|x) =argminy

    E(y; x) .

    This is known as maximum a posteriori(MAP) inference or energyminimization.

    Stephen Gould 14/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    24/33

    Inference and Energy Minimization

    We are usually interested in finding the most probable labeling,

    y =argmaxy

    P (y|x) =argminy

    E(y; x) .

    This is known as maximum a posteriori(MAP) inference or energyminimization.

    A number of techniques can be used to find y, including:

    message-passing (dynamic programming)

    integer programming(part 3)

    graph-cuts(part 2)

    However, in general, inference is NP-hard.

    Stephen Gould 14/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    25/33

    Characterizing Markov Random Fields

    Markov random fields can be categorized via a number of differentdimensions:

    Label space: binary vs. multi-label; homogeneous vs.heterogeneous.

    Order: unary vs. pairwise vs. higher-order.

    Structure: chain vs. tree vs. grid vs. general graph;neighbourhood size.

    Potentials: submodular, convex, compressible.

    These all affect tractability of inference.

    Stephen Gould 15/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    26/33

    Markov Random Fields for Pixel Labeling

    P (y|x) P (x| y) P (y) = exp {E(y; x)}energy

    E(y; x) =iV

    Ui (yi; x)

    unary+

    ijN8

    Pij(yi, yj; x)

    pairwise

    Ui (yi; x) =

    likelihood

    L [[yi=]] log P (xi |)

    Pij(yi, yj; x) = [[yi=yj]] Potts prior

    Here the prior acts to smooth predictions(independentofx).Stephen Gould 16/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    27/33

    Prior Strength

    = 1 = 4 = 16 = 128 = 1024

    Stephen Gould 17/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    28/33

    Interactive Segmentation Model

    Label space: foreground or background

    L= {0, 1}

    Unary term: Gaussian mixture models for foreground and

    background

    Ui (yi; x) =k

    12|k| + 12(xi k)

    T 1k

    (xi k) log k

    Pairwise term: contrast-dependent smoothness prior

    Pij(yi, yj; x) =

    0+1exp

    xixj

    2

    2

    , ifyi=yj

    0, otherwise

    Stephen Gould 18/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    29/33

  • 8/12/2019 Part1 MLSS 2011

    30/33

    Stereo Matching Model

    Label space: pixel disparity

    L= {0, 1, . . . , 127}

    Unary term: sum of absolute differences (SAD) ornormalized cross-correlation (NCC)

    Ui (yi; x) =

    (u,v)W

    |xleft(u, v) xright(u yi, v)|

    Pairwise term: discontinuity preserving prior

    Pij(yi, yj) = max {|yi yj|, dmax}

    Stephen Gould 20/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    31/33

    Image Denoising Model

    Label space: pixel intensity or colour

    L= {0, 1, . . . , 255}

    Unary term: square distance

    Ui (yi; x) =yi xi2

    Pairwise term: truncated L2 distance

    Pij(yi, yj) = max

    yi yj2, d2max

    Stephen Gould 21/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    32/33

    Digital Photo Montage Model

    Label space: image index

    L= {1, 2, . . . , K}

    Unary term: none!

    Pairwise term: seem penalty

    Pij(yi, yj; x) =xyi(i) xyj(i) + xyi(j) xyj(j)

    (or edge-normalized variant)

    Stephen Gould 22/23

    http://find/
  • 8/12/2019 Part1 MLSS 2011

    33/33

    end of part 1

    Stephen Gould 23/23

    http://find/