jigsaws: joint appearance and shape clustering john winn with anitha kannan and carsten rother...

Jigsaws: joint appearance and shape clustering

John Winnwith Anitha Kannan and Carsten Rother

Microsoft Research, Cambridge

Patch models

Used for: Object recognition/detection Object segmentation

But also: Stereo matching, photo stitching Texture synthesis Super-resolution Motion segmentation Image/video compression

Patch models

Patch clustering/codebook (e.g. Leibe & Schiele)

Epitome (Jojic et al.)parameter sharing + translation invariant

Issues with fixed patch size/shape

Patch includes backgroundpatches containing the same object are not clustered together

Patch excludes part of objectpatch is less discriminative

Patch includes occlusionoccluded and unoccluded objects are not clustered together

Patch size?

Small(single pixel)

Large(entire image)

More discriminative

Less sharing

More sharing

Less discriminative

Optimal size/shape?

Depends on: •object size/shape•object variability•size of training set

Size

Aims of jigsaw model

Learn patches (jigsaw pieces) which are

1.Shared: each piece is similar in shape and appearance to many regions of the training images;

2.Discriminative: each piece is as large as possible;

3.Exhaustive: all parts of the training images can be reconstructed from the set of jigsaw pieces.

The Jigsaw model

Image I1 Offset map L1

...Image I2 Offset map L2 Image IN Offset map LN

Jigsaw J

The Jigsaw model

Jigsaw J



x

xxxxxP 1)(,)(|Normal),|( LL)I(LJI

)( λμ,

The Jigsaw model

Jigsaw J



)( λμ,

Potts model:

Toy example

Training image Jigsaw

Learned using EM+ graph cuts

Dog example

Training image

3232 Jigsaw mean

Dog example

Reconstructed image

Learned segmentation

3232 Jigsaw mean

Epitome reconstruction

Faces example

128128 Jigsaw mean

100 6464 imagesSource: Olivetti face database

Learning the ‘pieces’



Jigsaw J

Faces example

Results of shape clustering on the face images

64x64 jigsaw

Object recognition (preliminary) Trained set: 20 street images

Allow patches to deform (as in LayoutCRF, CVPR 2006).

Object recognition (preliminary) Trained set: 20 street images (10 labelled)

64x64 jigsaw

Accuracy improves (~1%) if you include an additional 10 unlabelled images when learning the jigsaw.

Allow patches to deform (as in LayoutCRF, CVPR 2006).

Work in progress…

Training larger jigsaws on 100s of images

Incorporating shape clustering into the probabilistic model

Learning additional invariances e.g. to illumination

Object recognition results on MSRC and other datasets

Conclusions Jigsaw model allows learning the shape

and appearance of objects or object parts in images. Can also handle occlusion.

Clustering shape and appearance much more powerful for recognition than appearance alone.

Can be used as a ‘plug-and-play’ replacement for fixed size patches in any existing patch-based system.

Thank you

[email protected]

http://johnwinn.org

jigsaws: joint appearance and shape clustering john winn with anitha kannan and carsten rother...

Documents

clustering shape

object patch

set of jigsaw pieces

jigsaw modeljigsaw j

object parts

appearance of objects

training imagesdiscriminative

fixed size patches