1 computer vision research huttenlocher, zabih –recognition, stereopsis, restoration, learning ...

1

Computer Vision Research

Huttenlocher, Zabih– Recognition, stereopsis, restoration, learning

Strong algorithmic focus– Combinatorial optimization – Geometric algorithms

Application areas– Techniques we developed have

• Played important role at Xerox and Microsoft• Resulted in successful startups

– Medical imaging• Zabih joint with Radiology department in NYC

2

Markov Random Fields

Many computer vision problems can be formalized using Markov random fields – Set of sites and neighborhood system

– Estimate label for each site accounting for• Goodness of fit of label to observed data at site

• Consistency of label with neighbors

MRF’s are undirected graphical models– Probabilistic relational models (directed)

Until recently a formalism used in computer vision, but not very practical

3

Example MRF Problems

Stereopsis– For an image pair, estimate

depth at each pixel

• Sites are pixels, neighbors are 4-connected grid, labels are depths

Object recognition– For an image, estimate location

of a multi-part flexible object

• Sites are parts, neighbors are connected parts, labels are locations

4

MRF Algorithms

Underlying graph G=(S,N)– For tree-structure can solve exactly using

variant of Viterbi recurrence• But impractical for large label set

– For two labels, can solve exactly using min-cut

– For three or more labels and grid-graph problem is NP hard

Recent algorithmic progress– For grid graphs, good approximation methods

– For low tree-width graphs, exact methods even for large label sets

5

Alpha Expansion Technique [BVZ99]

Use min-cut to efficiently solve a special two label problem– Labels “stay the same” or “replace with ”

Iterate over possible values of – Each rules out exponentially many labelings

Red expansion

move from x

Input labeling x

6

Graph Cuts for MRF’s on Grid

Best stereo algorithms use alpha expansion technique – Middlebury stereo benchmark

Beyond computer vision: many image compositing, restoration, editing tasks – E.g., SIGGRAPH, Microsoft

Ground Truth Correlation Alpha Expansion

7

Tree-Like MRF’s

Object recognition– Nodes are parts, labels are locations

Small graph, not at all grid-like– Many labels (millions or more)

Viterbi algorithm for trees– Still not practical because O(m2n) for n parts

and m locations per part

– Fast min convolution techniques make finding best labeling O(mn)

More generally for fan-like graphs

8

Fan Structured Models [CFH05]

K-fan, let RS be a set of reference parts– And R’=S-R be the remaining parts

– Complete graph on R and complete bipartite graph on R,R’

Parts local image patches – Probability of (oriented) edge at each pixel

9

Models (Weakly Supervised)

Car (Rear) 1-fan

Motorbike 2-fan

Face 1-fan

• Training examples only labeled as positive/negative

10

Recognition Results

High detection accuracy– Motorbikes 98.6%, Faces 98.2%,

Cars 94.4%, Planes 95.0%

Fast running time– Approx. 2 sec. per image, 2 fans

Exact (global) method for computing highest probability configuration of parts for given image– No approximations or local search techniques

Single overall optimization problem– Does not depend on “feature detection”

1 computer vision research huttenlocher, zabih –recognition, stereopsis, restoration, learning ...

Documents

label problemlabels

gridgraph problem

large label setfor

site consistency of

gridlikemany labels

image pair

image compositing

n parts