efficient algorithms for matching pedro felzenszwalb trevor darrell yann lecun alex berg
TRANSCRIPT
Efficient Algorithms for Matching
Pedro FelzenszwalbTrevor DarrellYann LeCunAlex Berg
Efficient Algorithms for Matching
Pedro FelzenszwalbTrevor DarrellYann LeCunAlex Berg
Polynomial & exact
Efficient Algorithms for Matching
Pedro FelzenszwalbTrevor DarrellYann LeCunAlex Berg
Polynomial & exact
Multilinear & approximate
Efficient Algorithms for Matching
Pedro FelzenszwalbTrevor DarrellYann LeCunAlex Berg
Polynomial & exact
Multilinear & approximate
“Fast?” but very good
Efficient Algorithms for Matching
Pedro FelzenszwalbTrevor DarrellYann LeCunAlex Berg
Polynomial & exact
Multilinear & approximate
“Fast?” but very good
Happy when things work
First Criticism
• Efficiently computing the wrong solution is not so useful…
First Response
• Even if say, an algorithm does not solve object recognition, it can still be a useful tool…
Why Matching?• Ideas hatched before me
– Statistical Pattern Theory (Ulf Grenander)– Deformable Templates– Fischler & Elshlager– Etc. at least by the early 1970’s
• “transform” and “appearance” parameters
• Matching to estimate transform
),( ATM Used to be continuous, now often discreteVery general, Translation / Diffeomorphism / AssignmentImage / Features / “Parts” / etc.
Why Matching?• Ideas hatched before me
– Statistical Pattern Theory (Ulf Grenander)– Deformable Templates– Fischler & Elschlager– Etc. at least by the early 1970’s
• “transform” and “appearance” parameters
• Matching to estimate transform
),( ATM Used to be continuous, now often discreteVery general, Translation / Diffeomorphism / AssignmentImage / Features / “Parts” / etc.
MODEL
TRANSFORM
IMAGE
Why Matching?• Ideas hatched before me
– Statistical Pattern Theory (Ulf Grenander)– Deformable Templates– Fischler & Elschlager– Etc. at least by the early 1970’s
• “transform” and “appearance” parameters
• Matching to estimate transform
),( ATM Used to be continuous, now often discreteVery general, Translation / Diffeomorphism / AssignmentImage / Features / “Parts” / etc.
MODEL
TRANSFORM
IMAGE
Why Matching?• Ideas hatched before me
– Statistical Pattern Theory (Ulf Grenander)– Deformable Templates– Fischler & Elschlager– Etc. at least by the early 1970’s
• “transform” and “appearance” parameters
• Matching to estimate transform – Searching over diffeomorphisms difficult– Searching over discrete assignments easier?
),( ATM Used to be continuous, now often discreteVery general, Translation / Diffeomorphism / AssignmentImage / Features / “Parts” / etc.
MODEL
TRANSFORM
IMAGE
Search for a Transformation
Model of Car
Image
?
Find Transformation Using Correspondence
Model of Car
Image
•Search through a discrete set of possible point correspondences•Objective function should be close to cost of the original model•Use the discrete correspondences to obtain a continuous transformation if needed
Sometimes…
Find Transformation Using Correspondence
Model of Car
Image
Why it works…Sometimes we can measure consistency of model appearance locallyInspired by branch and bound: “If local appearance is inconsistent, any alignment with that appearance is bad.”
My preferred way of motivating local features…
Find Transformation Using Correspondence
Model of Car
Image
Sometimes local appearance is notenough, so we model some versionof spatial constraints.
Do not make the problem harder than it was…
Linear Assignmente.g. Hungarian Algorithm
Just Features, no Geometry
Individual feature matches provide most of the solution.
Quadratic Assignment(Adding Geometric Constraints)
Individual feature matches provide most of the solution.Geometric consistency only has to clean things up a little.
In this case we formulate the matching as an Integer Quadratic Programming problem and look for an approximate solution…
Second Meta-Comment
• Even if a problem can have very difficult instances, the effective complexity of certain instances might be quite low. This can be quite difficult to verify formally.
Use Alignment to Compare
Model of Car
•Given alignment evaluate the model•Note: we might have been done already Grauman et al, Zhang et al•Actually do some alignment and check the quality of the fit.
Back to the alignment…
Humans can be very efficient
• Simon Thorpe: animal or not in <=150 ms– “Feed forward” process– Difficult to retrain (familiarization does not make a
difference)– Salient parts of images are actually processed more
rapidly.– Support for some styles of current algorithms– Neurophysiological evidence for some mid-level
vision (illusory contours, figure ground, etc.)
von der Heydt et al
That’s all fine, but
What is an Object?
• Apple yes, mist no• A rule of thumb is that objects have some
definite spatial support…• Image/Scene? Some context models treat
scenes or images as objects• Face, eyes, nose, eye-lashes • We can build SPT models for all of these…
Heuristics
• Take advantage of the data– Sometimes a single feature is enough– For efficiency need to weigh this against how often
that feature is found– Many (?) object recognition datasets allow easy
discrimination between categories with only very simple features extracted from the whole image, eg Pascal and Caltech 101.
– Segmentation or Figure/Ground -- might as well see if there is an object there before trying to recognize it…
An Approach
1. Extract features from an image2. Look features up in a large database
• Approximate Nearest Neighbor algorithms can make this sub-linear.
• Each entry tells us about which hypotheses [Object,position,pose,...] might be present.
3. Use a “short list” of these to check in more detail using a matching framework
• Simple matching can actually be indexed. Indyk et al, Grauman et al.
4. Finally use a matching to align models and apply more expensive processing