eecs 274 computer vision local invariant features
TRANSCRIPT
![Page 1: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/1.jpg)
EECS 274 Computer Vision
Local Invariant Features
![Page 2: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/2.jpg)
:Local features
• Matching with Harris Detector• Scale-invariant Feature Detection• Scale Invariant Image Descriptors• Affine-invariant Feature Detection• Object Recognition• SIFT Features
• Reading: S Chapter 4
![Page 3: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/3.jpg)
Examples
![Page 4: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/4.jpg)
Features
![Page 5: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/5.jpg)
What local features to use?
![Page 6: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/6.jpg)
Aperture problem
Ambiguity of 1-dimensional motion perception
Stripes moved upward 6 pixels
Stripes moved left 5 pixels
![Page 7: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/7.jpg)
Introduction
Local invariant photometric descriptors
( )local descriptor
Local : robust to occlusion/clutter + no segmentationPhotometric : distinctiveInvariant : to image transformations + illumination changes
![Page 8: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/8.jpg)
History - Recognition
Color histogram [Swain 91]
b
g
r
Each pixel is describedby a color vector
Distribution of color vectorsis described by a histogram
not robust to occlusion, not invariant, not distinctive
![Page 9: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/9.jpg)
History - Recognition
Eigenimages [Turk 91]• Each face vector is represented in the eigenimage space
– eigenvectors with the highest eigenvalues = eigenimages
• The new image is projected into the eigenimage space– determine the closest face
. .1v.
3v
2v.
not robust to occlusion, requires segmentation, not invariant, discriminant
![Page 10: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/10.jpg)
History - Recognition
Geometric invariants [Rothwell 92]• Function with a value independent of the
transformation
• Invariant for image rotation : distance of two points
• Invariant for planar homography : cross-ratio local and invariant, not discriminant, requires sub-pixel extraction of primitives
tt yxTyxyxfyxf ),(),( where),(),(
![Page 11: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/11.jpg)
History - Recognition
Problems : occlusion, clutter, image transformations, distinctiveness
Solution : recognition with local photometric invariants
[ Local greyvalue invariants for image retrieval, C. Schmid and R. Mohr, PAMI 1997 ]
![Page 12: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/12.jpg)
Approach
1) Extraction of interest points (characteristic locations)
2) Computation of local descriptors
3) Determining correspondences
4) Selection of similar images
( )local descriptor
![Page 13: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/13.jpg)
Matching with interest points
• Extraction of interest points with the Harris detector
• Comparison of points with cross-correlation
• Verification with the fundamental matrix
![Page 14: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/14.jpg)
Moravec corner detector
• Developed for Stanford Cart in 1977
![Page 15: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/15.jpg)
Moravec corner detectorChange of intensity for the shift [u,v]:
2,
( , ) ( , ) ( , ) ( , )x y
E u v w x y I x u y v I x y
IntensityShifted intensity
Window function
Four shifts: (u,v) = (1,0), (1,1), (0,1), (-1, 1)Look for local maxima in min(E)
![Page 16: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/16.jpg)
Problems of Moravec detector• Noisy response due to a binary window function• Only a set of shifts at every 45 degree is
considered• Only minimum of E is taken into account
Harris corner detector (1988) solves these problems.
![Page 17: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/17.jpg)
Harris detector
Based on the idea of auto-correlation
Important difference in all directions interest point
![Page 18: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/18.jpg)
Interest points
Geometric featuresrepeatable under transformations
2D characteristics of the signalhigh informational content
Comparison of different detectors [Schmid98] Harris detector
![Page 19: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/19.jpg)
Harris detectorAuto-correlation function for a point and a shift
Discrete shifts can be avoided with the auto-correlation matrix
),( yx ),(),( yxvu
x y
vyuxIyxIyxwvuE 2)),(),()(,(),(
v
uyxIyxIyxIvyuxI yx )),(),((),(),(with
2
2
,
2
,
,,
2
2
*
)),()(,(),(),(*),(
),(),(*),()),((*),(
),(),(),(),(
yyx
yxx
T
yxy
yxyx
yxyx
yxx
yyx
x
III
IIIwA
A
v
u
yxIyxwyxIyxIyxw
yxIyxIyxwyxIyxw
vu
v
uyxIyxIyxwvuE
uu
![Page 20: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/20.jpg)
Comparison of detectors: Rotation
[Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98]
repeatability = #good matches/mean(#points)
![Page 21: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/21.jpg)
Comparison of detectors: Perspective
[Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98]
repeatability = #good matches/mean(#points)
![Page 22: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/22.jpg)
Harris corner detectorIntensity change in shifting window: eigenvalue analysis
1, 2 – eigenvalues of A
direction of the slowest change
direction of the fastest change
(max)-1/2
(min)-1/2
Ellipse E(u,v) = const
2
2
*
),()(
yyx
yxx
T
III
IIIwA
AvuEE uuu
uncertainty ellipse
Shi and Tomasi use min(1, 2)
to locate good features to track
![Page 23: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/23.jpg)
Harris corner detector
1
2
Corner1 and 2 are large, 1 ~ 2;E increases in all directions
1 and 2 are small;E is almost constant in all directions
edge 1 >> 2
edge 2 >> 1
flat
Classification of image points using eigenvalues of M:
![Page 24: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/24.jpg)
Harris detection• Auto-correlation matrix
– captures the structure of the local neighborhood
– measure based on eigenvalues of this matrix• 2 strong eigenvalues => interest point• 1 strong eigenvalue => contour• 0 eigenvalue => uniform region
• Interest point detection– threshold on the eigenvalues– local maximum for localization
![Page 25: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/25.jpg)
Harris corner detectorMeasure of corner response:
(k – empirical constant, k = 0.04-0.06)
22121
2
)(
))(()det(
k
AtracekAR
![Page 26: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/26.jpg)
Example
![Page 27: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/27.jpg)
Example
![Page 28: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/28.jpg)
Example
![Page 29: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/29.jpg)
Good features
Using auto-correlation or Hessian matrix
![Page 30: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/30.jpg)
Local descriptors
Descriptors characterize the local neighborhood of a point
local descriptor
( )
![Page 31: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/31.jpg)
Local jet
Convolution of image I with Gaussian derivatives
)(*),(
)(*),(
)(*),(
)(*),(
)(),(
)(),(
),(
yy
xy
xx
y
x
GyxI
GyxI
GyxI
GyxI
GyxI
GyxI
yxv
)2
),(exp(
2
1),),((
),(),()(),(
2
2
2
tt yx
yxG
ydxdyyxxIyxGGyxI
![Page 32: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/32.jpg)
N-Jet, local jet
nsor epsilon te ricantisymmet theis where
2
2
)(
ij
yyyyxyxyxxxx
yyxx
yyyyyxxyxxxx
yyxx
kjiijk
lkijklij
ljiijkkkjiij
llijkklkijklij
ijij
ii
jiji
ii
LLLLLL
LL
LLLLLLLL
LLLL
L
LLLL
LLLL
LLLLLLLL
LLLLLLLL
LL
L
LLL
LL
L
• Invariance to image rotation : differential invariants [Koen87]
![Page 33: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/33.jpg)
Local descriptors
Robustness to illumination changes
In case of an affine transformation
2
2
2
2
2/1
2/3
)(
)(
)(
)
)(
)(
)(
)(
ii
kjiijk
ii
lkijklij
ii
kjiijkkkjiij
ii
llijkklkijklij
ii
jiij
ii
ii
ii
jiji
LL
LLLL
LL
LLLL
LL
LLLLLLLL
LL
LLLLLLLL
LL
LL
LL
L
LL
LLL
baII )()( 21 xx
or normalization of the image patch with mean and variance
![Page 34: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/34.jpg)
Determining correspondences
Vector comparison using the Mahalanobis distance
=?
)()(),( 1 qpqpqp TMdist
( )( )
![Page 35: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/35.jpg)
Selection of similar images
• In a large database – voting algorithm– additional constraints
• Rapid acces with an indexing mechanism
![Page 36: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/36.jpg)
Voting algorithm
local characteristicsvector of
( )
![Page 37: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/37.jpg)
Voting algorithm
} }
1 1 02 1 1
I is the corresponding model image1
2 1 1
![Page 38: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/38.jpg)
Additional constraints
• Semi-local constraints– neighboring points should match– angles, length ratios should be similar
• Global constraints– robust estimation of the image transformation (homogaphy,
epipolar geometry)
11
2
3
2
3
![Page 39: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/39.jpg)
Results
database with ~1000 images
![Page 40: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/40.jpg)
Results
![Page 41: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/41.jpg)
Harris detector
Interest points extracted with Harris (~ 500 points)
![Page 42: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/42.jpg)
Cross-correlation matching
Initial matches (188 pairs)
![Page 43: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/43.jpg)
Global constraints
Robust estimation of the fundamental matrix
99 inliers 89 outliers
![Page 44: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/44.jpg)
Summary of Harris detector
• Very good results in the presence of occlusion and clutter– local information– discriminant greyvalue information – invariance to image rotation and illumination
• Not invariance to scale and affine changes
• Solution for more general view point changes– local invariant descriptors to scale and rotation– extraction of invariant points and regions
![Page 45: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/45.jpg)
Scale Invariant Feature Detection• Consider two images of the same scene,
related by a scale change (i.e. zooming)• How are their scale space representations
related? Scale Space Theory (Lindeberg ’98):
Normalized derivatives have the same value atcorresponding relative scales
handat task for theparameter free a is
),,()',','('
'
)','(),(
),()','(
, s,derivative normalized
)1(
2
2/2/
tyxLstyxL
tst
yxIyxI
sysxyx
tt
m
yx
![Page 46: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/46.jpg)
Harris detector + scale changes
![Page 47: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/47.jpg)
Scale Adapted Harris Detector
Many corresponding points at which the scale factor correspondsto scale change between images
![Page 48: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/48.jpg)
Scale invariant Harris points• Multi-scale extraction of Harris interest points
• Selection of points at characteristic scale in scale space
Laplacian
Chacteristic scale :
- maximum in scale space
- scale invariant
![Page 49: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/49.jpg)
Scale invariant interest points
invariant points + associated regions [Mikolajczyk & Schmid’01] Harris-Laplacian Feature
multi-scale Harris points
selection of points
at the characteristic scale
with Laplacian
![Page 50: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/50.jpg)
Harris detector – adaptation to scale
![Page 51: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/51.jpg)
Evaluation of scale invariant detectors
repeatability – scale changes
![Page 52: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/52.jpg)
SIFT: Overview
• 1999• Generates image features,
“keypoints”– invariant to image scaling and rotation– partially invariant to change in
illumination and 3D camera viewpoint– many can be extracted from typical
images– highly distinctive
![Page 53: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/53.jpg)
Algorithm overview
1. Scale-space extrema detection– Uses difference-of-Gaussian function
2. Keypoint localization– Sub-pixel location and scale fit to a
model
3. Orientation assignment– 1 or more for each keypoint
4. Keypoint descriptor– Created from local image gradients
![Page 54: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/54.jpg)
Scale space
• Definition: where
),(),,(),,( yxIyxGyxL 222 2/)(
22
1),,(
yxeyxG
![Page 55: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/55.jpg)
Scale space
• Keypoints are detected using scale-space extrema in difference-of-Gaussian function D
• D definition:
• Efficient to compute
),,(),,(
),()),,(),,((),,(
yxLkyxL
yxIyxGkyxGyxD
![Page 56: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/56.jpg)
Relationship of D to
• Close approximation to scale-normalized Laplacian of Gaussian,
• Diffusion equation:
• Approximate ∂G/∂σ:– giving,
• When D has scales differing by a constant
factor it already incorporates the σ2 scale normalization required for scale-invariance
• e.g.,
G22
G22
GG 2
k
yxGkyxGG ),,(),,(
Gk
yxGkyxG 2),,(),,(
GkyxGkyxG 22)1(),,(),,(
2k
![Page 57: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/57.jpg)
Scale space construction
2k2σ
2kσ
2σ
kσ
σ
2kσ
2σ
kσ
σ
![Page 58: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/58.jpg)
Scale space• A collection of images obtained by progressively smoothing
the input image• Analogous to gradually reducing image resolution • See Vedaldi’s implementation (http://www.vlfeat.org/ ) for
details• Discretized scales
)),(min(logO 3,S
octaves of # : octave,per scales of # :
1,,,1,,0,6.1
,2),(
2
minmin0
),()1,(),()/(
0
hw
soosoosooSs
II
Os
OoooSs
IIDoGos
![Page 59: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/59.jpg)
Scale space images
…
first octave
…
…
second octave
…
…
third octave
…
fourth octave
…
…
![Page 60: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/60.jpg)
Difference-of-Gaussian images
…
first octave
…
…
second octave
…
…
third octave
…
fourth octave
…
…
![Page 61: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/61.jpg)
Frequency of sampling
• There is no minimum• Best frequency determined experimentally
![Page 62: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/62.jpg)
Prior smoothing for each octave
• Increasing σ increases robustness, but costs• σ = 1.6 a good tradeoff• Doubling the image initially increases
number of keypoints
![Page 63: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/63.jpg)
Finding extrema
• Sample point D(x,y,σ) is selected only if it is a minimum or a maximum of these points
DoG scale space
Extrema in this image
![Page 64: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/64.jpg)
Localization
• 3D quadratic function is fit to the local sample points
• Start with Taylor expansion with sample point as the origin– where
• Take the derivative with respect to X, and set it to 0, giving
• is the location of the keypoint• This is a 3x3 linear system
2
2
2
1)(
DDDD T
T
Tyx ),,(
XX
D
X
D ˆ02
2
DD2
12
ˆ
![Page 65: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/65.jpg)
Localization
• Hessian and derivative approximated by finite differences,– example:
• If X is > 0.5 in any dimension, process repeated
x
Dy
D
D
x
y
x
D
yx
D
x
Dyx
D
y
D
y
Dx
D
y
DD
2
222
2
2
22
22
2
2
4
)()(
1
2
2
,11
,11
,11
,11
2
,1
,,1
2
2
,1
,1
jik
jik
jik
jik
jik
jik
jik
jik
jik
DDDD
y
D
DDDD
DDD
XX
D
X
D ˆ02
2
![Page 66: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/66.jpg)
Filtering
• Contrast (use prev. equation):– If |D(X)| < 0.03, throw it out
• Edge-iness:– Use ratio of principal curvatures to throw out
poorly defined peaks– Curvatures come from Hessian:– Ratio of Trace(H)2 and Determinant(H)
– If ratio > (r+1)2/(r), throw it out (SIFT uses r=10)
XD
DDT
ˆ2
1)ˆ(
yyxy
xyxx
DD
DDH
22
2 )(
)(
)(,)()(
)(
HDet
HTrratioDDDHDet
DDHTr
xyyyxx
yyxx
![Page 67: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/67.jpg)
Orientation assignment
• Descriptor computed relative to keypoint’s orientation achieves rotation invariance
• Precomputed along with mag. for all levels (useful in descriptor computation)
• Multiple orientations assigned to keypoints from an orientation histogram– Significantly improve stability of matching
))),1(),1(/())1,()1,(((2tan),(
))1,()1,(()),1(),1((),( 22
yxLyxLyxLyxLayx
yxLyxLyxLyxLyxm
![Page 68: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/68.jpg)
Keypoint images
![Page 69: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/69.jpg)
Keypoint Selection
Finding extremawith DoG
Removing|D(X)| < 0.03
Removing|D(X)| < 0.03
![Page 70: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/70.jpg)
Select canonical orientation• Create histogram of local
gradient directions computed at selected scale
• Assign canonical orientation at peak of smoothed histogram
• Each key specifies stable 2D coordinates (x, y, scale, orientation)
0 2
![Page 71: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/71.jpg)
Descriptor
• Descriptor has 3 dimensions (x,y,θ)• Orientation histogram of gradient magnitudes• Position and orientation of each gradient sample
rotated relative to keypoint orientation
![Page 72: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/72.jpg)
Descriptor
• Weight magnitude of each sample point by Gaussian weighting function
• Distribute each sample to adjacent bins by trilinear interpolation (avoids boundary effects)
![Page 73: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/73.jpg)
Descriptor• Best results achieved with 4x4x8 =
128 descriptor size• Normalize to unit length
– Reduces effect of illumination change
• Cap each element to 0.2, normalize again– Reduces non-linear illumination changes– 0.2 determined experimentally
![Page 74: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/74.jpg)
Object detection
• Create a database of keypoints from training images
• Match keypoints to a database– Nearest neighbor
search
![Page 75: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/75.jpg)
PCA-SIFT
• Different descriptor (same keypoints)• Apply PCA to the gradient patch• Descriptor size is 20 (instead of 128)• More robust, faster
![Page 76: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/76.jpg)
SIFT: Summary
• Scale space• Difference-of-Gaussian• Localization• Filtering• Orientation assignment• Descriptor, 128 elements
![Page 77: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/77.jpg)
Object recognition
• Definition: Identify an object and determine its pose and model parameters
• Commercial object recognition– $4 billion/year industry for inspection and
assembly– Almost entirely based on template matching
• Upcoming applications– Mobile robots, toys, user interfaces– Location recognition– Digital camera panoramas, 3D scene modeling
![Page 78: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/78.jpg)
Invariant local features• Image content => local feature coordinates invariant to translation, rotation, scale• Technical details regarding
– Keypoint matching (using 1st and 2nd nearest neighbors)– Efficient nearest neighbor indexing – Clustering with Hough transform (model location, orientation, scale)– Account for affine distortion
Features
![Page 79: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/79.jpg)
Advantages of invariant local features
• Locality: features are local, so robust to occlusion and clutter (no prior segmentation)
• Distinctiveness: individual features can be matched to a large database of objects
• Quantity: many features can be generated for even small objects
• Efficiency: close to real-time performance
![Page 80: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/80.jpg)
![Page 81: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/81.jpg)
![Page 82: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/82.jpg)
Experimental evaluation
![Page 83: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/83.jpg)
Scale change (factor 2.5)
Harris-Laplace DoG
![Page 84: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/84.jpg)
Viewpoint change (60 degrees)
Harris-Affine (Harris-Laplace)
![Page 85: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/85.jpg)
Descriptors - conclusion
• SIFT + steerable perform best
• Performance of the descriptor independent of the detector
• Errors due to imprecision in region estimation, localization
![Page 86: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/86.jpg)
Image retrieval
…> 5000images
change in viewing angle
![Page 87: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/87.jpg)
Matches
22 correct matches
![Page 88: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/88.jpg)
Image retrieval
…> 5000images
change in viewing angle
+ scale change
![Page 89: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/89.jpg)
Matches
33 correct matches
![Page 90: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/90.jpg)
Multiple panoramas from an unordered image set
![Page 91: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/91.jpg)
Image registration and blending
![Page 92: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/92.jpg)
Location recognition
![Page 93: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/93.jpg)
Robot localization• Joint work with Stephen Se, Jim Little
![Page 94: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/94.jpg)
Recognizing panoramas• Matthew Brown and David Lowe• Recognize overlap from an unordered set of
images and automatically stitch together• SIFT features provide initial feature matching• Image blending at multiple scales hides the
seams
Panorama automatically assembled from 143 images
![Page 95: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/95.jpg)
Map continuously built over time
![Page 96: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/96.jpg)
Planar recognition• Planar surfaces can be
reliably recognized at a rotation of 60° away from the camera
• Affine fit approximates perspective projection
• Only 3 points are needed for recognition
![Page 97: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/97.jpg)
3D object recognition• Extract
outlines with background subtraction
![Page 98: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/98.jpg)
3D object recognition
• Only 3 keys are needed for recognition, so extra keys provide robustness
• Affine model is no longer as accurate
![Page 99: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/99.jpg)
Recognition under occlusion
![Page 100: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/100.jpg)
Comparison to template matching
• Costs of template matching– 250,000 locations x 30 orientations x 4 scales =
30,000,000 evaluations– Does not easily handle partial occlusion and other
variation without large increase in template numbers– Viola & Jones cascade must start again for each
qualitatively different template
• Costs of local feature approach– 3000 evaluations (reduction by factor of 10,000)– Features are more invariant to illumination, 3D rotation,
and object variation– Use of many small subtemplates increases robustness to
partial occlusion and other variations
![Page 101: EECS 274 Computer Vision Local Invariant Features](https://reader035.vdocuments.us/reader035/viewer/2022062410/56649e605503460f94b5b9e2/html5/thumbnails/101.jpg)
More local features
• Harris Laplace• Harris Affine• Hessian detector• Hessian Laplace• Hessian Affine• MSER (Maximally Stable Extremal
Regions)• SURF (Speeded-Up Robust Feature)