scale invariant feature tranform- siftraul/imageanalysis/sift.pdf · 10/16/2019 feature detection 3...
TRANSCRIPT
10/16/2019 Feature Detection 1
Scale Invariant Feature
Tranform- SIFT
Raul Queiroz Feitosa
10/16/2019 Feature Detection 2
Outline
Overview
Keypoint detector
Descriptor
Matching
Alternative Approaches
10/16/2019 Feature Detection 3
The man behind the method
David G. Lowe, "Distinctive image features from scale-invariant keypoints,"
International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Example:
10/16/2019 Feature Detection 4
The Correspondence Problem
10/16/2019 Feature Detection 5
The Correspondence Problem
Introduction: difficult to locate an accurate match …
in uniform
neighborhoods close to edges
10/16/2019 Feature Detection 6
The Correspondence Problem
Promising image entities
Before starting the search for correspondences, it is
convenient to locate entities, (interest points or keypoints)
with good characteristics for an accurate match:
1. Distinctiveness: locally separable
2. Invariance: identifiable under the expected geometric and
radiometric distortions
3. Stability: appears in both images
4. Seldomness: globally separable
Invariance to
Image scale,
Rotation on the plane.
Robustness to
Affine distortion,
Change in 3D viewpoint,
Noise.
Locality: feature are local, so robust to occlusion and clutter
Distinctiveness: individual features can be matched to a large database of objects
Quantity: many features can be generated for even small objects.
10/16/2019 Feature Detection 7
Advantages
10/16/2019 Feature Detection 8
Basic Steps
1. Scale-space extrema detection
2. Keypoint localization
3. Orientation assignment
4. Keypoint descriptor construction
5. Matching keypoints
Interest-point detection
Description
Matching
Gaussian Image Pyramid
The scale-space representation:
The input image is successively smoothed with a Gaussian
kernel and subsampled, forming an image pyramid.
10/16/2019 Feature Detection 9
x
y
σ (scale)
10/16/2019 Feature Detection 10
DoG Pyramid
Overview:
σ
σ2
σ2
σ22
σ4
10/16/2019 Feature Detection 11
Building the Pyramids
Algorithm:
1. The image size is doubled
This implies in a smoothing corresponding to σ=0.5. Thus, the
first smoothing should use σ=0.5 and from then on σ=1.6.
10/16/2019 Feature Detection 12
Building the Pyramids
Algorithm:
2. Build the Gaussian pyramid
first
octave
second
octave
third
octave
10/16/2019 Feature Detection 13
Building the Pyramids
Algorithm:
3. Build the DoG pyramid - D(x,y,σ)
first
octave
second
octave
third
octave
10/16/2019 Feature Detection 14
Outline
Overview
Keypoint detector
Descriptor
Matching
Alternative Approaches
10/16/2019 Feature Detection 15
SIFT Keypoint Detector
Detecting Extrema:
Scale-space extrema detection - algorithm :
4. Extrema in 3D
10/16/2019 Feature Detection 16
SIFT Keypoint Detector
10/16/2019 Feature Detection 17
Accurate Keypoint Localization
From difference-of-Gaussian local
extrema detection we obtain
approximate values for keypoints.
Originally these approximations were
used directly.
For an improvement in matching and
stability, fitting to a 3D quadratic
function is used.
This is specially important to localize
keyponts detected on higher octaves.
accurate
location rough
location
samples
fitted curve x
f (x)
10/16/2019 Feature Detection 18
Filtering out weak keypoints
Keypoints with low contrast are discarded
For image intensities normalized in [0 1], discard a
keypoint if
Keypoints along edges are discarded
How?
03.0 )ˆ( xD
10/16/2019 Feature Detection 19
SIFT Examples
original image
The initial 832
keypoints
locations at
extrema of DoG.
Vectors
indicating scale,
orientation, and
location.
After applying
a threshold on
minimum contrast,
729 keypoints
remain.
The final 536
keypoints after
threshold on
ratio of
principal
curvatures.
10/16/2019 Feature Detection 20
Outline
Overview
Keypoint detector
Descriptor
Matching
Alternative Approaches
10/16/2019 Feature Detection 21
Assigning an Orientation
Take the image L at the scale closest to the keypoint.
Calculate magnitude and orientations of gradient around the key point according to
Compute the orientation histogram with 36 bins. Each sample is weighted by gradient magnitude and
Gaussian circular window with a σ equal to 1.5 times scale of keypoint
22))1,()1,(()),1(),1((),( yxLyxLyxLyxLyxm
))),1(),1(/())1,()1,(((tan),(1
yxLyxLyxLyxLyx
10/16/2019 Feature Detection 22
Assigning an Orientation
0
10
20
30
. . . 35
0
36
0
freq
uen
cy
. . .
gradients
weights
wei
gh
ts
orientation
bins
10/16/2019 Feature Detection 23
Assigning an Orientation
Highest peak in orientation histogram is found along with any other peaks within 80% of highest peak → more than one orientation may be assigned to a key point.
A parabola is fit to the 3 closest histogram values to each peak and its maximum is taken → more accurate peak detection.
Thus each keypoint has 4 dimensions:
x location,
y location,
σ scale, and
orientation
10/16/2019 Feature Detection 24
SIFT Descriptor
1. Compute the gradient magnitude and orientation at each image
sample point around the keypoint. Magnitudes are weighted by
a Gaussian window, with σ =1/2 width of the descriptor
window.
to achieve orientation invariance,
the coordinates of the descriptor
and the gradient orientations are
rotated relative to the keypoint
orientation.
10/16/2019 Feature Detection 25
SIFT Descriptor
2. Form orientation histograms summarizing the contents over 4x4
subregions, with the length of each arrow given by the sum of
the gradient magnitudes near that direction within the
region.
10/16/2019 Feature Detection 26
SIFT Descriptor
3. Avoiding boundary effects between histograms
Weight equal to 1 – d , for each of the 3 dimensions where d is
the distance of a sample to the center of a bin
dx=1
dy=
1
dx
dy
dθ
10/16/2019 Feature Detection 27
SIFT Descriptor
4. Ensuring invariance to illumination
Vector is normalized to 1; its components are thresholded to no
larger than 0.2 and then the vector is normalized again.
10/16/2019 Feature Detection 28
SIFT Descriptor
“This figure shows a 2×2 descriptor array computed from an 8×8
set of samples, whereas the experiments in this paper use 4×4
descriptors computed from a 16×16 sample array.” Lowe
10/16/2019 Feature Detection 29
SIFT Descriptor
Keypoint Descriptor provides invariance to
Scale (by using the DoG pyramid)
Illumination (by using the gradients + normalization)
Rotation (by rotating the description relative to the main direction)
3D camera viewpoint (to a certain extent)
10/16/2019 Feature Detection 30
Outline
Overview
Keypoint detector
Descriptor
Matching
Alternative Approaches
10/16/2019 Feature Detection 31
SIFT- Keypoint Matching
Up to this point we have:
Found rough approximations for features by looking at
the difference-of-Gaussians
Localized the keypoint more accurately
Thresholded poor keypoints
Determined the orientation of a keypoint
Calculated a 128 feature vector for each keypoint
How to find corresponding keypoints on a pair (or
more) of images?
10/16/2019 Feature Detection 32
SIFT- Keypoint Matching
Keypoint Matching
The dissimilarity measure is given by the Euclidean
distance between descriptors.
Independently match all keypoints in all octaves in one
image with all keypoints in all octaves in other image.
Take the closest neighbor.
If the ratio of closest nearest neighbor with second
closest nearest neighbor, is greater than 0.8, discard
them.
10/16/2019 Feature Detection 33
SIFT- Keypoint Matching
Example of Matching
10/16/2019 Feature Detection 34
Outline
Overview
Keypoint detector
Descriptor
Matching
Alternative Approaches
SIFT Related References
SIFT LOWE, D.G. Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, v. 60, n. 2, pp. 91-
110, 2004.
SUSAN, S., JAIN, A., SHARMA, A., VERMA, A., JAIN, S., Fuzzy match index for scale-invariant feature transform (SIFT) features
with application to face recognition with wek supervision, IET Image Processing, v. 9, n. 11, pp. 951-958m 2915
SURF BAY, H.; ESS, A.; TUYTELAARS, T.; VAN GOOL, L. Speeded-up robust features (SURF). Journal of Computer Vision and
Image Understanding, v.110, n.3, p. 346-359, 2008.
SANGINETO, E., Pose and Expression Independent Facil Landmark Localization Using Dense-SURF and the Hausdorff Distance, IEEE
Trans. Pattern Analysis and Machine Intelligence, v. 35, n. 3, p. 624-637, 2013.
DAISY TOLA, E.; LEPETIT, V.; FUA, DAISY: An efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern
Analysis and Machine Intelligence, v.32, n.5, p. 825-830, 2009.
HoG DALAL, N. and B. TRIGGS. Histograms of Oriented Gradients for Human Detection, IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, Vol. 1 (June 2005), pp. 886–893.
DÉNIZ, O, BUENO, G., SALIDO, J. Dela TORRE, F., Face recognition using Histograms of Oriented Gradients, Pattern Recognition
Letters, v. 32, n.12, (September 2011), pp. 1598-1603.
10/16/2019 Feature Detection 35
Exercise with SIFT
Download the SIFTDemo.
Read the README file.
Try the MATLAB function match using as input
arguments any image stereo pair contained in the
package.
Try the MATLAB functions sift and showkeys.
10/16/2019 Feature Detection 36
10/16/2019 Feature Detection 37
Next Topic
Recognition