scale invariant feature tranform- siftraul/imageanalysis/sift.pdf · 10/16/2019 feature detection 3...

10/16/2019 Feature Detection 1

Scale Invariant Feature

Tranform- SIFT

Raul Queiroz Feitosa


Outline

Overview

Keypoint detector

Descriptor

Matching

Alternative Approaches


The man behind the method

David G. Lowe, "Distinctive image features from scale-invariant keypoints,"

International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.

Example:


The Correspondence Problem



Introduction: difficult to locate an accurate match …

in uniform

neighborhoods close to edges



Promising image entities

Before starting the search for correspondences, it is

convenient to locate entities, (interest points or keypoints)

with good characteristics for an accurate match:

1. Distinctiveness: locally separable

2. Invariance: identifiable under the expected geometric and

radiometric distortions

3. Stability: appears in both images

4. Seldomness: globally separable

Invariance to

Image scale,

Rotation on the plane.

Robustness to

Affine distortion,

Change in 3D viewpoint,

Noise.

Locality: feature are local, so robust to occlusion and clutter

Distinctiveness: individual features can be matched to a large database of objects

Quantity: many features can be generated for even small objects.


Advantages


Basic Steps

1. Scale-space extrema detection

2. Keypoint localization

3. Orientation assignment

4. Keypoint descriptor construction

5. Matching keypoints

Interest-point detection

Description

Matching

Gaussian Image Pyramid

The scale-space representation:

The input image is successively smoothed with a Gaussian

kernel and subsampled, forming an image pyramid.


x

y

σ (scale)


DoG Pyramid

Overview:

σ

σ2

σ2

σ22

σ4


Building the Pyramids

Algorithm:

1. The image size is doubled

This implies in a smoothing corresponding to σ=0.5. Thus, the

first smoothing should use σ=0.5 and from then on σ=1.6.



Algorithm:

2. Build the Gaussian pyramid

first

octave

second

octave

third

octave



Algorithm:

3. Build the DoG pyramid - D(x,y,σ)

first

octave

second

octave

third

octave


Outline

Overview

Keypoint detector

Descriptor

Matching



SIFT Keypoint Detector

Detecting Extrema:

Scale-space extrema detection - algorithm :

4. Extrema in 3D


SIFT Keypoint Detector


Accurate Keypoint Localization

From difference-of-Gaussian local

extrema detection we obtain

approximate values for keypoints.

Originally these approximations were

used directly.

For an improvement in matching and

stability, fitting to a 3D quadratic

function is used.

This is specially important to localize

keyponts detected on higher octaves.

accurate

location rough

location

samples

fitted curve x

f (x)


Filtering out weak keypoints

Keypoints with low contrast are discarded

For image intensities normalized in [0 1], discard a

keypoint if

Keypoints along edges are discarded

How?

03.0 )ˆ( xD


SIFT Examples

original image

The initial 832

keypoints

locations at

extrema of DoG.

Vectors

indicating scale,

orientation, and

location.

After applying

a threshold on

minimum contrast,

729 keypoints

remain.

The final 536

keypoints after

threshold on

ratio of

principal

curvatures.


Outline

Overview

Keypoint detector

Descriptor

Matching



Assigning an Orientation

Take the image L at the scale closest to the keypoint.

Calculate magnitude and orientations of gradient around the key point according to

Compute the orientation histogram with 36 bins. Each sample is weighted by gradient magnitude and

Gaussian circular window with a σ equal to 1.5 times scale of keypoint

22))1,()1,(()),1(),1((),( yxLyxLyxLyxLyxm

))),1(),1(/())1,()1,(((tan),(1

yxLyxLyxLyxLyx



0

10

20

30

. . . 35

0

36

0

freq

uen

cy

. . .

gradients

weights

wei

gh

ts

orientation

bins



Highest peak in orientation histogram is found along with any other peaks within 80% of highest peak → more than one orientation may be assigned to a key point.

A parabola is fit to the 3 closest histogram values to each peak and its maximum is taken → more accurate peak detection.

Thus each keypoint has 4 dimensions:

x location,

y location,

σ scale, and

orientation


SIFT Descriptor

1. Compute the gradient magnitude and orientation at each image

sample point around the keypoint. Magnitudes are weighted by

a Gaussian window, with σ =1/2 width of the descriptor

window.

to achieve orientation invariance,

the coordinates of the descriptor

and the gradient orientations are

rotated relative to the keypoint

orientation.


SIFT Descriptor

2. Form orientation histograms summarizing the contents over 4x4

subregions, with the length of each arrow given by the sum of

the gradient magnitudes near that direction within the

region.


SIFT Descriptor

3. Avoiding boundary effects between histograms

Weight equal to 1 – d , for each of the 3 dimensions where d is

the distance of a sample to the center of a bin

dx=1

dy=

1

dx

dy

dθ


SIFT Descriptor

4. Ensuring invariance to illumination

Vector is normalized to 1; its components are thresholded to no

larger than 0.2 and then the vector is normalized again.


SIFT Descriptor

“This figure shows a 2×2 descriptor array computed from an 8×8

set of samples, whereas the experiments in this paper use 4×4

descriptors computed from a 16×16 sample array.” Lowe


SIFT Descriptor

Keypoint Descriptor provides invariance to

Scale (by using the DoG pyramid)

Illumination (by using the gradients + normalization)

Rotation (by rotating the description relative to the main direction)

3D camera viewpoint (to a certain extent)


Outline

Overview

Keypoint detector

Descriptor

Matching



SIFT- Keypoint Matching

Up to this point we have:

Found rough approximations for features by looking at

the difference-of-Gaussians

Localized the keypoint more accurately

Thresholded poor keypoints

Determined the orientation of a keypoint

Calculated a 128 feature vector for each keypoint

How to find corresponding keypoints on a pair (or

more) of images?



Keypoint Matching

The dissimilarity measure is given by the Euclidean

distance between descriptors.

Independently match all keypoints in all octaves in one

image with all keypoints in all octaves in other image.

Take the closest neighbor.

If the ratio of closest nearest neighbor with second

closest nearest neighbor, is greater than 0.8, discard

them.



Example of Matching


Outline

Overview

Keypoint detector

Descriptor

Matching


SIFT Related References

SIFT LOWE, D.G. Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, v. 60, n. 2, pp. 91-

110, 2004.

SUSAN, S., JAIN, A., SHARMA, A., VERMA, A., JAIN, S., Fuzzy match index for scale-invariant feature transform (SIFT) features

with application to face recognition with wek supervision, IET Image Processing, v. 9, n. 11, pp. 951-958m 2915

SURF BAY, H.; ESS, A.; TUYTELAARS, T.; VAN GOOL, L. Speeded-up robust features (SURF). Journal of Computer Vision and

Image Understanding, v.110, n.3, p. 346-359, 2008.

SANGINETO, E., Pose and Expression Independent Facil Landmark Localization Using Dense-SURF and the Hausdorff Distance, IEEE

Trans. Pattern Analysis and Machine Intelligence, v. 35, n. 3, p. 624-637, 2013.

DAISY TOLA, E.; LEPETIT, V.; FUA, DAISY: An efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern

Analysis and Machine Intelligence, v.32, n.5, p. 825-830, 2009.

HoG DALAL, N. and B. TRIGGS. Histograms of Oriented Gradients for Human Detection, IEEE Computer Society Conference on Computer

Vision and Pattern Recognition, Vol. 1 (June 2005), pp. 886–893.

DÉNIZ, O, BUENO, G., SALIDO, J. Dela TORRE, F., Face recognition using Histograms of Oriented Gradients, Pattern Recognition

Letters, v. 32, n.12, (September 2011), pp. 1598-1603.


https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf








http://ieeexplore.ieee.org.ez370.periodicos.capes.gov.br/stamp/stamp.jsp?tp=&arnumber=7302658




































http://ac.els-cdn.com/S1077314207001555/1-s2.0-S1077314207001555-main.pdf?_tid=5dc7b7de-0a33-11e7-98b8-00000aacb362&acdnat=1489660319_046b9bff4b402f5d5add90efa9c8abf7







http://ieeexplore.ieee.org.ez370.periodicos.capes.gov.br/stamp/stamp.jsp?arnumber=6180175












https://infoscience.epfl.ch/record/138785/files/tola_daisy_pami_1.pdf







http://vc.cs.nthu.edu.tw/home/paper/codfiles/hkchiu/201205170946/Histograms of Oriented Gradients for Human Detection.pdf

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1467360











http://www.sciencedirect.com/science/article/pii/S0167865511000122








Exercise with SIFT

Download the SIFTDemo.

Read the README file.

Try the MATLAB function match using as input

arguments any image stereo pair contained in the

package.

Try the MATLAB functions sift and showkeys.


http://www.ele.puc-rio.br/~raul/ImageAnalysis/siftDemoV4.zip


Next Topic

Recognition

scale invariant feature tranform- siftraul/imageanalysis/sift.pdf · 10/16/2019 feature detection 3...

Documents