lecture 10 - interest point detectors &...

Interest Point Detectors & RANSAC

Instructor - Simon Lucey

16-423 - Designing Computer Vision Apps

Today

• Fundamental Matrix

• Interest Point Detectors

• Matching Regions

• RANSAC.

• Rank 2:

• 5 degrees of freedom

• Non-linear constraints between elements

Reminder: Essential Matrix

Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

Reminder: Epipolar lines


Recovering Epipolar Lines

Equation of a line:

or

or


T

Recovering Epipolar Lines

Equation of a line:

Now consider

This has the form where

So the epipolar lines are


T

T T

T

T

Essential matrix:

To recover translation and rotation use the matrix:

We take the SVD and then we set

Decomposition of E


Four Interpretations

To get the different

solutions, we mutliply τ by -1 and substitute

The Fundamental MatrixNow consider two cameras that are not normalized

By a similar procedure to before, we get the relation

orwhere

Relation between essential and fundamental


Estimation of the Fundamental Matrix

Estimation of Fundamental Matrix

When the fundamental matrix is correct, the epipolar line induced by a point in the first image should pass through the matching point in the second image and vice-versa.

This suggests the criterion

If and then

Unfortunately, there is no closed form solution for this quantity.


The 8 Point AlgorithmApproach:

• solve for fundamental matrix using homogeneous coordinates

• closed form solution (but to wrong problem!)• Known as the 8 point algorithm

Start with fundamental matrix relation

Writing out in full:

or


The 8 Point Algorithm


Can be written as:

where

Stacking together constraints from at least 8 pairs of points, we get the system of equations

Minimum direction problem of the form Find minimum of subject to .

To solve, compute the SVD and then set to the last column of .

The 8 Point Algorithm


Fitting Concerns•This procedure does not ensure that solution is rank 2. Solution: set last singular value to zero.

•Can be unreliable because of numerical problems to do with the data scaling – better to re-scale the data first

•Needs 8 points in general positions (cannot all be planar).

•Fails if there is not sufficient translation between the views

•Use this solution to start non-linear optimisation of true criterion (must ensure non-linear constraints obeyed).

•There is also a 7 point algorithm.

Today




• RANSAC.

I(x

,y)

I(x + 1, y + 1)

I

Simoncelli & Olshausen 2001

I(x

,y)

I

I(x + 8, y + 8) Simoncelli & Olshausen 2001

I(x

,y)

I


Pixel Coherence

• The pixel coherence assumption demonstrates that pixels within natural images are heavily correlated with one another within a local neighborhood (e.g., +/- 5 pixels).

• For example,

19

N

CS143 Intro to Computer VisionSept, 2007 ©Michael J. Black

Image Filtering

3 4 3

2 ? 5

5 4 2

3

What assumptions are you

making to infer the center value?

Pixel Coherence

• The pixel coherence assumption demonstrates that pixels within natural images are heavily correlated with one another within a local neighborhood (e.g., +/- 5 pixels).

• For example,

19

N

CS143 Intro to Computer VisionSept, 2007 ©Michael J. Black

Image Filtering

3 4 3

2 ? 5

5 4 2

3

What assumptions are you

making to infer the center value?

“Typically assume that pixels within +/- 3, 5 or 7 pixels are highly correlated.”

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

Estimating Gradients

20

• Images are a discretely sampled representation of a continuous signal,

(Black)

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x


21

• Images are a discretely sampled representation of a continuous signal,

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

(Black)

• What if I want to know given that I have only the appearance at and ?I(x0 + �x)


22

(Black)

�x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&

dx<#I

IFxH

xx0

I(x0)

I(x0 + �x)I(x0)

• Again, simply take the Taylor series approximation?


23

(Black)

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&

dx<#I

IFxH

xx0

I(x0 + �x)I(x0)

I(x0 + �x) � I(x0) +�I(x0)

�x

T

�x

• Again, simply take the Taylor series approximation?


23

(Black)

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C ',;A/2&;*/&;&D129*/)/<E&2;,-</D&

*/-*/2/();)1+(&+F&;&9+()1(.+.2&21A(;<

IGxH

x

!"#$%&'()*+&)+&!+,-.)/*&0121+("/-)3&4556 7819:;/<&=>&?<;9@

',;A/2&;2&B.(9)1+(2

C:;)&1D&'&E;()&)+&@(+E&IFx0GdxH&D+*&2,;<<&

dx<#I

IFxH

xx0

I(x0 + �x)I(x0)

I(x0 + �x) � I(x0) +�I(x0)

�x

T

�x

�I(x0)�x

Interest Point Detectors


• Interest point detectors, in general, detect corners. • A dense approach (using all pixels) will be far too slow. • Matching lines suffers from the “aperture problem” (i.e. has

the line moved along itself?). • Corners are:-

• relatively sparse • well-localized • good for matching

• reasonably cheap to compute • appear quite robustly • useful for geometry.

A(�x,�y) =X

(xk,yk)2N (x,y)

||I(xk

, y

k

)� I(xk

+ �x, y

k

+ �y)||2

N

(�x,�y)


• Approach is based on concept of auto-correlation,

26

N (x)


• Approach is based on concept of auto-correlation,

27

“in vector form”

“local neighborhood”

A(�x) =X

xk2N (x)

||I(xk)� I(xk + �x)||2

A(�x)

5 10 15 20 25

5

10

15

20

25

5 10 15 20 25

5

10

15

20

25


• Which patches match the auto-correlation responses?

28

A(�x) =X

xk2N (x)

||I(xk)� I(xk + �x)||2

Harris Detector

• Harris & Stephens (1988) proposed,

29

⇡X

i2N||I(xi)� I(xi)�

@I(xi)@x

T

�x||2

⇡ �xT H�x

H =X

i2N

@I(xi)@x

@I(xi)@x

T

Harris Detector

• How can one characterize these three situations?

A(�x) =X

xk2N (x)

||I(xk)� I(xk + �x)||2

• We can simplify this through a linear approximation,

• where,

Harris Corner Detector

Make decision based on image structure tensor

H =X

i2N

@I(xi)@x

@I(xi)@x

T

Harris Detector

32

Effects of Scale

• Consider a single row or column of the image – Plotting intensity as a function of position gives a signal

Source: S. Seitz

Effects of Scale

• Consider a single row or column of the image – Plotting intensity as a function of position gives a signal

• Where is the edge? Source: S. Seitz

Search Multiple Scales

f

Source: S. Seitz


f

g

Source: S. Seitz


f

g

f * g

Source: S. Seitz


• To find edges, look for peaks in

f

g

f * g

Source: S. Seitz

SIFT Detector

• Filter with difference of Gaussian filters at increasing scales• Build image stack (scale space)• Find extrema in this 3D volume

Coarse to Fine

36

Scale & Affine Invariant Interest Point Detectors 67

scale estimates the characteristic length of the corre-sponding image structures, in a similar manner as thenotion of characteristic length is used in physics. Theselected scale is characteristic in the quantitative sense,since it measures the scale at which there is maximumsimilarity between the feature detection operator andthe local image structures. This scale estimate will (fora given image operator) obey perfect scale invarianceunder rescaling of the image pattern.

Given a point in an image and a scale selection op-erator we compute the operator responses for a setof scales σn (Fig. 1). The characteristic scale corre-sponds to the local extremum of the responses. Notethat there might be several maxima or minima, thatis several characteristic scales corresponding to differ-ent local structures centered on this point. The char-acteristic scale is relatively independent of the imageresolution. It is related to the structure and not to theresolution at which the structure is represented. Theratio of the scales at which the extrema are found forcorresponding points is the actual scale factor betweenthe point neighborhoods. In Mikolajczyk and Schmid(2001) we compared several differential operators andwe noticed that the scale-adapted Harris measure rarelyattains maxima over scales in a scale-space representa-tion. If too few interest points are detected, the imagecontent is not reliably represented. Furthermore, theexperiments showed that Laplacian-of-Gaussians findsthe highest percentage of correct characteristic scales

Figure 1. Example of characteristic scales. The top row shows two images taken with different focal lengths. The bottom row shows theresponse Fnorm(x, σn) over scales where Fnorm is the normalized LoG (cf. Eq. (3)). The characteristic scales are 10.1 and 3.89 for the left andright image, respectively. The ratio of scales corresponds to the scale factor (2.5) between the two images. The radius of displayed regions inthe top row is equal to 3 times the characteristic scale.

to be found.

|LoG(x, σn)| = σ 2n |Lxx (x, σn) + L yy(x, σn)| (3)

When the size of the LoG kernel matches with thesize of a blob-like structure the response attains an ex-tremum. The LoG kernel can therefore be interpretedas a matching filter (Duda and Hart, 1973). The LoG iswell adapted to blob detection due to its circular sym-metry, but it also provides a good estimation of thecharacteristic scale for other local structures such ascorners, edges, ridges and multi-junctions. Many pre-vious results confirm the usefulness of the Laplacianfunction for scale selection (Chomat et al., 2000;Lindeberg, 1993, 1998; Lowe, 1999).

2.2. Harris-Laplace Detector

In the following we explain in detail our scale invariantfeature detection algorithm. The Harris-Laplace detec-tor uses the scale-adapted Harris function (Eq. (2)) tolocalize points in scale-space. It then selects the pointsfor which the Laplacian-of-Gaussian, Eq. (3), attainsa maximum over scale. We propose two algorithms.The first one is an iterative algorithm which detectssimultaneously the location and the scale of character-istic regions. The second one is a simplified algorithm,which is less accurate but more efficient.

SIFT Detector

Identified Corners Remove those on edges

Remove those where contrast

is low

Today




• RANSAC.

• Once we have detected local regions, we must then match them.

Matching Regions

39

image 1 image 2

SSD : sum of square difference

Small difference values ! similar patches

X

[�x,�y]T2N

||I1(x1 +�x, y1 +�y)� I2(x2 +�x, y2 +�y)||22

(x1, y1)(x2, y2)

Problems with SSD

• SSD measures are

sensitive to both,

• geometric and,

• photometric variation.

• Common practice to

use descriptors.

40

I1

I2

Planar Affine Patch Assumption

41

“View 1” “View 2”

• A number of technique have been proposed to detect affine covariant regions (Schmid et al. 2004).

Rotation Invariance

• Estimation of the dominant orientation – extract gradient orientation – histogram over gradient orientation – peak in this histogram

• Rotate patch in dominant direction

SIFT Descriptor

1. Compute image gradients 2. Pool into local histograms3. Concatenate histograms4. Normalize histograms

More on this in future lectures.

Why Pool?

Why Pool?

“average”

Why Pool?

⇤ “histogram”“pooling”“blurring”

MATLAB Example

• Example in MATLAB,

>> h = fspecial(‘gaussian’,[25,25],3); >> resp = imfilter(im, h);

⇤h resp

Other Descriptors

• Since SIFT, a plethora of other descriptors have been proposed.

• SIFT is sometimes problematic to use in practice as it is • protected under existing patents.

• In OpenCV alone there are, • SURF - (patent protected) • BRIEF • PHREAK

• See following link for tutorial in OpenCV.

• ORB • BRISK • FAST

http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_table_of_contents_feature2d/py_table_of_contents_feature2d.html

Today




• RANSAC.

Robust Estimation

• Least squares criterion is not robust to outliers

• For example, the two outliers here cause the fitted line to be quite wrong.

• One approach to fitting under these circumstances is to use RANSAC – “Random sampling by consensus”

RANSAC

Fitting a homography with RANSAC

Original images Initial matches Inliers from RANSAC

Things to try in iOS - GPUImage

See - https://github.com/BradLarson/GPUImage for more details.

https://github.com/BradLarson/GPUImage

More to read…

• Prince et al. • Chapter 13, Sections 2-3.

• Corke et al. • Chapter 13, Section 3.

• E. Rublee et al. “ORB: an efficient alternative to SIFT or SURF”, ICCV 2011.

https://www.willowgarage.com/sites/default/files/orb_final.pdf

lecture 10 - interest point detectors &...

Documents