robust computer vision midterm fundamental matrix ......despite the use of ransac, the matched surf...

9
Robust Computer Vision Midterm Fundamental Matrix estimation with the Levenberg-Marquardt algorithm Eric Wengrowski March 31, 2016 1

Upload: others

Post on 09-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

Robust Computer Vision

Midterm

Fundamental Matrix estimation with the

Levenberg-Marquardt algorithm

Eric Wengrowski

March 31, 2016

1

Page 2: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

Inputs and Expected Outputs

Our goal is to estimate the Fundamental Matrix using only 2 stereo images withnoisy matches. The stereo images are displayed below. Fundamental matrixestimation provides an estimated mapping of pixels to 3D coordinates.

left = rgb2gray(imread(’midterm_left.bmp’));

right = rgb2gray(imread(’midterm_right.bmp’));

Stereo Matching with SURF Features

The assignment called for locally maximum Harris Corners as interest points.But for simplicity and convenience, SURF points were used instead. SURFpoints are computed by approximating Laplacian of Gaussian with a Box Fil-ter. For orientation assignment, SURF uses wavelet responses in horizontal andvertical direction for a neighbourhood of size 6s. The dominant orientation isestimated by calculating the sum of all responses within a sliding orientationwindow of angle 60 degrees. For feature description, SURF uses Wavelet re-sponses in horizontal and vertical direction. A neighborhood of size 20sX20s istaken around the keypoint where s is the size. It is divided into 4x4 subregions.For each subregion, horizontal and vertical wavelet responses are taken and acomposes 64-dimensional feature vector of the form:

v = (∑

dx,∑

dy,∑|dx|,

∑|dy|)

% SURF

% Find SURF features

left_points = detectSURFFeatures(left);

right_points = detectSURFFeatures(right);

% Extract the features

[f_left,vpts_left] = extractFeatures(left,left_points);

[f_right,vpts_right] = extractFeatures(right,right_points);

% Visualize 100 strongest SURF features, including their scales and orientation

% which were determined during the descriptor extraction process.

figure

title(’100 Strongest SURF points in each image’)

subplot(1,2,1)

imshow(left); hold on;

strongestPoints = left_points.selectStrongest(100);

strongestPoints.plot(’showOrientation’,true);

hold off

subplot(1,2,2)

imshow(right); hold on;

strongestPoints = right_points.selectStrongest(100);

2

Page 3: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

Figure 1: Left and right stereo images of the same scene given as input.

strongestPoints.plot(’showOrientation’,true);

hold off

3

Page 4: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

Figure 2: Explanations of the SURF feature vector taken fromOpenCV’s documentation: http://docs.opencv.org/3.0-beta/doc/py_

tutorials/py_feature2d/py_surf_intro/py_surf_intro.html

Figure 3: 100 Strongest SURF points detected in each image.

% Retrieve the locations of matched points.

indexPairs = matchFeatures(f_left,f_right) ;

matchedPoints_left = vpts_left(indexPairs(:,1));

matchedPoints_right = vpts_right(indexPairs(:,2));

% Display the matching points.

4

Page 5: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

% The data still includes several outliers.

figure(1), showMatchedFeatures(left,right,matchedPoints_left,matchedPoints_right);

legend(’matched points left’,’matched points right’);

% Matched Points in each image

figure, title(’Matched SURF points in each image’)

subplot(1,2,1)

imshow(left); hold on;

strongestPoints = matchedPoints_left;

strongestPoints.plot(’showOrientation’,false);

hold off

subplot(1,2,2)

imshow(right); hold on;

strongestPoints = matchedPoints_right;

strongestPoints.plot(’showOrientation’,false);

hold off

RANSAC

% RANSAC

[~,inliersIndex] = estimateFundamentalMatrix(matchedPoints_left,matchedPoints_right,...

’Method’,’RANSAC’,’NumTrials’,10000,’DistanceThreshold’,10,...

’DistanceType’,’Sampson’,’Confidence’,99);

numInliers = sum(inliersIndex);

% Display the matching points after RANSAC

figure(2), showMatchedFeatures(left,right,matchedPoints_left(inliersIndex),...

matchedPoints_right(inliersIndex));

legend(’matched points left inliers’,’matched points right inliers’);

Despite the use of RANSAC, the matched SURF points are still not perfect.Therefore, it would not be suitable to use a subset of matches with the 8-pointalgorithm for Fundamental matrix estimation. Instead, we want to use a robustestimator find the fundamental matrix. We employ the Levenberg Marquardtalgorithm, which will consider all points, and find the Fundamental matrix Fthat locally minimizes the reprojection error.

F estimation with Levenberg Marquardt

Levenberg Marquardt is a heuristic blend of Gradient descent and Gauss-Newtoniteration. These are all in the class of iterative optimization algorithms thatsearch for local extreama. Levenberg Marquardt work by dynamically changingthe step-size λ:

1. Do an update.

2. Evaluate the error at the new parameter vector.

5

Page 6: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

matched points left

matched points right

Figure 4: Composite view of the stereo images and point matches beforeRANSAC. Despite having only 25-30 interest points remaining after matching,there are still significant outliers.

3. If the error has increased as a result the update, then retract the step (i.e.reset the weights to their previous values) and increase λ by a factor of 10or some such significant factor. Then go to (1) and try an update again.

4. If the error has decreased as a result of the update, then accept the step(i.e. keep the weights at their new values) and decrease λ by a factor of10 or so

Levenberg Marquardt is particularly popular in the Computer Vision Com-munity because it is relatively robust and computationally efficient for sparseestimate with relatively few data points, such as this stereo matching example.

% Levenberg-Marquardt algorithm

6

Page 7: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

Figure 5: The 25-30 matched SURF points are shown on each image individually,but significant outliers still remain. Notice the top-leftmost SURF point in theleft image. We see no reasonable match in the right image.

% Solved externally using fundest

fileID = fopen(’C:\Users\Eric\Development\matlab_sandbox\...

Classes\RobustComputerVision\matches.txt’,’w’);

for i = 1:numInliers

fprintf(fileID, ’%e \t %e \t %e \t %e \n’, [matchedPoints_left.Location(i,1)...

matchedPoints_left.Location(i,2) matchedPoints_right.Location(i,1)...

matchedPoints_right.Location(i,2)]);

end

fclose(fileID);

From the notation of Hartley and Zisserman, the 3D coorinates are of theform:

X = (XT1 , ..., X

Tn )T

and each point measured in 2D (each of the camera images) is:

Xi = (xTi , x′Ti )T

We take the covariance of the 2D image noise as a diagonal with σx = σy = 2pixels.

The 3D to 2D camera matricies are denoted P and P ′ for the left and rightcamera. For simplicity sake, we set

P = [I3|0]

andP ′ = [M |m]

so we are only solving for 1 transformation. But because we have more than8 point matches, P ′ is not unique. So, we employ the Levenberg Marquardtalgorithm to solve for P ′ such that we minimize the reprojection errors of eachof our 3D point estimates back into the 2D image planes.

7

Page 8: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

matched points left inliers

matched points right inliers

Figure 6: After RANSAC, the most significant outliers are pruned out, and wehave 15-16 relatively strong matches remaining. But notice that there is stillnot a perfect consensus for the geometric transform of the image points. Inparticular, notice the points on the rightmost column in the (blue) right image.The mapping between these points is not totally perfect with respect to therightmost column in the (red) left image.

The LM algorithm solves for a parameter vector P where

P = (aT , bT1 , ..., bTn )T

a = p′ which makes up the values of P’. bi = (Xi, Yi, Ti)T is a 3-vector,

parametrizing the ith 3D point (Xi, Yi, 1, Ti) So in total, there are 3n + 12parameters (3 for each matching interest point and 12 for the transformationbetween the 2 cameras).

Once we use the LM algorithm to solve for P ′ = [M |m], we have M and m.

8

Page 9: Robust Computer Vision Midterm Fundamental Matrix ......Despite the use of RANSAC, the matched SURF points are still not perfect. Therefore, it would not be suitable to use a subset

Figure 7: Robust Fundamental Matrix estimates using the Levenberg Mar-quardt algorithm and 17 matching interest points. We see here that the com-putational cost is negligible for relatively few interest points.The LevenbergMarquardt estimation was performed using the C++ libraries levmar http://

users.ics.forth.gr/~lourakis/levmar/ and fundest http://users.ics.

forth.gr/~lourakis/fundest/.

We can now solve for the Fundamental Matrix F:

F = [m]xM

% LM Estimate of F

F = [3.208634e-07 1.179766e-05 -0.003055175;

-5.946336e-06 -1.996869e-05 0.003977885;

0.001530039 0.00502452 -0.9934454];

We can solve for the covarience of P ′ is a 12x12 matrix

ΣP ′ = (U −n∑

i=1

WiV−1i WT

i )+

However, we must remember to set the last 5 singular values to 0 because 2cameras with n point matches have only 3n + 7 degrees of freedom. From thetext, we set ||F|| = 1. We can then compute the 9 x 12 Jacobian matrix J of Ffrom P ′.

9