john morris these slides were adapted from a set of lectures written by mircea nicolescu, university...

27
John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay of Islands

Upload: gilbert-reeves

Post on 19-Jan-2018

225 views

Category:

Documents


0 download

DESCRIPTION

3 Stereo Vision - Terminology −Fixation point −Point of intersection of the optical axes of the two cameras −Baseline −Distance between the camera optical centres −Epipolar plane −Plane passing through the optical centres and a point in the scene −Epipolar line −Intersection of the epipolar plane with the image plane. −Conjugate pair or Corresponding points −A point in the scene visible to both cameras (binocularly visible) will be projected to a point in each image −Disparity −Distance between corresponding points when the two images are superimposed −Disparity map −Disparities of all points form the disparity map −Usual output from a stereo matching algorithm −Often displayed as an image

TRANSCRIPT

Page 1: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

John MorrisThese slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno

Stereo Vision

Iolanthe in the Bay of Islands

Page 2: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

2

Stereo Vision• Goal

− Recovery of 3D scene structure− Using two or more images, − Each acquired from a different viewpoint in space

− Using multiple cameras or one moving camera− Term binocular vision is used when two cameras are employed

− More than 2 cameras can be usedAcquisition of complete 3D models

StereophotogrammetryUsing stereo vision systems to measure properties (dimensions here) of a scene

Page 3: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

3

Stereo Vision - Terminology− Fixation point

− Point of intersection of the optical axes of the two cameras− Baseline

− Distance between the camera optical centres− Epipolar plane

− Plane passing through the optical centres and a point in the scene− Epipolar line

− Intersection of the epipolar plane with the image plane.− Conjugate pair or Corresponding points

− A point in the scene visible to both cameras (binocularly visible) will be projected to a point in each image

− Disparity− Distance between corresponding points when the two images are superimposed

− Disparity map− Disparities of all points form the disparity map

− Usual output from a stereo matching algorithm − Often displayed as an image

Page 4: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

4

Stereo Vision• Camera

configuration• Parallel optical

axes• Parallel image

planes

Note: Virtual Image planes(in front of optical centre)

Page 5: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

5

Stereo Vision – Verging axes• Camera

configuration• Verging optical

axes

Page 6: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

6

TriangulationPrinciple underlying stereo vision• Any visible point in the scene must lie on

the line that passes through − the optical centre (centre of

projection) and − the projection of the point on the

image plane• We can backproject this line into the

scene• With two cameras, we have two such

lines• Intersection of these two lines is the (3D)

location of the point

Page 7: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

7

Stereo Vision• Two problems

− Correspondence problem− Reconstruction problem

• Correspondence problem− Finding conjugate pairs of corresponding or matched points in each image

− These points are projections of the same scene point− Triangulation depends on these conjugate pairs

Page 8: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

8

Stereo Vision• Correspondence problem

− Ambiguous correspondence between points in the two images may lead to several different consistent interpretations of the scene

− Problem is fundamentally ill-posed

If you can’t solve the correspondence problem,then all of these points could be scene points!

Each image has 3 scene points,representing somefeatures in the scene

Page 9: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

9

Reconstruction− Having found the corresponding points,

we can compute the disparity map− Disparity maps are commonly expressed in pixels

ie number of pixels between corresponding points in two images− Disparity map can be converted to a 3D map of the scene if the geometry of the imaging system is known

− Critical parameters: Baseline, camera focal length, pixel size

Page 10: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

10

Reconstruction• Determining depth

− To recover the position of P from its projections, pl and pr :

− In general, the two cameras are related by a rotation, R, and a translation, T:

− Parallel camera optical axes Zr = Zl = Z and Xr = Xl – T so we have:

where d = xl – xr is the disparity - the difference in position between the corresponding points in the two images, commonly measured in pixels

Page 11: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

11

Reconstruction• Recovering depth

where T is the baseline If d’ is measured in pixels,

then d = xl – xr = d’p where p is the width of a

pixel in the image plane,and we have

Z = Tf

Note the reciprocal relationship between disparity and depth!This is particularly relevant when considering the accuracy of stereo photogrammetry

d’p

Page 12: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

12

Stereo Vision• Configuration parameters

− Intrinsic parameters− Characterize the transformation from image plane coordinates to pixel coordinates in each camera− Parameters intrinsic to each camera

− Extrinsic parameters (R, T)− Describe the relative position and orientation of the two cameras

− Can be determined from the extrinsic parameters of each camera:

Page 13: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

13

Correspondence Problem• Why is the correspondence problem difficult?

− Some points in each image will have no corresponding points in the other image− They are not binocularly visible or− They are only monocularly visible− Cameras have different fields of view− Occlusions may be present

− A stereo system must be able to determine parts that should not be matched

These two are equivalent!

Page 14: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

14

The Correspondence Problem• Methods for establishing correspondences

− Two issues− How to select candidate matches?− How to determine the goodness of a match?

− Two main classes of correspondence (matching) algorithm:− Correlation-based

− Attempt to establish a correspondence by matching image intensities – usually over a window of pixels in each image Dense disparity maps

− Distance is found for all BV image points− Except occluded (MV) points

− Feature-based− Attempt to establish a correspondence by matching a sparse sets of image features – usually edges− Disparity map is sparse

− Number of points is related to the number of image features identified

Page 15: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

15

Correlation-Based Methods• Match image sub-windows in the two images using image

correlation− oldest technique for finding correspondence between image pixels

• Scene points must have the same intensity in each image− Assumes

a) All objects are perfect Lambertian scatterers ie the reflected intensity is not dependent on

angle or objects scatter light uniformly in all directions• Informally - matte surfaces only

b) Fronto-planar surfaces− (Visible) surfaces of all objects are

perpendicularto camera optical axes

Page 16: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

16

Correlation-Based Methods

Page 17: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

17

Correlation-Based Methods• Usually, we normalize c(d) by dividing it by the standard deviation of both

Il and Ir (normalized cross-correlation, c(d) [0,1] )

where and are the average pixel values in the left and right windows.

• An alternative similarity measure is the sum of squared differences (SSD):

lI rI

• In fact, experiment shows that the simpler sum of absolute differences (SAD) is just as good

c(d) = | Il( i+k, j+l ) – Ir( i+k-d, j+l ) |

Page 18: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

18

Correlation-Based Methods• Improvements

− Instead of using the image intensity values, the accuracy of correlation is improved by using thresholded signed gradient magnitudes at each pixel.

− Compute the gradient magnitude at each pixel in the two images without smoothing

− Map the gradient magnitude values into three values: -1, 0, 1 (by thresholding the gradient magnitude)

− More sensitive correlations are produced this way+ several dozen more

see Scharstein & Szeliski, 2001 for a review

Page 19: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

19

Correlation-Based Methods• Comments

− Correlation-based methods depend on the image window in one image having a distinctive structure that occurs infrequently in the search region of the other image

ie in one image, we can find unique features in each window that match only one window in the other

− How to choose the size of the window, W?− too small

− may not capture enough image structure and − may be too noise sensitive many false matches

− too large− makes matching less sensitive to noise (desired) but− decreases precision (blurs disparity map)

− An adaptive searching window has been proposed

Page 20: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

20

Correlation-Based Methods

Page 21: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

21

Correlation-Based Methods

Page 22: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

22

Correlation-Based Methods• Comments

− How to choose the size and location of the search region, R(pl)?

− if the distance of the fixating point from the cameras is much larger than the baseline, the location of R(pl) can be chosen to be the same as the location of pl

− the size (extent) of R(pl) can be estimated from the maximum range of distances we expect to find in the scene

− we will see that the search region can always be reduced to a line

Page 23: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

23

Feature-Based Methods• Main idea

− Look for a feature in an image that matches a feature in the other.− Typical features used are:

− edge points− line segments− corners (junctions)

Page 24: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

24

Feature-Based Methods• A set of features is used for matching

− a line feature descriptor, for example, could contain:

− length, l− orientation, − coordinates of the midpoint, m− average intensity along the line, i

• Similarity measures are based on matching feature descriptors:

where w0, ..., w3 are weights (determining the weights that yield the best matches is a nontrivial task).

Page 25: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

25

Feature-Based Methods

Page 26: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

26

Correlation vs. feature-based approaches• Correlation methods

− Easier to implement− Provide a dense disparity map (useful for reconstructing surfaces)− Need textured images to work well (many false matches otherwise)− Don’t work well when viewpoints are very different, due to

− change in illumination direction − violates Lambertian scattering assumption

− foreshortening− perspective problem – surfaces are not fronto-planar

• Feature-based methods− Suitable when good features can be extracted from the scene− Faster than correlation-based methods− Provide sparse disparity maps

− OK for applications like visual navigation− Relatively insensitive to illumination changes

Page 27: John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay

27

Other correspondence algorithms

• Dynamic programming (Gimel’Farb)− Finds a ‘path’ through an image which provides the best (least-

cost) match− Can allow for occlusions (Birchfield and Tomasi)− Generally provide better results than area-based correlation− Faster than correlation

• Graph Cut (Zabih et al)− Seems to provide best results− Very slow, not suitable for real-time applications

• Concurrent Stereo Matching− Examine all possible matches in parallel (Delmas, Gimel’Farb, Morris,

work in progress)− Uses a model of image noise instead of arbitrary weights in cost

functions− Suitable for real-time parallel hardware implementation

Some of these will be considered in detail later