lecture 11: structure from motion cs6670: computer vision noah snavely
Post on 21-Dec-2015
230 views
TRANSCRIPT
![Page 1: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/1.jpg)
Lecture 11: Structure from motion
CS6670: Computer VisionNoah Snavely
![Page 2: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/2.jpg)
Announcements
• Project 2 out, due next Wednesday, October 14– Artifact due Friday, October 16
• Questions?
![Page 3: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/3.jpg)
Readings
• Szeliski, Chapter 7.2• My thesis, Chapter 3
http://www.cs.cornell.edu/~snavely/publications/thesis/thesis.pdf
![Page 4: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/4.jpg)
Energy minimization via graph cuts
Labels (disparities)
d1
d2
d3
edge weight
![Page 5: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/5.jpg)
Other uses of graph cuts
![Page 6: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/6.jpg)
Agarwala et al., Interactive Digital Photo Montage, SIGGRAPH 2004
Other uses of graph cuts
![Page 7: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/7.jpg)
Panoramic stitching
• Problem: copy every pixel in the output image from one of the input images (without blending)
• Make transitions between two images seamless
![Page 8: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/8.jpg)
Panoramic stitching
• Can be posed as a labeling problem: label each pixel in the output image with one of the input images
Input images:
Output panorama:
![Page 9: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/9.jpg)
Panoramic stitching
• Number of labels: k = number of input images• Objective function (in terms of labeling L)
Input images:
Output panorama:
![Page 10: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/10.jpg)
Panoramic stitching
Infinite cost of labeling a pixel (x,y) with image I if I doesn’t cover (x,y)
Else, cost = 0
![Page 11: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/11.jpg)
Photographing long scenes with multi-viewpoint panoramas
![Page 12: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/12.jpg)
Today: Drift
copy of first image
(xn,yn)
(x1,y1)
– add another copy of first image at the end– this gives a constraint: yn = y1– there are a bunch of ways to solve this problem
• add displacement of (y1 – yn)/(n - 1) to each image after the first
• compute a global warp: y’ = y + ax• run a big optimization problem, incorporating this constraint
– best solution, but more complicated– known as “bundle adjustment”
![Page 13: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/13.jpg)
Global optimization
• Minimize a global energy function:– What are the variables?
• The translation tj = (xj, yj) for each image Ij
– What is the objective function?• We have a set of matched features pi,j = (ui,j, vi,j)
– We’ll call these tracks
• For each point match (pi,j, pi,j+1): pi,j+1 – pi,j = tj+1 – tj
I1 I2 I3 I4
p1,1p1,2 p1,3
p2,2
p2,3 p2,4
p3,3p3,4 p4,4p4,1
track
![Page 14: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/14.jpg)
Global optimization
I1 I2 I3 I4
p1,1p1,2 p1,3
p2,2
p2,3 p2,4
p3,3p3,4 p4,4p4,1
p1,2 – p1,1 = t2 – t1
p1,3 – p1,2 = t3 – t2
p2,3 – p2,2 = t3 – t2
…
v4,1 – v4,4 = y1 – y4
minimizewij = 1 if track i is visible in images j and j+1 0 otherwise
![Page 15: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/15.jpg)
Global optimization
I1 I2 I3 I4
p1,1p1,2 p1,3
p2,2
p2,3 p2,4
p3,3p3,4 p4,4p4,1
A2m x 2n 2n x 1
x2m x 1
b
![Page 16: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/16.jpg)
Global optimization
Defines a least squares problem: minimize• Solution:• Problem: there is no unique solution for ! (det = 0)• We can add a global offset to a solution and get the same error
A2m x 2n 2n x 1
x2m x 1
b
![Page 17: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/17.jpg)
Ambiguity in global location
• Each of these solutions has the same error• Called the gauge ambiguity• Solution: fix the position of one image (e.g., make the origin of the 1st image (0,0))
(0,0)
(-100,-100)
(200,-200)
![Page 18: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/18.jpg)
Solving for camera parameters
Projection equation
1333
100
'0
'0
xxcy
cx
yfs
xfs
s
sv
su
p txR
Recap: a camera is described by several parameters• Translation t of the optical center from the origin of world coords• Rotation R of the image plane• focal length f, principle point (x’c, y’c), pixel size (sx, sy)• blue parameters are called “extrinsics,” red are “intrinsics”
K
![Page 19: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/19.jpg)
Solving for camera rotation
• Instead of spherically warping the images and solving for translation, we can directly solve for the rotation Rj of each camera
• Can handle tilt / twist
![Page 20: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/20.jpg)
Solving for rotations
R1R2
f
I1
I2
p12 = (u12, v12)
p11 = (u11, v11)
(u11, v11, f) = p11
R1p11
R2p22
![Page 21: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/21.jpg)
Solving for rotations
minimize
![Page 22: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/22.jpg)
3D rotations
• How many degrees of freedom are there?• How do we represent a rotation?– Rotation matrix (too many degrees of freedom)– Euler angles (e.g. yaw, pitch, and roll) – bad idea– Quaternions (4-vector on unit sphere)
• Usually involves non-linear optimization
![Page 23: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/23.jpg)
Revisiting stereo
• How do we find the epipolar lines?• Need to calibrate the cameras (estimate
relative position, orientation)
p p'epipolar line
Image I Image J
![Page 24: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/24.jpg)
Solving for rotations and translations
• Structure from motion (SfM)• Unlike with panoramas, we often need to
solve for structure (3D point positions) as well as motion (camera parameters)
![Page 25: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/25.jpg)
p1,1
p1,2p1,3
Image 1
Image 2
Image 3
x1
x4
x3
x2
x5x6
x7
R1,t1
R2,t2
R3,t3
![Page 26: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/26.jpg)
Structure from motion
• Input: images with points in correspondence pi,j = (ui,j,vi,j)
• Output• structure: 3D location xi for each point pi• motion: camera parameters Rj , tj
• Objective function: minimize reprojection error
Reconstruction (side) (top)
![Page 27: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/27.jpg)
p1,1
p1,2p1,3
Image 1
Image 2
Image 3
x1
x4
x3
x2
x5x6
x7
R1,t1
R2,t2
R3,t3
![Page 28: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/28.jpg)
SfM objective function• Given point x and rotation and translation R, t
• Minimize sum of squared reprojection errors:
predicted image location
observedimage location
![Page 29: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/29.jpg)
Solving structure from motion• Minimizing g is difficult– g is non-linear due to rotations, perspective division– lots of parameters: 3 for each 3D point, 6 for each camera– difficult to initialize– gauge ambiguity: error is invariant to a similarity
transform (translation, rotation, uniform scale)
• Many techniques use non-linear least-squares (NLLS) optimization (bundle adjustment)– Levenberg-Marquardt is one common algorithm for NLLS– Lourakis, The Design and Implementation of a Generic
Sparse Bundle Adjustment Software Package Based on the Levenberg-Marquardt Algorithm, http://www.ics.forth.gr/~lourakis/sba/
– http://en.wikipedia.org/wiki/Levenberg-Marquardt_algorithm
![Page 30: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/30.jpg)
Extensions to SfM• Can also solve for intrinsic parameters (focal
length, radial distortion, etc.)• Can use a more robust function than squared
error, to avoid fitting to outliers
• For more information, see: Triggs, et al, “Bundle Adjustment – A Modern Synthesis”, Vision Algorithms 2000.
![Page 31: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/31.jpg)
Photo Tourism• Structure from motion on Internet photo
collections
![Page 32: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/32.jpg)
Photo Tourism
![Page 33: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/33.jpg)
Scene reconstruction
Feature detectionFeature detection
Pairwisefeature matching
Pairwisefeature matching
Incremental structure
from motion
Incremental structure
from motion
Correspondence estimation
Correspondence estimation
![Page 34: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/34.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 35: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/35.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 36: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/36.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 37: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/37.jpg)
Feature matchingMatch features between each pair of images
![Page 38: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/38.jpg)
Feature matchingRefine matching using RANSAC [Fischler & Bolles 1987] to estimate fundamental matrices between pairs
![Page 39: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/39.jpg)
Image connectivity graph
(graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)
![Page 40: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/40.jpg)
Incremental structure from motion
![Page 41: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/41.jpg)
Incremental structure from motion
![Page 42: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d555503460f94a33327/html5/thumbnails/42.jpg)
Incremental structure from motion