![Page 1: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/1.jpg)
Lecture 22: Structure from motion
CS6670: Computer VisionNoah Snavely
![Page 2: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/2.jpg)
Readings
• Szeliski, Chapter 7.1 – 7.4
![Page 3: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/3.jpg)
Announcements
• Project 2b due on Tuesday by 10:59pm
• Final project proposals due today at 11:59pm
• Course survey
![Page 4: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/4.jpg)
Encoding blend weights: I(x,y) = (R, G, B, )
color at p =
Implement this in two steps:
1. accumulate: add up the ( premultiplied) RGB values at each pixel
2. normalize: divide each pixel’s accumulated RGB by its value
Q: what if = 0?
Alpha Blending
Optional: see Blinn (CGA, 1994) for details:http://ieeexplore.ieee.org/iel1/38/7531/00310740.pdf?isNumber=7531&prod=JNL&arnumber=310740&arSt=83&ared=87&arAuthor=Blinn%2C+J.F.
I1
I2
I3
p
![Page 5: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/5.jpg)
How should we set the alpha values of I1, I2, I3 ?
Simplest choice: set all alpha values to one (gives discontinuities)
Better choice: use feathering to ramp alpha values to zero near the edges
Blend weights
I1
I2
I3
p
Feathered alpha channel
![Page 6: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/6.jpg)
What about more than two views?
• The geometry of three views is described by a 3 x 3 x 3 tensor called the trifocal tensor
• The geometry of four views is described by a 3 x 3 x 3 x 3 tensor called the quadrifocal tensor
• After this it starts to get complicated…
![Page 7: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/7.jpg)
New approach
• These matrices, tensors, etc, model geometry as a matrix that depends (in complicated ways) on the camera parameters alone
• Instead, we will explicitly model both cameras and points
![Page 8: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/8.jpg)
Large-scale structure from motion
Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352
![Page 9: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/9.jpg)
Questions?
![Page 10: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/10.jpg)
Structure from motion
• Given many images, how can we a) figure out where they were all taken from?b) build a 3D model of the scene?
This is (roughly) the structure from motion problem
![Page 11: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/11.jpg)
Structure from motion
• Input: images with points in correspondence pi,j = (ui,j,vi,j)
• Output• structure: 3D location xi for each point pi• motion: camera parameters Rj , tj possibly Kj
• Objective function: minimize reprojection error
Reconstruction (side) (top)
![Page 12: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/12.jpg)
Camera calibration and triangulation
• Suppose we know 3D points– And have matches between these points and an
image– How can we compute the camera parameters?
• Suppose we have know camera parameters, each of which observes a point– How can we compute the 3D location of that
point?
![Page 13: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/13.jpg)
Structure from motion
• SfM solves both of these problems at once• A kind of chicken-and-egg problem
– (but solvable)
![Page 14: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/14.jpg)
Photo Tourism
![Page 15: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/15.jpg)
First step: how to get correspondence?
• Feature detection and matching
![Page 16: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/16.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 17: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/17.jpg)
Feature detectionDetect features using SIFT [Lowe, IJCV 2004]
![Page 18: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/18.jpg)
Feature matchingMatch features between each pair of images
![Page 19: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/19.jpg)
Feature matchingRefine matching using RANSAC to estimate fundamental matrix between each pair
![Page 20: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/20.jpg)
Image connectivity graph
(graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)
![Page 21: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/21.jpg)
Structure from motion
Camera 1
Camera 2
Camera 3R1,t1
R2,t2
R3,t3
X1
X4
X3
X2
X5
X6
X7
minimize
f (R, T, P)
p1,1
p1,2
p1,3
non-linear least squares
![Page 22: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/22.jpg)
Problem size
• What are the variables? • How many variables per camera?• How many variables per point?
• Trevi Fountain collection 466 input photos + > 100,000 3D points = very large optimization problem
![Page 23: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/23.jpg)
Incremental structure from motion
![Page 24: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/24.jpg)
Incremental structure from motion
![Page 25: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/25.jpg)
Incremental structure from motion
![Page 26: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/26.jpg)
Photo Explorer
![Page 27: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/27.jpg)
Demo
![Page 28: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/28.jpg)
![Page 29: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/29.jpg)
![Page 30: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/30.jpg)
![Page 31: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/31.jpg)
![Page 32: Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely](https://reader033.vdocuments.us/reader033/viewer/2022042608/56649d565503460f94a34eb2/html5/thumbnails/32.jpg)
Questions?