lecture 11: structure from motion, part 2 cs6670: computer vision noah snavely

55
Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Post on 15-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Lecture 11: Structure from motion, part 2

CS6670: Computer VisionNoah Snavely

Page 2: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Announcements

• Project 2 due tonight at 11:59pm– Artifact due tomorrow (Friday) at 11:59pm

• Questions?

Page 3: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Final project

• Form your groups and submit a proposal

• Work on final project

• Final project presentations the last few classes

• Final report due the last day of classes

Page 4: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Final project

• One-person project: implement a recent research paper

Page 5: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Example one person projects• Seam carving

– SIGGRAPH 2007

Page 6: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Example one person projects

• What does the sky tell us about the camera?– Lalonde, et al, ECCV 2008

Page 7: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Other ideas

• Will post more ideas online

• Look at recent ICCV/CVPR/ECCV/SIGGRAPH papers

Page 8: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Example two/three person projects

• Automatic calibration of your camera’s clock

• In roughly what year was a given photo taken?

• Matching historical images to modern-day images

• Matching images on news sites / blogs for article matching

• Will post others online soon…

Page 9: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Final projects

• Think about your groups and project ideas

• Proposals will be due on Tuesday, Oct. 27

Page 10: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Scene reconstruction

Feature detectionFeature detection

Pairwisefeature matching

Pairwisefeature matching

Incremental structure

from motion

Incremental structure

from motion

Correspondence estimation

Correspondence estimation

Page 11: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Feature detectionDetect features using SIFT [Lowe, IJCV 2004]

Page 12: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Feature detectionDetect features using SIFT [Lowe, IJCV 2004]

Page 13: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Feature detectionDetect features using SIFT [Lowe, IJCV 2004]

Page 14: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Feature matchingMatch features between each pair of images

Page 15: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Feature matchingRefine matching using RANSAC [Fischler & Bolles 1987] to estimate fundamental matrices between pairs

Page 16: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Image connectivity graph

(graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)

Page 17: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Incremental structure from motion

Page 18: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Incremental structure from motion

Page 19: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Incremental structure from motion

Page 20: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Problem size

• Trevi Fountain collection 466 input photos + > 100,000 3D points = very large optimization problem

Page 21: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Photo Tourism overview

Scene reconstruction

Scene reconstruction

Photo ExplorerPhoto

ExplorerInput photographs

Page 22: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Photo Explorer

Page 23: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Can we reconstruct entire cities?

Page 24: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely
Page 25: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely
Page 26: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Gigantic matching problem

• 1,000,000 images 500,000,000,000 pairs– Matching all of these on a 1,000-node cluster would

take more than a year, even if we match 10,000 every second

– And involves TBs of data

• The vast majority (>99%) of image pairs do not match

• There are better ways of finding matching images (more on this later)

Page 27: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Gigantic SfM problem

• Largest problem size we’ve seen:– 15,000 cameras– 4 million 3D points– more than 12 million parameters– more than 25 million equations

• Huge optimization problem• Requires sparse least squares techniques

Page 28: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Building Rome in a Day

Rome, Italy. Reconstructed 150,000 in 21 hours on 496 machines

Colosseum

St. Peter’s Basilica

Trevi Fountain

Page 29: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352

Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352

Page 30: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Dubrovnik

Dubrovnik, Croatia. 4,619 images (out of an initial 57,845).Total reconstruction time: 23 hoursNumber of cores: 352

Page 31: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

San Marco Square

San Marco Square and environs, Venice. 14,079 photos, out of an initial 250,000. Total reconstruction time: 3 days. Number of cores: 496.

Page 32: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely
Page 33: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely
Page 34: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely
Page 36: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Questions?

Page 37: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

A few more preliminaries• Let’s revisit homogeneous coordinates• What is the geometric intuition?

– a point in the image is a ray in projective space

• Each point (x,y) on the plane is represented by a ray (sx,sy,s)– all points on the ray are equivalent: (x, y, 1) (sx, sy, s)

(0,0,0)

(sx,sy,s)

image plane

(x,y,1)y

x-z

Page 38: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Projective lines• What does a line in the image correspond to in

projective space?

• A line is a plane of rays through origin– all rays (x,y,z) satisfying: ax + by + cz = 0

z

y

x

cba0 :notationvectorin

• A line is also represented as a homogeneous 3-vector ll p

Page 39: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

l

Point and line duality– A line l is a homogeneous 3-vector– It is to every point (ray) p on the line: l p=0

p1p2

What is the intersection of two lines l1 and l2 ?

• p is to l1 and l2 p = l1 l2

Points and lines are dual in projective space• given any formula, can switch the meanings of points

and lines to get another formula

l1

l2

p

What is the line l spanned by rays p1 and p2 ?

• l is to p1 and p2 l = p1 p2

• l is the plane normal

.

Page 40: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Ideal points and lines

• Ideal point (“point at infinity”)– p (x, y, 0) – parallel to image plane– It has infinite image coordinates

(sx,sy,0)-y

x-z image plane

• Ideal line• l (a, b, 0) – parallel to image plane

(a,b,0)-y

x-z image plane

• Corresponds to a line in the image (finite coordinates)– goes through image origin (principle point)

Page 41: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Two-view geometry

• Where do epipolar lines come from?

epipolar plane

epipolar lineepipolar lineepipolar lineepipolar line

0

3d point lies somewhere along r

(projection of r)

Page 42: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix

• This epipolar geometry is described by a special 3x3 matrix , called the fundamental matrix

• Epipolar constraint on corresponding points:

• The epipolar line of point p in image J is:

epipolar plane

epipolar lineepipolar lineepipolar lineepipolar line

0

(projection of ray)

Page 43: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix

• Two special points: e1 and e2 (the epipoles): projection of one camera into the other

epipolar plane

epipolar lineepipolar lineepipolar lineepipolar line

0

(projection of ray)

Page 44: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix

• Two special points: e1 and e2 (the epipoles): projection of one camera into the other

0

Page 45: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Rectified case

• Images have the same orientation, t parallel to image planes

• Where are the epipoles?

Page 46: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Epipolar geometry demo

Page 47: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix

• Why does F exist?• Let’s derive it…

0

Page 48: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – calibrated case

0

: intrinsics of camera 1 : intrinsics of camera 2

: rotation of image 2 w.r.t. camera 1

: ray through p in camera 1’s (and world) coordinate system

: ray through q in camera 2’s coordinate system

Page 49: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – calibrated case

• , , and are coplanar• epipolar plane can be represented as

0

Page 50: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – calibrated case

0

Page 51: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – calibrated case

• One more substitution:– Cross product with t can be represented as a 3x3 matrix

0

Page 52: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – calibrated case

0

Page 53: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – calibrated case

0

{the Essential matrix

Page 54: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Fundamental matrix – uncalibrated case

0

the Fundamental matrix

Page 55: Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely

Properties of the Fundamental Matrix

• is the epipolar line associated with

• is the epipolar line associated with

• and

• is rank 2

• How many parameters does F have?55

T