structure from motion - cvg€¦ · • obtain reliable matches using matching or tracking and...

109
Structure from Motion Read Chapter 7 in Szeliski’s book

Upload: others

Post on 03-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Structure from Motion

Read Chapter 7 in Szeliski’s book

Page 2: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Schedule (tentative)

2

# date topic1 Sep.22 Introduction and geometry2 Sep.29 Invariant features 3 Oct.6 Camera models and calibration4 Oct.13 Multiple-view geometry5 Oct.20 Model fitting (RANSAC, EM, …)6 Oct.27 Stereo Matching7 Nov.3 2D Shape Matching 8 Nov.10 Segmentation9 Nov.17 Shape from Silhouettes10 Nov.24 Structure from motion11 Dec.1 Optical Flow 12 Dec.8 Tracking (Kalman, particle filter)13 Dec.15 Object category recognition 14 Dec.22 Specific object recognition

Page 3: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Today’s class

• Structure from motion

– factorization – sequential

– bundle adjustment

Page 4: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Factorization

• Factorise observations in structure of the scene and motion/calibration of the camera

• Use all points in all images at the same time

Affine factorisation Projective factorisation

Page 5: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Affine cameraThe affine projection equations are

1

=

j

j

j

yi

xi

ij

ij

ZYX

PP

yx

10001

1

=

j

j

j

yi

xi

ij

ij

ZYX

PP

yx

~~

4

4

=

=

j

j

j

yi

xi

ij

ijy

iij

xiij

ZYX

PP

yx

PyPx

how to find the origin? or for that matter a 3D reference point?

affine projection preserves center of gravity

∑−=i

ijijij xxx~ ∑−=i

ijijij yyy~

Page 6: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Orthographic factorizationThe orthographic projection equations are

where njmijiij ,...,1,,...,1,Mm === P

All equations can be collected for all i and j

where

[ ]n

mmnmm

n

n

M,...,M,M,,

mmm

mmmmmm

212

1

21

22221

11211

=

=

= M

P

PP

Pm

MPm =

M ~~

m

=

=

=

j

j

j

jyi

xi

iij

ijij

ZYX

,PP,

yx

P

Note that P and M are resp. 2mx3 and 3xn matrices and therefore the rank of m is at most 3

(Tomasi Kanade’92)

Page 7: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Orthographic factorizationFactorize m through singular value decomposition

An affine reconstruction is obtained as follows

TVUm Σ=

TVMUP Σ== ~,~

(Tomasi Kanade’92)

[ ]n

mmnmm

n

n

M,...,M,M

mmm

mmmmmm

min 212

1

21

22221

11211

P

PP

Closest rank-3 approximation yields MLE!

Page 8: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

0~~1~~1~~

1

1

1

=

=

=

−−

−−

−−

TT

TT

TT

yi

xi

yi

yi

xi

xi

PP

PP

PP

AA

AA

AA

0~~1~~1~~

=

=

=

T

T

T

yi

xi

yi

yi

xi

xi

PP

PP

PP

C

C

C

A metric reconstruction is obtained as follows

Where A is computed from

Orthographic factorizationFactorize m through singular value decomposition

An affine reconstruction is obtained as follows

TVUm Σ=

TVMUP Σ== ~,~

MAMAPP ~,~ 1 == −

0

1

1

=

=

=

T

T

T

yi

xi

yi

yi

xi

xi

PP

PP

PP 3 linear equations per view on symmetric matrix C (6DOF)

A can be obtained from Cthrough Cholesky factorisationand inversion

(Tomasi Kanade’92)

Page 9: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Examples

Tomasi Kanade’92,Poelman & Kanade’94

Page 10: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Examples

Tomasi Kanade’92,Poelman & Kanade’94

Page 11: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Examples

Tomasi Kanade’92,Poelman & Kanade’94

Page 12: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Examples

Tomasi Kanade’92,Poelman & Kanade’94

Page 13: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Perspective factorization

The camera equations

for a fixed image i can be written in matrix form as

where

mjmijiijij ,...,1,,...,1,Mmλ === P

MPm iii =Λ

[ ] [ ]( )imiii

mimiii

λ,...,λ,λdiagM,...,M,M , m,...,m,m

21

2121

=Λ== Mm

Page 14: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Perspective factorization

All equations can be collected for all i as

wherePMm =

=

Λ

ΛΛ

=

mnn P

PP

P

m

mm

m...

,...

2

1

22

11

In these formulas m are known, but Λi,P and Mare unknown

Observe that PM is a product of a 3mx4 matrix and a 4xn matrix, i.e. it is a rank-4 matrix

Page 15: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Perspective factorization algorithm

Assume that Λi are known, then PM is known.

Use the singular value decomposition PM=UΣ VT

In the noise-free caseΣ=diag(σ1,σ2,σ3,σ4,0, … ,0)

and a reconstruction can be obtained by setting:

P=the first four columns of UΣ.M=the first four rows of V.

Page 16: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Iterative perspective factorization

When Λi are unknown the following algorithm can be used:

1. Set λij=1 (affine approximation).

2. Factorize PM and obtain an estimate of P and M. If σ5 is sufficiently small then STOP.

3. Use m, P and M to estimate Λi from the cameraequations (linearly) mi Λi=PiM

4. Goto 2.

In general the algorithm minimizes the proximity measure P(Λ,P,M)=σ5

Note that structure and motion recovered up to an arbitrary projective transformation

Page 17: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Further Factorization work

Factorization with uncertainty

Factorization for dynamic scenes

(Irani & Anandan, IJCV’02)

(Costeira and Kanade ‘94)(Bregler et al. ‘00, Brand ‘01)

(Yan and Pollefeys, ‘05/’06)

Page 18: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Multi-Camera Factorizations

(Static) affine cameras Rigidly moving object Camera calibration using rigid motion 2D feature point trajectories as input No feature point correspondences between different camera views

required

rank ≤13all tracks of all affine cameras form rank 13 subspace!

juxtapose x, y coordinates for all points and cameras

for a

ll fr

ames

(Angst & Pollefeys ICCV09)

Page 19: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and
Page 20: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Factorization technique 3rd order tensor with rank 13 constraint

C : camera matrices M : motion matrix S : Shape matrix

Multi-Camera Factorizations

(Angst & Pollefeys ICCV09)

Page 21: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Experiment on real data

Minimal cases

extra camera: 1 point

(Angst & Pollefeys ICCV09)

Multi-Camera Factorizations

Page 22: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

practical structure and motion recovery from images

• Obtain reliable matches using matching or tracking and 2/3-view relations

• Compute initial structure and motion• Refine structure and motion• Auto-calibrate• Refine metric structure and motion

Page 23: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Initialize Motion (P1,P2 compatibel with F)

Sequential Structure and Motion Computation

Initialize Structure (minimize reprojection error)

Extend motion(compute pose through matches seen in 2 or more previous views)

Extend structure(Initialize new structure,refine existing structure)

Page 24: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Computation of initial structure and motion

according to Hartley and Zisserman “this area is still to some extend a

black-art”

All features not visible in all images⇒ No direct method (factorization not applicable)⇒ Build partial reconstructions and assemble

(more views is more stable, but less corresp.)

1) Sequential structure and motion recovery

2) Hierarchical structure and motion recovery

Page 25: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sequential structure and motion recovery

• Initialize structure and motion from two views

• For each additional view– Determine pose– Refine and extend structure

• Determine correspondences robustly by jointly estimating matches and epipolar geometry

Page 26: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Initial structure and motion

[ ][ ][ ]eeaFeP

0IPT

x +=

=

2

1

Epipolar geometry ↔ Projective calibration

012 =FmmT

compatible with F

Yields correct projective camera setup(Faugeras´92,Hartley´92)

Obtain structure through triangulationUse reprojection error for minimizationAvoid measurements in projective space

Page 27: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Compute Pi+1 using robust approach (6-point RANSAC)Extend and refine reconstruction

)x,...,X(xPx 11 −= iii

2D-2D

2D-3D 2D-3D

mimi+1

M

new view

Determine pose towards existing structure

Page 28: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Compute P with 6-point RANSAC

• Generate hypothesis using 6 points

• Count inliers

• Projection error ( )( ) ?x,x...,,xXP 11 td iii <−

• Back-projection error ( ) ijtd jiij <∀< ?,x,xF• Re-projection error ( )( ) td iiii <− x,x,x...,,xXP 11

• 3D error ( )( ) ?X,xP 3-1

Dii td <Λ

• Projection error with covariance ( )( ) td iii <−Λ x,x...,,xXP 11

• Expensive testing? Abort early if not promising• Verify at random, abort if e.g. P(wrong)>0.95

(Chum and Matas, BMVC’02)

Page 29: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Calibrated structure from motion

• Equations more complicated, but less degeneracies

• For calibrated cameras:– 5-point relative motion (5DOF)

Nister CVPR03

– 3-point pose estimation (6DOF)Haralick et al. IJCV94

D. Nistér, An efficient solution to the five-point relative pose problem, In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), Volume 2, pages. 195-202, 2003.

R. Haralick, C. Lee, K. Ottenberg, M. Nolle. Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem. Int’l Journal of Computer Vision, 13, 3, 331-356, 1994.

Page 30: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

• Linear equations for 5 points

• Linear solution space

• Non-linear constraints

5-point relative motion

10 cubic polynomials

scale does not matter, choose

(Nister, CVPR03)

Page 31: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

5-point relative motion

• Perform Gauss-Jordan elimination on polynomials

-z

-z

-z

represents polynomial of degree n in z

(Nister, CVPR03)

Page 32: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Three points perspective pose –p3p

(Haralick et al., IJCV94)

All techniques yield 4th order polynomial

Haralick et al. recommends using Finsterwalder’stechnique as it yields the best results numerically

19031841

Page 33: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Minimal solversLot’s of recent activity using Groebner bases:• Henrik Stewénius, David Nistér, Fredrik Kahl, Frederik

Schaffalitzky: A Minimal Solution for Relative Pose with Unknown Focal Length, CVPR 2005.

• H. Stewénius, D. Nistér, M. Oskarsson, and K. Åström. Solutions to minimal generalized relative pose problems. Omnivis 2005.

• D. Nistér, A Minimal solution to the generalised 3-point pose problem, CVPR 2004

• Martin Bujnak, Zuzana Kukelova, Tomás Pajdla: A general solution to the P4P problem for camera with unknown focal length. CVPR 2008.

• Brian Clipp, Christopher Zach, Jan-Michael Frahm and Marc Pollefeys, A New Minimal Solution to the Relative Pose of a Calibrated Stereo Camera with Small Field of View Overlap, ICCV 2009.

• Zuzana Kukelova, Martin Bujnak, Tomás Pajdla: Automatic Generator of Minimal Problem Solvers. ECCV 2008.

• F. Fraundorfer, P. Tanskanen and M. Pollefeys. A Minimal Case Solution to the Calibrated Relative Pose Problem for the Case of Two Known Orientation Angles. ECCV 2010.

• …

Page 34: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Minimal relative pose with know vertical

40

-g

Fraundorfer, Tanskanen and Pollefeys, ECCV2010

5 linear unknowns → linear 5 point algorithm3 unknowns → quartic 3 point algorithm

Vertical direction can often be estimated• inertial sensor• vanishing point

Page 35: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Incremental Structure from Motion

Photo Tourism

Noah Snavely, Steve Seitz, Rick Szeliski

University of Washington

Richard Szeliski

Page 36: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Incremental structure from motion

• Automatically select an initial pair of images

Page 37: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Incremental structure from motion

Page 38: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Incremental structure from motion

Page 39: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Hierarchical structure and motion recovery

• Compute 2-view• Compute 3-view• Stitch 3-view reconstructions• Merge and refine reconstruction

FT

H

PM

Page 40: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Hierarchical structure from motion

• Farenzena, Fusiello, and Gherardi,Structure-and-motion Pipeline on a Hierarchical Cluster Tree, 3DIM 2009, October 2009

Richar

ICVSS 2010 46

Page 41: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Hierarchical structure from motion

Richar

ICVSS 2010 47

Page 42: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Hierarchical structure from motion

Richar

ICVSS 2010 48

Page 43: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Stitching 3-view reconstructions

Different possibilities1. Align (P2,P3)with(P’1,P’2) ( ) ( )-1

23-1

12H

HP',PHP',Pminarg AA dd +

2. Align X,X’ (and C’C’) ( )∑j

jjAd HX',XminargH

3. Minimize reproj. error ( )( )∑

∑+

jjj

jjj

d

d

x',HXP'

x,X'PHminarg 1-

H

4. MLE (merge) ( )∑j

jjd x,PXminargXP,

Page 44: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

More issues

• Repetition ambiguities

• Large scale reconstructions

Marc PollefeysICVSS 2010 50

Page 45: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Disambiguating visual relations using loop constraints

51

(Zach et al CVPR‘10)

Page 46: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Repetition and symmetry detection

52

(Wu et al ECCV‘10)

Page 47: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

53

•Challenge: Error accumulation yields drift of relative scale, orientation and position

•Solution:Cancel drift by closing loops (e.g. at intersections)Need to visually recognize locations

Video-only large-scale reconstruction?

Page 48: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Solving 3D puzzles with VIPs

SIFT features•Extracted from 2D images•Variation due to viewpoint

VIP features•Extracted from 3D model•Viewpoint invariant

54

(Wu et al., CVPR08)

Page 49: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

am

55

Page 50: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Where am I? What am I looking at?

3D Database

xSIFT VIP

56

Image query

Location recognition (Baatz et al., ECCV2010)

Page 51: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

57

Projective ambiguity

reconstruction from uncalibrated images only⇒ projective ambiguity on reconstruction

´M´M))((Mm 1 PTPTP === − Similarity(7DOF)

Affine(12DOF)

Projective(15DOF)

Page 52: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

58

Euclidean projection matrix

=

1yy

xx

cfcsf

K

Factorization of Euclidean projection matrix

Intrinsics:

Extrinsics: ( )t,RNote: every projection matrix can be factorized, but only meaningful for euclidean projection matrices

(camera geometry)

(camera motion)

Page 53: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

59

The Absolute Quadric

Eliminate extrinsics from equation

Equivalent to projection of quadric

))(Ω)((Ω *1* TTTTT PTTTPTPPKK −−==

Absolute Quadric also exists in projective world

Absolute Quadric

T´Ω´´ * PP=Transforming world so thatreduces ambiguity to metric

** ΩΩ´ →

2 2 2 0A B C+ + =

0AX BY CZ D+ + + =

Page 54: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

60

ω*

Ω*

Absolute Quadric = calibration object which is always present but can only be observed through constraints on the intrinsics

Tii

Tiii Ωω KKPP == ∗∗

Absolute quadric and self-calibration

Projection equation:

• Translate constraints on K • through projection equation• to constraints on Ω*

Page 55: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

• Don’t treat all constraints equal

61

Practical linear self-calibration

( ) ( )( )( )( ) 0Ω

0Ω0Ω

0ΩΩ

23T

13T

12T

22T

11T

===

=−

∗∗

PPPPPP

PPPP(Pollefeys et al., ECCV‘02)

2

* 2

ˆ 0 0ˆ0 0

0 0 1

f

f

= Ω ≈

P PT TKK

( ) ( )( ) ( ) 0ΩΩ

0ΩΩ

33T

22T

33T

11T

=−=−

∗∗

∗∗

PPPPPPPP

(only rough aproximation, but still usefull to avoid degenerate configurations)

(relatively accurate for most cameras)

91

91

1.011.0

101.01

2.01

=

1yy

xx

cfcsf

K

after normalization!

when fixating point at image-center not only absolute quadric diag(1,1,1,0) satisfies ICCV98 eqs., but also diag(1,1,1,a), i.e. real or imaginary spheres!

Recent related work (Gurdjos et al. ICCV‘09)

Page 56: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

62

Upgrade from projective to metric

• Locate Ω* in projective reconstruction (self-calibration)Problem: not always unique (Critical Motion Sequences)

• Transform projective reconstruction • to bring Ω* in canonical position

Page 57: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Refining structure and motion

• Minimize reprojection error

– Maximum Likelyhood Estimation (if error zero-mean Gaussian noise)

– Huge problem but can be solved efficiently(Bundle adjustment)

( )∑∑= =

m

k

n

iikD

ik 1 1

2

kiM,P

MP,mmin

Page 58: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Non-linear least-squares

• Newton iteration• Levenberg-Marquardt • Sparse Levenberg-Marquardt

(P)X f= (P)X argminP

f−

Page 59: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Newton iteration

Taylor approximation

∆+≈∆+ J)(P )(P 00 ff PXJ∂∂

=Jacobian

)(PX 1f−

∆−=∆−−≈− JJ)(PX)(PX 001 eff

( ) 0T-1T

0TT JJJJJJ ee =∆⇒=∆⇒

∆+=+ i1i PP ( ) 0T-1T JJJ e=∆

( ) 01-T-11-T JJJ eΣΣ=∆

normal eq.

Page 60: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Levenberg-Marquardt

0TT JNJJ e=∆=∆

0TJN' e=∆

Augmented normal equations

Normal equations

J)λdiag(JJJN' TT +=

30 10λ −=

10/λλ :success 1 ii =+

ii λ10λ :failure = solve again

accept

λ small ~ Newton (quadratic convergence)

λ large ~ descent (guaranteed decrease)

Page 61: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Levenberg-Marquardt

Requirements for minimization• Function to compute f• Start value P0

• Optionally, function to compute J(but numerical ok, too)

Page 62: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Richard Szeliski

Bundle Adjustment

• What makes this non-linear minimization hard?– many more parameters: potentially slow– poorer conditioning (high correlation)– potentially lots of outliers– gauge (coordinate) freedom

Page 63: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and
Page 64: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and
Page 65: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Chained transformations

• Think of the optimization as a set of transformations along with their derivatives

Richard Szeliski

Page 66: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Richard Szeliski

Levenberg-Marquardt

• Iterative non-linear least squares [Press’92]– Linearize measurement equations

– Substitute into log-likelihood equation: quadratic cost function in ∆m

Page 67: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Richard Szeliski

Robust error models

• Outlier rejection– use robust penalty applied

to each set of jointmeasurements

– for extremely bad data, use random sampling [RANSAC, Fischler & Bolles, CACM’81]

Page 68: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Richard Szeliski

• Iterative non-linear least squares [Press’92]– Solve for minimum

Hessian:

error:

Levenberg-Marquardt

Page 69: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Chained transformations

• Only some derivatives are non-zero:j th camera, i th point

Richard Szeliski

Page 70: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Multi-camera rig

• Easy extension of generalized bundler:

Richard Szeliski

Page 71: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Richard Szeliski

Lots of parameters: sparsity

• Only a few entries in Jacobian are non-zero

Page 72: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Exploiting sparsity

• Schur complement to get reduced camera matrix

Richard Szeliski

Page 73: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse Levenberg-Marquardt

• complexity for solving– prohibitive for large problems

(100 views 10,000 points ~30,000 unknowns)

• Partition parameters– partition A – partition B (only dependent on A and itself)

0T-1 JN' e=∆3N

Page 74: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse bundle adjustment

residuals:normal equations:

with

note: tie points should be in partition A

Page 75: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse bundle adjustment

normal equations:

modified normal equations:

solve in two parts:

Page 76: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse bundle adjustment

U1

U2

U3

WT

W

V

P1 P2 P3 M

Jacobian of has sparse block structure

=J == JJN T

12xm 3xn(in general

much larger)

im.pts. view 1

( )( )∑∑= =

m

k

n

iikD

1 1

2ki MP,m

Needed for non-linear minimization

Page 77: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse bundle adjustment

• Eliminate dependence of camera/motion parameters on structure parametersNote in general 3n >> 11m

WT V

U-WV-1WT

− −

NI0WVI 1

11xm 3xn

Allows much more efficient computationse.g. 100 views,10000 points,

solve ±1000x1000, not ±30000x30000

Often still band diagonaluse sparse linear algebra algorithms

Page 78: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse bundle adjustment

normal equations:

modified normal equations:

solve in two parts:

Page 79: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Sparse bundle adjustment

• Covariance estimation

-1WVY =( )

1a

Tb

1-a

VYYWWVU

+

+∑=∑−=∑

Yaab ∑=∑ -

Page 80: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Direct vs. iterative solvers

• How do they work?• When should you use them?

Richard Szeliski

Page 81: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Large reduced camera matrices

• Use iterative preconditioned conjugate gradient– [Jeong, Nister, Steedly, Szeliski, Kweon, CVPR 2009]– [Agarwal, Snavely, Seitz, and Szeliski, ECCV 2009]– Block diagonal works well– Sometimes partial Cholesky does too– Direct solvers for small problems

Richard Szeliski

Page 82: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Large-scale structure from motion

Richard Szeliski

Page 83: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Large-scale reconstruction

• Most of the models shown so far have had ~500 images

• How do we scale from 100s to 10,000s of images?

• Observation: Internet collections represent very non-uniform samplings of viewpoint

Richard Szeliski

Page 84: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Skeletal graphs for efficient structure from motion

Noah Snavely, Steve Seitz,Richard Szeliski

CVPR’2008

Page 85: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Skeletal set

• Goal: select a smaller set of images to reconstruct, without sacrificing quality of reconstruction

• Two problems:– How do we measure quality?– How do we find a subset of images with

bounded quality loss?

Page 86: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Skeletal set

Goal: given an image graph ,• select a small set S of important

images to reconstruct, bounding the loss in quality of the reconstruction

• Reconstruct the skeletal set S • Estimate the remaining images with

much faster pose estimation steps

Page 87: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Properties of the skeletal set

• Should touch all parts ofDominating set

• Should form a single reconstructionConnected dominating set

• Should result in an accurate reconstruction ?

Page 88: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Example

Full graph Skeletal graph

Page 89: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Representing information in a graphWhat kind of information?– No absolute information about camera

positions– Each edge provides information about the

relative positions of two images

– … but not all edges are equally informativeWe model information with the uncertainty (covariance) in pairwise camera positions

Page 90: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Estimating certainty

• Usually measured with covariance matrix

• SfM has thousands or millions of parameters

• We only measure covariance in camera positions (3x3 matrix for every camera)

Page 91: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Pantheon

Full graph Skeletal graph (t=16)

Page 92: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Skeletal reconstruction

101 images

After adding leaves579 images

After final optimization579 images

Pantheon

Page 93: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Trafalgar Square

2973 images registered (277 in skeletal set)

Page 94: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Trafalgar Square

Page 95: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Running time

0

20

40

60

80

100

120

140

160

180

Stonehenge St. Peters Pantheon Pisa Trafalgar

Full reconstruction

Skeletal reconstruction

Skeletal reconstruction + BA

(~10 days) (~30 days)

hour

s

Page 96: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Building Rome in a Day

Sameer Agarwal, Noah Snavely,Ian Simon, Steven M. Seitz,

Richard SzeliskiICCV’2009

Page 97: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Distributed matching pipeline

Page 98: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Query expansion

Page 99: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Results: Dubrovnik

Page 100: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and
Page 101: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Results: Rome

Page 102: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Changchang Wu’s SfM code

for iconic graph• uses 5-point+RANSAC for 2-view initialization• uses 3-point+RANSAC for adding views• performs bundle adjustmentFor additional images• use 3-point+RANSAC pose estimation

Page 103: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Rome on a cloudless day

(Frahm et al. ECCV 2010)• GIST & clustering (1h35)

SIFT & Geometric verification (11h36)

SfM & Bundle (8h35)

Dense Reconstruction (1h58)

Some numbers• 1PC• 2.88M images• 100k clusters• 22k SfM with 307k images• 63k 3D models• Largest model 5700 images• Total time 23h53

Page 104: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Structure from motion: limitations

• Very difficult to reliably estimate metricstructure and motion unless:– large (x or y) rotation or– large field of view and depth variation

• Camera calibration important for Euclidean reconstructions

• Need good feature tracker

Page 105: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Related problems

• On-line structure from motion and SLaM (Simultaneous Localization and Mapping)– Kalman filter (linear)– Particle filters (non-linear)

Page 106: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Visual SLAM

• Real-time Stereo Visual SLAM

113

(Clipp et al., IROS2010)

Page 107: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Open challenges

• Large scale structure from motion– Complete building– Complete city

Page 108: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

Open source vode available, e.g.

Christopher Zach (SfM, Bundle adjustment)http://www.inf.ethz.ch/personal/chzach/opensource.html

Photo Tourism Bundler:http://phototour.cs.washington.edu/bundler/

Page 109: Structure from Motion - CVG€¦ · • Obtain reliable matches using matching or tracking and 2/3-view relations • Compute initial structure and motion • Refine structure and

116

Next week:Optical Flow