agenda - carnegie mellon university16720.courses.cs.cmu.edu/lec/transformations.pdf164 computer...

67
Agenda Rotations Camera calibration Homography Ransac

Upload: others

Post on 23-Jan-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Agenda

• Rotations

• Camera calibration

• Homography

• Ransac

Page 2: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

164 Computer Vision: Algorithms and Applications (September 3, 2010 draft)

Transformation Matrix # DoF Preserves Icon

translationh

I ti

2⇥32 orientation

rigid (Euclidean)h

R ti

2⇥33 lengths ⇢⇢

⇢⇢SSSS

similarityh

sR ti

2⇥34 angles ⇢

⇢SS

affineh

Ai

2⇥36 parallelism ⇥⇥ ⇥⇥

projectiveh

˜Hi

3⇥38 straight lines `

Table 3.5 Hierarchy of 2D coordinate transformations. Each transformation also preservesthe properties listed in the rows below it, i.e., similarity preserves not only angles but alsoparallelism and straight lines. The 2⇥3 matrices are extended with a third [0T 1] row to forma full 3⇥ 3 matrix for homogeneous coordinate transformations.

amples of such transformations, which are based on the 2D geometric transformations shownin Figure 2.4. The formulas for these transformations were originally given in Table 2.1 andare reproduced here in Table 3.5 for ease of reference.

In general, given a transformation specified by a formula x0 = h(x) and a source imagef(x), how do we compute the values of the pixels in the new image g(x), as given in (3.88)?Think about this for a minute before proceeding and see if you can figure it out.

If you are like most people, you will come up with an algorithm that looks something likeAlgorithm 3.1. This process is called forward warping or forward mapping and is shown inFigure 3.46a. Can you think of any problems with this approach?

procedure forwardWarp(f,h, out g):

For every pixel x in f(x)

1. Compute the destination location x0 = h(x).

2. Copy the pixel f(x) to g(x0).

Algorithm 3.1 Forward warping algorithm for transforming an image f(x) into an imageg(x0) through the parametric transform x0 = h(x).

Geometric Transformations

x

y

Let’s define families of transformations by the properties that they preserve

Page 3: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Rotations

Definition: an orthogonal transformation perserves dot products

Linear transformations that preserve distances and angles

[can conclude by setting a,b = coordinate vectors]

Defn: A is a rotation matrix if ATA = I, det(A) = 1Defn: A is a reflection matrix if ATA = I, det(A) = -1

aT b = T (a)T (b) where T (a) = Aa, a 2 Rn, A 2 Rn⇥n

aT b = aTATAb () ATA = I

aT b = F (a)TF (b) where F (a) = Aa, a 2 Rn, A 2 Rn⇥n

Page 4: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

2D Rotations

R =

cos ✓ � sin ✓sin ✓ cos ✓

1 DOF

Page 5: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

3D Rotations

Think of as change of basis where ri = r(i,:) are orthonormal basis vectors

R

2

4XYZ

3

5 =

2

4r11 r12 r13r21 r22 r23r31 r32 r33

3

5

2

4XYZ

3

5

rotated coordinate frame

r1

r2

r3

How many DOFs?

3 = (2 to point r1 + 1 to rotate along r1)

Page 6: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

3D RotationsLots of parameterizations that try to capture 3 DOFs

Helpful one for vision: axis-angle representation

Represent a 3D rotation with a unit vector that represents the axis of rotation, and an angle of rotation about that vector

7

Shears

A=

2

664

1 hxy hxz 0hyx 1 hyz 0hzx hzy 1 00 0 0 1

3

775

Shears y into x

7

8

Rotations• 3D Rotations fundamentally more complex than in 2D!

• 2D: amount of rotation!• 3D: amount and axis of rotation

-vs-

2D 3D

8

05-3DTransformations.key - February 9, 2015

Page 7: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Recall: cross-product

Dot product:

Cross product:

a · b = ||a|| ||b||cos✓

Cross product matrix:

������

i j ka1 a2 a3b1 b2 b3

������=

����a2 a3b2 b3

���� i�����a1 a3b1 b3

���� j+����a1 a2b1 b2

����k

a⇥ b = ab =

2

40 �a3 a2a3 0 �a1�a2 a1 0

3

5

2

4b1b2b3

3

5

Page 8: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Approach

x

! 2 R3, ||!|| = 1

Page 9: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Approach

x✓

! 2 R3, ||!|| = 1

xk

x?

1. Write as x as sum of parallel and perpindicular component to omega

2. Rotate perpindicular component by 2D rotation of theta in plane orthogonal to omega

R = I + w sin ✓ + ww(1� cos ✓)

[Rx can simplify to cross and dot product computations]

Page 10: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Exponential map

x✓

! 2 R3, ||!|| = 1

xk

x?

[standard Taylor series expansion of exp(x) @ x=0 as 1 + x + (1/2!)x2 +…]

Implication: we can approximate change in position due to a small rotation as v ⇥ x, where v = !✓

R = exp(v), where v = !✓

= I + v +1

2!

v2 + . . .

Page 11: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Agenda

• Rotations

• Camera calibration

• Homography

• Ransac

Page 12: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Perspective projection

COP

(X,Y,Z)

(x,y,1)

x =f

Z

X

y =f

Z

Y

x

y

z

[right-handed coordinate system]

Page 13: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Perspective projection revisited

2

4x

y

1

3

5 =

2

4f 0 00 f 00 0 1

3

5

2

4X

Y

Z

3

5

�x = fX

� = Z

x =�x

=fX

Z

Given (X,Y,Z) and f, compute (x,y) and lambda:

Page 14: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Special case: f = 1

COP

(X,Y,Z)(x,y,1)

• 3D point is obtained by scaling ray pointed at image coordinate • Scale factor = true depth of point

Natural geometric intuition:

[Aside: given an image with a focal length ‘f’, resize by ‘1/f’ to obtain unit-focal-length image]

Z

2

4x

y

1

3

5 =

2

4X

Y

Z

3

5

Page 15: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Homogenous notation

For now, think of above as shorthand notation for

2

4x

y

z

3

5 ⇠

2

4X

Y

Z

3

5

2

4x

y

z

3

5 ⌘

2

4X

Y

Z

3

5

9� s.t. �

2

4x

y

z

3

5 =

2

4X

Y

Z

3

5

Page 16: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Camera projection

3D point in world coordinates

Camera extrinsics (rotation and translation)

Camera instrinsic matrix K (can include skew & non-square pixel size)

2

4x

y

1

3

5 =

2

4f 0 00 f 00 0 1

3

5

2

4r11 r12 r13 t

x

r21 r22 r23 t

y

r31 r32 r33 t

z

3

5

2

664

X

Y

Z

1

3

775

camera

world coordinate frame

r1

r2

r3

T

Aside: homogenous notation is shorthand for x =�x

Page 17: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Fancier intrinsicsx

s

= s

x

x

y

s

= s

y

y

x

0 = x

s

+ o

x

y

0 = y

s

+ o

y

x” = x

0 + s

y

0

non-square pixels

shifted origin

x

y

✓ skewed image axes

}

}

K =

2

4s

x

s

o

x

0 s

y

o

y

0 0 1

3

5

2

4f 0 00 f 00 0 1

3

5 =

2

4fs

x

fs

o

x

0 fs

y

o

y

0 0 1

3

5

Page 18: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Notation�

2

4x

y

1

3

5 =

2

4fs

x

fs

o

x

0 fs

y

o

y

0 0 1

3

5

2

4r11 r12 r13 t

x

r21 r22 r23 t

y

r31 r32 r33 t

z

3

5

2

664

X

Y

Z

1

3

775

= K3⇥3

⇥R3⇥3 T3⇥1

2

664

X

Y

Z

1

3

775

= M3⇥4

2

664

X

Y

Z

1

3

775

Claims (without proof): 1. A 3x4 matrix ‘M’ can be a camera matrix iff det(M) is not zero 2. M is determined only up to a scale factor

[Using Matlab’s rows x columns]

Page 19: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Notation (more)M3⇥4

2

664

XYZ1

3

775 =⇥A3⇥3 b3⇥1

2

664

XYZ1

3

775

= A3⇥3

2

4XYZ

3

5+ b3⇥1

M =

2

4mT

1

mT2

mT3

3

5 , A =

2

4aT1aT2aT3

3

5 , b =

2

4b1b2b3

3

5

Page 20: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Applying the projection matrix

Set of 3D points that project to x = 0:

Set of 3D points that project to y = 0:

Set of 3D points that project to x = inf or y = inf:

� =⇥X Y Z

⇤a3 + b3

⇥X Y Z

⇤a1 + b1 = 0

⇥X Y Z

⇤a2 + b2 = 0

⇥X Y Z

⇤a3 + b3 = 0

x =1

(⇥X Y Z

⇤a1 + b1)

y =1

�(⇥X Y Z

⇤a2 + b2)

Page 21: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

x

y

a3

Rows of the projection matrix describe the 3 planes defined by the image coordinate system

a1

a2

image plane

COP

Page 22: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

(x,y) (X,Y,Z)

What’s set of (X,Y,Z) points that project to same (x,y)?2

4X

Y

Z

3

5 = �w + b where w = A

�1

2

4x

y

1

3

5, b = �A

�1b

What’s the position of COP / pinhole?

COP

A

2

4XYZ

3

5+ b = 0 )

2

4XYZ

3

5 = �A�1b

Other geometric properties

Page 23: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Affine Cameras

• Example: Weak-perspective projection model • Projection defined by 8 parameters • Parallel lines are projected to parallel lines • The transformation can be written as a direct linear transformation

Image coordinates (x,y) are an affine function of world coordinates (X,Y,Z)

mT3 =

⇥0 0 0 1

⇤ x =⇥X Y Z

⇤a1 + b1

y =⇥X Y Z

⇤a2 + b1

Affine transformations = linear transformations plus an offset

Page 24: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Geometric Transformations

Euclidean (trans + rot) preserves lengths + angles

Euclidean

Affine

Projective

Affine: preserves parallel lines

Projective: preserves lines

Page 25: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Agenda

• Rotations

• Camera calibration

• Homography

• Ransac

Page 26: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Calibration: Recover M from scene points P1,..,PN and the corresponding projections in the image plane p1,..,pN

Find M that minimizes the distance between the actual points in the image, pi, and their predicted projections MPi

Problems: • The projection is (in general) non-linear • M is defined up to an arbitrary scale factor

Page 27: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

PnP = Perspective n-Point

Page 28: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

ii MPp ≡

iT

iT

ii

Ti

T

i PmPmv

PmPmu

3

2

3

1 ==

0)(0)(

32

31

=−

=−

iiT

iT

iiT

iT

vPmPmuPmPm

Write relation between image point, projection matrix, and point in space:

Write non-linear relations between coordinates:

Make them linear:

The math for the calibration procedure follows a recipe that is used in many (most?) problems involving camera geometry, so it’s worth remembering:

Page 29: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

0

00

00

111

111

=

⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢

m

PvPPuP

PvPPuP

TNN

TN

TNN

TN

TT

TT

���

Put all the relations for all the points into a single matrix:

⎥⎥⎥

⎢⎢⎢

==⎥⎦

−⎢⎣

3

2

1

00

0 mmm

mmPvPu

PP

Tii

Tii

Ti

TiWrite them in

matrix form:

In noise-free case: Lm = 0

(vector of 0’s)

Page 30: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

What about noisy case?

min||m||2=1

||Lm||2

Is this the right error to minimize?

If not, what is?

Min right singular vector of L (or eigenvector of LTL)

Page 31: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

P1

z

x

y

Pi

(ui,vi)

(u1,v1)

MPi

Ideal error

2

3

2

2

3

1⎟⎟⎠

⎞⎜⎜⎝

⋅−+⎟⎟

⎞⎜⎜⎝

⋅−

i

ii

i

ii Pm

PmvPmPmuError(M) =

Initialize nonlinear optimization with “algebraic” solution

Page 32: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Radial Lens Distortions

Page 33: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Radial Lens Distortions

No Distortion Barrel Distortion Pincushion Distortion

Page 34: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix
Page 35: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Correcting Radial Lens Distortions

Before After

http://www.grasshopperonline.com/barrel_distortion_correction_software.html

Page 36: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Overall approachError(M,k’s)Minimize reprojection error:

Initialize with algebraic solution (approaches in literature based on various assumptions)

Page 37: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Revisiting homographies

2

4x

y

1

3

5 =

2

4f 0 00 f 00 0 1

3

5

2

4r11 r12 r13 t

x

r21 r22 r23 t

y

r31 r32 r33 t

z

3

5

2

664

X

Y

01

3

775

Place world coordinate frame on object plane

Page 38: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Projection of planar points

Convert between 2D location on object plane and image coordinate with a 3X3 matrix H(Above holds for any instrinc matrix K)

2

4x

y

1

3

5 =

2

4f 0 00 f 00 0 1

3

5

2

4r11 r12 r13 t

x

r21 r22 r23 t

y

r31 r32 r33 t

z

3

5

2

664

X

Y

01

3

775

=

2

4f 0 00 f 00 0 1

3

5

2

4r11 r12 t

x

r21 r22 t

y

r31 r32 t

z

3

5

2

4X

Y

1

3

5

=

2

4fr11 fr12 ft

x

fr21 fr22 ft

y

r31 r32 t

z

3

5

2

4X

Y

1

3

5

Page 39: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Two-views of a plane

Image correspondences

�1

2

4x1

y11

3

5 = H1

2

4XY1

3

5

�2

2

4x2

y21

3

5 = H2

2

4XY1

3

5

2

4x2

y21

3

5 = H

2

4XY1

3

5

[Aside: H usually invertible]

[LHS and RHS are related by a scale factor]

2

4x2

y21

3

5 = H2H�11

2

4x1

y11

3

5

Page 40: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Computing homography projections

2

4x2

y2

1

3

5 =

2

4a b c

d e f

g h i

3

5

2

4x1

y1

1

3

5

Given (x1,y1) and H, how do we compute (x2,y2)?

Is this operation linear in H or (x1,y1)?

x2 =�x2

=ax1 + by1 + c

gx1 + hy1 + i

Page 41: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Estimating homographies

Image correspondences

Given corresponding 2D points in left and right image, estimate H

How many corresponding points needed? How many degrees of freedom in H?

Homogenous linear systemAH(:) =

2

6400...

3

75

x2(gx1 + hy1 + i) = ax1 + by1 + c

...

Page 42: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Estimating homographies

Image correspondences

H is determined only up to scale factor (8 DOFs) Need 4 points minimum. How to handle more points?

min||H(:)||2=1

||AH(:)||2

Minimum right singular vector of A (eigenvector of ATA)

AH(:) =

2

6400...

3

75

Given corresponding 2D points in left and right image, estimate H

Page 43: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

“Frontalizing” planes using homographies

Estimate homography on (at least) 4 pairs of corresponding points (e.g., corners of quad/rect)

Apply homography on all (x,y) coordinates inside target rectangle to compute source pixel location

Page 44: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

“Frontalizing” planes using homographies

Page 45: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Special case of 2 views: rotations about camera center

LECTURE 4. PLANAR SCENES AND HOMOGRAPHY 5

cues (parallax) can only be recovered when T is nonzero. Looking at thehomography equation, the limit of H as d approaches infinity is R. Thus anypair of images of an arbitrary scene captured by a purely rotating camera isrelated by a planar homography.

A planar panorama can be constructed by capturing many overlappingimages at di↵erent rotations, picking an image to be a reference, and thenfinding corresponding points between the overlapping images. The pairwisehomographies are derived from the corresponding points, forming a mosaicthat typically is shaped like a “bow-tie,” as images farther away from thereference are warped outward to fit the homography. The figure below isfrom Pollefeys and Hartley & Zisserman.

4.7. Second Derivation of Homography Constraint

The homography constraint, element by element, in homogenous coordinatesis as follows:

2

4x2

y2z2

3

5 =

2

4H11 H12 H13

H21 H22 H23

H31 H32 H33

3

5

2

4x1

y1z1

3

5 , x2 ⇠ Hx1

In inhomogenous coordinates (x02 = x2/z2 and y02 = y2/z2),

Can be modeled as planar transformations, regardless of scene geometry!

(a) incline L.jpg (img1) (b) incline R.jpg (img2) (c) img2 warped to img1’s frame

Figure 5: Example output for Q6.1: Original images img1 and img2 (left and center) andimg2 warped to fit img1 (right). Notice that the warped image clips out of the image. Wewill fix this in Q6.2

H2to1=computeH(p1,p2)

Inputs: p1 and p2 should be 2⇥N matrices of corresponding (x, y)T coordinatesbetween two images.Outputs: H2to1 should be a 3⇥ 3 matrix encoding the homography that best matchesthe linear equation derived above for Equation 8 (in the least squares sense). Hint:Remember that a homography is only determined up to scale. The Matlab functionseig() or svd() will be useful. Note that this function can be written without anexplicit for-loop over the data points.

6 Stitching it together: Panoramas (30 pts)

We can also use homographies to create a panorama image from multiple views of the samescene. This is possible for example when there is no camera translation between the views(e.g., only rotation about the camera center), as we saw in Q4.2.

First, you will generate panoramas using matched point correspondences between imagesusing the BRIEF matching you implemented in Q2.4. We will assume that there is no errorin your matched point correspondences between images (Although there might be someerrors).

In the next section you will extend the technique to use (potentially noisy) keypointmatches.

You will need to use the provided function warp im=warpH(im, H, out size), whichwarps image im using the homography transform H. The pixels in warp_im are sampledat coordinates in the rectangle (1, 1) to (out_size(2), out_size(1)). The coordinates ofthe pixels in the source image are taken to be (1, 1) to (size(im,2), size(im,1)) andtransformed according to H.

• Q6.1 (15pts) In this problem you will implement and use the function (stub providedin matlab/imageStitching.m):

[panoImg] = imageStitching(img1, img2, H2to1)

on two images from the Dusquesne incline. This function accepts two images and theoutput from the homography estimation function. This function will:

10

Figure 6: Final panorama view. With homography estimated with RANSAC.

• a folder matlab containing all the .m and .mat files you were asked to write andgenerate

• a pdf named writeup.pdf containing the results, explanations and images asked forin the assignment along with to the answers to the questions on homographies.

Submit all the code needed to make your panorama generator run. Make sure all the .m

files that need to run are accessable from the matlab folder without any editing of the pathvariable. If you downloaded and used a feature detector for the extra credit, include thecode with your submission and mention it in your writeup. You may leave the data folderin your submission, but it is not needed. Please zip your homework as usual and submit itusing blackboard.

Appendix: Image Blending

Note: This section is not for credit and is for informational purposes only.

For overlapping pixels, it is common to blend the values of both images. You can sim-ply average the values but that will leave a seam at the edges of the overlapping images.Alternatively, you can obtain a blending value for each image that fades one image into theother. To do this, first create a mask like this for each image you wish to blend:

mask = zeros(size(im,1), size(im,2));

mask(1,:) = 1; mask(end,:) = 1; mask(:,1) = 1; mask(:,end) = 1;

mask = bwdist(mask, ’city’);

mask = mask/max(mask(:));

The function bwdist computes the distance transform of the binarized input image, so thismask will be zero at the borders and 1 at the center of the image. You can warp this maskjust as you warped your images. How would you use the mask weights to compute a linearcombination of the pixels in the overlap region? Your function should behave well whereone or both of the blending constants are zero.

13

Page 46: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Derivation

LECTURE 4. PLANAR SCENES AND HOMOGRAPHY 5

cues (parallax) can only be recovered when T is nonzero. Looking at thehomography equation, the limit of H as d approaches infinity is R. Thus anypair of images of an arbitrary scene captured by a purely rotating camera isrelated by a planar homography.

A planar panorama can be constructed by capturing many overlappingimages at di↵erent rotations, picking an image to be a reference, and thenfinding corresponding points between the overlapping images. The pairwisehomographies are derived from the corresponding points, forming a mosaicthat typically is shaped like a “bow-tie,” as images farther away from thereference are warped outward to fit the homography. The figure below isfrom Pollefeys and Hartley & Zisserman.

4.7. Second Derivation of Homography Constraint

The homography constraint, element by element, in homogenous coordinatesis as follows:

2

4x2

y2z2

3

5 =

2

4H11 H12 H13

H21 H22 H23

H31 H32 H33

3

5

2

4x1

y1z1

3

5 , x2 ⇠ Hx1

In inhomogenous coordinates (x02 = x2/z2 and y02 = y2/z2),

K2

2

4X2

Y2

Z2

3

5 = R

2

4X1

Y1

Z1

3

5

�2

2

4x2

y2

1

3

5 =

2

4f2 0 00 f2 00 0 1

3

5

2

4X2

Y2

Z2

3

5

2

4x2

y2

1

3

5 = K2RK

�11

2

4x1

y1

1

3

5

Page 47: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Take-home points for homographies

• If camera rotates about its center, then the images are related by a homography irrespective of scene depth.

• If the scene is planar, then images from any two cameras are related by a homography.

• Homography mapping is a 3x3 matrix with 8 degrees of freedom.

2

4x2

y2

1

3

5 =

2

4a b c

d e f

g h i

3

5

2

4x1

y1

1

3

5

Page 48: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Matching features

What do we do about the “bad” matches?

Page 49: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

49

General problem: we are trying to fit a (geometric) model to noisy data

How about we choose the average vector (least-squares soln)? Why will/won’t this work?

Page 50: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Let’s generalize the problem a bitEstimate best model (a line) that fits data {xi, yi}

minw,b

X

i

(yi � fw,b(xi))2

fw,b(xi) = wxi + b

x

y

Page 51: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Let’s generalize the problem a bit“Least-squares” solution

x

y

Page 52: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC Line Fitting Example

Sample two points

Page 53: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC Line Fitting Example

Fit Line

Page 54: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC Line Fitting Example

Total number of points within a threshold of line.

Page 55: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC Line Fitting Example

Repeat, until get a good result

Page 56: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC Line Fitting Example

Repeat, until get a good result

Page 57: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC Line Fitting Example

Repeat, until get a good result

Page 58: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RAndom SAmple Consensus

Select one match, count inliers

Page 59: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RAndom SAmple Consensus

Select one match, count inliers

Page 60: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Least squares fit

Find “average” translation vector for the largest group of inliers

Page 61: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC for estimating transformation

RANSAC loop: 1. Select feature pairs (at random) 2. Compute transformation T (exact) 3. Compute inliers (point matches where |pi’ - T pi|2< ε) 4. Keep largest set of inliers

5. Re-compute least-squares estimate of transformation T using all of the inliers

Page 62: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC for estimating transformation

RANSAC loop: 1. Select feature pairs (at random) 2. Compute transformation T (exact) 3. Compute inliers (point matches where |pi’ - T pi|2< ε) 4. Keep largest set of inliers

5. Re-compute least-squares estimate of transformation T using all of the inliers

Ah = 0, A 2 R8X9 h, 0 2 R9

Recall homography estimation: how do we estimate with all inlier points?

Page 63: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC for estimating transformation

RANSAC loop: 1. Select feature pairs (at random) 2. Compute transformation T (exact) 3. Compute inliers (point matches where |pi’ - T pi|2< ε) 4. Keep largest set of inliers

5. Re-compute least-squares estimate of transformation T using all of the inliers

Ah = 0, A 2 R8X9 h, 0 2 R9

Recall homography estimation: how do we estimate with all inlier points?

Page 64: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC for alignment

Page 65: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC for alignment

Page 66: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

RANSAC for alignment

Page 67: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix

Planar object recognition(what is transformation used; how many pairs must be selected in initial step?