visual recognition: image formationrurtasun/courses/visualrecognition/lecture2.pdfraquel urtasun...
TRANSCRIPT
![Page 1: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/1.jpg)
Visual Recognition: Image Formation
Raquel Urtasun
TTI Chicago
Jan 5, 2012
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 1 / 61
![Page 2: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/2.jpg)
Today’s lecture ...
Fundamentals of image formation
You should know about this already...
... so we will go fast on it
Read about it if you are not familiar
This will be almost all the geometry we will see in this class
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 2 / 61
![Page 3: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/3.jpg)
Material
Chapter 2 of Rich Szeliski book
Available online here
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 3 / 61
![Page 4: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/4.jpg)
How is an image created?
The image formation process that produced a particular image depends on
lighting conditions
scene geometry,
surface properties
camera optics
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 4 / 61
![Page 5: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/5.jpg)
Geometric primitives and transformations
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 5 / 61
![Page 6: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/6.jpg)
What are we going to see?
Basic 2D and 3D primitives:
points
lines
planes
How 3D features are projected into 2D features
See [Hartley and Zisserman] book for more details
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 6 / 61
![Page 7: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/7.jpg)
2D primitives: 2D points
2D points, e.g., pixel coordinate in an image, can be defined asp = (x , y) ∈ <2
p =
[xy
]
2D points can also be represented using homogeneous coordinatesp = (x , y , w) ∈ P2, with P2 = <3 − (0, 0, 0), the perspective 2D space.
In homogeneous coordinates vectors that differ only by scale areequivalent.
A homogeneous vector can be converted into an inhomogeneous one bydividing through by the last element
p = (x ; y ; w) = w(x ; y ; 1) = w p
with p an augmented vector p = (x , y , 1).
Homogeneous points whose last element is 0 are called ideal points orpoints at infinity and do not have an inhomogeneous representation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 7 / 61
![Page 8: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/8.jpg)
2D primitives: 2D points
2D points, e.g., pixel coordinate in an image, can be defined asp = (x , y) ∈ <2
p =
[xy
]
2D points can also be represented using homogeneous coordinatesp = (x , y , w) ∈ P2, with P2 = <3 − (0, 0, 0), the perspective 2D space.
In homogeneous coordinates vectors that differ only by scale areequivalent.
A homogeneous vector can be converted into an inhomogeneous one bydividing through by the last element
p = (x ; y ; w) = w(x ; y ; 1) = w p
with p an augmented vector p = (x , y , 1).
Homogeneous points whose last element is 0 are called ideal points orpoints at infinity and do not have an inhomogeneous representation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 7 / 61
![Page 9: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/9.jpg)
2D primitives: 2D points
2D points, e.g., pixel coordinate in an image, can be defined asp = (x , y) ∈ <2
p =
[xy
]
2D points can also be represented using homogeneous coordinatesp = (x , y , w) ∈ P2, with P2 = <3 − (0, 0, 0), the perspective 2D space.
In homogeneous coordinates vectors that differ only by scale areequivalent.
A homogeneous vector can be converted into an inhomogeneous one bydividing through by the last element
p = (x ; y ; w) = w(x ; y ; 1) = w p
with p an augmented vector p = (x , y , 1).
Homogeneous points whose last element is 0 are called ideal points orpoints at infinity and do not have an inhomogeneous representation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 7 / 61
![Page 10: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/10.jpg)
2D primitives: 2D points
2D points, e.g., pixel coordinate in an image, can be defined asp = (x , y) ∈ <2
p =
[xy
]
2D points can also be represented using homogeneous coordinatesp = (x , y , w) ∈ P2, with P2 = <3 − (0, 0, 0), the perspective 2D space.
In homogeneous coordinates vectors that differ only by scale areequivalent.
A homogeneous vector can be converted into an inhomogeneous one bydividing through by the last element
p = (x ; y ; w) = w(x ; y ; 1) = w p
with p an augmented vector p = (x , y , 1).
Homogeneous points whose last element is 0 are called ideal points orpoints at infinity and do not have an inhomogeneous representation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 7 / 61
![Page 11: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/11.jpg)
2D primitives: 2D points
2D points, e.g., pixel coordinate in an image, can be defined asp = (x , y) ∈ <2
p =
[xy
]
2D points can also be represented using homogeneous coordinatesp = (x , y , w) ∈ P2, with P2 = <3 − (0, 0, 0), the perspective 2D space.
In homogeneous coordinates vectors that differ only by scale areequivalent.
A homogeneous vector can be converted into an inhomogeneous one bydividing through by the last element
p = (x ; y ; w) = w(x ; y ; 1) = w p
with p an augmented vector p = (x , y , 1).
Homogeneous points whose last element is 0 are called ideal points orpoints at infinity and do not have an inhomogeneous representation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 7 / 61
![Page 12: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/12.jpg)
2D primitives: 2D lines
2D lines l = (a, b, c) can be represented in homogeneous coordinates
p · l = ax + by + c = 0
If we normalize such that l = (nx , ny , d) = (n, d) with ||n|| = 1, then n isthe normal, perpendicular to the line and d is its distance to the origin.
Exception is the line at infinity, i.e., l = (0, 0, 1), which includes all (ideal)points at infinity.
We can also express in polar coordinates, i.e., l = (d , θ) withn = (cos θ, sin θ).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 8 / 61
![Page 13: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/13.jpg)
2D primitives: 2D lines
2D lines l = (a, b, c) can be represented in homogeneous coordinates
p · l = ax + by + c = 0
If we normalize such that l = (nx , ny , d) = (n, d) with ||n|| = 1, then n isthe normal, perpendicular to the line and d is its distance to the origin.
Exception is the line at infinity, i.e., l = (0, 0, 1), which includes all (ideal)points at infinity.
We can also express in polar coordinates, i.e., l = (d , θ) withn = (cos θ, sin θ).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 8 / 61
![Page 14: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/14.jpg)
2D primitives: 2D lines
2D lines l = (a, b, c) can be represented in homogeneous coordinates
p · l = ax + by + c = 0
If we normalize such that l = (nx , ny , d) = (n, d) with ||n|| = 1, then n isthe normal, perpendicular to the line and d is its distance to the origin.
Exception is the line at infinity, i.e., l = (0, 0, 1), which includes all (ideal)points at infinity.
We can also express in polar coordinates, i.e., l = (d , θ) withn = (cos θ, sin θ).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 8 / 61
![Page 15: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/15.jpg)
2D primitives: 2D lines
2D lines l = (a, b, c) can be represented in homogeneous coordinates
p · l = ax + by + c = 0
If we normalize such that l = (nx , ny , d) = (n, d) with ||n|| = 1, then n isthe normal, perpendicular to the line and d is its distance to the origin.
Exception is the line at infinity, i.e., l = (0, 0, 1), which includes all (ideal)points at infinity.
We can also express in polar coordinates, i.e., l = (d , θ) withn = (cos θ, sin θ).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 8 / 61
![Page 16: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/16.jpg)
2D lines and 2D points
When using homogeneous coordinates ...
We can compute the intersection of two lines
p = l1 × l2
with × the cross product.
The cross product a× b is defined as a vector c that is perpendicular toboth a and b, with a direction given by the right-hand rule and a magnitudeequal to the area of the parallelogram that the vectors span.
c = |a| · |b| · sin θ · n
The line joining two points can be expressed as
l = p1 × p2
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61
![Page 17: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/17.jpg)
2D lines and 2D points
When using homogeneous coordinates ...
We can compute the intersection of two lines
p = l1 × l2
with × the cross product.
The cross product a× b is defined as a vector c that is perpendicular toboth a and b, with a direction given by the right-hand rule and a magnitudeequal to the area of the parallelogram that the vectors span.
c = |a| · |b| · sin θ · n
The line joining two points can be expressed as
l = p1 × p2
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61
![Page 18: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/18.jpg)
2D lines and 2D points
When using homogeneous coordinates ...
We can compute the intersection of two lines
p = l1 × l2
with × the cross product.
The cross product a× b is defined as a vector c that is perpendicular toboth a and b, with a direction given by the right-hand rule and a magnitudeequal to the area of the parallelogram that the vectors span.
c = |a| · |b| · sin θ · n
The line joining two points can be expressed as
l = p1 × p2
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61
![Page 19: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/19.jpg)
2D primitives: 2D conics
2D conic is a curve obtained by intersecting a cone (i.e., a right circularconical surface) with a plane and can be written using a quadric equation
pQp = 0
Q expresses the type of quadric.
Useful to represent human body or basic primitives and in geometry andcamera calibration
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 10 / 61
![Page 20: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/20.jpg)
2D primitives: 2D conics
2D conic is a curve obtained by intersecting a cone (i.e., a right circularconical surface) with a plane and can be written using a quadric equation
pQp = 0
Q expresses the type of quadric.
Useful to represent human body or basic primitives and in geometry andcamera calibration
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 10 / 61
![Page 21: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/21.jpg)
2D primitives: 2D conics
2D conic is a curve obtained by intersecting a cone (i.e., a right circularconical surface) with a plane and can be written using a quadric equation
pQp = 0
Q expresses the type of quadric.
Useful to represent human body or basic primitives and in geometry andcamera calibration
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 10 / 61
![Page 22: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/22.jpg)
3D primitives: 3D points
Point coordinates in 3D can be written using inhomogeneous coordinatesp = (x , y , z) ∈ <3.
And also in homogeneous coordinates p = (x , y , z , w) ∈ P3.
It is useful to denote a 3D point using the augmented vector p = (x , y , z , 1)with p = w x.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 11 / 61
![Page 23: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/23.jpg)
3D primitives: 3D points
Point coordinates in 3D can be written using inhomogeneous coordinatesp = (x , y , z) ∈ <3.
And also in homogeneous coordinates p = (x , y , z , w) ∈ P3.
It is useful to denote a 3D point using the augmented vector p = (x , y , z , 1)with p = w x.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 11 / 61
![Page 24: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/24.jpg)
3D primitives: 3D points
Point coordinates in 3D can be written using inhomogeneous coordinatesp = (x , y , z) ∈ <3.
And also in homogeneous coordinates p = (x , y , z , w) ∈ P3.
It is useful to denote a 3D point using the augmented vector p = (x , y , z , 1)with p = w x.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 11 / 61
![Page 25: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/25.jpg)
3D primitives: 3D plane
In homogeneous coordinates, m = (a, b, c , d), the plane is
p · m = ax + by + cz + d = 0
If we normalize such that m = (nx , ny , nz , d) = (n, d) with ||n|| = 1, then nis the normal, perpendicular to the plane, and d is its distance to the origin.
The plane at infinity m = (0, 0, 0, 1), containing all the points at infinity,cannot be normalized.
We can express in spherical coordinates n = (cos θ, cosφ, sin θ cosφ, sinφ)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 12 / 61
![Page 26: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/26.jpg)
3D primitives: 3D plane
In homogeneous coordinates, m = (a, b, c , d), the plane is
p · m = ax + by + cz + d = 0
If we normalize such that m = (nx , ny , nz , d) = (n, d) with ||n|| = 1, then nis the normal, perpendicular to the plane, and d is its distance to the origin.
The plane at infinity m = (0, 0, 0, 1), containing all the points at infinity,cannot be normalized.
We can express in spherical coordinates n = (cos θ, cosφ, sin θ cosφ, sinφ)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 12 / 61
![Page 27: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/27.jpg)
3D primitives: 3D plane
In homogeneous coordinates, m = (a, b, c , d), the plane is
p · m = ax + by + cz + d = 0
If we normalize such that m = (nx , ny , nz , d) = (n, d) with ||n|| = 1, then nis the normal, perpendicular to the plane, and d is its distance to the origin.
The plane at infinity m = (0, 0, 0, 1), containing all the points at infinity,cannot be normalized.
We can express in spherical coordinates n = (cos θ, cosφ, sin θ cosφ, sinφ)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 12 / 61
![Page 28: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/28.jpg)
3D primitives: 3D plane
In homogeneous coordinates, m = (a, b, c , d), the plane is
p · m = ax + by + cz + d = 0
If we normalize such that m = (nx , ny , nz , d) = (n, d) with ||n|| = 1, then nis the normal, perpendicular to the plane, and d is its distance to the origin.
The plane at infinity m = (0, 0, 0, 1), containing all the points at infinity,cannot be normalized.
We can express in spherical coordinates n = (cos θ, cosφ, sin θ cosφ, sinφ)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 12 / 61
![Page 29: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/29.jpg)
3D primitives: 3D line
One possible representation is to use two points on the line, (p,q), then anypoint can be expressed as a linear combination of these two points
r = (1− λ)p + λq
If 0 ≤ λ ≤ 1, then we get the line segment joining p and q.
In homogeneous coordinates,
r = µp + λq
When a point is at infinity, i.e., q = (dx , dy , dz , 0) = (d, 0), then r = p + λd
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 13 / 61
![Page 30: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/30.jpg)
3D primitives: 3D line
One possible representation is to use two points on the line, (p,q), then anypoint can be expressed as a linear combination of these two points
r = (1− λ)p + λq
If 0 ≤ λ ≤ 1, then we get the line segment joining p and q.
In homogeneous coordinates,
r = µp + λq
When a point is at infinity, i.e., q = (dx , dy , dz , 0) = (d, 0), then r = p + λd
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 13 / 61
![Page 31: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/31.jpg)
3D primitives: 3D line
One possible representation is to use two points on the line, (p,q), then anypoint can be expressed as a linear combination of these two points
r = (1− λ)p + λq
If 0 ≤ λ ≤ 1, then we get the line segment joining p and q.
In homogeneous coordinates,
r = µp + λq
When a point is at infinity, i.e., q = (dx , dy , dz , 0) = (d, 0), then r = p + λd
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 13 / 61
![Page 32: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/32.jpg)
3D primitives: 3D line
One possible representation is to use two points on the line, (p,q), then anypoint can be expressed as a linear combination of these two points
r = (1− λ)p + λq
If 0 ≤ λ ≤ 1, then we get the line segment joining p and q.
In homogeneous coordinates,
r = µp + λq
When a point is at infinity, i.e., q = (dx , dy , dz , 0) = (d, 0), then r = p + λd
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 13 / 61
![Page 33: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/33.jpg)
3D primitives: 3D conics
Can be written using a quadric equation
pQp = 0
Q expresses the type of quadric.
Useful to represent human body in 3D or basic primitives
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 14 / 61
![Page 34: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/34.jpg)
2D Transformations
Translation: can be written as p′ = p + t, or
p′ =[
I t]
p
with I the 2× 2 identity matrix, or
p′ =
[I t
0T 1
]p
where 0 is the zero vector
Which representation is more useful?
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 15 / 61
![Page 35: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/35.jpg)
2D Transformations
Translation: can be written as p′ = p + t, or
p′ =[
I t]
p
with I the 2× 2 identity matrix, or
p′ =
[I t
0T 1
]p
where 0 is the zero vector
Which representation is more useful?
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 15 / 61
![Page 36: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/36.jpg)
2D Transformations
2D Rigid Body Motion: can be written as p′ = Rp + t, or
p′ =[
R t]
p
with
R =
[cos θ − sin θsin θ cos θ
]is an orthonormal rotation matrix RRT = I, and |R| = 1
Can also be written in homogeneous coordinates.
Also called 2D Euclidean transformation
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 16 / 61
![Page 37: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/37.jpg)
2D Transformations
Similarity transform: is p′ = sRp + t, with s an scale factor.
Also written as
p′ =[sR t
]p =
[a −b txb a ty
]p
where we no longer require that a2 + b2 = 1.
Preserves angles between lines.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 17 / 61
![Page 38: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/38.jpg)
2D Transformations
Affine is p′ = Ap, with A an arbitrary 2× 3 matrix, i.e.,
p′ =
[a00 a01 a02a10 a11 a12
]p
Parallel lines remain parallel under affine transformations.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 18 / 61
![Page 39: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/39.jpg)
2D Transformations
Projective operates on homogeneous coordinates
p′ = Hp
with H an arbitrary 3× 3 matrix.
Also known as perspective transform or homography.
H is homogeneous, i.e., it is only defined up to a scale.
Two H matrices that differ only by scale are equivalent.
Perspective transformations preserve straight lines.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 19 / 61
![Page 40: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/40.jpg)
Hierarchy of Transformations
A set of (potentially restricted) 3× 3 matrices operating on 2Dhomogeneous coordinate vectors.
They form a nested set of groups, i.e., they are closed under compositionand have an inverse that is a member of the same group.
Each (simpler) group is a subset of the more complex group below it.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 20 / 61
![Page 41: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/41.jpg)
Hierarchy of 2D Transformations
They can be applied in series
Other transformations exist, e.g., stretch/squash
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 21 / 61
![Page 42: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/42.jpg)
Hierarchy of 3D Transformations
Same as the 2D hierarchy.
Check the book chapter for the exact definition.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 22 / 61
![Page 43: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/43.jpg)
Representing Rotations
Representing 2D rotations in Euler angles is not a problem.
However, it is a problem in 3D.
Alternative representations: axis angles, quaternions.
Let’s see some of this representations.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 23 / 61
![Page 44: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/44.jpg)
Euler angles: definition
The most popular parameterization of orientation space.
A general rotation is described as a sequence of rotations about threemutually orthogonal coordinate axes fixed in the space.
The rotations are applied to the space and not to the axis.
Figure: Principal rotation matrices: Rotation along the x-axis. [Souce:Watt 95]
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 24 / 61
![Page 45: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/45.jpg)
Euler angles: definition
The most popular parameterization of orientation space.
A general rotation is described as a sequence of rotations about threemutually orthogonal coordinate axes fixed in the space.
The rotations are applied to the space and not to the axis.
Figure: Principal rotation matrices: Rotation along the y-axis. [Souce:Watt 95]
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 24 / 61
![Page 46: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/46.jpg)
Euler angles: definition
The most popular parameterization of orientation space.
A general rotation is described as a sequence of rotations about threemutually orthogonal coordinate axes fixed in the space.
The rotations are applied to the space and not to the axis.
Figure: Principal rotation matrices: Rotation along the z-axis. [Souce:Watt 95]
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 24 / 61
![Page 47: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/47.jpg)
Euler angles: composition
General rotations can be done by composing rotations over these axis.
For example, let’s create a rotation matrix R(θx , θy , θz) in terms of the jointangles θx , θy , θz .
R(θx , θy , θz) = Rx · Ry · Rz =
cycz cy sz −sy 0
sxsycz − cxsz sxsy sz + cxcz sxcy 0cxsycz + sxsz cxsy sz − sxcz cxcy 0
0 0 0 1
with si = sin(θi ), and ci = cos(θi ).
Matrix multiplication is not conmutative, the order is important
Rx · Ry · Rz 6= Rz · Ry · Rx
Rotations are assumed to be relative to fixed world axes, rather than local tothe object.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 25 / 61
![Page 48: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/48.jpg)
Euler angles: composition
General rotations can be done by composing rotations over these axis.
For example, let’s create a rotation matrix R(θx , θy , θz) in terms of the jointangles θx , θy , θz .
R(θx , θy , θz) = Rx · Ry · Rz =
cycz cy sz −sy 0
sxsycz − cxsz sxsy sz + cxcz sxcy 0cxsycz + sxsz cxsy sz − sxcz cxcy 0
0 0 0 1
with si = sin(θi ), and ci = cos(θi ).
Matrix multiplication is not conmutative, the order is important
Rx · Ry · Rz 6= Rz · Ry · Rx
Rotations are assumed to be relative to fixed world axes, rather than local tothe object.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 25 / 61
![Page 49: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/49.jpg)
Euler angles: composition
General rotations can be done by composing rotations over these axis.
For example, let’s create a rotation matrix R(θx , θy , θz) in terms of the jointangles θx , θy , θz .
R(θx , θy , θz) = Rx · Ry · Rz =
cycz cy sz −sy 0
sxsycz − cxsz sxsy sz + cxcz sxcy 0cxsycz + sxsz cxsy sz − sxcz cxcy 0
0 0 0 1
with si = sin(θi ), and ci = cos(θi ).
Matrix multiplication is not conmutative, the order is important
Rx · Ry · Rz 6= Rz · Ry · Rx
Rotations are assumed to be relative to fixed world axes, rather than local tothe object.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 25 / 61
![Page 50: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/50.jpg)
Euler angles: composition
General rotations can be done by composing rotations over these axis.
For example, let’s create a rotation matrix R(θx , θy , θz) in terms of the jointangles θx , θy , θz .
R(θx , θy , θz) = Rx · Ry · Rz =
cycz cy sz −sy 0
sxsycz − cxsz sxsy sz + cxcz sxcy 0cxsycz + sxsz cxsy sz − sxcz cxcy 0
0 0 0 1
with si = sin(θi ), and ci = cos(θi ).
Matrix multiplication is not conmutative, the order is important
Rx · Ry · Rz 6= Rz · Ry · Rx
Rotations are assumed to be relative to fixed world axes, rather than local tothe object.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 25 / 61
![Page 51: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/51.jpg)
Euler angles: drawbacks I
Gimbal lock: This results when two axes effectively line up, resulting in atemporary loss of a degree of freedom.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 26 / 61
![Page 52: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/52.jpg)
Euler angles: drawbacks I
Gimbal lock: This results when two axes effectively line up, resulting in atemporary loss of a degree of freedom. This is a singularity in theparameterization. θ1 and θ3 become associated with the same DOF.
R(θ1,π
2, θ3) =
0 0 −1 0
sin(θ1 − θ3) cos(θ1 − θ3) 0 0cos(θ1 − θ3) sin(θ1 − θ3) 0 0
0 0 0 1
Figure: Singular locations of the Euler angles parametrization (at β = ±π/2)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 26 / 61
![Page 53: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/53.jpg)
Euler angles: drawbacks II
The parameterization is non-linear.The parameterization is modular R(θ) = R(θ + 2πn), with n ∈ Z .
The parameterization is not unique
∃[θ4, θ5, θ6] such that R(θ1, θ2, θ3) = R(θ4, θ5, θ6)
with θi 6= θ3+i for all i ∈ {1, 2, 3}.
Figure: Example of two routes for the animation of the block letter R [Watt]
.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 27 / 61
![Page 54: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/54.jpg)
Euler angles: drawbacks II
The parameterization is non-linear.The parameterization is modular R(θ) = R(θ + 2πn), with n ∈ Z .The parameterization is not unique
∃[θ4, θ5, θ6] such that R(θ1, θ2, θ3) = R(θ4, θ5, θ6)
with θi 6= θ3+i for all i ∈ {1, 2, 3}.
Figure: Example of two routes for the animation of the block letter R [Watt]
.Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 27 / 61
![Page 55: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/55.jpg)
Euler angles: drawbacks II
The parameterization is non-linear.The parameterization is modular R(θ) = R(θ + 2πn), with n ∈ Z .The parameterization is not unique
∃[θ4, θ5, θ6] such that R(θ1, θ2, θ3) = R(θ4, θ5, θ6)
with θi 6= θ3+i for all i ∈ {1, 2, 3}.
Figure: Example of two routes for the animation of the block letter R [Watt]
.Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 27 / 61
![Page 56: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/56.jpg)
Axis Angles
A rotation can be represented by a rotation axis n, and an angle θ.
Or equivalently by a 3D vector w = θn.
It’s a minimal representation.
It has a singularity... when does it happen?
Figure: Rotation around an axis n by an angle θ
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 28 / 61
![Page 57: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/57.jpg)
Axis Angles
A rotation can be represented by a rotation axis n, and an angle θ.
Or equivalently by a 3D vector w = θn.
It’s a minimal representation.
It has a singularity... when does it happen?
Figure: Rotation around an axis n by an angle θ
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 28 / 61
![Page 58: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/58.jpg)
Axis Angles
A rotation can be represented by a rotation axis n, and an angle θ.
Or equivalently by a 3D vector w = θn.
It’s a minimal representation.
It has a singularity... when does it happen?
Figure: Rotation around an axis n by an angle θ
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 28 / 61
![Page 59: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/59.jpg)
Axis Angles
A rotation can be represented by a rotation axis n, and an angle θ.
Or equivalently by a 3D vector w = θn.
It’s a minimal representation.
It has a singularity... when does it happen?
Figure: Rotation around an axis n by an angle θ
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 28 / 61
![Page 60: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/60.jpg)
Quaternions
Quaternions were invented by W.R.Hamilton in 1843.
A quaternion has 4 components
q = [qw , qx , qy , qz ]T
They are extensions of complex numbers a + ib to a 3D imaginary space, ijk.
q = qw + qx i + qy j + qzk
With the additional properties
i2 = j2 = ijk = −1
i = jk = −kj, j = ki = −ik, k = ij = −ji
To represent rotations, only unit length quaternions are used
||q||2 =√q2w + q2x + q2y + q2z
This forms the surface of a 4D hypersphere of radius 1.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 29 / 61
![Page 61: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/61.jpg)
Quaternions
Quaternions were invented by W.R.Hamilton in 1843.
A quaternion has 4 components
q = [qw , qx , qy , qz ]T
They are extensions of complex numbers a + ib to a 3D imaginary space, ijk.
q = qw + qx i + qy j + qzk
With the additional properties
i2 = j2 = ijk = −1
i = jk = −kj, j = ki = −ik, k = ij = −ji
To represent rotations, only unit length quaternions are used
||q||2 =√q2w + q2x + q2y + q2z
This forms the surface of a 4D hypersphere of radius 1.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 29 / 61
![Page 62: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/62.jpg)
Quaternions
Quaternions were invented by W.R.Hamilton in 1843.
A quaternion has 4 components
q = [qw , qx , qy , qz ]T
They are extensions of complex numbers a + ib to a 3D imaginary space, ijk.
q = qw + qx i + qy j + qzk
With the additional properties
i2 = j2 = ijk = −1
i = jk = −kj, j = ki = −ik, k = ij = −ji
To represent rotations, only unit length quaternions are used
||q||2 =√q2w + q2x + q2y + q2z
This forms the surface of a 4D hypersphere of radius 1.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 29 / 61
![Page 63: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/63.jpg)
Quaternions
Quaternions were invented by W.R.Hamilton in 1843.
A quaternion has 4 components
q = [qw , qx , qy , qz ]T
They are extensions of complex numbers a + ib to a 3D imaginary space, ijk.
q = qw + qx i + qy j + qzk
With the additional properties
i2 = j2 = ijk = −1
i = jk = −kj, j = ki = −ik, k = ij = −ji
To represent rotations, only unit length quaternions are used
||q||2 =√q2w + q2x + q2y + q2z
This forms the surface of a 4D hypersphere of radius 1.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 29 / 61
![Page 64: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/64.jpg)
Quaternions
Quaternions were invented by W.R.Hamilton in 1843.
A quaternion has 4 components
q = [qw , qx , qy , qz ]T
They are extensions of complex numbers a + ib to a 3D imaginary space, ijk.
q = qw + qx i + qy j + qzk
With the additional properties
i2 = j2 = ijk = −1
i = jk = −kj, j = ki = −ik, k = ij = −ji
To represent rotations, only unit length quaternions are used
||q||2 =√q2w + q2x + q2y + q2z
This forms the surface of a 4D hypersphere of radius 1.Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 29 / 61
![Page 65: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/65.jpg)
Quaternions
Quaternions were invented by W.R.Hamilton in 1843.
A quaternion has 4 components
q = [qw , qx , qy , qz ]T
They are extensions of complex numbers a + ib to a 3D imaginary space, ijk.
q = qw + qx i + qy j + qzk
With the additional properties
i2 = j2 = ijk = −1
i = jk = −kj, j = ki = −ik, k = ij = −ji
To represent rotations, only unit length quaternions are used
||q||2 =√q2w + q2x + q2y + q2z
This forms the surface of a 4D hypersphere of radius 1.Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 29 / 61
![Page 66: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/66.jpg)
Quaternions
Figure: Unit quaternions live on the unit sphere ||q|| = 1. Smooth trajectorythrough 3 quaternions. The antipodal point to q2, namely −q2, represents thesame rotation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 30 / 61
![Page 67: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/67.jpg)
Quaternions
Quaternions form a group whose underlying set is the four dimensionalvector space R4, with a multiplication operator ◦ that combines both thedot product and cross product of vectors.
The identity rotation is encoded as q = [1, 0, 0, 0]T .
The quaternion q = [qw , qx , qy , qz ]T encodes a rotation of θ = 2 cos−1(qw )along the unit axis v = [qx , qy , qz ].
Also a quaternion can represent a rotation by an angle θ around the v axis as
q = [cosθ
2, sin
θ
2v]
with v = v|v| .
If v is unit length, then q will also be.
Proof: simple exercise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 31 / 61
![Page 68: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/68.jpg)
Quaternions
Quaternions form a group whose underlying set is the four dimensionalvector space R4, with a multiplication operator ◦ that combines both thedot product and cross product of vectors.
The identity rotation is encoded as q = [1, 0, 0, 0]T .
The quaternion q = [qw , qx , qy , qz ]T encodes a rotation of θ = 2 cos−1(qw )along the unit axis v = [qx , qy , qz ].
Also a quaternion can represent a rotation by an angle θ around the v axis as
q = [cosθ
2, sin
θ
2v]
with v = v|v| .
If v is unit length, then q will also be.
Proof: simple exercise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 31 / 61
![Page 69: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/69.jpg)
Quaternions
Quaternions form a group whose underlying set is the four dimensionalvector space R4, with a multiplication operator ◦ that combines both thedot product and cross product of vectors.
The identity rotation is encoded as q = [1, 0, 0, 0]T .
The quaternion q = [qw , qx , qy , qz ]T encodes a rotation of θ = 2 cos−1(qw )along the unit axis v = [qx , qy , qz ].
Also a quaternion can represent a rotation by an angle θ around the v axis as
q = [cosθ
2, sin
θ
2v]
with v = v|v| .
If v is unit length, then q will also be.
Proof: simple exercise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 31 / 61
![Page 70: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/70.jpg)
Quaternions
Quaternions form a group whose underlying set is the four dimensionalvector space R4, with a multiplication operator ◦ that combines both thedot product and cross product of vectors.
The identity rotation is encoded as q = [1, 0, 0, 0]T .
The quaternion q = [qw , qx , qy , qz ]T encodes a rotation of θ = 2 cos−1(qw )along the unit axis v = [qx , qy , qz ].
Also a quaternion can represent a rotation by an angle θ around the v axis as
q = [cosθ
2, sin
θ
2v]
with v = v|v| .
If v is unit length, then q will also be.
Proof: simple exercise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 31 / 61
![Page 71: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/71.jpg)
Quaternions
Quaternions form a group whose underlying set is the four dimensionalvector space R4, with a multiplication operator ◦ that combines both thedot product and cross product of vectors.
The identity rotation is encoded as q = [1, 0, 0, 0]T .
The quaternion q = [qw , qx , qy , qz ]T encodes a rotation of θ = 2 cos−1(qw )along the unit axis v = [qx , qy , qz ].
Also a quaternion can represent a rotation by an angle θ around the v axis as
q = [cosθ
2, sin
θ
2v]
with v = v|v| .
If v is unit length, then q will also be.
Proof: simple exercise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 31 / 61
![Page 72: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/72.jpg)
Quaternions
Quaternions form a group whose underlying set is the four dimensionalvector space R4, with a multiplication operator ◦ that combines both thedot product and cross product of vectors.
The identity rotation is encoded as q = [1, 0, 0, 0]T .
The quaternion q = [qw , qx , qy , qz ]T encodes a rotation of θ = 2 cos−1(qw )along the unit axis v = [qx , qy , qz ].
Also a quaternion can represent a rotation by an angle θ around the v axis as
q = [cosθ
2, sin
θ
2v]
with v = v|v| .
If v is unit length, then q will also be.
Proof: simple exercise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 31 / 61
![Page 73: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/73.jpg)
Quaternion to rotational matrix
To convert a quaternion q = [qw , qx , qy , qz ] to a rotational matrixsimply compute
1− 2q2y − 2q2z 2qxqy + 2qwqz 2qxqz − 2qwqy 02qxqy − 2qwqz 1− 2q2x − 2q2z 2qyqz + 2qwqx 02qxqz + 2qwqy 2qyqz − 2qwqx 1− 2q2x − 2q2y 0
0 0 0 1
A matrix can also easily be converted to quaternion. See referencesfor the exact algorithm.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 32 / 61
![Page 74: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/74.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 75: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/75.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 76: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/76.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 77: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/77.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 78: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/78.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 79: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/79.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 80: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/80.jpg)
Interpretation of quaternions
Any incremental movement along one of the orthogonal axes in curved spacecorresponds to an incremental rotation along an axis in real space (distancesalong the hypersphere correspond to angles in 3D space).
Moving in some arbitrary direction corresponds to rotating around somearbitrary axis.
If you move too far in one direction, you come back to where you started(corresponding to rotating 360 degrees around any one axis).
A distance of x along the surface of the hypersphere corresponds to arotation of angle 2x radians.
This means that moving along a 90 degree arc on the hyperspherecorresponds to rotating an object by 180 degrees.
Traveling 180 degrees corresponds to a 360 degree rotation, thus getting youback to where you started.
This implies that q and -q correspond to the same orientation.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 33 / 61
![Page 81: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/81.jpg)
Quaternion operations: dot product
The dot product of quaternions is simple their vector dot product
p · q = pwqw + pxqx + pyqy + pzqz = |p||q| cosφ
The angle between two quaternions in 4D space is half the angle onewould need to rotate from one orientation to the other in 3D space.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 34 / 61
![Page 82: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/82.jpg)
Quaternion operations: multiplication
Multiplication on quaternions can be done by expanding them intocomplex numbers
pq =< s · t − v ·wT , sw + tv + v ×w >
where p = [s, v]T , and q = [t,w].
If p represents a rotation and q represents a rotation, then pqrepresents p rotated by q.
Note that two unit quaternions multiplied together will result inanother unit quaternion.
Quaternions extend the planar rotations of complex numbers to 3Drotations in space.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 35 / 61
![Page 83: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/83.jpg)
Quaternion operations: multiplication
Multiplication on quaternions can be done by expanding them intocomplex numbers
pq =< s · t − v ·wT , sw + tv + v ×w >
where p = [s, v]T , and q = [t,w].
If p represents a rotation and q represents a rotation, then pqrepresents p rotated by q.
Note that two unit quaternions multiplied together will result inanother unit quaternion.
Quaternions extend the planar rotations of complex numbers to 3Drotations in space.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 35 / 61
![Page 84: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/84.jpg)
Quaternion operations: multiplication
Multiplication on quaternions can be done by expanding them intocomplex numbers
pq =< s · t − v ·wT , sw + tv + v ×w >
where p = [s, v]T , and q = [t,w].
If p represents a rotation and q represents a rotation, then pqrepresents p rotated by q.
Note that two unit quaternions multiplied together will result inanother unit quaternion.
Quaternions extend the planar rotations of complex numbers to 3Drotations in space.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 35 / 61
![Page 85: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/85.jpg)
Quaternion operations: multiplication
Multiplication on quaternions can be done by expanding them intocomplex numbers
pq =< s · t − v ·wT , sw + tv + v ×w >
where p = [s, v]T , and q = [t,w].
If p represents a rotation and q represents a rotation, then pqrepresents p rotated by q.
Note that two unit quaternions multiplied together will result inanother unit quaternion.
Quaternions extend the planar rotations of complex numbers to 3Drotations in space.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 35 / 61
![Page 86: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/86.jpg)
Quaternion operations: others
Inverse of a quaternion q = [s, v]T
q−1 =1
|q|2[s,−v]T
Any multiple of a quaternion gives the same rotation because theeffects of the magnitude are divided out.
Very good for interpolation, Slerp.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 36 / 61
![Page 87: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/87.jpg)
Redundancy of the parameterizations
The parameterizations that we have seen:
Rotational matrix: 9 DOF. It has 6 extra DOF.Axis angles: 3 DOF for the scaled version and 4 DOF for thenon-scaled. The latter has one extra DOF.Quaternions: 4 DOF, 1 extra DOF.
From the Euler theorem we know that an arbitrary rotation can bedescribed with only 3 DOF, so those parameters extra are redundant.
For rotational matrix, we can impose additional constraints if thematrix represent a rigid transform.
|a| = |b| = |c | = 1
a = b × c, b = c × a, c = a× b
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 37 / 61
![Page 88: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/88.jpg)
Redundancy of the parameterizations
The parameterizations that we have seen:
Rotational matrix: 9 DOF. It has 6 extra DOF.Axis angles: 3 DOF for the scaled version and 4 DOF for thenon-scaled. The latter has one extra DOF.Quaternions: 4 DOF, 1 extra DOF.
From the Euler theorem we know that an arbitrary rotation can bedescribed with only 3 DOF, so those parameters extra are redundant.
For rotational matrix, we can impose additional constraints if thematrix represent a rigid transform.
|a| = |b| = |c | = 1
a = b × c, b = c × a, c = a× b
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 37 / 61
![Page 89: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/89.jpg)
Redundancy of the parameterizations
The parameterizations that we have seen:
Rotational matrix: 9 DOF. It has 6 extra DOF.Axis angles: 3 DOF for the scaled version and 4 DOF for thenon-scaled. The latter has one extra DOF.Quaternions: 4 DOF, 1 extra DOF.
From the Euler theorem we know that an arbitrary rotation can bedescribed with only 3 DOF, so those parameters extra are redundant.
For rotational matrix, we can impose additional constraints if thematrix represent a rigid transform.
|a| = |b| = |c | = 1
a = b × c, b = c × a, c = a× b
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 37 / 61
![Page 90: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/90.jpg)
Which rotation representation is better?
The axis/angle representation is minimal, and hence does not require anyadditional constraints.
If the angle is expressed in degrees, it is easier to understand the pose, andeasier to express exact rotations and derivatives are easy to compute.
Quaternions do not have discontinuities. It is also easier to interpolatebetween rotations and to chain rigid transformations.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 38 / 61
![Page 91: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/91.jpg)
Which rotation representation is better?
The axis/angle representation is minimal, and hence does not require anyadditional constraints.
If the angle is expressed in degrees, it is easier to understand the pose, andeasier to express exact rotations and derivatives are easy to compute.
Quaternions do not have discontinuities. It is also easier to interpolatebetween rotations and to chain rigid transformations.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 38 / 61
![Page 92: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/92.jpg)
Which rotation representation is better?
The axis/angle representation is minimal, and hence does not require anyadditional constraints.
If the angle is expressed in degrees, it is easier to understand the pose, andeasier to express exact rotations and derivatives are easy to compute.
Quaternions do not have discontinuities. It is also easier to interpolatebetween rotations and to chain rigid transformations.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 38 / 61
![Page 93: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/93.jpg)
3D to 2D projections
How are 3D primitives projected onto the image plane?
We can do this using a linear 3D to 2D projection matrix
Different types. The most commonly used:
OrthographyPerspective
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 39 / 61
![Page 94: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/94.jpg)
Orthographic Projection
An orthographic projection simply drops the z component of p = (x , y , z) toobtain the 2D point q
q =[
I | 0]
p
Using homogeneous coordinates
q =
1 0 0 00 1 0 00 0 0 1
p
Is an approximate model for long focal length lenses and objects whosedepth is shallow relative to their distance to the camera.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 40 / 61
![Page 95: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/95.jpg)
Orthographic Projection
An orthographic projection simply drops the z component of p = (x , y , z) toobtain the 2D point q
q =[
I | 0]
p
Using homogeneous coordinates
q =
1 0 0 00 1 0 00 0 0 1
p
Is an approximate model for long focal length lenses and objects whosedepth is shallow relative to their distance to the camera.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 40 / 61
![Page 96: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/96.jpg)
Orthographic Projection
An orthographic projection simply drops the z component of p = (x , y , z) toobtain the 2D point q
q =[
I | 0]
p
Using homogeneous coordinates
q =
1 0 0 00 1 0 00 0 0 1
p
Is an approximate model for long focal length lenses and objects whosedepth is shallow relative to their distance to the camera.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 40 / 61
![Page 97: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/97.jpg)
Orthographic Projection
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 41 / 61
![Page 98: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/98.jpg)
Scaled Orthographic Projection
Scaled orthography is actually more commonly used to fit to the image
q =[sI | 0
]p
This model is equivalent to first projecting the world points onto a localfronto-parallel image plane and then scaling this image using regularperspective projection.
The scaling can be the same for all parts of the scene or it can be differentfor objects that are being modeled independently.
It is a popular model for reconstructing the 3D shape of objects far awayfrom the camera,
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 42 / 61
![Page 99: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/99.jpg)
Scaled Orthographic Projection
Scaled orthography is actually more commonly used to fit to the image
q =[sI | 0
]p
This model is equivalent to first projecting the world points onto a localfronto-parallel image plane and then scaling this image using regularperspective projection.
The scaling can be the same for all parts of the scene or it can be differentfor objects that are being modeled independently.
It is a popular model for reconstructing the 3D shape of objects far awayfrom the camera,
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 42 / 61
![Page 100: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/100.jpg)
Scaled Orthographic Projection
Scaled orthography is actually more commonly used to fit to the image
q =[sI | 0
]p
This model is equivalent to first projecting the world points onto a localfronto-parallel image plane and then scaling this image using regularperspective projection.
The scaling can be the same for all parts of the scene or it can be differentfor objects that are being modeled independently.
It is a popular model for reconstructing the 3D shape of objects far awayfrom the camera,
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 42 / 61
![Page 101: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/101.jpg)
Scaled Orthographic Projection
Scaled orthography is actually more commonly used to fit to the image
q =[sI | 0
]p
This model is equivalent to first projecting the world points onto a localfronto-parallel image plane and then scaling this image using regularperspective projection.
The scaling can be the same for all parts of the scene or it can be differentfor objects that are being modeled independently.
It is a popular model for reconstructing the 3D shape of objects far awayfrom the camera,
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 42 / 61
![Page 102: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/102.jpg)
Scaled Orthographic Projection
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 43 / 61
![Page 103: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/103.jpg)
Perspective Projection
Points are projected onto the image plane by dividing them by their zcoordinate, i.e.,
q = Pz(p) =
x/zy/z
1
In homogeneous coordinates, the projection has a simple linear form,
q =
1 0 0 00 1 0 00 0 1 0
p
We have dropped the w component, thus it is not possible to recover thedistance of the 3D point from the image.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 44 / 61
![Page 104: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/104.jpg)
Perspective Projection
Points are projected onto the image plane by dividing them by their zcoordinate, i.e.,
q = Pz(p) =
x/zy/z
1
In homogeneous coordinates, the projection has a simple linear form,
q =
1 0 0 00 1 0 00 0 1 0
p
We have dropped the w component, thus it is not possible to recover thedistance of the 3D point from the image.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 44 / 61
![Page 105: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/105.jpg)
Perspective Projection
Points are projected onto the image plane by dividing them by their zcoordinate, i.e.,
q = Pz(p) =
x/zy/z
1
In homogeneous coordinates, the projection has a simple linear form,
q =
1 0 0 00 1 0 00 0 1 0
p
We have dropped the w component, thus it is not possible to recover thedistance of the 3D point from the image.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 44 / 61
![Page 106: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/106.jpg)
Perspective Projection
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 45 / 61
![Page 107: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/107.jpg)
Perspective Projection
[Source: S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 46 / 61
![Page 108: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/108.jpg)
Camera Intrinsics
Once we projected the 3D point through an ideal pinhole using a projectionmatrix, we must still transform the result according to the pixel sensorspacing and the relative position of the sensor plane to the origin.
Image sensors return pixel values indexed by integer pixel coord (xs , ys).
To map pixels to 3D coordinates, we first scale the (xs , ys) by the pixelspacings (sx , sy ).
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 47 / 61
![Page 109: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/109.jpg)
Camera Intrinsics
Once we projected the 3D point through an ideal pinhole using a projectionmatrix, we must still transform the result according to the pixel sensorspacing and the relative position of the sensor plane to the origin.
Image sensors return pixel values indexed by integer pixel coord (xs , ys).
To map pixels to 3D coordinates, we first scale the (xs , ys) by the pixelspacings (sx , sy ).
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 47 / 61
![Page 110: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/110.jpg)
Camera Intrinsics
Once we projected the 3D point through an ideal pinhole using a projectionmatrix, we must still transform the result according to the pixel sensorspacing and the relative position of the sensor plane to the origin.
Image sensors return pixel values indexed by integer pixel coord (xs , ys).
To map pixels to 3D coordinates, we first scale the (xs , ys) by the pixelspacings (sx , sy ).
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 47 / 61
![Page 111: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/111.jpg)
Camera Intrinsics
Once we projected the 3D point through an ideal pinhole using a projectionmatrix, we must still transform the result according to the pixel sensorspacing and the relative position of the sensor plane to the origin.
Image sensors return pixel values indexed by integer pixel coord (xs , ys).
To map pixels to 3D coordinates, we first scale the (xs , ys) by the pixelspacings (sx , sy ).
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 47 / 61
![Page 112: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/112.jpg)
Camera Intrinsics
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
The combined 2D to 3D projection can then be written as
p =[
Rs | cs] sx 0 0
0 sy 00 0 1
xsys1
= Ms ps
The matrix Ms has 8 unknowns: 3 rotation, 3 translation, 2 scaling.
We have ignored the possible skewing between the axes.
In general only 7 dof, since the distance of the sensor from the origin cannotbe teased apart from the sensor spacing
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 48 / 61
![Page 113: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/113.jpg)
Camera Intrinsics
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
The combined 2D to 3D projection can then be written as
p =[
Rs | cs] sx 0 0
0 sy 00 0 1
xsys1
= Ms ps
The matrix Ms has 8 unknowns: 3 rotation, 3 translation, 2 scaling.
We have ignored the possible skewing between the axes.
In general only 7 dof, since the distance of the sensor from the origin cannotbe teased apart from the sensor spacing
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 48 / 61
![Page 114: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/114.jpg)
Camera Intrinsics
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
The combined 2D to 3D projection can then be written as
p =[
Rs | cs] sx 0 0
0 sy 00 0 1
xsys1
= Ms ps
The matrix Ms has 8 unknowns: 3 rotation, 3 translation, 2 scaling.
We have ignored the possible skewing between the axes.
In general only 7 dof, since the distance of the sensor from the origin cannotbe teased apart from the sensor spacing
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 48 / 61
![Page 115: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/115.jpg)
Camera Intrinsics
We then describe the orientation of the sensor relative to the cameraprojection center Oc with an origin cs and a 3D rotation Rs .
The combined 2D to 3D projection can then be written as
p =[
Rs | cs] sx 0 0
0 sy 00 0 1
xsys1
= Ms ps
The matrix Ms has 8 unknowns: 3 rotation, 3 translation, 2 scaling.
We have ignored the possible skewing between the axes.
In general only 7 dof, since the distance of the sensor from the origin cannotbe teased apart from the sensor spacing
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 48 / 61
![Page 116: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/116.jpg)
Camera Calibration
We know some 3D points and we want to obtain the camera intrinsics andextrinsics.
The relationship between the 3D pixel center p and the 3D camera-centeredpoint pc is given by an unknown scaling s, i.e., p = spc .
We can therefore write the complete projection as
ps = αM−1s pc = Kpc
with K the calibration matrix that describes the camera intrinsicparameters.
Combining with the external parameters we have
ps = K[
R | t]
pw = Ppw
with P = K[R|t] the camera matrix.
Problem of identification, so K is assumed upper-triangular when estimatedfrom 3D measurements.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 49 / 61
![Page 117: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/117.jpg)
Camera Calibration
We know some 3D points and we want to obtain the camera intrinsics andextrinsics.
The relationship between the 3D pixel center p and the 3D camera-centeredpoint pc is given by an unknown scaling s, i.e., p = spc .
x
X? X?
X?
We can therefore write the complete projection as
ps = αM−1s pc = Kpc
with K the calibration matrix that describes the camera intrinsicparameters.
Combining with the external parameters we have
ps = K[
R | t]
pw = Ppw
with P = K[R|t] the camera matrix.
Problem of identification, so K is assumed upper-triangular when estimatedfrom 3D measurements.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 49 / 61
![Page 118: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/118.jpg)
Camera Calibration
We know some 3D points and we want to obtain the camera intrinsics andextrinsics.
The relationship between the 3D pixel center p and the 3D camera-centeredpoint pc is given by an unknown scaling s, i.e., p = spc .
We can therefore write the complete projection as
ps = αM−1s pc = Kpc
with K the calibration matrix that describes the camera intrinsicparameters.
Combining with the external parameters we have
ps = K[
R | t]
pw = Ppw
with P = K[R|t] the camera matrix.
Problem of identification, so K is assumed upper-triangular when estimatedfrom 3D measurements.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 49 / 61
![Page 119: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/119.jpg)
Camera Calibration
We know some 3D points and we want to obtain the camera intrinsics andextrinsics.
The relationship between the 3D pixel center p and the 3D camera-centeredpoint pc is given by an unknown scaling s, i.e., p = spc .
We can therefore write the complete projection as
ps = αM−1s pc = Kpc
with K the calibration matrix that describes the camera intrinsicparameters.
Combining with the external parameters we have
ps = K[
R | t]
pw = Ppw
with P = K[R|t] the camera matrix.
Problem of identification, so K is assumed upper-triangular when estimatedfrom 3D measurements.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 49 / 61
![Page 120: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/120.jpg)
Camera Calibration
We know some 3D points and we want to obtain the camera intrinsics andextrinsics.
The relationship between the 3D pixel center p and the 3D camera-centeredpoint pc is given by an unknown scaling s, i.e., p = spc .
We can therefore write the complete projection as
ps = αM−1s pc = Kpc
with K the calibration matrix that describes the camera intrinsicparameters.
Combining with the external parameters we have
ps = K[
R | t]
pw = Ppw
with P = K[R|t] the camera matrix.
Problem of identification, so K is assumed upper-triangular when estimatedfrom 3D measurements.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 49 / 61
![Page 121: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/121.jpg)
Camera Calibration
Thus it is typically assumed to be
K =
fx s cx0 fy cy0 0 1
pw = Ppw
with typically fx = fy and s = 0.
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 50 / 61
![Page 122: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/122.jpg)
Camera Matrix
We can put intrinsics and extrinsics together in a 3× 4 camera matrix
P = K[R|t]
Or in homogeneous coordinates we have an invertible matrix
P =
[K 00T 1
] [R t0T 1
]= KE
with E a 3D rigid body (Euclidean) transformation.
We can map a 3D point in homogeneous coordinates into the image plane as
ps ∼ Ppw
After multiplication by P, the vector is divided by the third element of thevector to obtain the normalized form ps = (xs , ys , 1, d).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 51 / 61
![Page 123: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/123.jpg)
Camera Matrix
We can put intrinsics and extrinsics together in a 3× 4 camera matrix
P = K[R|t]
Or in homogeneous coordinates we have an invertible matrix
P =
[K 00T 1
] [R t0T 1
]= KE
with E a 3D rigid body (Euclidean) transformation.
We can map a 3D point in homogeneous coordinates into the image plane as
ps ∼ Ppw
After multiplication by P, the vector is divided by the third element of thevector to obtain the normalized form ps = (xs , ys , 1, d).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 51 / 61
![Page 124: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/124.jpg)
Camera Matrix
We can put intrinsics and extrinsics together in a 3× 4 camera matrix
P = K[R|t]
Or in homogeneous coordinates we have an invertible matrix
P =
[K 00T 1
] [R t0T 1
]= KE
with E a 3D rigid body (Euclidean) transformation.
We can map a 3D point in homogeneous coordinates into the image plane as
ps ∼ Ppw
After multiplication by P, the vector is divided by the third element of thevector to obtain the normalized form ps = (xs , ys , 1, d).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 51 / 61
![Page 125: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/125.jpg)
Camera Matrix
We can put intrinsics and extrinsics together in a 3× 4 camera matrix
P = K[R|t]
Or in homogeneous coordinates we have an invertible matrix
P =
[K 00T 1
] [R t0T 1
]= KE
with E a 3D rigid body (Euclidean) transformation.
We can map a 3D point in homogeneous coordinates into the image plane as
ps ∼ Ppw
After multiplication by P, the vector is divided by the third element of thevector to obtain the normalized form ps = (xs , ys , 1, d).
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 51 / 61
![Page 126: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/126.jpg)
Photometric image formation
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 52 / 61
![Page 127: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/127.jpg)
Lighting: Point Source
To produce an image, the scene must be illuminated with one or more lightsources.
A point light source originates at a single location in space.
In addition to its location, a point light source has an intensity and a colorspectrum, i.e., a distribution over wavelengths L(λ).
The intensity of a light source falls off with the square of the distancebetween the source and the object being lit, because the same light is beingspread over a larger (spherical) area.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 53 / 61
![Page 128: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/128.jpg)
Lighting: More complex sources
A more complex light distribution can often be represented using anenvironment map.
This representation maps incident light directions v to color values (orwavelengths λ)
L(v, λ)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 54 / 61
![Page 129: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/129.jpg)
Reflectance and shading
When light hits an objects surface, it is scattered and reflected.
The Bidirectional Reflectance Distribution Function (BRDF) is a 4Dfunction f (θi , φi , θr , φr , λ) that describes how much of each wavelength λarriving at an incident direction vi is emitted in a reflected direction vr .
It is reciprocal, we can exchange vi and vr .
Figure: (a) Light scatters when it hits a surface. (b) The BRDF is parameterizedby the angles that the incident, vi and reflected, vr , light ray directions make withthe surface coordinate frame (dx , dy ,n).
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 55 / 61
![Page 130: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/130.jpg)
BRDF and light
For an isotropic material fr (θi , θr , |φi − φr |, λ) or fr (vi , vr ,n, λ)
The amount of light exiting a surface point p in a direction vr under a givenlighting condition is
Lr (vr , λ) =
∫Li (vi , λ)fr (vi , vr ,n, λ)max(0, cos θi )dvi
If the light sources are a discrete set of point light sources, then the integralis a sum
Lr (vr , λ) =∑i
Li (λ)fr (vi , vr ,n, λ)max(0, cos θi )dvi
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 56 / 61
![Page 131: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/131.jpg)
Diffuse Reflection
Also known as Lambertian or matte reflection.
Scatters light uniformly in all directions and is the phenomenon we mostnormally associate with shading.
In this case the BRDF is constant
fr (vi , vr ,n, λ) = fr (λ)
The amount of light depends on the angle between the incident lightdirection and the surface normal, θi . The shading equation for diffusereflection is then
Ld(vr , λ) =∑i
Li (λ)fd(λ)max(0, vi · n)
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 57 / 61
![Page 132: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/132.jpg)
Specularities
The second major component of the BRDF is specular reflection, whichdepends on the direction of the outgoing light.
Incident light rays are reflected in a direction that is rotated by 180 aroundthe surface normal n.
The amount of light reflected in a given direction vr thus depends on theangle between the view direction vr and the specular direction si .
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 58 / 61
![Page 133: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/133.jpg)
Phong Model
Combines the diffuse and specular components of reflection with anotherterm, which he called the ambient illumination.
This term accounts for the fact that objects are generally illuminated notonly by point light sources but also by a general diffuse illuminationcorresponding to inter-reflection (e.g., the walls in a room) or distantsources such as the sky.
The ambient term does not depend on surface orientation, but depends onthe color of both the ambient illumination and the object
L = ka(λ)La(λ) + kd(λ)∑i
Li (λ)max(0, vi · n) + Ls
There exists more models, we just mentioned the most used ones.
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 59 / 61
![Page 134: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/134.jpg)
Typical Shading
Figure: Diffuse (smooth shading) and specular (shiny highlight) reflection, as wellas darkening in the grooves and creases due to reduced light visibility andinterreflections. Photo from the Caltech Vision Lab
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 60 / 61
![Page 135: Visual Recognition: Image Formationrurtasun/courses/VisualRecognition/lecture2.pdfRaquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 9 / 61. 2D primitives: 2D conics 2D conic is](https://reader035.vdocuments.us/reader035/viewer/2022081607/5ed7d7a28e533201dc4aebb3/html5/thumbnails/135.jpg)
Next class ... some image fundamentals
Raquel Urtasun (TTI-C) Visual Recognition Jan 5, 2012 61 / 61