![Page 1: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/1.jpg)
Video Modeling(Basics: Camera Models and Motion Models)
Two-Dimensional Motion Estimation(Fundamentals & Basic Techniques)
Chapter 5 (parts) and Chapter 6
![Page 2: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/2.jpg)
2
Outline (chapter 5)
• Camera projection• 3D motion • Projection of 3-D motion• 2D motion due to rigid object motion
– Projective mapping
• Approximation of projective mapping– Affine model– Bilinear model
![Page 3: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/3.jpg)
3
Pinhole Camera
Cameracenter
Imageplane
2-Dimage
3-Dpoint
The image of an object is reversed from its 3-D position. The object appears smaller when it is farther away.
![Page 4: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/4.jpg)
4
Pinhole Camera Model:Perspective Projection
ZyxZYFy
ZXFx
ZY
Fy
ZX
Fx
torelatedinversely are ,
, ,
All points in this ray will have the same image
![Page 5: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/5.jpg)
5
Approximate Model:Orthographic Projection
images) cmicroscopi i.e. Z,Ffor is Yy and X(xobject. theof distance the tocompared
small isobject in theation withdepth vari theas long as used beCan const whenF/Z with icorthograph scaled sY,y sX,actually x
, ,)(far very isobject When the
ZZYyXx
Z
![Page 6: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/6.jpg)
6
Rigid Object Motion
zyxzyx TTT ,,:;,,:][;)]([':centerobject theon wrp. translatiandRotation
TRCTCXRX
![Page 7: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/7.jpg)
7
Rigid Object Motion
],,[ zyx TTTT
zyxzyx TTT ,,:;,,:][;)]([':centerobject theon wrp. translatiandRotation
TRCTCXRX
z
y
x
TTT
ZYX
rrrrrrrrr
ZYX
987
654
321
'''
![Page 8: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/8.jpg)
8
Flexible Object Motion
• Two ways to describe– Decompose into multiple, but connected rigid sub-objects– Global motion plus local motion in sub-objects– Ex. Human body consists of many parts each undergo a rigid
motion– Elastic motion can’t easily decomposed– Mesh models
![Page 9: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/9.jpg)
9
3-D Motion -> 2-D Motion
2-D MV
3-D MV
![Page 10: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/10.jpg)
Sample (2D) Motion Field
![Page 11: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/11.jpg)
11
Occlusion Effect
Motion is undefined in occluded regions
![Page 12: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/12.jpg)
12
Typical Camera Motions
![Page 13: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/13.jpg)
13
2-D Motion Corresponding to Camera Motion
Camera zoom Camera rotation around Z-axis (roll)
![Page 14: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/14.jpg)
14
2-D Motion Corresponding to Rigid Object Motion
• General case:
• Projective mapping:(if not planar – planar patches)
FTZFryrxrFTZFryrxr
Fy
FTZFryrxrFTZFryrxrFx
TTT
ZYX
rrrrrrrrr
ZYX
z
y
z
x
z
y
x
)()(
'
)()('
'''
987
654
987
321
Projection ePerspectiv
987
654
321
ycxcybxbby
ycxcyaxaax
21
210
21
210
1',
1'
:c)bYaX(Z object When the planar is surface
![Page 15: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/15.jpg)
15
Projective Mapping
Two features of projective mapping:• Chirping: increasing perceived spatial frequency for far away objects• Converging (Keystone): parallel lines converge in distance
![Page 16: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/16.jpg)
16
Affine and Bilinear Model
• Approximations to Projective Mapping– Affine (6 parameters):
• Good for mapping triangles to triangles
– Bilinear (8 parameters):
• Good for mapping blocks to quadrangles– Polynomial models (biquadratic – 12 parameters)
ybxbbyaxaa
yxdyxd
y
x
210
210
),(),(
xybybxbbxyayaxaa
yxdyxd
y
x
3210
3210
),(),(
![Page 17: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/17.jpg)
17
Motion Field Corresponding toDifferent 2-D Motion Models
Translation
Bilinear Perspective
Affine
![Page 18: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/18.jpg)
Outline (Chapter 6)二维运动估计
• Optical flow equation and ambiguity in motion estimation• General methodologies in motion estimation
– Motion representation– Motion estimation criterion– Optimization methods– Gradient descent methods
• Pixel-based motion estimation• Block-based motion estimation
– EBMA algorithm
![Page 19: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/19.jpg)
19
2-D Motion vs. Optical Flow
On the left, a sphere is rotating under a constant ambient illumination, but the observed image does not change.
On the right, a point light source is rotating around a stationary sphere, causing the highlight point on the sphere to rotate.
• 2-D Motion: Projection of 3-D motion, depending on 3D object motion and projection operator• Optical flow: “Perceived” 2-D motion based on changes in image pattern, also depends on illumination and object surface texture
![Page 20: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/20.jpg)
20
Optical Flow Equation
• When illumination condition is unknown, the best one can do it to estimate optical flow.
• Constant intensity assumption -> Optical flow equation
0or 0or 0
:equation flow optical thehave we two,above theCompare
),,(),,(
:expansion sTaylor' using But,
),,(),,(:"assumptionintensity constant "Under
ttv
yv
xd
td
yd
x
dy
dy
dx
tyxdtdydx
tyxdtdydx
Tyxtyx
tyxtyx
tyx
v
![Page 21: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/21.jpg)
21
Optical Flow Equation
),,( ofector gradient v spatial theis ],[
vector velocity theis )v, ( where
0or 0
equation flow optic isThat
y
tyxyx
vtt
vy
vx
TT
x
Tyx
v
![Page 22: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/22.jpg)
22
Ambiguities in Motion Estimation
• Optical flow equation only constrains the flow vector in the gradient direction
• The flow vector in the tangent direction ( ) is under-determined
• In regions with constant brightness ( ), the flow is indeterminate -> Motion estimation is unreliable in regions with flat texture, more reliable near edges
nv
0
0
tv
vv
n
ttnn
eev
tv
![Page 23: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/23.jpg)
23
Ambiguities in Motion Estimation
nv
0
tv
• The flow vector in the tangent direction ( Vt ) is under-determined (aperture problem)
![Page 24: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/24.jpg)
24
Ambiguities in Motion Estimation
• The flow vector in the tangent direction ( Vt ) is under-determined (aperture problem)
![Page 25: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/25.jpg)
25
General Considerations for Motion Estimation
• Two categories of approaches:– Feature based (more often used in object tracking, 3D
reconstruction from 2D)– Intensity based (based on constant intensity assumption)
(more often used for motion compensated prediction, required in video coding, frame interpolation) -> Our focus
• Three important questions– How to represent the motion field?– What criteria to use to estimate motion parameters?– How to search motion parameters?
![Page 26: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/26.jpg)
26
Motion Representation
Global:Entire motion field is represented by a few global parameters
Pixel-based:One MV at each pixel, with some smoothness constraint between adjacent MVs.
Region-based:Entire frame is divided into regions, each region corresponding to an object or sub-object with consistent motion, represented by a few parameters.
Block-based:Entire frame is divided into blocks, and motion in each block is characterized by a few parameters.
Also mesh-based (flow of corners, approximated inside)
![Page 27: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/27.jpg)
27
Notations
Anchor frame:
Target frame:
Motion parameters:
Motion vector at a pixel in the anchor frame:
Motion field:
Mapping function:
)(1 x
)(2 x
)(xdxaxd ),;(
a
xaxdxaxw ),;();(
TLaaa ],...,,[ 21a
Backward motion estimation
Forward motion estimation
![Page 28: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/28.jpg)
28
Motion Estimation Criterion
• To minimize the displaced frame difference (DFD)
(MSE)Error SquareMean ,min)());(()(
(MAD) Difference AbsoluteMean ,min)());(()(
212
12
MSE-DFD
MAD-DFD
x
x
E
E
xaxdxa
xaxdxa
min)()();()()(
0)(
)()( small is if 0
2
121OF
121
12111
x
pT
T
tttyx
E
dt
ddt
dy
dx
xxaxdxa
d
xx
• To satisfy the optical flow equation
![Page 29: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/29.jpg)
29
Motion Estimation Criterion
10 1 1 1 2 1
12
2
( ( ( )) ( ) ) ( ( ( ) ( )) ( ))
( ) ( )( ) ( )( )
( )( )( )( ) ( )
T
x x
x x x x
y
xx x
v x x y x tv
y tx y y
d x x x x x
Block size: 8*8 16*16
This is homework
![Page 30: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/30.jpg)
30
Motion Estimation Criterion
• Two above methods fail– Constant intensity assumption is not true everywhere (say shadows)– Flat texture: many motions satisfy the assumption (ill-posed)
• Need to impose additional smoothness constraint using regularization technique (Important in pixel- and block-based representation)
– Penalty term for smoothness, measure motion flow differences between adjacent pixels
– use weights to include a priori knowledge of importance of smoothness
min)()(
);();()(
DFD
2
aa
aydaxdax y
ssDFD
Ns
EwEw
Ex
![Page 31: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/31.jpg)
31
Relation Among Different Criteria
• OF criterion is good only if motion is small. • OF criterion can often yield closed-form solution as the objective
function is quadratic in MVs.• When the motion is not small, can iterate the solution based on
the OF criterion to satisfy the DFD criterion.• Bayesian criterion can be reduced to the DFD criterion plus
motion smoothness constraint• More in the textbook
![Page 32: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/32.jpg)
32
Optimization Methods
• Exhaustive search – Typically used for the DFD criterion with p=1 (MAD)– Guarantees reaching the global optimal– Computation required may be unacceptable when number of
parameters to search simultaneously is large!– Fast search algorithms reach sub-optimal solution in shorter time
• Gradient-based search– Typically used for the DFD or OF criterion with p=2 (MSE)
• the gradient can often be calculated analytically• When used with the OF criterion, closed-form solution may be obtained
– Reaches the local optimal point closest to the initial solution• Multi-resolution search
– Search from coarse to fine resolution, faster than exhaustive search– Avoid being trapped into a local minimum
![Page 33: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/33.jpg)
33
Gradient Descent Method
• Iteratively update the current estimate in the direction opposite the gradient direction.
• The solution depends on the initial condition. Reaches the local minimum closest to the initial condition
• Choice of step side:– Fixed step size: Step size must be small to avoid oscillation, requires many iterations– Steepest gradient descent (adjust step size optimally)
Not a good initial
A good initial
Appropriatestepsize
Stepsize too big
![Page 34: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/34.jpg)
34
Newton’s Method
• Newton’s method
– Converges faster than 1st order method (I.e. requires fewer number of iterations to reach convergence)
– Requires more calculation in each iteration– More prone to noise (gradient calculation is subject to noise, more so with 2nd order
than with 1st order)– May not converge if \alpha >=1. Should choose \alpha appropriate to reach a good
compromise between guaranteeing convergence and the convergence rate.
![Page 35: Video Processing Communications Yao Wang Chapter-5-6a](https://reader034.vdocuments.us/reader034/viewer/2022042600/577cce6e1a28ab9e788e0ffa/html5/thumbnails/35.jpg)
35
Newton-Raphson Method
• Newton-Ralphson method – Approximate 2nd order gradient with product of 1st order gradients– Applicable when the objective function is a sum of squared errors– Only needs to calculate 1st order gradients, yet converge at a rate similar to
Newton’s method.