Download - A Statistical Approach for Object Motion Estimation With MPEG Motion Vectors ([email protected])
-
7/23/2019 A Statistical Approach for Object Motion Estimation With MPEG Motion Vectors ([email protected])
1/4
2004 IEEE International Co nferenc e
on
Multimedia and Expo
ICME)
A Statistical Approach for Object M otion Estimation with
MPEG
Mo tion Vectors
Xiaodong Yu , Ping Xue' and Qi Tian'
Nanyang Technological Universiv, School of Electrical and Electronic Engineering, Singapore
Institute fa r Infocomm Research, Agency fo r Science, Technology and Research, Singapore
' {exdyu, epxue)@ntu.edu.sg, [email protected] tar. du.sg
Abstract
In this paper we propose
a
statistical approach to
estimate the object motion with A4PEG motion vectors A
model with
tw
normal distribution terms is applied to
represent the simplified object motion. One
term
models
the
nobes
embedded in the mofion vectorfield produced
in the encoding stage and the other term
models ihe
randomness of the
frue
object motion. Experiments with
vehicle mo tion estimation fro m MPEG ha@c video are
used
to
evaluate the proposed algorithm. The influence of
rime window, fram e size and referencej+ame distance a re
investigated. The vehicle speeds can be estimared with a
high accuracy up to 85
- 92 .
1. Introduction
Object motion estimation is a classic problem in the
computer vision field. In recent years with the popularity
of MPEG videos, much research effoorts have been
attached to estimate object motion with MPEG motion
vectors. Although MPEG motion vector is originally
designed to minimize the motion prediction error in
coding, it also embeds rich motion information among
frames
[ I ]
Sinc e motion vectors are readily available in
MPEG streams, we need neither fully decode the
compressed video stream nor calculate the optical flow
thus great computations can be saved.
Motion-vector-based object motion estimation is
composed of two components: motion segmentation and
object tracking. It is assumed that objects are rigid or their
parts are rigidly connected
to
one another and objects
have continuous motion [I]. Thus an object can be
segmented fr om background by clustering m otion vectors
according to their similarities in directions or amplitudes
[2,3,10].
In the next step, motion parameters are derived
from the motion vectors associated to this object for
tracking. Such algorithms are analogues of those in
optical flow field and they all rely on the success of
moving object segmentation. However, the granularity of
motion vector field limits the performance o f motion
vector based object segmentation. To solve this problem,
scholars have raised several approaches. Eng an d Ma
[5]
used unbiased
fuzzy
clustering to replace the well-know
fuzzy c-mea ns clustering. They found that this algorithm
was sensitive to the existence of small motion vector
clusters and resulted in accurate identification of small
objects. Babu and Ramakrishnan [6] accumulated and
interpolated motion vectors over a few frames to enrich
the motion information. Nevertheless, these approaches
are inefficient if the object is
too
small. For example,
wherever
two
or more objects sm aller than
a
macroblock
conbibute to distinct motions within a macroblock, the
encoded motion vector cannot represent the motion
correctly [4] hence motion segmentation is infeasible.
Furthermore, if an object is in similar size of one
or two
macroblocks, only one
or two
motion vectors cannot
provide sufficient information to distinguish object
motions from noisy vectors thus it is still difticult to
segment this object from the background. These pr ob lem
motivate us to seek another way to estimate object motion
with motion vectors. We argue that it is possible to
extract some useful motion information
in
macro level
even when the objects are too
small
to estimate individual
object's motion, providing these objects follow some
kinds of common motion pattern.
In this paper we proposed a statistical model to
estimate the mean object motio n with MF'EG motion
vectors under the stationary assumption. Two normal
distribution terms are used to model the randomness of
the object motion and the noises embedded in motion
vector field respectively. Applying the statistical analysis
within a time window, we alleviate the granularity of
motion vector field on the cost of instant motion
information.
The rest of this paper is organized as follows.
In
Section
2,
we formulate
our
research question and
proposed a statistical model. Then we present the test bed
for the proposed model in Section 3 . In Section we
present experimental results of the model and the
influential factors presented in Section 2 with the test bed.
Finally, a conclusion and the discussions of future work
are given in Section 5 .
2.
Theoretical an alysis
In this paper, we assume that the object motions are
homogeneous both in spatial and temporal domain and we
call it the stationary assumption. This assumption requires
that object motions are similar to one another in terms of
0-7803-8603-5/04/ 20.00
2004 EEE
519
-
7/23/2019 A Statistical Approach for Object Motion Estimation With MPEG Motion Vectors ([email protected])
2/4
amplitude or direction and their motions change slowly.
In light of the difiiculties of object segmentation with
motion vectors discussed in Section 1, we expect
to
describe the objects motion with a few statistical
parameters within
a
sho rt period rather than identifying
instant motion for every single object. The period within
which the stationary assumption is satisfied is defined
as
the time window. We defme the displacement that
associates with object motion as the true object motion,
the one that does not as noise and the motion derived
directly from the raw motion vector field, i.e., the true
object m otion plus noise,
as
the observed object motion,
X, = i n ,
where
X
X nd
n
are variables of the observed objects
motion, the true object motion and noise respectively,
denotes the i-th sample, i.e., the i-th motion vector within
a time window. The variables in the statistical model can
be either amplitudes or directions of motion vectors. Th ey
can be extracted either directly from motion vector field
or from motion vector field after some transformations,
e.g., camera calibration,
as
long as the stationary
assumption can be satisfied after such transformations.
The statistical model for the true object motion is
application-tailored.
In
this paper, we are interest in the
amplitude of object motion. Under the stationary
assumption, we can expect that within
a
t ime window,
most true object motions should concentrate around a
center value, i.e., the mean, and they sho uld be symm etric
about the center in a bell shape.
This
rational deduction
coincides w ith the experimental ohservations (see Figure
2). Combining the rational deduction and the
experimental ohservations, we propose to model the true
object motion with normal distribution in this
paper.
For
the model o f noise, we employ an additive zero-mean and
constant variance normal distribution, which has been
used to model the noise in the optical flow
[7].
Thus we
have
X
-
N P,~ : : ) ,
ni -
N(O,U;),
2)
z ' -
N P . ~U : ,
where p and
dx
re mean and variance of the true
object motion, and dn ariance of noise.
We
approximate
p with the sample mean o f the observed m otion A from
Nsamoles
in
a time window T:
1)
-
The mean of the true object motion
is
a
parameter of
interest to the users in applications because it represents
the dominant motion characteristics. It is desired to
improve its estimation accuracy. T his can he a chieved by
either reducing
the
variance of the estimation error or
improving the signal-noise-ratio
SNR).
The estimation
error follows the normal distribution,
1
: U :
=
( X , ' - p )
-
N ( 0 ,
,
4)
and
S N R
is
given by
5 )
Now let
us
characteristically analyze the influential
factors in
-
7/23/2019 A Statistical Approach for Object Motion Estimation With MPEG Motion Vectors ([email protected])
3/4
We present a case study for the trafiic monitoring
application in this section and use it to test the proposed
model and the influential facton discussed above. It is
extended from our previous work
[SI.
In this application,
we estimated vehicle speed with motion vectors in MPEG
traffic v ideo. The traff ic video
is
collected from a S kycam,
i.e., a camera highly mounted with a much wider view.
Figure 1 shows a sample image of such video with the
motion vectors. Most of the vchicles in the traffic video
are sm aller than a single macroblock and there
is
only one
o r two motion vectors associated to each vehicle. Hence,
this
is
a good example to show the advantage of the
proposed method over conventional clustering based
counterparts. Within a short period of time, the
amplitudes and the directions of vehicles speed should be
similar and they will not change significantly. Thus, the
stationary assumption
is
satisfied easily in this application.
Due to the perspective effect, motion vectors from
different vehicles with similar speed may present
different amplitudes. Hence camera calibration
is
employed to obtain the displacement of the vehicles in
ground plane from MPEG motion vectors. In this way,
the variable of the object motion in the statistical model
and equations
1)- 6)
is the mapped motion vector, or
equivalently
Figure 1. Sample image from Skycam and motion vectors
(scaled
by IO
4.
Experimental results
We test the proposed model and the impact of the
influential factors with the test bed described in Section
and the test videos are two MF'EG videos collected from
hvo Skycams respectively. Each
of
them
is
5 minutes
long and includes 6 lanes, representing various traffic
conditions at certain place. They are digitalized by a
MPEG card in MPEG-1 format at resolution 352x288,
frame rate 25Fps, reference frame distance
Df=3
and
constant bitrate 1150khps. The variable of object motion
is
the speed of vehicle in this case study. The mean speed
of each lane
is
calculated and compared with ground buth
independently. Ground truth
is
obtained manually at 2
seconds interval.
First of all, we test the normal approximation of object
motion. Figu re 2.a show the distributions of the estimated
speed within a lane for
a
30-second test sequence. It is
bell-shape and symmetric about their mean. The normal
fits
demonstrate that the normal distribution properly
approximate the speed distributions. To test our
assumption objectively, we conduct the Ryan-Joiner
(similar to Shapiro-Wilk) normality test. The plots and
the test results are presented in Figure 2.b.
As
the plot
shows the ordered observed values and the respective
cumulative frequency almost lie along a straight line, it is
secure to assume that the vehicle speed is normally
distributed.
4 ,
833
1
? a s m m
gp.6
-I...-
.I
,.
-
7/23/2019 A Statistical Approach for Object Motion Estimation With MPEG Motion Vectors ([email protected])
4/4
Figure 3. The impact of time window
on
speed estim ation.
a)
and (b) show the standard deviation of speed estimation error at
and mean accuracy of speed estimation
at
different time
windows respectively.
Finally, we test the influence of frame size and
reference frame distance
on
speed estimation. We re-
encode two test videos at two Frame sizes, CIF and QC IF,
and three reference frame distance, D,= (normal), Of=
6
and
D
- 12. Then we evaluate the
test
bed with re-
f:
encoded videos. Figure 4.a illustrates the mean amplitude
of motion vectors in the test videos. We find that the
mean motion vectors increase approximately linearly with
the reference frame distance and the square root of frame
size. Using a larger reference frame distance or a larger
frame size, motion vector becom es longer. Consequently,
the influence of half pixel error is suppressed and the
accuracies of speed estimation are improved. Note that
the mean motion vector with 0 =
12
in CIF size is
slightly smaller than the double of the mean motion
vector with
Of=
2 in QCIF size and the one with Of=
in CIF size. Meanwhile, the mean accuracy of speed
estimation with 0
=
12 in CIF size deteriorates as
compared with the others in CIF size. Similar observation
is also reported in
Gonzales,
Yeo and Kuos experiments
[ 9 ] The reason
is
that with
a
larger reference frame
distance,
a
larger motion vector is selected thus mo re bits
are needed to code this motion vector. When these
additional bits are not sufficiently compensated by the
corresponding saves in coding smaller marcoblock
residual emors, a sub-optimal, shorter motion vector is
used instead of the optimal, longer one. This is the
inherent limitation of MPEG motion vector based
approach. -
M
w PI
0 1 1 w
a
II
(a) (b)
Figure 4 The mean motion vectors (a) and the mean accuracies
of speed estimation @) far test videos n different fnme size and
reference hame distance. T=60s.
5. Conclusion and future work
In
this paper, an algorithm that estimates object
motion from
MPEG
compressed video with statistical
model was presented. This algorithm complements the
existing clustering based approaches in small object
scenarios where the latter are inefficient. Theoretical
analysis and experimental evaluation were conducted to
investigate the influential factors of the proposed model.
Unlike clustering based object motion estimation
techniques, we did not attempt to segment and track every
object. Instead, we tried to estimate
a
few statistical
motion parameters for all objects within a time window.
In this way, motion estimation accuracy can be
substantially improved with proper spatial and temporal
processing (85%-92%
in DUI
est bed) an d th c granularity
of motion vector field is alleviated.
Although the test vehicle in this paper is a typical
application for traffic monitoring, it is applicable in other
scenarios where the objects are small while moving in a
common pattern. It is can also be extend to moving
camera by compensating the camera motion. They are
included in ou r future works.
Reference:
[I] Nevenka Dimitrova and Forouzan Clshani, Motion
Recovely for Video Content Classification,
ACM
Transactions
on
Information Systems,
Vo1.13, No.4,
October,1995,pp408-439
[2] F Bartolini, V Cappelhi,
and
C. Giani,
Motion
Estimation and Tracking for Urban Traffic Monitoring,
Proceeding of
IEEE
Internal Conference
on
Image
Processing, 1996, pages 87-90
[3] Heitou Zen, Tameharu Hasegawa, Shinji Ozawa, Moving
Object Detection from MPEG Coded Picture, Proceeding
of IEEE International Confrrence
on
Image Processing,
vol. IV, pp.25-29, Oct. 1999
[4] Kyongil Yoon, Daniel DeMenthon, David Doermann,
Event Detection from MPEC Video in the Compressed
Domain,
Internalional Conference
on
Pattern R ecognition,
p. 1819 -1825, Volume 1, Barcelona, Sp ain, September 03 -
08,2000.
[5] Haw-Lung Eng, Kai-Kuang Ma, Motion Trajectory
Extraction Based on Macroblock Motion Vectors for
Video
Indexing, International Conzrence on Image Processing,
pp:284-288, 1999
[6] Babu,
R.V.,
Ramakrishnan, K.R., Co mpressed domain
motion segmentation for video object extraction,Acoustics,
Speech, and Signal Processing,
2002 IEEE
Inlernational
Conference o n, Volume: 4 ,2 00 2, Page(s): 3788 -3791
[7] Christophe Garcia, Georgios Tziritas, Optimal Projection of
2 - 0 Displacements for 3-D Translational Motion
Estimation,
Image
om
Vision Computing,
Vol
20,
pp:793-
804,2002
[8] Xiaodong Yu, Lingyu Dum, Qi Tian, Highway Traffic
Information Extraction from Skycam MPEG Video,
Proceedings
of
IEEE 5th Intelligent Tramponation S ystem
Conference,
Page(s): 37- 42, Sep. 3-6, 2002
[9]
CA .
Gonzales, H. Yeo and C.J.Kuo, Requ irements for
Motion Estimation Search Range in MPEG-2 Coded Video,
I BM Joumal
of Research Development,
Vol. 43, No.4, July
1999.
[IO] J im Wang and Ze-Nian
Li,
Kernel-based Multiple Cue
Algorithm for Object Segmentation,
IS&T/SPIE, Symp. On
Electronic Image and Video Communications and
Processing, 2000
522