real-time object tracking
DESCRIPTION
Related article: Wonsang You, M.S. Houari Sabirin, and Munchurl Kim, "Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain," Proceedings of SPIE, N. Kehtarnavaz and M.F. Carlsohn, San Jose, CA, USA: SPIE, 2009, pp. 72440D-72440D-12.TRANSCRIPT
Real-time Detection and Real-time Detection and Tracking of Multiple Objects with Tracking of Multiple Objects with Partial Decoding in H.264/AVC Partial Decoding in H.264/AVC
Bitstream DomainBitstream Domain
Wonsang YouUniversity of Augsburg, Germany
Electronic Imaging, 19 January 2009
MOTIVATION
Real-time Object Detection and Tracking in H.264|AVC Bitstream
2
Pixel Domain Approach
• Categories of Object Detection and Tracking Approaches.– Pixel domain approach– Compressed domain approach
• Pixel domain approach.– Using raw pixel data– High accuracy– High computational complexity– Require additional computation for compressed videos
• Compressed domain approach– Exploit encoded information (DCT, motion vectors, etc)– Poor performance
• Applicable for simple scenarios• Weak for occlusion
3
Compressed Domain Approach
• Basic idea– Exploit encoded information (DCT, motion vectors, etc)
• Advantages– Remarkably fast processing time– Adaptive to compressed videos
• Disadvantages– Unreliability of encoded information– Sparse assignment of block-based data– Poor performance
• Applicable for simple scenarios• Weak for occlusion
4
Related Works in Compressed Approach
• Basic Solution– Using a low-resolution image from DCT coefficients
– Unfortunately, impossible for AVC bitstreams
5
DC
Our Solution for H.264/AVC Bitstreams
• Basic idea– We use partially-decoded pixel data instead of low-resolution images.
• Advantages– Reliable performance in more natural scenes
• Articulated objects such as humans• Objects changing in size• Objects which have monotonous color or a chaotic set of motion vectors
– Occlusion handling– Detecting and tracking multiple objects in stationary background– Real-time processing– Partial decoding in I-frames
• It has been considered to be impossible• Due to spatial prediction dependency on neighboring blocks
6
Overview of the Proposed Algorithm
• Extraction Phase– Probabilistic Spatiotemporal Macroblock Filtering– Roughly extracting the block-level region of objects– Constructing the approximate object trajectories in each P-frame
• Refinement Phase– Accurately refining the obect trajectories– Background subtraction and partial decoding in I-frames– Motion interpolation in P-frames
7
EXTRACTION PHASE
Real-time Object Detection and Tracking in H.264|AVC Bitstream
8
Probabilistic Spatiotemporal Macroblock Filtering
• Probabilistic Spatiotemporal Macroblock Filtering– Block-based filtering of background parts (BGs)– By using spatial and temporal properties of macroblocks– Rapid processing of segmenting object regions and tracking each
object
9
Block Clustering
• Block clustering– Removing skip
macroblocks
– Eliminating probable background parts
– Clustering the remaining MBs into several fragments
• Block group (BG)– Set of non-skip blocks
10
B4
F1
B1
B2
B3 B5
B6
B7
B8
BGs
Spatial Filtering
• Filtering block groups which are likely to be background
– Removing BGs of • One-macroblock• All zero IT coefficients
• Active Block Group (ABG)– Remaining BGs after spatial
filtering
11
B4
F1
B1
B2
B3 B5
B6
B7
B8
ABGs: Remaining BGsafter Spatial Filtering
Temporal Filtering
• Filtering ABGs which are likely to be background
– Removing ABGs of background
• Based on temporal consistency of each ABG over a given period
– Fragments with high occurrence probability: considered as a part of objects
12
B4
F1
B1
B2
B3 B5
B6
B7
B8
Remaining ABGsafter Temporal Filtering
Temporal Filtering
• Observing occurrence of ABGs during a finite period
– ABGs with high occurrence for finite period are judged as "Real Object".
– Occurrence Probability is measured.
13
frame
1T
2T
3T
4T
5T
6T
AG 16
26G
iG6
6G
Real object
Real object
Real object Observation
period
36G
lllll GGGLPP ,...,,R 21
ABGs
Temporal Filtering
14
frame
1T
2T
3T
4T
5T
6T
AG 16
26G
iG6
6G
Real object
Real object
Real object Observation
period
36G
lllll GGGLPP ,...,,R 21
ABGs
Criteria for survival of ABGas an object
REFINEMENT PHASE
Real-time Object Detection and Tracking in H.264|AVC Bitstream
15
Background Subtraction in I-frames
16
(a) (b)
(c) (d)
lD lS
A
B C D
Reference Blocks (A-D) aresubstituted into background image
Partial Decodingin I-frames
ROI Refinement in I-frames
Motion Interpolation in P-frames
17
I P P P I
t2R
t3R
t4R
Assumption: The object moves slowly nearly with uniform motion in one GOP
(a) (b)
(c) (d)
ROI Refinement in P-frames
In the ROI prediction stage, ROI significantly vary over P-frames.So, ROI refinement is needed for P-frames.
Occlusion Handling
18
Comparing Hue color histogram of two objects
Experimental Results (1/3)
19
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 100 200 300 400 500 600 700 800 900
0
0.5
1
1.5
2
2.5
3
3.5
0 100 200 300 400 500 600 700 800 900
(a)
(b)
merged active trains
frames
frames
Spatial filtering rates
Act
ive
grou
p tr
ains
real
obj
ects
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
1 11 21 31 41 51 61 71 81 91
Partial DecodingFull Decoding
(a)
sec
frames
I-frames
I-frames
Indoor Sequence: 49.5 frames/secOurdoor Sequence: 37.12 frames/sec
Experimental Results (2/3)
20
(a) (d)
(b) (e)
(c) (f)
(g) (j)
(h) (k)
(i) (l)
Experimental Results (3/3)
21
(a) (d)
(b) (e)
(c) (f)
(g) (j)
(h) (k)
(i) (l)