multi-object detection and tracking from a moving platform
DESCRIPTION
Multi-Object Detection and Tracking from a Moving Platform. 1-Analysis and detection: Registration across video group of frames ( VGoF ) Detection and segmentation of motion blobs (background models, shadow) 2-Representation and tracking: - PowerPoint PPT PresentationTRANSCRIPT
Multi-Object Detection and Tracking from a Moving Platform
Tracking from a Moving Platform
1-Analysis and detection:• Registration across video group of frames (VGoF)• Detection and segmentation of motion blobs (background
models, shadow)
2-Representation and tracking: • Video object representation (shape, color descriptors,
geometric models)• Object tracking (prediction, correspondence, occlusion
resolution etc.)
3-Access and event modeling: • Efficient data structures for video queries in high-dimensional
feature space • High-level event representation
Multi-Object Tracking
Moving ObjectDetection &
Feature Extraction
Data Association(Correspondence)
Prediction
UpdateTrajectories
Context
Tracking
1. Detect moving objects in stabilized frames.2. Predict locations of the current set of objects.3. Match predictions to actual measurements.4. Update object trajectories.5. Update image stabilized ref coord system.
Multi-object Detection and Tracking Unit
VGoF RegistrationInto Common
Coordinate System
UpdateCoord System
ObjectStates
Dynamic State Estimation for Tracking
Dynamic System State EstimatorMeasurement
System
System state
Measurements State estimate
Stateuncertainties
System Errors•Agile motion•Distraction/clutter•Occlusion•Changes in lighting•Changes in pose•Shadow(Object or background models are often inadequate or inaccurate))
Measurement Errors•Camera noise•Framegrabber noise•Compression artifacts•Perspective projection
State Error•Position•Appearance
•Color •Shape•Texture etc.
•Support map
System noise Measurement noise
Spatio-temporal volume of hall monitor sequence: (a) Left entire volume, (b) Middle: cut taken at vertical position y0, (c) Right: Cut taken at vertical Position y1.
Gerald Kuhne, “Motion-based segmentation and classification of Video Objects”Dissertation Univ. of Mannheim, 2002
Motion Detection- 3D Spatiotemporal Volume
Motion Detection - Structure and Flux Tensor Approach
Typical Approach: threshold trace(J)Problem: trace(J) fails to capture the
nature of gradient changes and results in ambiguities between stationary versus moving features
Alternative Approach: Analyze the
eigenvalues and the associated eigenvectors of J
Problem: Eigen-decompositions at every pixel is computationally expensive for real time performance
Proposed Solution: Flux tensor time derivative of J
€
J =
∂I
∂x
∂I
∂xdx
Ω
∫ ∂I
∂x
∂I
∂ydx
Ω
∫ ∂I
∂x
∂I
∂tdx
Ω
∫∂I
∂y
∂I
∂xdx
Ω
∫ ∂I
∂y
∂I
∂ydx
Ω
∫ ∂I
∂y
∂I
∂tdx
Ω
∫∂I
∂t
∂I
∂xdx
Ω
∫ ∂I
∂t
∂I
∂ydx
Ω
∫ ∂I
∂t
∂I
∂tdx
Ω
∫
⎡
⎣
⎢ ⎢ ⎢ ⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥ ⎥ ⎥ ⎥
€
trace(J) = ∇IΩ
∫2dx
Motion Detection Flux Tensor vs Gaussian Mixture
Multi-object Tracking StagesProbabilistic Bayesian framework
Features Used in Data Association: Proximity and Appearance-based
Data Association Strategy: Multi-hypothesis testing
Gating Strategies: Absolute and Relative
Discontinuity Resolution: Prediction (Kalman filter), or Appearance models
Filtering: Temporal consistency check and Spatio-temporal cluster check
Association Strategy• Multi-hypothesis testing with delayed decision - Many matches are kept with
evidence-based pruning• Support for multiple interactions - one-to-one object matches, many-to-one,
one-to-many, many-to-many, one-to-none, or none-to-one matches • Corresponding low-level object tracking events
• Segmentation errors• Group interactions (merge/split)• Occlusion• Fragmentation• Entering object• Exiting object
ObjectMatchGraph
Match Confidence ComputationMatch confidence quantifies correspondence goodness-of-fitConfidence value has two components:• Similarity confidence (Confsim)• Separation confidence(Confsep)
€
Confsim (Ω1,i,Ω1, j ) =1−D(Ω1,i,Ω1, j )
MaxDist
Confsep (Ω1,i,Ω1, j ) =
1
0.5 -D(Ω1,i,Ω1, j ) - D(Ω1,i,Ω1, j*)
2 × max(D(Ω1,i,Ω1, j ),D(Ω1,i,Ω1, j*))
⎧
⎨ ⎪
⎩ ⎪1,j* is the closest competitor in terms of distance
NodejNodei
-bounding box- support map-centroid-area etc.
-bounding box- support map-centroid-area etc.
Conf(i,j)
Link
Trajectory Segment Generation• Trace links in the ObjectMatchGraph to generate
possible trajectory segments• SegmentList - Linked list of inner nodes (objects/cells)• Trajectory labeling - Extracted trajectory segments are
labeled using a modified connected components labeling• Trajectory linking - Trajectories are formed by linking
unfiltered segments sharing the same label.
ObjectMatchGraph
Source Split Merge Sink
Source-Split
Inner
Sink-MergeSplit-MergeSingle
SegmentTrajectory
Data HierarchyNode
Type
Centroid
Bounding Box
Area
Support Map
ImRGB
Parents
ChildrenTrajectory
Macrosegment
Segment
Node(Object-Region)
TrajectoryTypeLabelStart_frameStart_positionEnd_frameEnd_positionLengthDisplacementDiagonalSegments
SegmentTypeLabelConsistency
Start_frame, object, child, nodetypeEnd_frame, object, nodetypeObjectsCentersTrajectory_typeTrajectory_displacementTrajectory_lengthTrajectory_boundingboxParentsChildren
Need for Local Registration
Exp Results: DARPA ET01 Video Frame #50
Registered Frame Motion Detection Results
Foreground Mask Tracking Results
Exp Results - NGA Crystal View HD Video Frame #787 in Coord. #740
c) Predictions d) After occlusion handlingUPS
Future Work - Trajectory Matching and Filtering
• Establishing trajectory continuity (object ID matching) across moving coordinate systems
• Customizing trajectory analysis for airborne video tracking with misregistration error, large platform motion, zooming, etc
• Maintaining temporal consistency of trajectories
• Removing periodic clustered trajectories
• Resolving discontinuous trajectories
• Trajectory display and visualization: video vs mosaic
Future Work – Performance Optimization and Tuning
• Moving object detector filters
• Flux tensor fixed optimal threshold learning or continuous adaptive thresholding
• Morphological post processing filters
• Real-time versus offline MATLAB (approximate):
•Flux tensor detection 4sec/frame
•Object tracking 2sec/frame (for around 10 objects)
•Excluding I/O time
Future Work - Near Term Performance Improvements
• Frame-to-frame registration accuracy difficult to maintain across a hundred frames or more (few seconds of video)
• Reducing false motion trajectories due to registration errors due to scene structure
• Maintaining a common coordinate system for registering long airborne video sequence
• Tracking through large platform motion• Dealing with large camera field-of-view changes• Platform motion jitter
Future Work - Longer Term Performance Improvements
• Filtering periodic motions produced by clutter, etc.• Shadows (e.g. false detections, shape distortions, merges)
• Sudden illumination changes (e.g. due to cloud movements)
• Glare from specular surfaces (e.g. windshields, water surfaces)
• Perspective distortion (e.g. object size, shape and position)
• Trajectory gaps and distortion due to occlusion• Poor video quality (e.g. low resolution, low color saturation)