multiple moving target detection, tracking, and recognition from a moving observer

Upload: hammad-ansari

Post on 03-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

    1/6

    Multiple Moving Target Detection, Tracking, andRecognition from a Moving Observer

    Fenghui Yao and Ali Sekmen Mohan J. MalkaniDepartment of Computer Science Department of Electric and Computer Engineering

    Tennessee State University3500 John A Merritt Blvd, Nashville, TN 37215, USA

    {fyao, asekmen, mmalkani }@tnstate.edu

    Abstract - This paper describes an algorithm formultiple moving targets detection, tracking and recognition froma moving observer. When the camera is placed on a movingobserver, the whole background of the scene appears to bemoving and the actual motion of the targets must be distinguishedfrom the background motion. To do this, an affine motion modelbetween consecutive frames is estimated, and then moving targetscan be extracted. Next, the target tracking employs a similaritymeasure which is based on the joint feature-spatial space. At last,the target recognition is performed by matching moving targets

    with target database. The average processing time is 680 ms perframe, which corresponds to a processing rate of 1.5 frames persecond. The algorithm was tested on the Vivid datasets providedthe Air Force Research Laboratory and experimental resultsshow that this method is efficient and fast for real-timeapplication.

    I. INTRODUCTION

    Detection and tracking of moving objects in an image

    sequence is one of the basic tasks in computer vision. The

    detected moving object trajectory can be either of interest in its

    own or used as the input for a high level analysis such as

    motion pattern understanding, moving behavior recognition

    and so on. Applications include surveillance, homeland

    security, protection of vital infrastructure, and advanced

    human-machine communication. Therefore, moving objects

    detection and tracking has received more and more attention,

    and many algorithms have been proposed. Among these, one

    interesting approach is the Particle filter [1], which has been

    used and extended many times [2] [3] [4]. Particle filter was

    developed to track objects in clutter, where the posterior

    density and observation density are often non-Gaussian. The

    key idea of particle filtering is to approximate the probability

    distribution by a weighted sample set. Each sample consists of

    an element which represents the hypothetical state of an object

    and a corresponding probability. The state of an object may be

    control points of a contour [1], the position, shape and motionof an elliptical region [2], or specific model parameters [3].

    That is, these methods [2] [3] are based on models. Rosss

    approach [4] is a model-free, statistical detection method

    which use both edge and color information. The common

    assumption of these methods [1] [2] [3] is that the background

    does not move, and the image sequences are from a stationary

    camera. Tian et al [5] developed a real-time algorithm to detect

    salient motion in complex environments by combining

    temporal difference imaging and temporal filtered optical flow.

    The image sequence used in this method is also from stationary

    camera.

    The works of Smith and Brady [7] and Kang et al [6]

    employed the image sequences from moving platform. Kang

    et al developed an approach for tracking of moving objects

    observed by both stationary and Pan-Tilt-Zoom cameras. Smith

    and Bradys approach employed the image sequence from a

    camera mounted on a vehicle to detect other moving vehicle.This method used special-purpose hardware to implement the

    real-time target detection and tracking. COMETS system

    detects the target from a moving observer (an autonomous

    helicopter but does not perform tracking [8]. Yang et als

    tracker works for image sequence form both stationary and

    moving platform but it detects and track single target [9].

    Literature [10] proposes a detection-based multiple object

    tracking, literature [11] shows a multiple object tracking

    method based on multiple hypotheses graph representation, and

    literature [12] demonstrates a distributed Bayesian multiple

    target tracker. However they all employ image sequences from

    stationary observers.As shown above, there are a few works to discuss the

    multiple moving target detection and tracking from the moving

    observer. And also few work deals with target recognition at

    same time. This paper introduces a method for moving target

    detection, tracking, and recognition from a moving observer.

    II.MOVING TARGET DETECTION FROM AMOVING OBSERVER

    The entire configuration is shown in Fig. 1. The output of

    the moving target detection is sent to the target tracking. The

    tracked targets are sent to target recognition. This section

    describes moving target detection, target tracking and target

    recognition is discussed in Section 3 and 4, respectively.

    The moving observer usually means a camera mounted on aground vehicle or on an airborne platform such as a helicopter

    or an unmanned aerial vehicle (UAV). In this work, the video

    sequences are generated by an airborne camera. In airborne

    video, everything (target and background) appear to be moving

    over time due to the camera motion. Before employing frame

    differencing (simple motion detection method for stationary

    platforms) to detect motion images, it is necessary to conduct

    motion compensation first. Two-frame background motion

    978-1-4244-2184-8/08/$25.00 2008 IEEE. 978

    Proceedings of the 2008 IEEE

    International Conference on Information and Automation

    June 20 -23, 2008, Zhangjiajie, China

  • 7/28/2019 Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

    2/6

    estimation is achieved by fitting a global parametric motion

    model (affine or projective) to sparse optic flow. Here, we use

    affine transformation model.

    A. Optic Flow Detection

    Sparse optic flow is obtained by applying Lucas-Kanade

    algorithm [13]. The number of optic flow is controlled in the

    range of 200 to 1000. Other methods such as matching Harris

    corners, Moravec feature, SUSAN corners between frames, or

    matching SIFT features are all applicable here. The mainfactors need to be considered are computation cost and

    robustness. Experiment results show that Lucas-Kanade

    method is most reliable and pretty fast.

    B. Affine Parameter Estimation

    2-D affine transformation is described as follows,

    +

    =

    6

    5

    43

    21

    a

    a

    y

    x

    aa

    aa

    Y

    X

    i

    i

    i

    i , (1)

    where (xi,yi) are locations of feature points in previous frame,and (Xi, Yi) are locations of feature points in current frame.

    Theoretically, to determine six affine parameters, three pairs ofmatched feature points are enough. How to select these threepairs of feature points will affect the precision of affineparameter estimation. To reduce this estimation error, theseparameters can be solved in the least-squares method based onall matched feature points. However the computation cost inleast-squares method is heavy. To reduce the computation timeand estimation error, this work use the algorithm similar toLMedS (Lest Median Square) [14]. Details are as follows. (i)Randomly select N pairs of matched feature points from

    previous frame and current frame. And further, randomly selectM triplets from N pairs of matched feature points, where

    NM

  • 7/28/2019 Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

    3/6

    .1

    ),(1

    ),(

    1

    2

    1

    2

    1

    = =

    =

    =

    =

    N

    i

    jih

    M

    j

    ji

    M

    ijjxyx

    h

    vuG

    yxK

    MN

    vyPM

    IIJ

    (3)

    J(Ix,Iy) is symmetric and bounded by zero and one. Thissimilarity is based on the average separation criterion in clusteranalysis [15] except that it employs the distance with akernelized one. This similarity measure has been applied for asingle target tracking [16] [17].

    B. Modified Similarity Measure for Multiple Target Tracking

    In multiple target tracking, the similarity between the targetTk represented by the hullHk in (t-1)-th frame and the target Tlin the t-th frame depends on not only joint feature-spatial space

    but also the distance between them. Therefore the similaritymeasure in Eq.(3) is modified as follows.

    ktc

    ts

    tlc

    tl

    y

    kt

    xlk

    ttkl

    PPIIJTTS

    ,11

    ,1

    ,1 ),(),(

    =

    , (4)

    where ktxI,1 , ktcP

    ,1 is the distribution of target samples inside

    the hullHk and the target center in (t-1)-frame,tlyI and

    tlcP is

    the distribution of target samples insideHland the target center

    in t-th frame, respectively, and 1ts is the affine

    transformation model from (t-1)-th frame to t-th frame.

    To verify the robustness of this similarity measure, four

    targets extracted from aerial images as shown in Fig. 3 (a),

    which are gray truck (GT), red sedan (RS), blue sedan (BS),and gray sedan (GS) from left to right, are employed for

    similarity testing. These four targets are rotated in range of 0

    to 180, with 5 increment in each rotation. The similarity

    measure between these generated images and the gray truck in

    Fig. 3 (a) are calculated by using 500 random sample points

    from the each image. The similarity measures for GT-RS, GT-

    BS, GT-GS, and GT-GT are sown in Fig.3 (b). The similarity

    measure variance for GT-RS, GT-BS, GT-GS, and GT-GT

    matching is 4.34 10-6, 1.04 10-5, 1.29 10-5, and 5.04 10-5,

    respectively. These results show that the similarity in Eq. (4) is

    robust to the rotation and scaling. To reduce the computation

    time, there is no need to use all points inside the target hull.

    The sample points can be chosen randomly from samplesinside the target hull.

    C. Tracking Graph Management

    The multiple targets tracker needs to handle all problems aslisted in Fig.2. The algorithms to deal with these problems areas follows.

    1) Missing detection prediction: The targets which are undertracking till the frame right before the current frame may bemissed at the current frame because of the failure of detector.Missing detections at i-th frame, they are estimated from thedetection results obtained in image frames prior to the currentframe, by applying estimators. According to the position andvelocity of the target in previous image frames, its new

    position and velocity in the new frame can be estimated byKalman filter, recursive Bayesuan estimator, or particle filter.In this work, Kalman filter is employed. From previous state

    ),,,( ,1,1,1,1 kiikiikickic vyx , the next state ),,,( ikikikcikc vyx

    is estimated, where ),( ,1,1 kicki

    c yx is the center of the target Tk

    (which is missing at i-th frame) at (i-1)-th frame, and

    ),( ,1,1 kiikii v is the average velocity and direction over

    passed frames. Fig. 4 (c) shows a missing detection at frame21, which will be estimated.

    2) New target detection: New targets usually appear at thefour surroundings but not interior area. If a target is detectedand tracked over 2 frames, it is considered as a new target.Currently, four surroundings with the size of 20-pixel arecleared to zero, to remove the pixels that are not involved ingenerating frame difference. Toward inside, four surroundings

    with size of 40-pixel are the area that new target may emerge.Fig. 4 (a) shows 3 newly detected targets at frame 6.

    3) False detection filtering: For targets that emerge in theinside area of the image, and are not linked to the targets in

    previous frame or next frame, they are false detection. They arefiltered out. Fig.4 (b) shows a false detection at frame 9, whichwill be filtered out.

    Fig. 3 Robustness of similarity measure.

    (b) Similarity measures between targets in (a)

    (a) Four targets extracted from aerial image (from left to right:gray truck, red sedan, blue sedan, and white sedan)

    Similarity Using 500 Samples

    0.00

    0.02

    0.04

    0.06

    0.08

    0.10

    0.12

    0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170

    theta

    Similarity

    GT-RS GT-BSGT-GS GT-GT

    980

  • 7/28/2019 Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

    4/6

    (a)

    (b)

    Fig. 4 Target detection results. (a) Three targets (grey truck, grey sedan,and red sedan) are detected at frame 6; (b) Three targets andfalse detection (lower red ellipse) at frame 9; (c) Missingdetection (red sedan) at frame 21.

    (c)

    (b)

    Fig. 5 Target detection results. (a) Merging detection at frame 162; (b)

    Mask image showing target merging at frame 162; (c)Trajectory for six targets from frame 1 to frame 162.

    (a)

    (c)

    981

  • 7/28/2019 Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

    5/6

    4) Disappearing detection: For targets that are close to thefour surroundings, if they are not detected and tracked for 2frames, they disappear from the monitor range of the camera.

    5) Merge detection: For two or more targets in previousframe, if they are linked to the same target in the current frame,target merging occurs. In this case the graph manager willseparate them. Fig.5 (a) shows target merging and (b) shows itsmask image (merging detection is marked by the red circle in

    the middle) for another input image sequence. The principalaxis of the mask image for the merging targets, is calculatedand is used as the boundary to separate the merged targets.

    6) Split detection: For a target in previous frame, it is linkedto two targets, and further the split targets keep track for 2frames, then split occurs.

    The target graph manager maintains the trajectory of eachtarget. Fig. 5 (c) shows the target trajectory from frame 1 toframe 162, for the six targets.

    IV.TARGET RECOGNITION

    As indicated in Fig.1, the moving target recognition

    subsystem accepts the tracked targets from the tracker. Foreach target, it performs matching with the target pattern indatabase. The target database stores the target name, targetregion represented by its hull, and image data. For image data

    of target, the pixels beyond the hull region are cleared to zeros(refer to Fig. 3 (a)). The similarity measure for targetrecognition is based on Eq. (3). For the recognized target, thissubsystem output the target name. For unknown target, thissubsystem will register it to the database. For the recognizedtarget, its model image data is updated.

    V. EXPERIMENT RESULTS

    The above algorithm is implemented by using MS-VisualC++ 6.0 and Intel Open CV, running on Windows platform. used in missing detection is set at 3, and 2 for new targetdetection and disappearing target detection is set at 5. Thecalculation for modified similarity measure employs 500randomly selected pixels inside target hull, and HSV feature isused in Eq. (4). The test video sequences are from AFRL Vividdatabase. Fig. 6 shows some target detection and trackingresults. First column from left shows the detected and trackedtarget (shown by global number and circled by green ellipses)till frame 48. The second column shows the tracking is lost

    because of the dynamic observer movement (red ellipses showsthe detected targets). The third column shows the five targets

    under tracking and a newly detected target (shown by yellowellipse). The fourth column shows the target merging, which issplit into two targets. Fig. 7 shows some target tracking and

    Frame 48 Frame 108 Frame 198 Frame 342

    Fig. 6 Target detection and tracking results at frame 48, 108, 198, and 342, respectively.

    Frame 30 Frame 144 Frame 244 Frame 636

    Fig.7 Target tracking and recognition results at frame 30, 144, 244, and 636, respectively.

    982

  • 7/28/2019 Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

    6/6

    recognition result. From left to right: (i) blue sedan, (ii) graypick-up truck (iii) white sedan and gray pick-up truck, and (iv)white sedan and gray pick-up truck. In (iii), the gray pick-uptruck is wrongly recognized as a blue sedan because it is

    partially hidden by trees, and in (iv) white sedan and gray pick-up truck are both wrongly recognized as blue sedan becausethey are both partially hidden by trees. The average executiontime for target detection, tracking and recognition, on a

    Windows Vista machine mounted with a 2.33GHz Intel Core2CPU and 2GB memory, are shown in Table 1.

    TABLE 1 AVERAGE PROCESSING TIMEFOR TARGET DETECTION, TRACKING AND RECOGNITION

    Processing task Time (ms)

    Target detection 316.1

    Target tracking and recognition 363.7

    VI.CONCLUSIONS

    This paper proposed an algorithm for multiple moving targetdetection, tracking and recognition from a moving observer.

    The moving observer is a manned/unmanned aerial vehiclemounted with a camera. The proposed algorithm first estimatethe motion model between two consecutive image frames,which is used to remove the moving background. Then itemploys a similarity measure for target tracking based on jointfeature-spatial space. The joint feature-spatial space is HSVfeature and geometry information. The similarity calculationemploys 500 randomly selected pixels. On a Windows Vistamachine mounted with a 2.33GHz Intel Core2 CPU and 2GBmemory, the average processing time is 680 ms. It leads to 1.5frame/s processing rate. The experiment results show the

    proposed algorithm is efficient and fast.

    ACKNOWLEDGEMENTThis work was partially supported by a grant from AFRL underMinority Leaders Program, contract No. TENN 06-S567-07-C2. Also, the authors would like to thank AFRL for providingthe datasets used in this research.

    REFERENCES

    [1] M. Isard and A. Blake, CONDENSATION ConditionalDensity Propagation for Visual Tracking, International Journalon Computer Vision, vol. 1, no. 29, pp.5-28, 1998.

    [2] K. Nummiaro, E. Koller-Meier, and L. V. Gool, An AdaptiveColor-based Particle Filter, Image and Vision Computing, vol.21, 2002, pp.99-110.

    [3] D. Tweed and A. Calway, Tracking Many Objects Using

    Subordinated Condensation, in 13th British Machine VisionConference (BMVC 2002), 2002.

    [4] M. Ross, Model-free, Statistical Detection and Tracking ofMoving Objects, in 13th International Conference on Image

    Processing(ICIP 2006), Atlanta, GA, Oct.8-11, 2006.[5] Y. L. Tian and A. Hampapur, Robust Salient Motion Detection

    with Complex Background for Real-time Video Surveillance,IEEE Computer Society Workshop on Motion and VideoComputing, Breckenridge, Colorado, Jan. 5-6, 2005.

    [6] J. Kang, I. Cohen, G. Medioni, and C. Yuan, Detecction andTracking of Moving Objects from a Moving Platform inPresence of Strong parallax, IEEE international Conference onComputer Vision (ICCV), Beijing, China, Oct. 2005.

    [7] S.M. Smith and J.M. Brady, ASSET-2: Real-time Motion andShape Tracking, IEEE Transactions on Pattern Analysis andMachine Intelligence, vol. 17, No. 8, Aug. 1995.

    [8] A. Ollero, J. Ferruz, et al, Motion Compensation and ObjectDetection for Autonomous helicopter Visual Navigation in

    COMETS system, in Proceedings of IEEE InternationalConference on Robotics and Automation, New Orleans, LA,USA, April 26 May 1, 2004.

    [9] C. Yang, R. Duraiswami and L. Davis, Efficient Mean-ShiftTracking via a New Similarity measure, inProceedings of IEEE

    International Conference on Computer Vision and PatternRecognition, CVPR 2005, San Diego, CA, USA, June 20-25,pp.176-183.

    [10]M. Han, A. Sethi and Y. Gong, A Detection-based MultipleObject Tracking Method, in Proceedings of 2004 IEEE

    International Conference on Image Processing (ICIP 2004),Singapore, October 24-27, 2004.

    [11]A. Chia, W. Huang and L. Li, Multiple Objects Tracking withMultiple Hypotheses Graph Representation, in Proceedings ofthe 18th International Conference on pattern Recognition,

    ICPR06, August 20 24, Hong Kong.[12]W. Qu, D. Schonfeld, and M. Mohamed, Distributed Bayesian

    Multiple-Target Tracking in Crowded Environments UsingMultiple Collaborative Cameras, EURASIP Journal on

    Advances in Signal Processing, Vol. 2007, Article ID 38373.[13]B.D. Lucas and T. Kanade, An Interactive Image Registration

    Technique with an Application in Stereo Vision, in 7thInternational Joint Conference on Artificial Intelligent, 1981,pp.674-679.

    [14]S. Araki, T. Matsuoka, et al, Real-time Tracking of MultipleMoving Object Contours in a Moving Camera Image Sequence,

    IEICE transactions on information and systems, Vol. E83-D, No.7, July 2000.

    [15]A. R. Webb, Statistical Pattern Recognition, John Weley &Sons, UK, 2nd Edition, 2002.

    [16]A. Elgammal, R. Duraiswami, and L. S. Davis, ProbabilisticTracking in Joint Feature-Spatial Spaces, Proceedings ofIEEEComputer Society Conference on Computer Vision and Pattern

    Recognition, Wisconsin, USA, June 16-22, 2003.[17]C. Yang, R. Duraiswami and L. Davis, Efficient Mean-Shift

    Tracking via a New Similarity Measure, Proceedings ofIEEEComputer Society Conference on Computer Vision and Pattern

    Recognition, San Diego, USA, June 20-25, 2005.

    983