[ieee 2009 ieee student conference on research and development (scored) - upm serdang, malaysia...

3
Study and Implementation of Color-based Object Tracking in Monocular Image Sequences Muhammad Owais Mehmood Undergraduate Student, Department of Electronic Engineering NED University of Engineering & Technology, Karachi, Pakistan [email protected] Abstract— Kernel tracking of density-based appearance models is implemented in this paper for real-time object tracking applications. First a ROI, i.e., the region of interest is selected in real-time to create a model. Then the matching and locating of the search object is achieved by using mean-shift algorithm. Experimental results show that this method can find perform object tracking with adaptation to scale and translation, robustness to noise and handling of occlusion. Keywords- Machine vision, Image matching, Image motion analysis, Image Processing, Pattern Recognition I. INTRODUCTION Object Tracking is assigning of consistent label to an object being tracked across different frames of the video. It is a challenging, yet, important task in Computer Vision and finds applications in action recognition, multimedia indexing, automated surveillance, human-machine interfaces, and vehicular guidance systems. Numerous approaches have been proposed for Object Tracking which includes Point Tracking, Kernel Tracking and Silhouette Tracking [1]. This paper emphasizes on the implementation of Kernel tracking of density-based appearance models. Kernel tracking computes the motion of an object, represented as a primitive region, across a sequence of frames. To implement color density based tracking using color histograms, mean-shift algorithm is used. The implementation uses approaches from [2] and [3]. The result is a fast and light-weight object tracking solution which is robust to noise and variations of sizes. Experiments were performed on real-time data obtained from live camera feeds. Different objects and different background settings were used to prove the robustness of the system. II. MEAN-SHIFT ALGORITHM Mean-shift is a tool for finding modes in a set of data samples, manifesting an underlying probability density function (PDF). In other words, given a PDF which in this case is a normalized histogram, the Mean-shift algorithm tends toward the peak of PDF. From the object to be created a normalized PDF, in this case a normalized color histogram is created. As in figure 1, object with the similar color histogram is searched in the vicinity of object as in the previous frame. This histogram matching can be done by the Bhattacharyya Coefficient [2]. If it the histogram matching is close enough, the system assumes it is the same object; else this process is repeated over and over again which results in tracking. Figure 1. Overview of the process III. CONTINUOUSLY ADAPTIVE MEAN-SHIFT ALGORITHM Mean-shift can efficiently handle real data analysis without assuming any prior shape assumptions. There is only one parameter “window size” which has to be set. However, this parameter is non-trivial and inappropriate window sizes can cause false positives/negatives. Therefore, continuously adaptive mean-shift algorithm (CAMSHIFT) is employed as in [3]. It is called Continuously Adaptive because other than mean-shift it also adjusts the size and angle of the object. The Choose a reference model Represent model by PDF in color feature space Identify the model in present frame Search model in next consecutive frame Repeat process Proceedings of 2009 Student Conference on Research and Development (SCOReD 2009), 16-18 Nov. 2009, UPM Serdang, Malaysia 978-1-4244-5187-6/09/$26.00 ©2009 IEEE 109

Upload: muhammad-owais

Post on 10-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2009 IEEE Student Conference on Research and Development (SCOReD) - UPM Serdang, Malaysia (2009.11.16-2009.11.18)] 2009 IEEE Student Conference on Research and Development (SCOReD)

Study and Implementation of Color-based Object Tracking in Monocular Image Sequences

Muhammad Owais Mehmood

Undergraduate Student, Department of Electronic Engineering NED University of Engineering & Technology, Karachi, Pakistan

[email protected]

Abstract— Kernel tracking of density-based appearance models is implemented in this paper for real-time object tracking applications. First a ROI, i.e., the region of interest is selected in real-time to create a model. Then the matching and locating of the search object is achieved by using mean-shift algorithm. Experimental results show that this method can find perform object tracking with adaptation to scale and translation, robustness to noise and handling of occlusion.

Keywords- Machine vision, Image matching, Image motion analysis, Image Processing, Pattern Recognition

I. INTRODUCTION Object Tracking is assigning of consistent label to an object

being tracked across different frames of the video. It is a challenging, yet, important task in Computer Vision and finds applications in action recognition, multimedia indexing, automated surveillance, human-machine interfaces, and vehicular guidance systems.

Numerous approaches have been proposed for Object Tracking which includes Point Tracking, Kernel Tracking and Silhouette Tracking [1]. This paper emphasizes on the implementation of Kernel tracking of density-based appearance models. Kernel tracking computes the motion of an object, represented as a primitive region, across a sequence of frames. To implement color density based tracking using color histograms, mean-shift algorithm is used.

The implementation uses approaches from [2] and [3]. The result is a fast and light-weight object tracking solution which is robust to noise and variations of sizes. Experiments were performed on real-time data obtained from live camera feeds. Different objects and different background settings were used to prove the robustness of the system.

II. MEAN-SHIFT ALGORITHM Mean-shift is a tool for finding modes in a set of data

samples, manifesting an underlying probability density function (PDF). In other words, given a PDF which in this case is a normalized histogram, the Mean-shift algorithm tends toward the peak of PDF.

From the object to be created a normalized PDF, in this case a normalized color histogram is created. As in figure 1, object with the similar color histogram is searched in the vicinity of object as in the previous frame. This histogram matching can be done by the Bhattacharyya Coefficient [2]. If

it the histogram matching is close enough, the system assumes it is the same object; else this process is repeated over and over again which results in tracking.

Figure 1. Overview of the process

III. CONTINUOUSLY ADAPTIVE MEAN-SHIFT ALGORITHM Mean-shift can efficiently handle real data analysis without

assuming any prior shape assumptions. There is only one parameter “window size” which has to be set. However, this parameter is non-trivial and inappropriate window sizes can cause false positives/negatives. Therefore, continuously adaptive mean-shift algorithm (CAMSHIFT) is employed as in [3].

It is called Continuously Adaptive because other than mean-shift it also adjusts the size and angle of the object. The

Choose a reference

model

Represent model by PDF

in color feature space

Identify the model in

present frame

Search model in next consecutive

frame

Repeat process

Proceedings of 2009 �����Student Conference on Research and Development (SCOReD 2009), ����������16-18 Nov. 2009, UPM Serdang, Malaysia

978-1-4244-5187-6/09/$26.00 ©2009 IEEE109

Page 2: [IEEE 2009 IEEE Student Conference on Research and Development (SCOReD) - UPM Serdang, Malaysia (2009.11.16-2009.11.18)] 2009 IEEE Student Conference on Research and Development (SCOReD)

scale and orientation are the best fitted by not only keeping the window centered over the area which has the highest probability but CAMSHIFT also finds match by starting at the object’s previous location and calculating the center of gravity of the PDF probability values within a rectangle. This process is repeated until the rectangle is well-adjusted at the center of gravity of the object.

Due to the continuously adaptive window size, the system is robust to the object’s distribution scale and to translation. Translation in a monocular feed results in change of object size; hence, it is properly handled by CAMSHIFT as well.

Moreover, CAMSHIFT algorithm is based on the HSV color system and relies only upon Hue value (separating the brightness factor out) which gives CAMSHIFT algorithm much wider tolerations against illumination variations.

IV. CAMSHIFT ALGORITHM The basic meanshift tracking starts with the selection of

Region of Interest (ROI) whose normalized color histogram is used. In the next frame, candidate histogram is created at the same location.

Let uq be the normalized color histogram of the target

ROI and up be the candidate normalized color histogram at the location x, same as that of the target in the previous image. Let I be a pixel in the ROI around x, then it provides a weighted sample. If u be the bin of the histogram in which the color is, then the weight contributed by the pixel is

u

uq

piw �

The mean shift vector is given as

( )i i

i

ii

w x

wM x x�

� ��

which is then updated by

i ii

ii

w x

w x���

where i i

ii

i

w x

w

��

is called the weighted ratio. The current

position in then updated and iteration is done. The weighted ratio is biased towards the mode of the normalized color histogram and hence tends toward the correct position. But, meanshift would fail when dealing with variable window sizes. However, CAMSHIFT calculates the centroid of the color probability distribution within its 2D window of calculation, re-centers the window, then calculates the area for the next window size [3]. This overcomes the limitation associated with the basic meanshift algorithm.

(a) (b)

(c) (d)

Figure 2. Object selected in (a) being tracked correctly in (b), (c) and (d) with variations in size and orientation.

The camshift algorithm can be summarized as: calculation of 1D color histograms from H (Hue) of HSV color model. It is preferred over RGB model because RGB is much sensitive to lightning changes. As in the meanshift algorithm, probability distributions of both target and candidate model are built, with the ROI location being the same in both frames. However, the candidate model’s ROI size is slightly larger. Then, meanshift is performed for a set number of iterations; zeroth moment (area or size) and mean location are stored. The mean location serves as the center of search window/ROI for the next frame and search window size is a function of the zeroth moment calculated previously. This is repeated for all frames in the sequence. Further details are available in [3].

V. RESULTS Figures 2 and 3 show the real-time tracking results of rigid

and non-rigid objects, respectively. Tracking is performed correctly despite variations in the orientation, shapes and sizes.

Figure 3. Object tracked with variations in shape and size

110

Page 3: [IEEE 2009 IEEE Student Conference on Research and Development (SCOReD) - UPM Serdang, Malaysia (2009.11.16-2009.11.18)] 2009 IEEE Student Conference on Research and Development (SCOReD)

(a) (b)

(c)

Figure 4. (a) Object being tracked while leaving the Field of

View of Camera (b) Object leaves the FOV (c) Object re-entry into the Field of View and consistent tracking.

As long as the occlusion is not 100%, CAMSHIFT continues to track an object [3]. However, Experiments of the implementation showed that when the object moves out of the scene (100% occlusion) and re-enters from the same point/area, then the algorithm can successfully handle 100% occlusion as in figure 4. However, if the object entry/exit point is different then the system looses track completely. Figure 5 shows tracking results with input feed obtained from a lower quality webcam. The images are blurred, yet, the object is correctly tracked.

VI. CONCLUSION Object tracking is an important task for various computer

and machine vision applications, ranging from industrials tasks, e.g., to place objects automatically or to align them to a moving part, to complex tasks, e.g., localization and mapping in a robot vision application.

To conclude, in this paper object tracking using mean-shift was reviewed and real-time tracking was performed using continuously adaptive mean-shift which resulted in correct correspondences even with noise, size and orientation variances, and occlusion. The actual application was implemented on VC++ using OpenCV libraries.

REFERENCES [1] A. Yilmaz, O. Javed, and M. Shah, "Object tracking: A survey," ACM

Comput. Surv., vol. 38, no. 4, pp. 13+, 2006. [Online]. Available: http://dx.doi.org/10.1145/1177352.1177355

[2] D. Comaniciu and P. Meer, "Mean shift: a robust approach toward feature space analysis," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, no. 5, pp. 603-619, 2002. [Online]. Available: http://dx.doi.org/10.1109/34.1000236

[3] G. R. Bradski, "Computer vision face tracking for use in a perceptual user interface," Intel Technology Journal, no. Q2, 1998. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.7673

[4] D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-based object tracking," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 25, no. 5, pp. 564-577, 2003. [Online]. Available: http://dx.doi.org/10.1109/TPAMI.2003.1195991

[5] D. Comaniciu, V. Ramesh, and P. Meer, "Real-time tracking of non-rigid objects using mean shift," vol. 2, 2000, pp. 142-149 vol.2. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2000.854761

[6] Y. Cheng, "Mean shift, mode seeking, and clustering," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 17, no. 8, pp. 790-799, 1995. [Online]. Available: http://dx.doi.org/10.1109/34.400568

[7] Jue Wang, Yingqing Xu, Heung-Yeung Shum, Michael F. Cohen, “Video Tooning”, SIGGRAPH '04: ACM SIGGRAPH 2004 Papers, pp. 574-583, 2004. [Online]. Available: http://doi.acm.org/10.1145/1186562.1015763

[8] Computer Vision Application with C# by Arif Khan. (undated). [Online]. Viewed 2009 July-August. Available: http://www.codeproject.com/script/Articles/MemberArticles.aspx?amid=4058454

Figure 5. Tracking of the red object is performed in noisy,

blurred images.

111