fan yang, huchuan lu, and ming-hsuan yang - umiacsfyang/papers/tip14_spt_supp.pdf · bird1 #100...
TRANSCRIPT
IEEE TRANSACTIONS ON IMAGE PROCESSING 1
Robust Superpixel TrackingFan Yang, Huchuan Lu, and Ming-Hsuan Yang
I. TRACKING RESULTS
We present the tracking results of all state-of-the-art trackers
and our SPT tracker on all sequences in Figure 1 and 2.
II. ERROR PLOTS
The tracking error plots in terms of center position are
shown in Figure 3. All the tracking results can be found at
http://www.umiacs.umd.edu/∼fyang/spt.html.
III. SEGMENTATION RESULTS
Figure 4 shows the tracking results of foreground and back-
ground segmentation from the liquor sequence. Figure 5 shows
another video segmentation results of the racecar sequence.
IV. ADDITIONAL RESULTS
We present the tracking results of our SPT tracker on the
box and board sequences from the PROST dataset in Figure 6
and 7. As shown in the figure, our tracker successfully keeps
track of the target objects throughout the entire sequences.
V. FAILURE CASE
Since our tracker only relies on color features for superpixel
segmentation, it is likely to fail when such visual information
is indistinguishable form the background. We show one failure
case of our tracker on the skating1 sequence from the VTD
dataset in Figure 8. When the skater enters into the dark areas,
almost all pixels of the skater and the surrounding region
are indistinguishable. In such cases, the proposed appearance
model is not able to obtain a good MAP estimate, thereby
leading to tracking failures. To address this problem, other
complementary features can be included into our framework.
VI. ANALYSIS OF PARAMETER SENSITIVITY
We present tracking results of our tracker on two sequences,
bird2 and transformer, by changing parameters and demon-
strate that our tracker is not very sensitive to parameters. We
use 200, 300 and 400 for the number of superpixels, 5, 10
and 15 for update frequency, and 0.51, 0.515 and 0.52 for
occlusion detection threshold, resulting in 27 combinations of
parameters.
The quantitative comparisons in terms of average error of
center location and number of successful tracked frames are
shown in Table I, II and III.
The results show that our tracker is not sensitive to parame-
ter changes. It consistently performs well as long as changing
parameters are within a reasonable range.
IEEE TRANSACTIONS ON IMAGE PROCESSING 2
lemming #332 lemming #399 lemming #997 lemming #1299
liquor #734 liquor #778 liquor #1413 liquor #1722
singer1 #13 singer1 #81 singer1 #174 singer1 #211
basketball #35 basketball #236 basketball #678 basketball #725
woman #43 woman #60 woman #234 woman #312
transformer #17 transformer #52 transformer #73 transformer #124
Fig. 1. Tracking results on the public datasets by the IVT, Frag, MIL, PROST, VTD, TLD, Struck, HT and SPT methods. The best four trackers in termsof errors of center location are shown.
IEEE TRANSACTIONS ON IMAGE PROCESSING 3
bolt #20 bolt #92 bolt #222 bolt #350
bird1 #100 bird1 #185 bird1 #224 bird1 #268
bird2 #11 bird2 #19 bird2 #63 bird2 #92
girl #117 girl #206 girl #731 girl #911
surfing1 #28 surfing1 #46 surfing1 #174 surfing1 #217
racecar #51 racecar #161 racecar#519 racecar#704
Fig. 2. Tracking results on our own datasets by the IVT, Frag, MIL, VTD, ℓ1, TLD, Struck, HT and SPT methods. The best four trackers in terms of errorsof center location are shown.
TABLE IQUANTITATIVE COMPARISONS IN TERMS OF AVERAGE ERROR OF CENTER LOCATION (TOP) AND NUMBER OF SUCCESSFUL TRACKED FRAMES (BOTTOM)
ON THE bird2 AND transformer SEQUENCES BY CHANGING PARAMETERS. THE NUMBER OF SUPERPIXELS IS 200. FOR EACH COMBINATION, THE
NUMBERS INDICATE UPDATE FREQUENCY AND OCCLUSION DETECTION THRESHOLD IN ORDER.
Sequence (5, 0.51) (5, 0.515) (5, 0.52) (10, 0.51) (10, 0.515) (10, 0.52) (15, 0.51) (15, 0.515) (15, 0.52)
transformer 12 13 12 13 12 12 10 12 12bird2 16 23 18 18 20 19 15 16 21
Sequence (5, 0.51) (5, 0.515) (5, 0.52) (10, 0.51) (10, 0.515) (10, 0.52) (15, 0.51) (15, 0.515) (15, 0.52)
transformer 124 124 124 124 124 124 124 124 124bird2 93 55 84 87 69 71 92 94 61
IEEE TRANSACTIONS ON IMAGE PROCESSING 4
0 200 400 600 800 1000 12000
10
20
30
40
50
60
70
80
Frame #
Cen
ter
Err
ors
(in p
ixel
)
lemming
IVTFragMILVTDl1TLDStruckHTSPT
0 200 400 600 800 1000 1200 1400 16000
50
100
150
200
250
Frame #
Cen
ter
Err
ors
(in p
ixel
)
liquor
IVTFragMILVTDl1TLDStruckHTSPT
0 50 100 150 200 250 300 3500
5
10
15
20
25
30
35
40
Frame #
Cen
ter
Err
ors
(in p
ixel
)
singer1
IVTFragMILVTDl1TLDStruckHTSPT
0 100 200 300 400 500 600 7000
10
20
30
40
50
60
70
80
Frame #
Cen
ter
Err
ors
(in p
ixel
)
basketball
IVTFragMILVTDl1TLDStruckHTSPT
0 50 100 150 200 250 3000
20
40
60
80
100
120
140
160
Frame #
Cen
ter
Err
ors
(in p
ixel
)
woman
IVTFragMILVTDl1TLDStruckHTSPT
0 20 40 60 80 100 1200
20
40
60
80
100
120
140
160
Frame #
Cen
ter
Err
ors
(in p
ixel
)
transformer
IVTFragMILVTDl1TLDStruckHTSPT
0 50 100 150 200 250 300 3500
20
40
60
80
100
120
140
160
Frame #
Cen
ter
Err
ors
(in p
ixel
)
bolt
IVTFragMILVTDl1TLDStruckHTSPT
0 50 100 150 200 250 300 350 4000
20
40
60
80
100
120
140
160
180
200
Frame #
Cen
ter
Err
ors
(in p
ixel
)
bird1
IVTFragMILVTDl1TLDStruckHTSPT
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
Frame #
Cen
ter
Err
ors
(in p
ixel
)
bird2
IVTFragMILVTDl1TLDStruckHTSPT
0 500 1000 15000
50
100
150
200
250
300
350
400
Frame #
Cen
ter
Err
ors
(in p
ixel
)
girl
IVTFragMILVTDl1TLDStruckHTSPT
0 50 100 150 200 2500
20
40
60
80
100
120
140
160
Frame #
Cen
ter
Err
ors
(in p
ixel
)
surfing1
IVTFragMILVTDl1TLDStruckHTSPT
0 100 200 300 400 500 600 7000
20
40
60
80
100
120
140
160
Frame #
Cen
ter
Err
ors
(in p
ixel
)
racecar
IVTFragMILVTDl1TLDStruckHTSPT
Fig. 3. Error plots of the IVT, FragTrack, MILTrack, PROST, VTD, ℓ1, TLD, Struck, HT and SPT methods in terms of the error of center position in pixel.
TABLE IIQUANTITATIVE COMPARISONS IN TERMS OF AVERAGE ERROR OF CENTER LOCATION (TOP) AND NUMBER OF SUCCESSFUL TRACKED FRAMES (BOTTOM)
ON THE bird2 AND transformer SEQUENCES BY CHANGING PARAMETERS. THE NUMBER OF SUPERPIXELS IS 300. FOR EACH COMBINATION, THE
NUMBERS INDICATE UPDATE FREQUENCY AND OCCLUSION DETECTION THRESHOLD IN ORDER.
Sequence (5, 0.51) (5, 0.515) (5, 0.52) (10, 0.51) (10, 0.515) (10, 0.52) (15, 0.51) (15, 0.515) (15, 0.52)
transformer 13 13 12 14 13 13 12 13 13bird2 16 18 19 16 17 16 15 18 20
Sequence (5, 0.51) (5, 0.515) (5, 0.52) (10, 0.51) (10, 0.515) (10, 0.52) (15, 0.51) (15, 0.515) (15, 0.52)
transformer 124 124 124 124 124 124 124 124 124bird2 96 85 71 86 90 87 96 69 72
TABLE IIIQUANTITATIVE COMPARISONS IN TERMS OF AVERAGE ERROR OF CENTER LOCATION (TOP) AND NUMBER OF SUCCESSFUL TRACKED FRAMES (BOTTOM)
ON THE bird2 AND transformer SEQUENCES BY CHANGING PARAMETERS. THE NUMBER OF SUPERPIXELS IS 400. FOR EACH COMBINATION, THE
NUMBERS INDICATE UPDATE FREQUENCY AND OCCLUSION DETECTION THRESHOLD IN ORDER.
Sequence (5, 0.51) (5, 0.515) (5, 0.52) (10, 0.51) (10, 0.515) (10, 0.52) (15, 0.51) (15, 0.515) (15, 0.52)
transformer 10 14 12 11 12 13 12 12 12bird2 15 16 17 16 17 16 15 15 18
Sequence (5, 0.51) (5, 0.515) (5, 0.52) (10, 0.51) (10, 0.515) (10, 0.52) (15, 0.51) (15, 0.515) (15, 0.52)
transformer 124 124 124 124 124 124 124 124 124bird2 94 91 87 89 88 93 96 92 75
IEEE TRANSACTIONS ON IMAGE PROCESSING 5
liquor #278 liquor #768 liquor #1287 liquor #1453 liquor #1736
Fig. 4. Results of foreground/background segmentation and tracking across frames on sequence liquor. First row: original images. Second row: confidencemaps of corresponding local regions, which is obtained by using the appearance model. Third row: the segmentation results. Fourth row: the final trackingresults of each frame.
IEEE TRANSACTIONS ON IMAGE PROCESSING 6
racecar #17 racecar #57 racecar #298 racecar #445 racecar #644
Fig. 5. Results of foreground/background segmentation and tracking across frames on sequence racecar. First row: original images. Second row: confidencemaps of corresponding local regions, which is obtained by using the appearance model. Third row: the segmentation results. Fourth row: the final trackingresults of each frame.
IEEE TRANSACTIONS ON IMAGE PROCESSING 7
Fig. 6. Tracking results of our SPT tracker on the box sequence from the PROST dataset.
IEEE TRANSACTIONS ON IMAGE PROCESSING 8
Fig. 7. Tracking results of our SPT tracker on the board sequence from the PROST dataset.
IEEE TRANSACTIONS ON IMAGE PROCESSING 9
Fig. 8. A failure case of our SPT tracker on the skating1 sequence from the VTD dataset.