ivan laptev irisa/inria, rennes, france september 07, 2006
DESCRIPTION
Boosted Histograms for Improved Object Detection. Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006. Histograms for object recognition. Remarkable success of recognition methods using histograms of local image measurements:. [Swain & Ballard 1991] - Color histograms - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/1.jpg)
Ivan Laptev
IRISA/INRIA, Rennes, France
September 07, 2006
Boosted HistogramsBoosted Histogramsfor for
Improved Object DetectionImproved Object Detection
![Page 2: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/2.jpg)
• [Swain & Ballard 1991] - Color histograms
• [Schiele & Crowley 1996] - Receptive field histograms
• [Lowe 1999] - localized orientation histograms (SIFT)
• [Schneiderman & Kanade 2000] - localized histograms of wavelet coef.
• [Leung & Malik 2001] - Texton histograms
• [Belongie et.al. 2002] - Shape context
• [Dalal & Triggs 2005] - Dense orientation histograms
Remarkable success of recognition methods using histograms of local image measurements:
Likely explanation: Histograms are robust to image variations such as limited geometric transformations and object class variability.
Histograms for object recognitionHistograms for object recognition
![Page 3: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/3.jpg)
Histograms
What to measure?
• No guarantee for optimal recognition • Different regions may have different discriminative power
Color
[SB91]
Gaussian derivatives
[SC96]
Wavelet coeff.
[SK00]
Textons
[LM01]
Gradient orientation
[L99,DT05]
Where to measure?
AB
C
DAB
C
D
Whole image
[SB91,SC96]
Pre-defined grid
[SK00,BMP02,DT05]
Key points
[L99]
Histograms: What vs. WhereHistograms: What vs. Where
![Page 4: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/4.jpg)
• Efficient discriminative classifier [Freund&Schapire’97]• Good performance for face detection [Viola&Jones’01]
IdeaIdea
boosting
selected features
weak classifier
AdaBoost:
Haar features
Histogram features
SVMNeural Networks
Too heavy
![Page 5: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/5.jpg)
Possible approach:
Example 1:
Weak learnerWeak learner
1-dim. projections onto predefined vectors
![Page 6: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/6.jpg)
Possible approach:
Example 2:
Weak learnerWeak learner
1-dim. projections onto predefined vectors
![Page 7: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/7.jpg)
feature mean feature covariance
Fischer weak learnerFischer weak learner
Alternative approach:
Evidence from real image training data:
Fischer learner “1-bin” learner
• Assume Normal distribution of features (hopefully valid at least for some of ~10^5 features!)• Compute projection direction by FLD:
![Page 8: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/8.jpg)
Histogram featuresHistogram features
~10^5 rectangle features
Histograms over 4 gradient orientations, 4 subdivisions for each reactangle
![Page 9: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/9.jpg)
Training dataTraining data
Crop and resize
• Perturb annotation
• Increase training set X 10
+
![Page 10: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/10.jpg)
Training: Selected FeaturesTraining: Selected Features
376 of ~10^5 features selected 0.999 correct classification10^-5 false positives
![Page 11: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/11.jpg)
• Scan and classify image windows at different positions and scales
• Cluster detections in the space-scale space• Assign cluster size to the detection confidence
Conf.=5
Object detectionObject detection
![Page 12: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/12.jpg)
motorbikes
bicycles
people
cars
#217 / #220
#123 / #123
#152 / #149
#320 / #341
PASCAL Visual Object ClassesPASCAL Visual Object ClassesChallenge 2005 (VOC’05)Challenge 2005 (VOC’05)
![Page 13: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/13.jpg)
Ground truth annotation
Detection results:• >50 % overlap of bounding box with GT•one bounding box for each object• confidence value for each detection
Precision-Recall (PR) curve:
Average Precision (AP) value:
Evaluation criteriaEvaluation criteria
Detection results:• >50 % overlap of bounding box with GT•one bounding box for each object• confidence value for each detection
Detection results:• >50 % overlap of bounding box with GT•one bounding box for each object• confidence value for each detection
Detection results:• >50 % overlap of bounding box with GT•one bounding box for each object• confidence value for each detection
![Page 14: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/14.jpg)
PR-curves for the “Motorbike” validation dataset:
[Levi and Weiss, CVPR 2004] “Learning object detection from a small number of examples: The importance of good features”
Evaluation of detectionEvaluation of detection
FLD learner
+ 1-bin classifier
![Page 15: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/15.jpg)
Bicycles test1 People test1
cars test1Motorbikes test1
Results for VOC’05 ChallengeResults for VOC’05 Challenge
![Page 16: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/16.jpg)
Average Precision values:
Results for VOC’05 ChallengeResults for VOC’05 Challenge
![Page 17: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/17.jpg)
![Page 18: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/18.jpg)
![Page 19: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/19.jpg)
PASCAL Visual Object ClassesPASCAL Visual Object ClassesChallenge 2006 (VOC’06)Challenge 2006 (VOC’06)
![Page 20: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/20.jpg)
examples
Results for VOC’06 ChallengeResults for VOC’06 Challenge
Competition "comp3" (train on VOC data) Class “bicycle"
![Page 21: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/21.jpg)
examples
Results for VOC’06 ChallengeResults for VOC’06 Challenge
Competition "comp3" (train on VOC data) Class “cow"
![Page 22: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/22.jpg)
examples
Results for VOC’06 ChallengeResults for VOC’06 Challenge
Competition "comp3" (train on VOC data) Class “horse"
![Page 23: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/23.jpg)
Results for VOC’06 ChallengeResults for VOC’06 Challenge
Competition "comp3" (train on VOC data) Class “motorbike"
![Page 24: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/24.jpg)
Results for VOC’06 ChallengeResults for VOC’06 Challenge
Competition "comp3" (train on VOC data) Class “person"
![Page 25: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/25.jpg)
bicycle bus car cat cow dog horse motorbike person sheep
Cambridge 0.249 0.138 0.254 0.151 0.149 0.118 0.091 0.178 0.030 0.131
ENSMP - - 0.398 - 0.159 - - - - -
INRIA_Douze 0.414 0.117 0.444 - 0.212 - - 0.390 0.164 0.251
INRIA_Laptev 0.440 - - - 0.224 - 0.140 0.318 0.114 -
TUD - - - - - - - 0.153 0.074 -
TKK 0.303 0.169 0.222 0.160 0.252 0.113 0.137 0.265 0.039 0.227
Average Precision values:
Results for VOC’06 ChallengeResults for VOC’06 Challenge
![Page 26: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/26.jpg)
• All results are obtained with a single set of parameters
• Small number of training samples is sufficient
• Efficient detection: 10fps on 320x280 images
• Extension to texton/color histogram features is straightforward
Open questions:
• Other free-shape regions better? How to find them?
• Better weak learner that takes advantage of histogram properties
• View transformations
Final NotesFinal Notes
![Page 27: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/27.jpg)
• All results are obtained with a single set of parameters
• Small number of training samples is sufficient
• Efficient detection: 10fps on 320x280 images
• Extension to texton/color histogram features is straightforward
Open questions:
• Other free-shape regions better? How to find them?
• Better weak learner that takes advantage of histogram properties
• View transformations
Final NotesFinal Notes
![Page 28: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/28.jpg)
• All results are obtained with a single set of parameters
• Small number of training samples is sufficient
• Efficient detection: 10fps on 320x280 images
• Extension to texton/color histogram features is straightforward
Open questions:
• Other free-shape regions better? How to find them?
• Better weak learner that takes advantage of histogram properties
• View transformations
Final NotesFinal Notes
![Page 29: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/29.jpg)
• All results are obtained with a single set of parameters
• Small number of training samples is sufficient
• Efficient detection: 10fps on 320x280 images
• Extension to texton/color histogram features is straightforward
Open questions:
• Other free-shape regions better? How to find them?
• Better weak learner that takes advantage of histogram properties
• View transformations
Final NotesFinal Notes
![Page 30: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/30.jpg)
• All results are obtained with a single set of parameters
• Small number of training samples is sufficient
• Efficient detection: 10fps on 320x280 images
• Extension to texton/color histogram features is straightforward
Open questions:
• Other free-shape regions better? How to find them?
• Better weak learner that takes advantage of histogram properties
• View transformations
Final NotesFinal Notes
• Detection tasks in VOC05,VOC06 are far from being solved, it is a challenge!
![Page 31: Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006](https://reader035.vdocuments.us/reader035/viewer/2022062809/56815a34550346895dc773b6/html5/thumbnails/31.jpg)