robust real-time object detection paul viola & michael jones
Post on 21-Dec-2015
237 Views
Preview:
TRANSCRIPT
Robust Real-Time Object Detection
Paul Viola & Michael Jones
Introduction
Frontal face detection is achieved Comparatively satisfactory detection rates Efficient decrease in false positive rate
Extremely rapid operation 384*288 pixel image is processed for 15
frames/second
Contribution of The Paper
Integral image A new image representation
AdaBoost Effective classifier selection
Cascade structure of complex classifiers Dramatic decrease in detection time
Simple Rectangle Features
Why not use pixels directly ? Features encodes domain
knowledge that is hard to learn by finite quantity of training data
Features operates much faster than pixel based systems
Integral Image
Double integral of original image
A new representation of image for fast calculation of rectangle features
Integral Image
Sum of pixels in rectangle D from the original image can be defined in integral image as :
P(4) -P(3)-P(2)+P(1)
Advantages of Integral Image
Pyramid image Requires a pyramid of
images A fixed scaled
detector works on all those images
Forming the pyramid is computationally expensive
Integral Image A single feature can
be evaluated at any scale and location in a few operations
Integral image is computed in one pass over the original image
Learning Classification Functions
45,394 features associated with each sub-window
A very small number of these features can be combined to form an effective classifier
A variant of AdaBoost is used to Select features Train the classifier
How does AdaBoost work?
Combines a mixture of weak classifiers to form a strong one
Percepton algorithm returns the one having the minimum classification error
The examples are re-weighted in according to the accuracy of the first classifier
The final strong classifier is a weighted combination of weak classifiers
How does AdaBoost work?
First and second features selected by AdaBoost
Attentional Cascade
Increase detection performance & reduce computation time
Calling simpler classifiers before complex ones
A simple classifier example (two-feature): 100% detection rate 40% false positives 60 microprocessor instructions (very efficient)
Attentional Cascade
Training of Cascade of Classifiers
The deeper classifiers are trained with harder examples
Simple classifiers in the first stages, complex ones in the deeper parts of the cascade
Complex classifiers takes more time to compute
A general detection algorithm works like 85-95% detection rate & 10-5 - 10-6 % false positive rate
The cascade system works like
With a 10 stage classifier For each cascade a detection rate 99% and
false positive rate 30 % Overall system runs at
(.9910~) 90% detection rate (0.3010 ~)6 * 10-6% false positive rate
Training of Cascade of Classifiers
Requirements
Needs to be determined : Number of stages Number of features for each stage Thresholds for each stage
Practical Implementation
User selects acceptable fi and di for each layer
Each layer is trained for Adaboost Number of features are increased until target
fi and di are met for this level If overall target F and D is not met for the
system we add a new level to the cascade
Results – Structure of Cascade
32 layers – 4297 features
Weeks spent to train the cascade
Layer # 1 2 3 4 5 6 7 …
Features 2 5 20 20 20 50 50 …
False Positives 40% 20%
Detection Rate 100% 100%
Results – Algorithm Details
All sub-windows (training – testing) are variance normalized for lighting conditions
Scaling is achieved by just scaling the detectors rather than the image
Step size of one pixel is used
Results – Algorithm details
Results
Most of the windows are rejected in the first & second cascade
Face detection on a 384x288 image runs in about 0.067 seconds
15 times faster than Rowley-Baluja-Kanade 600 times faster than Schneiderman- Kanade
Results
150 images and 507 faces
top related