1
Object Class DetectionChristoph Einsiedler
2
Motivation Face recognition StreetView street address
recognition
http://googleonlinesecurity.blogspot.de/2014/04/street-view-and-recaptcha-technology.html
3
Motivation Electronic driving aids (traffic sign recognition)
Image organisation/search (automatic tagging)
http://rossel-vw.de/p_50679/de/models/cc/galerie.html
4
Problem description Object Class Detection Classification Localization
Face recognition etc. as special cases
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/
5
Problem description
robustness Big differences between
instances of the same category
Small differences between instances of different categories
complexity Huge number of
categories
6
Algorithms
Find interest points SIFT …
Interest point description SIFT HOG …
Image description Bag-of-features …
7
Algorithms SIFT
1. Scale-space extrema detection Convolution with Gaussian filters at different scales Calculation of differences Points with maximal differences as keypoints
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
8
Algorithms SIFT
2. Keypoint localization Calculation of interpolatated positions Removal of keypoints with low contrast Removal of poorly located keypoints on edges
9
Algorithms SIFT
3. Orientation assignment Gradients of Gaussian smoothed image are considered (scale
invariance) Magnitudes and directions are put into a histogram Orientation of the highest peak is assigned (rotation invariance)
10
Algorithms SIFT
4. Keypoint descriptor
(illumination, viewing angle,… invariance)
11
Algorithms
Find interest points SIFT • …
Interest point description SIFT • HOG …
Picture description Bag-of-features …
12
Algorithms
HOG
1. Gamma/Color normalization Greyscale, RGB or LAB tested Not neccessary
http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf
13
Algorithms
HOG
2. Gradient computation Different masks tested (e.g. sobel masks) 1-D centered mask best
14
Algorithms
HOG
3. Orientation binning Edge orientation histogram for each cell of the image Orientations grouped into 9 bins (0-180°)
15
Algorithms
HOG
4. Normalization and descriptor blocks Image divided into blocks (R-HOG, C-HOG) Normalization Aggregation into one vector
16
Algorithms
Find interest points SIFT • …
Interest point description SIFT • HOG • …
Picture description Bag-of-features …
17
AlgorithmsBag-of-features
origins in document classification
later also used for object class detetcion in images
http://www.dtic.mil/dtic/tr/fulltext/u2/a307731.pdf
18
AlgorithmsBag-of-features
Clustering
create signatures for images
http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/dorko_schmid_obj_class_rec.pdf
19
Algorithms
Find interest points SIFT • …
Interest point description SIFT • HOG • …
Picture description Bag-of-features • …
20
Evaluation Comparability not easy
Pascal VOC often used Benchmark (training data, test data) Images from Flickr Manually annotated Annual competitions
21
Evaluation Classification/Detection
Competitions Classification Localization
Segmentation Competition
Action Classification Competition
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/
22
Evaluation Classification/detection competition 20 classes of objects:
Class Example image 1 Example image 2
aeroplane
bicycle
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/
23
EvaluationClass Example image 1 Example image 2
bird
boat
bottle
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/
24
EvaluationClass
bus
car
cat
chair
cow
diningtable
dog
horse
Class
motorbike
person
pottet plant
sheep
sofa
diningtable
train
tv/monitor
25
EvaluationEvaluation measures: Recall Precision
Average Precision
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/
26
Evaluation Pascal VOC 2012 results
algorithm mean
aero plan
e
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
dining
table
dog
horse
motor
bike
person
pottet
plant
sheep
sofa
train
tv/moni-tor
NUSPL_CTX_GPM_SCM
82.2 97.3 84.2 80.8 85.3 60.8 89.9
86.8
89.3
75.4 77.8
75.1 83.0
87.5 90.1 95.0 57.8 79.2 73.4 94.5 80.7
NUSPSL_CTX_GPM
78.6 95.5 81.1 79.4 82.5 58.2 87.7
84.1
83.1
68.5 72.8
68.5 76.4
83.3 87.5 92.8 56.5 77.8 67.0 91.2 77.6
NLPR_PLS_SSVW
78.3 94.5 82.6 79.4 80.7 57.8 87.8
85.5
83.9
66.6 74.2
69.4 75.2
83.0 88.2 93.6 56.2 75.6 64.1 90.0 76.6
NUS_Context_SVM
78.3 95.3 81.5 78.9 81.8 57.5 87.3
83.7
82.3
68.4 75.0
68.5 75.8
82.9 86.7 92.7 56.8 77.7 66.1 90.7 77.1
Semi-Semantic Visual Words & Partial Least Sqares
78.3 94.5 82.6 79.4 80.7 57.8 87.8
85.5
83.9
66.6 74.2
69.4 75.2
83.0 88.2 93.6 56.2 75.6 64.1 90.0 76.6
NUSPSL_CTX_GPM_SVM
76.7 94.3 78.5 76.4 80.0 57.0 86.3
82.1
81.5
65.6 74.7
66.5 73.4
81.9 85.4 91.9 53.2 74.0 65.1 89.5 76.1
CVC_UVA_UNITN
74.3 92.0 74.2 73.0 77.5 54.3 85.2
81.9
76.4
65.2 63.2
68.5 68.9
78.2 81.0 91.6 55.9 69.4 65.4 86.7 77.4
UvA_UNITN_MostTellingMonkey
73.4 90.1 74.1 66.6 76.0 57.0 85.6
81.2
74.5
63.5 62.7
64.5 66.6
76.5 81.3 90.8 58.7 69.5 66.3 84.7 77.3
CVC_CLS 71.0 89.3 70.9 69.8 73.9 51.3 84.8
79.6
72.9
63.8 59.4
64.1 64.7
75.5 79.2 91.4 42.7 63.2 61.9 86.7 73.8
MSRA_USTC_HIGH_ORDER_SVM
70.5 92.8 74.8 69.6 76.1 47.3 83.5
76.4
76.9
59.8 54.5
63.5 67.0
75.1 78.8 90.4 43.2 63.3 60.4 85.6 71.2
27
Thank you for your attention.