pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · spatial pyramid with...
TRANSCRIPT
![Page 1: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/1.jpg)
Outline
1. Bag of visual words model for categorization
• SVM classifier
2. Adding spatial information for localization
3. Databases and challenges
4. Spatial layout
5. Class based segmentation
• Pixel level localization
6. Conclusions and the future
![Page 2: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/2.jpg)
Beyond a bag of visual words
![Page 3: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/3.jpg)
Outline
• Region of Interest (ROI)
• jumping/sliding window for localization
• Spatial tiling
• Histogram of Gradients (HOG)
• Spatial pyramid
• Case study
![Page 4: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/4.jpg)
• Problem of background clutter • Use a sub-window
– At correct position, no clutter is present
Region of Interest (ROI)
![Page 5: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/5.jpg)
– Scale / orientation range to search over – Speed– Context
Sliding window detection
![Page 6: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/6.jpg)
search over scale
SmallestScale
LargerScale
![Page 7: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/7.jpg)
Problems with sliding windows …
• aspect ratio
• granuality (finite grid)
• partial occlusion
• multiple responses
See recent work by
• Christoph Lampert et al CVPR 08, ECCV 08
• Bosch et al BMVC 08
![Page 8: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/8.jpg)
Sliding window• Classifier: SVM with linear kernel
• bag of visual word representation of ROI
• Stronger training: ROI on object instance
Example detections for dog
Lampert et al CVPR 08
![Page 9: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/9.jpg)
More spatial information - tilingUse spatial grid to define correspondence
If codebook has V visual words, then representation has dimension 4V
Fergus et al ICCV 05
• parameter: number of tiles
![Page 10: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/10.jpg)
Ex: Leibe & Schiele 03/04 : Generalized Hough Transform
• Learning: for every cluster, store possible “occurrences”
• Recognition: for new image, let the matched patches vote for possible object positions
![Page 11: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/11.jpg)
Voting Space(continuous)
Interest Points Matched Codebook Entries
Probabilistic Voting
Backprojectionof Maximum
Ex: Leibe & Schiele 03/04 : Generalized Hough Transform
![Page 12: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/12.jpg)
More features – histogram of orientations
Counts in orientation bins can be thought of as visual words
imagedominant direction HOG
frequ
ency
orientation
• tiling
• each tile represents HOG
• dense descriptor
![Page 13: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/13.jpg)
Ex 1: Human (Pedestrian) DetectionHistograms of Oriented Gradients for Human DetectionDalal & Triggs, CVPR 2005
![Page 14: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/14.jpg)
• training: ROI over pedestrian
• classification: linear SVM on HOG
• NB similarity to SIFT, GIST
![Page 15: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/15.jpg)
Dalal and Triggs, CVPR 2005
![Page 16: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/16.jpg)
Learned model
Slide from Deva Ramanan
f(x) = w>x+ b
![Page 17: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/17.jpg)
Slide from Deva Ramanan
![Page 18: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/18.jpg)
Ex 2: Upper body detector – using HOGs
• Ferrari et al CVPR 08
average training data
![Page 19: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/19.jpg)
More features – dense visual words
Vogel & Schiele 2004, Jurie & Triggs ICCV 05 , Fei-Fei & Perona CVPR 05, Bosch et al ECCV 06
DENSE PATCHES
…
…
Row reorder gray valuesand form a vector ofsize N2
Parameters: N – size of patchM – distance between patches
Textons
… …… …… …
128- SIFT descriptor
Parameters: r – radi of patchM – distance between patches
SIFT
Luong & Malik 1999, Varma & Zisserman 2003
![Page 20: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/20.jpg)
More Spatial information – Pyramid kernels
• Divide image into grids of varying resolution, and give more weight to agreement in finer grids.– 2^l grids at level l
• Intersect histograms, multiply by weight.
Lazebnik et al. [CVPR 2006]
![Page 21: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/21.jpg)
Based on Grauman & Darrell ICCV 05
Spatial Pyramid Kernels for Geometry/Appearance Matching(Lazebnik et al CVPR’06)
More Spatial information – Pyramid kernels
![Page 22: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/22.jpg)
![Page 23: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/23.jpg)
9 matches x 1
= 9
![Page 24: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/24.jpg)
9 matches x 1
= 9
![Page 25: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/25.jpg)
9 matches x 1 4 matches x ½
= 9 = 2
![Page 26: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/26.jpg)
9 matches x 1 4 matches x ½
= 9 = 2
![Page 27: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/27.jpg)
9 matches x 1 4 matches x ½ 2 matches x ¼
= 9 = 2 = 1/2
Total matching weight (value of spatial pyramid kernel ): 9 + 2 + 0.5 = 11.5
![Page 28: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/28.jpg)
l = 0 l = 1 l = 2
+ +
Pyramid spatial layout for appearance patches – for images
BoW 4BoW 16BoW
Represent appearance as dense grid of visual words
![Page 29: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/29.jpg)
l = 1
l = 0
x y
Generalizations• Use chi-squared kernel instead of histogram intersection
Kf(i, j) =Xl∈L
βlfe−μχ2(hlf(i),hlf(j))
![Page 30: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/30.jpg)
l = 0 l = 1 l = 2
1HOG 4HOG 16HOG
Pyramid HOG – for imagesRepresent local orientated gradients
![Page 31: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/31.jpg)
Pyramid HOG for image regions0 1 LEVEL 2
![Page 32: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/32.jpg)
Pyramid HOG for image regions0 1 LEVEL 2
![Page 33: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/33.jpg)
Case study: scene classification
Suburb Bedroom Kitchen Living room Office
Coast Forest Mountain Open country River Sky/clouds
Vog
el&
Sch
iele
-VS
FeiF
ei&
Per
ona
-FP
Coast Forest Mountain Open country Highway Inside city Tall building Street
Oliv
a &
Tor
ralb
a -O
T
Laze
bnik
et a
l. -L
SP
Store Industrial
![Page 34: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/34.jpg)
Vogel & Schiele DATASET
Coast Forest Mountain
Open country River Sky/clouds
702 images VS dataset6 categories
![Page 35: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/35.jpg)
Oliva & Torralba DATASET
Coast Forest Mountain Open country
Highway Inside city Tall building Street
2688 images8 categories
OT dataset
![Page 36: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/36.jpg)
3759 images13 categories
FP dataset
Fei-Fei & Perona DATASET
Suburb Bedroom Kitchen Living room Office
Coast Forest Mountain Open country Highway Inside city Tall building Street
![Page 37: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/37.jpg)
Lazebnik, Schmid & Ponce DATASET
Suburb Bedroom Kitchen Living room Office
Coast Forest Mountain Open country Highway Inside city Tall building Street
Store Industrial4385 images15 categories
LSP dataset
![Page 38: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/38.jpg)
Mulit-way Classification
• For each class ‘c’ learn a 1-vs-rest SVM classifier
• Classification of test image I according to:
• where is the distance for the SVM for class c
c∗ = argmaxcDc(I)
Dc(I)
![Page 39: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/39.jpg)
VisualVocabulary
w1
w2
w3w4
……
……
Features
• bag of visual words
• HOG
• spatial pyramid of visual words
• spatial pyramid HOG
Parameters• vocabulary size V
• level weightings
• feature combination weights
l = 0 l = 1 l = 2
![Page 40: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/40.jpg)
Dataset
Validation set
100
Learn themodels
Others
½ Training ½ Testing
Classification
Methodology for learning parameter values
• optimize classification performance on a validation set
• 1 vs rest SVM classifier
![Page 41: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/41.jpg)
SVM-BOW
60
65
70
75
80
85
90
200 500 1000 1500 2000 2500 5000
V
Per
form
ance
(%)
SVM-BOW
Optimize vocabulary size V on validation set
2688 images8 categories
OT dataset
![Page 42: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/42.jpg)
Spatial Pyramid with optimizationSpatial Pyramid with optimizationSpatial pyramid with learnt weights
without optimization:
with optimization:
Kf(i, j) =Xl∈L
βlfe−μχ2(hlf(i),hlf(j))
• Learn level weights – linear combination of kernels
level weightbase kernel
dense visual words
• if weights common to all classes:
β0 = 0.25, β1 = 0.25, β2 = 0.5
β0 : β2 = 1.0, β1 : β2 = 0.8
![Page 43: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/43.jpg)
Optimized values for each datasetSame number of Images as authors
SVM one against all – x2 kernel for visual words Up to L = 2 for spatial pyramid
88.0
100
89.2
78.3
93.6
91.1
90.8
97.5
Coast
Forest
Mountain
Op. Country
Highway
Inside City
Street
Tall building
Ope
nco
untry
Mou
ntai
ns
Coa
st
Ope
nco
untry
Hig
hway
Stre
et
![Page 44: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/44.jpg)
Spatial Pyramid with optimizationSpatial Pyramid with optimizationSpatial pyramid with feature combination
Lazebnik et al.dense visual words
dense visual words optimized
dense visual words & HOG optimized
81.1 83.5 90.2
Kopt(i, j) =Xf∈F
dfKf(i, j)
• feature weights – linear combination of kernels
feature weightfeature kernel
• dense visual words
• HOG
![Page 45: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/45.jpg)
Take home messages
• Lite use of spatial information
• tiling, spatial pyramid
• Combination of features
• visual words, sparse, dense, HOG
• Learn parameters on validation set
![Page 46: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik](https://reader034.vdocuments.us/reader034/viewer/2022042221/5ec71325f6fd5611092a5b80/html5/thumbnails/46.jpg)
More classifiers …• SVM Classifier
• good performance
• convex optimization
• Logistic regression
• Adaboost
• e.g. used by Viola & Jones face detector
• slow to learn, fast to test
• Random forests
• fast to learn, fast to test
• Jamie Shotton tutorial