eccv2010: feature learning for image classification, part 1
DESCRIPTION
TRANSCRIPT
![Page 1: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/1.jpg)
1
Part 1:Classical Image Classification
Methods
Kai Yu
Dept. of Media AnalyticsNEC Laboratories America
Andrew Ng
Computer Science Dept.Stanford University
![Page 2: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/2.jpg)
Outline of Part 2
04/10/23 2
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 3: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/3.jpg)
Outline of Part 2
04/10/23 3
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 4: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/4.jpg)
Local features
04/10/23 4
• Distinctive descriptors of local image patches• Invariant to local translation, scale, …• and sometimes rotation or general affine transformations• The most famous choice is the SIFT feature
• Distinctive descriptors of local image patches• Invariant to local translation, scale, …• and sometimes rotation or general affine transformations• The most famous choice is the SIFT feature
![Page 5: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/5.jpg)
Sampling local features from images
04/10/23 5
A set of points
Image credits: F-F. Li, E. Nowak, J. Sivic
![Page 6: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/6.jpg)
Visual words
04/10/23 6
• Similar points are grouped into one visual word• Algorithms: k-means, agglomerative clustering, …• Points from different images are then more easily compared.
• Similar points are grouped into one visual word• Algorithms: k-means, agglomerative clustering, …• Points from different images are then more easily compared.
Slide credit: Kristen Grauman
![Page 7: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/7.jpg)
Outline of Part 2
04/10/23 7
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 8: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/8.jpg)
Bag-of-words (BoW) representation
04/10/23 8
Analogy to documents
Adapted from tutorial slides by Fei-Fei et al.
![Page 9: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/9.jpg)
BoW for object categorization
04/10/23 9
• Works pretty well for whole-image classification• Works pretty well for whole-image classification
Slide credit: Svetlana Lazebnik
Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005)
![Page 10: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/10.jpg)
Unsupervised Dictionary Learning
04/10/23 10
image database
• Sample local features from images• Run k-mean or other clustering algorithm to get dictionary• Dictionary is also called “codebook”
• Sample local features from images• Run k-mean or other clustering algorithm to get dictionary• Dictionary is also called “codebook”
SIFTspace
R1
R2
R3
![Page 11: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/11.jpg)
Compute BoW histogram for each image
04/10/23 11
R1
R2
R3
Assign sift features into
clusters
Compute the frequency of each cluster
within an image
R1
R2
R3
BoW histogram representations
![Page 12: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/12.jpg)
Indication of BoW histogram
04/10/23 12
• Summarize entire image based on its distribution of visual word occurrences
• Turn bags of different sizes into a fixed length vector
• Analogous to bag of words representation commonly used for text categorization.
• Summarize entire image based on its distribution of visual word occurrences
• Turn bags of different sizes into a fixed length vector
• Analogous to bag of words representation commonly used for text categorization.
![Page 13: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/13.jpg)
Image classification based on BoW histogram
04/10/23 13
dog
birdDecision
boundary
BoW histogram vector space
• Learn a classification model to determine the decision boundary• Nonlinear SVMs are commonly applied.
• Learn a classification model to determine the decision boundary• Nonlinear SVMs are commonly applied.
![Page 14: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/14.jpg)
Issues
04/10/23 14
• Sampling strategy
• Learning codebook: size? supervised?, …
• Classification: which method? scalability?
• Scalability: how to handle millions of data?
• How to use spatial information?
• Sampling strategy
• Learning codebook: size? supervised?, …
• Classification: which method? scalability?
• Scalability: how to handle millions of data?
• How to use spatial information?
![Page 15: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/15.jpg)
Spatial information
04/10/23 15
• The BoW removes spatial layout.
• This increases the invariance to scale, translation, and deformation,
• But sacrifices discriminative power, especially when the spatial layout is important.
• The BoW removes spatial layout.
• This increases the invariance to scale, translation, and deformation,
• But sacrifices discriminative power, especially when the spatial layout is important.
Slide adapted from Bill Freeman
![Page 16: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/16.jpg)
Spatial pyramid matching
04/10/23 16
• Compute BoW for image regions at different locations in various scales• Compute BoW for image regions at different locations in various scales
Figure credit: Svetlana Lazebnik
![Page 17: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/17.jpg)
A common pipeline for discriminative image classification using BoW
04/10/23 17
K-means
Dense/Sparse SIFT
dictionary
Dictionary Learning
VQ Coding
Dense/Sparse SIFT
Spatial Pyramid Pooling
Nonlinear SVM
Image Classification
![Page 18: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/18.jpg)
Combining multiple descriptors
04/10/23 18
Multiple Feature Detectors
Multiple Descriptors: SIFT, shape, color, …
VQ Coding and Spatial Pooling Nonlinear SVM
Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008
![Page 19: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/19.jpg)
Outline of Part 2
04/10/23 19
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 20: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/20.jpg)
04/10/23 20
Topic models for images
wN
c z
D
Latent Dirichlet Allocation (LDA)
Fei-Fei et al. ICCV 2005
“beach”
Slide credit Fei-Fei Li
![Page 21: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/21.jpg)
Part-based Model
04/10/23 21
Fischler & Elschlager 1973Rob Fergus ICCV09 Tutorial
![Page 22: ECCV2010: feature learning for image classification, part 1](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b41b8b4a79595c0e8b45b4/html5/thumbnails/22.jpg)
For a comprehensive coverage of object categorization models, please visit
04/10/23 22
Recognizing and Learning Object Categories
Li Fei-Fei (Stanford), Rob Fergus (NYU), Antonio Torralba (MIT)
http://people.csail.mit.edu/torralba/shortCourseRLOC/