![Page 1: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/1.jpg)
1
Part 1:Classical Image Classification
Methods
Kai Yu
Dept. of Media AnalyticsNEC Laboratories America
Andrew Ng
Computer Science Dept.Stanford University
![Page 2: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/2.jpg)
Outline of Part 2
04/18/23 2
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 3: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/3.jpg)
Outline of Part 2
04/18/23 3
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 4: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/4.jpg)
Local features
04/18/23 4
• Distinctive descriptors of local image patches• Invariant to local translation, scale, …• and sometimes rotation or general affine transformations• The most famous choice is the SIFT feature
• Distinctive descriptors of local image patches• Invariant to local translation, scale, …• and sometimes rotation or general affine transformations• The most famous choice is the SIFT feature
![Page 5: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/5.jpg)
Sampling local features from images
04/18/23 5
A set of points
Image credits: F-F. Li, E. Nowak, J. Sivic
![Page 6: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/6.jpg)
Visual words
04/18/23 6
• Similar points are grouped into one visual word• Algorithms: k-means, agglomerative clustering, …• Points from different images are then more easily compared.
• Similar points are grouped into one visual word• Algorithms: k-means, agglomerative clustering, …• Points from different images are then more easily compared.
Slide credit: Kristen Grauman
![Page 7: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/7.jpg)
Outline of Part 2
04/18/23 7
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 8: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/8.jpg)
Bag-of-words (BoW) representation
04/18/23 8
Analogy to documents
Adapted from tutorial slides by Fei-Fei et al.
![Page 9: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/9.jpg)
BoW for object categorization
04/18/23 9
• Works pretty well for whole-image classification• Works pretty well for whole-image classification
Slide credit: Svetlana Lazebnik
Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005)
![Page 10: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/10.jpg)
Unsupervised Dictionary Learning
04/18/23 10
image database
• Sample local features from images• Run k-mean or other clustering algorithm to get dictionary• Dictionary is also called “codebook”
• Sample local features from images• Run k-mean or other clustering algorithm to get dictionary• Dictionary is also called “codebook”
SIFTspace
R1
R2
R3
![Page 11: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/11.jpg)
Compute BoW histogram for each image
04/18/23 11
R1
R2
R3
Assign sift features into
clusters
Compute the frequency of each cluster
within an image
R1
R2
R3
BoW histogram representations
![Page 12: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/12.jpg)
Indication of BoW histogram
04/18/23 12
• Summarize entire image based on its distribution of visual word occurrences
• Turn bags of different sizes into a fixed length vector
• Analogous to bag of words representation commonly used for text categorization.
• Summarize entire image based on its distribution of visual word occurrences
• Turn bags of different sizes into a fixed length vector
• Analogous to bag of words representation commonly used for text categorization.
![Page 13: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/13.jpg)
Image classification based on BoW histogram
04/18/23 13
dog
birdDecision
boundary
BoW histogram vector space
• Learn a classification model to determine the decision boundary• Nonlinear SVMs are commonly applied.
• Learn a classification model to determine the decision boundary• Nonlinear SVMs are commonly applied.
![Page 14: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/14.jpg)
Issues
04/18/23 14
• Sampling strategy
• Learning codebook: size? supervised?, …
• Classification: which method? scalability?
• Scalability: how to handle millions of data?
• How to use spatial information?
• Sampling strategy
• Learning codebook: size? supervised?, …
• Classification: which method? scalability?
• Scalability: how to handle millions of data?
• How to use spatial information?
![Page 15: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/15.jpg)
Spatial information
04/18/23 15
• The BoW removes spatial layout.
• This increases the invariance to scale, translation, and deformation,
• But sacrifices discriminative power, especially when the spatial layout is important.
• The BoW removes spatial layout.
• This increases the invariance to scale, translation, and deformation,
• But sacrifices discriminative power, especially when the spatial layout is important.
Slide adapted from Bill Freeman
![Page 16: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/16.jpg)
Spatial pyramid matching
04/18/23 16
• Compute BoW for image regions at different locations in various scales• Compute BoW for image regions at different locations in various scales
Figure credit: Svetlana Lazebnik
![Page 17: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/17.jpg)
A common pipeline for discriminative image classification using BoW
04/18/23 17
K-means
Dense/Sparse SIFT
dictionary
Dictionary Learning
VQ Coding
Dense/Sparse SIFT
Spatial Pyramid Pooling
Nonlinear SVM
Image Classification
![Page 18: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/18.jpg)
Combining multiple descriptors
04/18/23 18
Multiple Feature Detectors
Multiple Descriptors: SIFT, shape, color, …
VQ Coding and Spatial Pooling Nonlinear SVM
Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008
![Page 19: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/19.jpg)
Outline of Part 2
04/18/23 19
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
•Local Features, Sampling, Visual Words, …
•Discriminative Methods- Bag-of-Words (BoW) representation- Spatial pyramid matching (SPM)
•Generative Methods- Part-based methods- Topic models
![Page 20: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/20.jpg)
04/18/23 20
Topic models for images
wN
c z
D
Latent Dirichlet Allocation (LDA)
Fei-Fei et al. ICCV 2005
“beach”
Slide credit Fei-Fei Li
![Page 21: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/21.jpg)
Part-based Model
04/18/23 21
Fischler & Elschlager 1973Rob Fergus ICCV09 Tutorial
![Page 22: 1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062421/56649cf35503460f949c10cb/html5/thumbnails/22.jpg)
For a comprehensive coverage of object categorization models, please visit
04/18/23 22
Recognizing and Learning Object Categories
Li Fei-Fei (Stanford), Rob Fergus (NYU), Antonio Torralba (MIT)
http://people.csail.mit.edu/torralba/shortCourseRLOC/