kapitel 14 recognition – p. 1 recognition scene understanding / visual object categorization pose...
TRANSCRIPT
![Page 1: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/1.jpg)
Kapitel 14 “Recognition” – p. 1
Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image categorization Bag-of-features models Large-scale image search
Kapitel 14
![Page 2: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/2.jpg)
Kapitel 14 “Recognition” – p. 2
Scene understanding
Scene categorization
•outdoor/indoor
•city/forest/factory/etc.
![Page 3: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/3.jpg)
Kapitel 14 “Recognition” – p. 3
Scene understanding
Object detection
•find pedestrians
![Page 4: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/4.jpg)
Kapitel 14 “Recognition” – p. 4
Scene understanding
mountain
building
tree
banner
marketpeople
street lamp
sky
building
![Page 5: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/5.jpg)
Kapitel 14 “Recognition” – p. 5
Visual object categorization
![Page 6: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/6.jpg)
Kapitel 14 “Recognition” – p. 6
Visual object categorization
![Page 7: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/7.jpg)
Kapitel 14 “Recognition” – p. 7
Visual object categorization
![Page 8: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/8.jpg)
Kapitel 14 “Recognition” – p. 8
Visual object categorization
Recognition is all about modeling variability camera position illumination shape variations within-class variations
![Page 9: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/9.jpg)
Kapitel 14 “Recognition” – p. 9
Visual object categorization
Within-class variations (why are they chairs?)
![Page 10: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/10.jpg)
Kapitel 14 “Recognition” – p. 10
Pose clustering
Working in transformation space - main ideas: Generate many hypotheses of transformation image vs. model,
each built by a tuple of image and model features
Correct transformation hypotheses appear many times
Main steps: Quantize the space of possible transformations
For each tuple of image and model features, solve for the optimal transformation that aligns the matched features
Record a “vote” in the corresponding transformation space bin
Find "peak" in transformation space
![Page 11: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/11.jpg)
Kapitel 14 “Recognition” – p. 11
Pose clustering
Example: Rotation only. A pair of one scene segment and one model segment suffices to generate a transformation hypothesis.
![Page 12: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/12.jpg)
Kapitel 14 “Recognition” – p. 12
Pose clustering
2-tuples of image and model corner points are used to generate hypotheses. (a) The corners found in an image. (b) The four best hypotheses found with the edges drawn in. The nose of the plane and the head of the person do not appear because they were not in the models.
C.F. Olson: Efficient pose clustering using a randomized algorithm. IJCV, 23(2): 131-147, 1997.
![Page 13: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/13.jpg)
Kapitel 14 “Recognition” – p. 13
Object recognition by local features
D.G. Lowe: Distinctive image features from scale-invariant keypoints. IJCV, 60(2): 91-110, 2004
The SIFT features of training images are extracted and storedFor a query image
Extract SIFT features Efficient nearest neighbor indexing Pose clustering by Hough transform For clusters with >2 keypoints (object hypotheses): determine
the optimal affine transformation parameters by least squares method; geometry verification
Input Image Stored
Image
![Page 14: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/14.jpg)
Kapitel 14 “Recognition” – p. 14
Object recognition by local features
Robust feature-based alignment (panorama)
![Page 15: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/15.jpg)
Kapitel 14 “Recognition” – p. 15
Object recognition by local features
Extract features
![Page 16: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/16.jpg)
Kapitel 14 “Recognition” – p. 16
Object recognition by local features
Extract features
Compute putative matches
![Page 17: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/17.jpg)
Kapitel 14 “Recognition” – p. 17
Object recognition by local features
Extract features
Compute putative matches
Loop:
Hypothesize transformation T (small group of putative matches that are related by T)
![Page 18: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/18.jpg)
Kapitel 14 “Recognition” – p. 18
Object recognition by local features
Extract features
Compute putative matches
Loop:
Hypothesize transformation T (small group of putative matches that are related by T)
Verify transformation (search for other matches consistent with T)
![Page 19: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/19.jpg)
Kapitel 14 “Recognition” – p. 19
Object recognition by local features
Extract features
Compute putative matches
Loop:
Hypothesize transformation T (small group of putative matches that are related by T)
Verify transformation (search for other matches consistent with T)
![Page 20: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/20.jpg)
Kapitel 14 “Recognition” – p. 20
Object recognition by local features
Pose clustering by Hough transform (Hypothesize transformation T):
Each of the SIFT keypoints of input image specifies 4 parameters: 2D location, scale, and orientation.
Each matched SIFT keypoint in the database has a record of the keypoint’s parameters relative to the training image in which it was found.
We can create a Hough transform entry predicting the model location, orientation, and scale from the match hypothesis.
![Page 21: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/21.jpg)
Kapitel 14 “Recognition” – p. 21
Object recognition by local features
Recognition: Cluster of 3 corresponding feature pairs
![Page 22: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/22.jpg)
Kapitel 14 “Recognition” – p. 22
Object recognition by local features
![Page 23: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/23.jpg)
Kapitel 14 “Recognition” – p. 23
Object recognition by local features
![Page 24: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/24.jpg)
Kapitel 14 “Recognition” – p. 24
Image categorization
Training Labels
Training Images
Classifier Training
Training
Image Features
Image Features
Testing
Test Image
Trained Classifier
Trained Classifier
Outdoor
Prediction
![Page 25: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/25.jpg)
Kapitel 14 “Recognition” – p. 25
Image categorization
Global histogram (distribution):
color, texture, motion, …
histogram matching distance
![Page 26: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/26.jpg)
Kapitel 14 “Recognition” – p. 26
Image categorization
Cars found by color histogram matching
See Chapter “Inhaltsbasierte Suche in Bilddatenbanken”
![Page 27: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/27.jpg)
Kapitel 14 “Recognition” – p. 27
Bag-of-features models
ObjectObjectBag of Bag of ‘words’‘words’
![Page 28: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/28.jpg)
Kapitel 14 “Recognition” – p. 28
Bag-of-features models
Origin 1: Text recognition by orderless document representation (frequencies of words from a dictionary), Salton & McGill (1983)
US Presidential Speeches Tag Cloudhttp://chir.ag/phernalia/preztags/
![Page 29: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/29.jpg)
Kapitel 14 “Recognition” – p. 29
Bag-of-features models
Origin 2: Texture recognition
Texture is characterized by the repetition of basic elements or textons
For stochastic textures, it is the identity of the textons, not their spatial arrangement, that matters
![Page 30: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/30.jpg)
Kapitel 14 “Recognition” – p. 30
Bag-of-features models
Universal texton dictionary
histogram
![Page 31: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/31.jpg)
Kapitel 14 “Recognition” – p. 31
Bag-of-features models
G. Cruska et al.: Visual categorization with bags of keypoints. Proc. of ECCV, 2004.
We need to build a “visual” dictionary!
![Page 32: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/32.jpg)
Kapitel 14 “Recognition” – p. 32
Bag-of-features models
Main step 1: Extract features (e.g. SIFT)
…
![Page 33: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/33.jpg)
Kapitel 14 “Recognition” – p. 33
Bag-of-features models
Main step 2: Learn visual vocabulary (e.g. using clustering)
clustering
![Page 34: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/34.jpg)
Kapitel 14 “Recognition” – p. 34
Bag-of-features models
visual vocabulary (cluster centers)
![Page 35: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/35.jpg)
Kapitel 14 “Recognition” – p. 35
Bag-of-features models
… Source: B. LeibeAppearance codebook
Example: learned codebook
![Page 36: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/36.jpg)
Kapitel 14 “Recognition” – p. 36
Bag-of-features models
Main step 3: Quantize features using “visual vocabulary”
![Page 37: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/37.jpg)
Kapitel 14 “Recognition” – p. 37
Bag-of-features models
Main step 4: Represent images by frequencies of “visual words” (i.e., bags of features)
![Page 38: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/38.jpg)
Kapitel 14 “Recognition” – p. 38
Bag-of-features models
Example: representation basedon learned codebook
![Page 39: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/39.jpg)
Kapitel 14 “Recognition” – p. 39
Bag-of-features models
Main step 5: Apply any classifier to the histogram feature vectors
![Page 40: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/40.jpg)
Kapitel 14 “Recognition” – p. 40
Bag-of-features models
Example:
![Page 41: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/41.jpg)
Kapitel 14 “Recognition” – p. 41
Bag-of-features models
Caltech6 dataset
Dictionary quality and size are very important parameters!
![Page 42: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/42.jpg)
Kapitel 14 “Recognition” – p. 42
Bag-of-features models
Caltech6 dataset
J.C. Niebles, H. Wang, L. Fei-Fei: Unsupervised learning of human action categories using spatial-tempotal words. IJCV, 79(3): 299-318, 2008.
Action recognition
![Page 43: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/43.jpg)
Kapitel 14 “Recognition” – p. 43
Large-scale image search
Query Results from 5k Flickr images
J. Philbin, et al.: Object retrieval with large vocabularies and fast spatial matching. Proc. of CVPR, 2007.
![Page 44: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/44.jpg)
Kapitel 14 “Recognition” – p. 44
Large-scale image search
Mobile tourist guide self-localization object/building recognition photo/video augmentation
Aachen Cathedral
[Quack, Leibe, Van Gool, CIVR’08]
![Page 45: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/45.jpg)
Kapitel 14 “Recognition” – p. 45
Large-scale image search
![Page 46: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/46.jpg)
Kapitel 14 “Recognition” – p. 46
Large-scale image search
Application: Image auto-annotationLeft: Wikipedia imageRight: closest match from Flickr
Moulin Rouge
Tour MontparnasseColosseum
ViktualienmarktMaypole
Old Town Square (Prague)
![Page 47: Kapitel 14 Recognition – p. 1 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image](https://reader035.vdocuments.us/reader035/viewer/2022081518/55143b7d5503462d4e8b4685/html5/thumbnails/47.jpg)
Kapitel 14 “Recognition” – p. 47
Sources
K. Grauman, B. Leibe: Visual Object Recognition. Morgen & Claypool Publishers, 2011.
R. Szeliski: Computer Vision: Algorithms and Applications. Springer, 2010. (Chapter 14 “Recognition”)
Course materials from others (G. Bebis, J. Hays, S. Lazebnik, …)