![Page 1: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/1.jpg)
Multiclass object detection
![Page 2: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/2.jpg)
Multiclass object detection
![Page 3: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/3.jpg)
![Page 4: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/4.jpg)
![Page 5: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/5.jpg)
Context: objects appear in configurations
![Page 6: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/6.jpg)
Generalization: objects share parts
![Page 7: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/7.jpg)
How many categories?
![Page 8: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/8.jpg)
Slide by Aude Oliva
“Muchas”
![Page 9: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/9.jpg)
How many object categories are there?
Biederman 1987
![Page 10: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/10.jpg)
How many categories?
• Probably this question is not even specific enough to have an answer
![Page 11: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/11.jpg)
Which level of categorization is the right one?
Car is an object composed of: a few doors, four wheels (not all visible at all times), a roof, front lights, windshield
If you are thinking in buying a car, you might want to be a bit more specific aboutyour categorization level.
?
![Page 12: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/12.jpg)
Entry-level categories(Jolicoeur, Gluck, Kosslyn 1984)
• Typical member of a basic-level category are categorized at the expected level
• Atypical members tend to be classified at a subordinate level.
A birdAn ostrich
![Page 13: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/13.jpg)
We do not need to recognize the exact category
A new class can borrow information from similar categories
![Page 14: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/14.jpg)
So, where is computer vision?
Well…
![Page 15: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/15.jpg)
Multiclass object detectionthe not so early days
![Page 16: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/16.jpg)
Multiclass object detectionthe not so early days
• Schneiderman-Kanade multiclass object detection
Using a set of independent binary classifiers was a common strategy:• Viola-Jones extension for dealing with rotations
- two cascades for each view
(a) One detector for each class
There is nothing wrong with this approach if you have access to lots of training data and you do not care about efficiency.
![Page 17: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/17.jpg)
Generalizing Across Categories
Can we transfer knowledge from one object category to another?Slide by Erik Sudderth
![Page 18: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/18.jpg)
Shared features• Is learning the object class 1000 easier than
learning the first?
• Can we transfer knowledge from one object to another?
• Are the shared properties interesting by themselves?
…
![Page 19: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/19.jpg)
Multitask learningR. Caruana. Multitask Learning. ML 1997
“MTL improves generalization by leveraging the domain-specific information contained in the training signals of related tasks. It does this by training tasks in parallel while using a shared representation”.
vs.
Sejnowski & Rosenberg 1986; Hinton 1986; Le Cun et al. 1989; Suddarth & Kergosien 1990; Pratt et al. 1991; Sharkey & Sharkey 1992; …
![Page 20: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/20.jpg)
Multitask learning
•horizontal location of doorknob •single or double door•horizontal location of doorway center •width of doorway•horizontal location of left door jamb
•horizontal location of right door jamb•width of left door jamb •width of right door jamb•horizontal location of left edge of door •horizontal location of right edge of door
Primary task: detect door knobs
Tasks used:
R. Caruana. Multitask Learning. ML 1997
![Page 21: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/21.jpg)
Sharing invariancesS. Thrun.Is Learning the n-th Thing Any Easier Than Learning The First? NIPS 1996
Knowledge is transferred between tasks via a learned model of the invariances of the domain: object recognition is invariant to rotation, translation, scaling, lighting, … These invariances are common to all object recognition tasks.
Toy world
Without sharing
With sharing
![Page 22: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/22.jpg)
Convolutional Neural Network
Translation invariance is already built into the network
The output neurons share all the intermediate levels
Le Cun et al, 98
![Page 23: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/23.jpg)
Sharing transformationsMiller, E., Matsakis, N., and Viola, P. (2000). Learning from one example through
shared densities on transforms. In IEEE Computer Vision and Pattern Recognition.
Transformations are sharedand can be learnt from other tasks.
![Page 24: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/24.jpg)
Sharing in constellation models
Pictorial StructuresFischler & Elschlager, IEEE Trans. Comp. 1973
Constellation ModelFei-Fei, Fergus, Perona, ICCV 2003
SVM DetectorsHeisele, Poggio, et. al., NIPS 2001
Model-Guided SegmentationMori, Ren, Efros, & Malik, CVPR 2004
![Page 25: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/25.jpg)
Reusable Parts
Goal: Look for a vocabulary of edges that reduces the number of features.
Krempp, Geman, & Amit “Sequential Learning of Reusable Parts for Object Detection”. TR 2002
Num
ber o
f fea
ture
s
Number of classes
Examples of reused parts
![Page 26: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/26.jpg)
Specific feature
Non-shared feature: this featureis too specific to faces.
pedestrian
chair
Traffic light
sign
face
Background class
![Page 27: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/27.jpg)
Shared feature
shared feature
![Page 28: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/28.jpg)
Additive models and boosting
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
Screen detector
Car detector
Face detector
• Binary classifiers that share features:
Screen detector
Car detector
Face detector
• Independent binary classifiers:
![Page 29: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/29.jpg)
50 training samples/class29 object classes2000 entries in the dictionary
Results averaged on 20 runsError bars = 80% interval
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
Shared features
Class-specific features
![Page 30: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/30.jpg)
Generalization as a function of object similarities
12 viewpoints12 unrelated object classes
Number of training samples per class Number of training samples per class
Area
und
er R
OC
Area
und
er R
OC K = 2.1 K = 4.8
Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007
![Page 31: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/31.jpg)
Opelt, Pinz, Zisserman, CVPR 2006
Efficiency Generalization
J. Shotton, A. Blake, R. Cipolla.Multi-Scale Categorical Object Recognition Using
Contour Fragments. In IEEETrans. on PAMI, 30(7):1270-1281, July 2008.
![Page 32: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/32.jpg)
Sharing patches
• Bart and Ullman, 2004For a new class, use only features similar to features that where good for other classes:
Proposed Dog features
![Page 33: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/33.jpg)
Some more references
• Baxter 1996• Caruana 1997• Schapire, Singer, 2000• Thrun, Pratt 1997• Krempp, Geman, Amit, 2002• E.L.Miller, Matsakis, Viola, 2000• Mahamud, Hebert, Lafferty, 2001• Fink et al. 2003, 2004• LeCun, Huang, Bottou, 2004• Holub, Welling, Perona, 2005• …
![Page 34: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/34.jpg)
Modeling object relationships
![Page 35: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/35.jpg)
The “guess what I am trying to detect” challenge
The detector challenge: by looking at the output of a detector on a random setof images, can you guess which object is it trying to detect?
![Page 36: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/36.jpg)
What object is detector trying to detect?
The detector challenge: by looking at the output of a detector on a random setof images, can you guess which object is it trying to detect?
![Page 37: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/37.jpg)
1. chair, 2. table, 3. road, 4. road, 5. table, 6. car, 7. keyboard.
![Page 38: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/38.jpg)
The context challenge
How far can you go without using an object detector?
![Page 39: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/39.jpg)
2
1
What are the hidden objects?
![Page 40: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/40.jpg)
What are the hidden objects?
Chance ~ 1/30000
![Page 41: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/41.jpg)
p(O | I) ap(I|O) p(O)
Object model Context model
imageobjects
![Page 42: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/42.jpg)
p(O | I) ap(I|O) p(O)
Object model Context model
Full jointScene model Aprox. joint
![Page 43: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/43.jpg)
p(O | I) ap(I|O) p(O)
Object model Context model
Full jointScene model Approx. joint
![Page 44: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/44.jpg)
p(O | I) ap(I|O) p(O)
Object model Context model
Full jointScene model
p(O) = S Pp(Oi|S=s) p(S=s)s i
Approx. joint
officestreet
![Page 45: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/45.jpg)
p(O | I) ap(I|O) p(O)
Object model Context model
Full jointScene model Approx. joint
![Page 46: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/46.jpg)
Pixel labeling using MRFsEnforce consistency between neighboring labels,
and between labels and pixels
Carbonetto, de Freitas & Barnard, ECCV’04
Oi
![Page 47: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/47.jpg)
Beyond nearest-neighbor grids
• Most MRF/CRF models assume nearest-neighbor graph topology
• This cannot capture long-distance correlations
![Page 48: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/48.jpg)
Object-Object Relationships
Use latent variables to induce long distance correlations between labels in a Conditional Random Field (CRF)
He, Zemel & Carreira-Perpinan (04)
![Page 49: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/49.jpg)
Object-Object Relationships
[Kumar Hebert 2005]
![Page 50: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/50.jpg)
• Fink & Perona (NIPS 03)Use output of boosting from other objects at previous
iterations as input into boosting for this iteration
Object-Object Relationships
![Page 51: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/51.jpg)
Objects in Context
Building,boat, motorbike
Building, boat, person
Water,sky
Road
Most consistent labeling according to object co-occurrences& locallabel probabilities.
Boat
Building
Water
Road
A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora and S. Belongie. Objects in Context. ICCV 2007
![Page 52: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/52.jpg)
52
Objects in Context:
Contextual RefinementContextual model based on co-occurrencesTry to find the most consistent labeling with high posterior probability and high mean pairwise interaction.Use CRF for this purpose. Boat
Building
Water
Road
Independent segment classificationMean interaction of all label pairs
Φ(i,j) is basically the observed label co-occurrences in training set.
Slide by GokberkCinbis
![Page 53: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/53.jpg)
Detecting difficult objects
Office Maybethere is a mouse
Start recognizing the scene
Torralba, Murphy, Freeman. NIPS 2004.
![Page 54: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/54.jpg)
Detecting difficult objects
Detect first simple objects (reliable detectors) that provide strongcontextual constraints to the target (screen -> keyboard -> mouse)
Torralba, Murphy, Freeman. NIPS 2004.
![Page 55: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/55.jpg)
Detecting difficult objects
Detect first simple objects (reliable detectors) that provide strongcontextual constraints to the target (screen -> keyboard -> mouse)
Torralba, Murphy, Freeman. NIPS 2004.
![Page 56: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/56.jpg)
BRF for car detection: topology
Torralba Murphy Freeman (2004)
![Page 57: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/57.jpg)
BRF for car detection: results
Torralba Murphy Freeman (2004)
![Page 58: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/58.jpg)
A “car” out of context is less of a car
Car Building Road
b F G b F G b F G
From image
From detectors
Thresholded beliefs
![Page 59: Iccv2009 recognition and learning object categories p2 c02 - recognizing muliple objects in an image - sharing and context](https://reader034.vdocuments.us/reader034/viewer/2022051412/54968cf6b479590d248b4571/html5/thumbnails/59.jpg)
Contextual object relationshipsCarbonetto, de Freitas & Barnard (2004) Kumar, Hebert (2005)
Torralba Murphy Freeman (2004)
Fink & Perona (2003)E. Sudderth et al (2005)