![Page 1: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/1.jpg)
Li Fei-Fei, PrincetonRob Fergus, MIT
Antonio Torralba, MIT
Recognizing and Learning Recognizing and Learning Object Categories: Year 2007Object Categories: Year 2007
CVPR 2007 Minneapolis, Short Course, June 17CVPR 2007 Minneapolis, Short Course, June 17
![Page 2: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/2.jpg)
AgendaAgenda
• Introduction
• Bag-of-words models
• Part-based models
• Discriminative methods
• Segmentation and recognition
• Datasets & Conclusions
![Page 3: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/3.jpg)
![Page 4: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/4.jpg)
perceptibleperceptible visionvision materialmaterialthingthing
![Page 5: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/5.jpg)
Plato said…• Ordinary objects are classified together if they
`participate' in the same abstract Form, such as the Form of a Human or the Form of Quartz.
• Forms are proper subjects of philosophical investigation, for they have the highest degree of reality.
• Ordinary objects, such as humans, trees, and stones, have a lower degree of reality than the Forms.
• Fictions, shadows, and the like have a still lower degree of reality than ordinary objects and so are not proper subjects of philosophical enquiry.
![Page 6: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/6.jpg)
Bruegel, 1564
![Page 7: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/7.jpg)
How many object categories are there?
Biederman 1987
![Page 8: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/8.jpg)
So what does object recognition involve?
![Page 9: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/9.jpg)
Verification: is that a lamp?
![Page 10: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/10.jpg)
Detection: are there people?
![Page 11: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/11.jpg)
Identification: is that Potala Palace?
![Page 12: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/12.jpg)
Object categorization
mountain
building
tree
banner
vendorpeople
street lamp
![Page 13: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/13.jpg)
Scene and context categorization
• outdoor
• city
• …
![Page 14: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/14.jpg)
Computational photography
![Page 15: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/15.jpg)
Assisted driving
meters
me
ters
Ped
Ped
Car
Lane detection
Pedestrian and car detection
• Collision warning systems with adaptive cruise control, • Lane departure warning systems, • Rear object detection systems,
![Page 16: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/16.jpg)
Improving online search
Query:STREET
Organizing photo collections
![Page 17: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/17.jpg)
Challenges 1: view point variation
Michelangelo 1475-1564
![Page 18: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/18.jpg)
Challenges 2: illumination
slide credit: S. Ullman
![Page 19: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/19.jpg)
Challenges 3: occlusion
Magritte, 1957
![Page 20: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/20.jpg)
Challenges 4: scale
![Page 21: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/21.jpg)
Challenges 5: deformation
Xu, Beihong 1943
![Page 22: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/22.jpg)
Challenges 6: background clutter
Klimt, 1913
![Page 23: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/23.jpg)
History: single object recognition
![Page 24: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/24.jpg)
History: single object recognition
• Lowe, et al. 1999, 2003
• Mahamud and Herbert, 2000
• Ferrari, Tuytelaars, and Van Gool, 2004
• Rothganger, Lazebnik, and Ponce, 2004
• Moreels and Perona, 2005
• …
![Page 25: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/25.jpg)
Challenges 7: intra-class variation
![Page 26: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/26.jpg)
History: early object categorization
![Page 27: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/27.jpg)
• Turk and Pentland, 1991• Belhumeur, Hespanha, &
Kriegman, 1997• Schneiderman & Kanade 2004• Viola and Jones, 2000
• Amit and Geman, 1999• LeCun et al. 1998• Belongie and Malik, 2002
• Schneiderman & Kanade, 2004• Argawal and Roth, 2002• Poggio et al. 1993
![Page 28: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/28.jpg)
![Page 29: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/29.jpg)
Object categorization: Object categorization: the statistical viewpointthe statistical viewpoint
)|( imagezebrap
)( ezebra|imagnopvs.
• Bayes rule:
)(
)(
)|(
)|(
)|(
)|(
zebranop
zebrap
zebranoimagep
zebraimagep
imagezebranop
imagezebrap ⋅=
posterior ratio likelihood ratio prior ratio
![Page 30: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/30.jpg)
Object categorization: Object categorization: the statistical viewpointthe statistical viewpoint
)(
)(
)|(
)|(
)|(
)|(
zebranop
zebrap
zebranoimagep
zebraimagep
imagezebranop
imagezebrap ⋅=
posterior ratio likelihood ratio prior ratio
• Discriminative methods model posterior
• Generative methods model likelihood and prior
![Page 31: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/31.jpg)
Discriminative
• Direct modeling of
Zebra
Non-zebra
Decisionboundary
)|(
)|(
imagezebranop
imagezebrap
![Page 32: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/32.jpg)
• Model and
Generative)|( zebraimagep ) |( zebranoimagep
MiddleLowHigh
MiddleLow
)|( zebranoimagep)|( zebraimagep
![Page 33: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/33.jpg)
Three main issuesThree main issues
• Representation– How to represent an object category
• Learning– How to form the classifier, given training data
• Recognition– How the classifier is to be used on novel data
![Page 34: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/34.jpg)
Representation
– Generative / discriminative / hybrid
![Page 35: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/35.jpg)
Representation
– Generative / discriminative / hybrid
– Appearance only or location and appearance
![Page 36: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/36.jpg)
Representation
– Generative / discriminative / hybrid
– Appearance only or location and appearance
– Invariances• View point• Illumination• Occlusion• Scale• Deformation• Clutter• etc.
![Page 37: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/37.jpg)
Representation
– Generative / discriminative / hybrid
– Appearance only or location and appearance
– invariances– Part-based or global
w/sub-window
![Page 38: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/38.jpg)
Representation
– Generative / discriminative / hybrid
– Appearance only or location and appearance
– invariances– Parts or global w/sub-
window– Use set of features or
each pixel in image
![Page 39: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/39.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning
Learning
![Page 40: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/40.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)
– Methods of training: generative vs. discriminative
Learning
![Page 41: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/41.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)
– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels
Learning
Contains a motorbike
![Page 42: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/42.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)
– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels– Batch/incremental (on category and image
level; user-feedback )
Learning
![Page 43: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/43.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)
– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels– Batch/incremental (on category and image
level; user-feedback ) – Training images:
• Issue of overfitting• Negative images for discriminative methods
Priors
Learning
![Page 44: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/44.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)
– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)
– Level of supervision• Manual segmentation; bounding box; image
labels; noisy labels– Batch/incremental (on category and image
level; user-feedback ) – Training images:
• Issue of overfitting• Negative images for discriminative methods
– Priors
Learning
![Page 45: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/45.jpg)
– Scale / orientation range to search over – Speed– Context
Recognition
![Page 46: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/46.jpg)
Hoi
em, E
fros
, Her
bert
, 200
6
![Page 47: Cvpr2007 object category recognition p0 - introduction](https://reader036.vdocuments.us/reader036/viewer/2022062313/559b5d2f1a28abc27f8b4817/html5/thumbnails/47.jpg)
OBJECTS
ANIMALS INANIMATEPLANTS
MAN-MADENATURALVERTEBRATE …..
MAMMALS BIRDS
GROUSEBOARTAPIR CAMERA