1 recognition by association: ask not “what is it?” ask “what is it like?” tomasz...
TRANSCRIPT
![Page 1: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/1.jpg)
1
Recognition by Association:ask not “What is it?”
ask “What is it like?”
Tomasz Malisiewicz and Alyosha EfrosCMU
CVPR’08
![Page 2: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/2.jpg)
Understanding an Image
slide by Fei Fei, Fergus & Torralba
![Page 3: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/3.jpg)
Object naming -> Object categorization
sky
building
flag
wallbanner
bus
cars
bus
face
street lamp
slide by Fei Fei, Fergus & Torralba
![Page 4: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/4.jpg)
Object categorization
sky
building
flag
wallbanner
bus
cars
bus
face
street lamp
![Page 5: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/5.jpg)
5
Classical View of Categories
• Dates back to Plato & Aristotle 1. Categories are defined by a
list of properties shared by all elements in a category
2. Category membership is binary
3. Every member in the category is equal
![Page 6: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/6.jpg)
6
Problems with Classical View• Humans don’t do this!
– People don’t rely on abstract definitions / lists of shared properties (Rosch 1973)• e.g. Are curtains furniture?
– Typicality• e.g. Chicken -> bird, but bird -> eagle, pigeon, etc.
– Intransitivity• e.g. car seat is chair, chair is furniture, but …
– Not language-independent• e.g. “Women, Fire, and Dangerous Things” category is
Australian aboriginal language (Lakoff 1987)
–Doesn’t work even in human-defined domains• e.g. Is Pluto a planet?
![Page 7: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/7.jpg)
Problems with Visual CategoriesChair
•A lot of categories are functional
Car
•Different views of same object can be visually dis-similar
![Page 8: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/8.jpg)
8
Categorization in Modern Psychology
• Prototype Theory (Rosch 1973)–One or more summary representations (prototypes) for
each category–Humans compute similarity between input and
prototypes
• Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992)–categories represented in terms of remembered objects
(exemplars)–Similarity is measured between input and all exemplars– think non-parametric density estimation
8
![Page 9: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/9.jpg)
Different way of looking at recognition
CarCarCar
Road
Building
Input Image
![Page 10: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/10.jpg)
10
Different way of looking at recognition
![Page 11: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/11.jpg)
11
What is the ultimate goal?
• Parsing Images
• A “what is it like?” machine
• A kind of “visual memex”
![Page 12: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/12.jpg)
12
Recognition as Association
LabelMe Dataset
12,905 Object Exemplars171 unique ‘labels’
http://labelme.csail.mit.edu/
![Page 13: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/13.jpg)
13
Our Contributions
• Posing Recognition as Association–Use large number of object exemplars
13
•Learning Object Similarity–Different distance function per exemplar
•Recognition-Based Object Segmentation– Use multiple segmentation approach
![Page 14: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/14.jpg)
14
Measuring Similarity
![Page 15: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/15.jpg)
15
Exemplar Representation
Segment from LabelMe
![Page 16: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/16.jpg)
16
ShapeCentered Mask
Bounding Box Dimensions
Pixel Area
Obj~Obj
![Page 17: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/17.jpg)
17
TextureTextons
top,bot,left,right boundary
Interior: Bag-of-Words
![Page 18: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/18.jpg)
18
ColorMean Color
Color std
Color Histogram
![Page 19: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/19.jpg)
19
LocationAbsolute Position Mask
0
1
.42
.8 Top Height
Bot Height
![Page 20: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/20.jpg)
Distance “Similarity” Functions
• Positive Linear Combinations of Elementary Distances Computed Over 14 Features
Building e Distance Function
Building e
![Page 21: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/21.jpg)
Learning Object Similarity
• Learn a different distance function for each exemplar in training set
• Formulation is similar to Frome et al [1,2][1] Andrea Frome, Yoram Singer, Jitendra Malik. "Image Retrieval and Recognition Using Local Distance Functions." In NIPS, 2006.
[2] Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik. "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification." In ICCV, 2007.
![Page 22: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/22.jpg)
22
Non-parametric density estimation
Color Dimension
Sh
ap
e D
imen
sio
n
Class 1Class 2
Class 3
![Page 23: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/23.jpg)
23
Non-parametric density estimation
Color Dimension
Sh
ap
e D
imen
sio
n
Class 1Class 2
Class 3
![Page 24: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/24.jpg)
24
Non-parametric density estimation
Color Dimension
Sh
ap
e D
imen
sio
n
Class 1Class 2
Class 3
![Page 25: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/25.jpg)
25
Learning Distance Functions
25Dshape
Dcolor
Focal Exemplar
![Page 26: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/26.jpg)
26
Learning Distance Functions
26Dshape
Dcolor
Focal Exemplar
“similar” side
DecisionDecisionBoundaryBoundary
“dissimilar” side
Don’t Care
![Page 27: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/27.jpg)
Visualizing Distance Functions (Training Set)
Query
Query
Top Neighbors with Tex-Hist Dist
Top Neighbors with Learned Dist
![Page 28: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/28.jpg)
Visualizing Distance Functions (Training Set)
![Page 29: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/29.jpg)
Visualizing Distance Functions (Training Set)
![Page 30: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/30.jpg)
Visualizing Distance Functions (Training Set)
![Page 31: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/31.jpg)
personperso
nperso
nperso
nperso
n
Visualizing Distance Functions (Training Set)
![Page 32: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/32.jpg)
3232
Visualizing Distance Functions (Training Set)
personperso
nperso
n
standing
personwoman
person
Different Label on “similar” side of distance function
![Page 33: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/33.jpg)
Labels Crossing Boundary
![Page 34: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/34.jpg)
34
“Conventional” Recognition in Test Set
• Compute the similarity between an input and all exemplars
• All exemplars with D < 1 are “associated” with the input
• Most occurring label from associations is propagated onto input
• Association confidence score favors more associations and smaller distances
34
![Page 35: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/35.jpg)
Performance on labeling perfect segments (test set)
![Page 36: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/36.jpg)
Object Segmentation via Recognition
• Generate Multiple Segmentations (Hoiem 2005, Russell 2006, Malisiewicz 2007)
– Mean-Shift and Normalized Cuts
– Use pairs and triplets of adjacent segments
– Generate about 10,000 segments per image
• Enhance training with bad segments
• Apply learned distance functions to bottom-up segments
![Page 37: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/37.jpg)
37
Example AssociationsBottom-Up Segments
![Page 38: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/38.jpg)
38
Quantitative Evaluation
38
Object hypothesis is correct if labels match and OS > .5
*We do not penalize for multiple correct overlapping associations
OS(A,B) = Overlap Score = intersection(A,B) / union(A,B)
![Page 39: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/39.jpg)
39
Toward Image Parsing
39
![Page 40: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/40.jpg)
40
Conclusion: Main Points• Object Association: defining an object in terms of a set of visually similar objects. Trying to get away from classes.
• Learning per-examplar-distances: each object gets to decide on its own distance function. Suddenly, NN distances are meaningful!
• Using multiple segmentations: partition the input image into manageable chunks than can then be matched
![Page 41: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08](https://reader035.vdocuments.us/reader035/viewer/2022070307/551b5581550346d41a8b621d/html5/thumbnails/41.jpg)
41
Thank You
41
Questions?