cl a ss 5: a t t ri b utes and s e mant i c f e a tu r es r o g e rio f e ris, f e b 21, 2013
DESCRIPTION
Cl a ss 5: A t t ri b utes and S e mant i c F e a tu r es R o g e rio F e ris, F e b 21, 2013 E E CS 6890 – T o p ics in I n f orm a tion P r o ce ss i n g Spr i ng 2013, Co l um b i a Un i v e r sity h t tp:// r o g er i o f er i s . c om/Visual R e c og n i t i onA n dS e a r c h. - PowerPoint PPT PresentationTRANSCRIPT
Class 5: Attributes and Semantic Features
Rogerio Feris, Feb 21, 2013EECS 6890 – Topics in Information Processing
Spring 2013, Columbia Universityhttp://rogerioferis.com/VisualRecognitionAndSearch
Thanks for sending the project proposals!
Project update presentations (10 min per group)
March 14
April 11
Details will be provided in the course website
Visual Recognition And Search Columbia University, Spring 2013
Project Report
Plan for Today
Introduction to Semantic Features
Attribute-based Classification and
Search Attributes for Fine-
Grained Classification Relative
Attributes
Project Proposal Presentations
Visual Recognition And Search Columbia University, Spring 2013
Use the scores of semantic classifiers as high-level features
Off-the-shelfClassifiers …
Score Score Score
Compact / powerfuldescriptor with semanticmeaning (allows“explaining” the decision)
Semantic Features
Beach Classifier
Visual Recognition And Search Columbia University, Spring 2013
Water ClassifierSand ClassifierSky Classifier
Input Image
Semantic Features
this concept
[John Smith et al, Multimedia Semantic Indexing Using Model Vectors,ICME 2003]
Concatenation / Dimensionality Reduction
Visual Recognition And Search Columbia University, Spring 2013
Semantic Features (Frame-Level) Illustration of Early IBM work (multimedia community) describing
System evolved to the IBM Multimedia Analysis and RetrievalSystem (IMARS)
[Rong Yan et al, Model-Shared SubspaceBoosting for Multi-label Classification, KDD 2007]Discriminative semantic basis
Rapid event modeling,e.g., “accident with high-speed skidding”
Visual Recognition And Search Columbia University, Spring 2013
Ensemble Learning
Semantic Features (Frame-level)
Descriptor is formed by concatenating the outputs of weaklytrained classifiers called classemes (trained with noisy labels)
[L. Torresani et al, Efficient Object Category Recognition Using Classemes, ECCV 2010]
Images used to train the “table” classeme (from Google image search)
NoisyLabels
Visual Recognition And Search Columbia University, Spring 2013
Classemes (Frame-level)
Compact and Efficient Descriptor , useful for large-scale classification
Features are not really semantic!
Visual Recognition And Search Columbia University, Spring 2013
Classemes (Frame-level)
Object Bank [Li-Jia Li et al, Object Bank: A High-Level Image Representationfor Scene Classification and Semantic Feature Sparsification]
http://vision.stanford.edu/projects/objectbank/
State-of-the-art scene classification results (~7 seconds per image)
Visual Recognition And Search Columbia University, Spring 2013
Semantic Features (Object Level)
DescribingNaming
Bald
Beard
Red Shirt
Modifiers rather than (or in addition to) nouns
Semantic properties that are shared among objects
Attributes are category independent and transferrable
Visual Recognition And Search Columbia University, Spring 2013
?
Semantic Attributes
Attribute-Based Search
Visual Recognition And Search Columbia University, Spring 2013
Traditional Approaches: Face Recognition (“Naming”)
Face recognition is very challenging under lighting changes, pose variation, and low-resolution imagery (typical conditions in surveillance scenarios)
Attribute-based People Search (“Describing”)
[Vaquero et al, Attribute-based People Search in Surveillance Environments, WACV 2009]
Rather than relying on face recognition only, a complementary people searchframework based on semantic attributes is provided
Query Example:
“Show me all bald people at the 42nd street station last month with dark skin, wearingsunglasses, wearing a red jacket”
Visual Recognition And Search Columbia University, Spring 2013
People Search in Surveillance Videos
Visual Recognition And Search Columbia University, Spring 2013
People Search in Surveillance Videos
Visual Recognition And Search Columbia University, Spring 2013
People Search in Surveillance Videos
People Search based on textual descriptions - It does not requiretraining images for the target suspect.
Robustness: attribute detectors are trained using lots of trainingimages covering different lighting conditions, pose variation, etc.
Works well in low-resolution imagery (typical in video surveillancescenarios)
Visual Recognition And Search Columbia University, Spring 2013
People Search in Surveillance Videos
Modeling attribute correlations
[Siddiquie, Feris and Davis, “Image Ranking and Retrieval Based onMulti-Attribute Queries”, CVPR 2011]
Visual Recognition And Search Columbia University, Spring 2013
People Search in Surveillance Videos
Attribute-Based Classification
Visual Recognition And Search Columbia University, Spring 2013
Recognition of Unseen Classes (Zero-Shot Learning)[Lampert et al, Learning To Detect Unseen Object Classes by Between-Class AttributeTransfer, CVPR 2009]
1) Train semantic attribute classifiers
2) Obtain a classifier for an unseenobject (no training samples) by justspecifying which attributes it has
Visual Recognition And Search Columbia University, Spring 2013
Attribute-based Classification
Visual Recognition And Search Columbia University, Spring 2013
Unseen categories
Semantic AttributeClassifiers
Attribute-basedclassification
Unseen categories
Flat multi-class classification
Attribute-based Classification
Face verification [Kumar et al, ICCV 2009]
Bird Categorization [Farrell et al, ICCV 2011]
Animal Recognition[Lampert et al, CVPR 2009] Person Re-identification
[Layne et al, BMVC 2012]
Many more! Significantgrowth in the past few years
Visual Recognition And Search Columbia University, Spring 2013
Attribute-based ClassificationAction recognition [Liu al, CVPR2011]
Note: Several recent methods use the term “attributes” torefer to non-semantic model outputs
In this case attributes are just mid-level features, like PCA,hidden layers in neural nets, … (non-interpretable splits)
Visual Recognition And Search Columbia University, Spring 2013
Attribute-based Classification
Visual Recognition And Search Columbia University, Spring 2013
http://rogerioferis.com/VisualRecognitionAndSearch/Resources.html
Attribute-based Classification
Attributes for Fine-GrainedCategorization
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Visual Recognition And Search Columbia University, Spring 2013
Machines collaborating with humans to organize visual knowledge, connecting text to
images, images to text, and images to images
Easy annotation interface for experts (powered by computer vision)
Visual Query: Fine-grained Bird Categorization
Picture credit: Serge Belongie
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained CategorizationVisipedia (http://http://visipedia.org /)
African Is it an African or Indian Elephant? Indian
Example-based Fine-Grained Categorization is Hard!!
Slide Credit: Christoph Lampert
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
African Is it an African or Indian Elephant? Indian
Visual distinction of subordinate categories may be quite subtle, usuallybased on Parts and Attributes
Visual Recognition And Search Columbia University, Spring 2013
Larger Ears Smaller Ears
Fine-Grained Categorization
Standard classification methods may not be suitable because thevariation between classes is small …
[B. Yao, CVPR 2012]
Codebook
… and intra-class variation is still high.
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Humans rely on field guides!
Field guides usually refer to parts and attributes of the object
Slide Credit: Pietro Perona
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]
[Branson et al, Visual Recognition with Humans in the Loop, ECCV 2010]
Computer vision reduces the amount of human-interaction (minimizes thenumber of questions)
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
[Wah et al, Multiclass Recognition and Part Localization with Humans inthe Loop, ICCV 2011]
Localized part and attribute detectors.
Questions include asking the user to localize parts.
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
http://www.vision.caltech.edu/visipedia/CUB-200-2011.html
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Video Demo:http://www.youtube.com/watch?v=_ReKVqnDXzA
Visual Recognition And Search Columbia University, Spring 2013
Fine-Grained Categorization
Like a normal field guide…
thatand
you can search and sortwith visual recognition
See N. Kumar et al,"Leafsnap: A Computer Vision System for Automatic Plant Species Identification, ECCV 2012
Nearly 1 million downloads
1.7
40k new users per month100k active users
million images taken100k new images/month100k users with > 5 images
Users from all over the world
Botanists, educators, kids, hobbyists,photographers, … Slide Credit: Neeraj Kumar
Visual Recognition And Search Columbia University, Spring 2013
Check the fine-grained visual categorization workshop:http://www.fgvc.org/
Fine-Grained Categorization
Relative Attributes
Visual Recognition And Search Columbia University, Spring 2013
[Parikh & Grauman, Relative Attributes, ICCV 2011]
Smiling ??? Not smiling
???Natural Not natural
Slide credit: Parikh &Grauman
Visual Recognition And Search Columbia University, Spring 2013
Relative Attributes
For each attribute e.g., “openness”
Supervision consists of:
Ordered pairs
Similar pairs
Slide credit: Parikh &Grauman
Visual Recognition And Search Columbia University, Spring 2013
Learning Relative Attributes
Learn a ranking functionImage features
Learned parameters
that best satisfies the constraints:
Slide credit: Parikh &Grauman
Visual Recognition And Search Columbia University, Spring 2013
Learning Relative Attributes
Max-margin learning to rank formulation
21
43
6 5
Based on [Joachims 2002]
Rank Margin
Image Relative Attribute Score
Slide credit: Parikh &Grauman
Visual Recognition And Search Columbia University, Spring 2013
Learning Relative Attributes
Each image is converted into a vector of relative attribute scores indicatingthe strength of each attribute
A Gaussian distribution for each category is built in the relative attributespace. The distribution of unseen categories is estimated based on thespecified constraints and the distributions of seen categories
Max-likelihood is then used for classification
Blue: Seen class Green: Unseen class
Visual Recognition And Search Columbia University, Spring 2013
Relative Zero-Shot Learning
Slide credit: Parikh &Grauman
Visual Recognition And Search Columbia University, Spring 2013
Relative Image Description
Slide credit: Kristen Grauman
Visual Recognition And Search Columbia University, Spring 2013
Whittle Search
Visual Recognition And Search Columbia University, Spring 2013
http://pub.ist.ac.at/~chl/PnA2012/
http://rogerioferis.com/PartsAndAttributes/
Describing images of unknown objects [Farhadi et al, CVPR 2009]
Recognizing unseen classes [Lampert et al, CVPR 2009]
Reducing dataset bias (trained across classes)
Effective object search in surveillance videos [Vaquero et al, WACV 2009]
Compact descriptors / Efficient image retrieval [Douze et al, CVPR 2011]
Fine-grained object categorization [Wah et al, ICCV 2011]
Face verification [Kumar et al, 2009], Action recognition [Liu et al, CVPR2011], Person re-identification [Layne et al, BMVC 2012] and otherclassification tasks.
Other applications, such as sentence generation from images [Kulkarni etal, CVPR 2011], image aesthetics prediction [Dhar et al CVPR 2011], …
Visual Recognition And Search Columbia University, Spring 2013
SummarySemantic attribute classifiers can be useful for:
Extensive annotation may be required for attribute classifiers
Class-attribute relations may be automatically extracted from textualsources
[Rohrbach et al, What Helps Where – And Why? Semantic Relatedness for Knowledge Transfer", CVPR 2010]; [Berg et al, Automatic Attribute Discovery and Characterization from Noisy Web Data, ECCV 2008].
Semantic Attributes may not be discriminative
Various methods combine semantic attributes with “discriminative attributes”(non-semantic) for classification (e.g., [Farhadi et al, CVPR 2009]). Constructionof nameable + discriminative attributes has also been proposed by [Parikh &Grauman, Interactively Building a Discriminative Vocabulary of NameableAttributes, CVPR 2011]
Visual Recognition And Search Columbia University, Spring 2013
Summary