learning shared body plans
DESCRIPTION
Learning Shared Body Plans. Ian Endres University of Illinois work with Derek Hoiem , Vivek Srikumar and Ming-Wei Chang. How should we represent multiple related object categories?. How should we represent multiple related object categories?. - PowerPoint PPT PresentationTRANSCRIPT
Learning Shared Body Plans
Ian EndresUniversity of Illinois
work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang
How should we represent multiple related object categories?
How should we represent multiple related object categories?
Want to detect, localize, and estimate pose of broad range of objects, including new ones
One option: independent detectors
CatDetector
DogDetector
4-Legged Animal
Detector
Basic-Level Categories
Broad Categories Parts
…
Head Detector
Our previous work: Train separate detectors, Joint spatial model
Vehicle
Wheel
Animal
Leg
Head
Four-leggedMammal
Can runCan JumpFacing rightMoves on road
Facing right
Farhadi Endres Hoiem (2010)
Jointly trained multi-category models• Train part/category detectors to jointly predict
object structure– Only need to perform well in context defined by
others
• Spatial model encodes likely part positions, number of parts, likely categories, etc.– Generalizes Felzenszwalb et al.: cross-category
sharing, multiple parts with one model, variable size
Deformable Part Models
From Felzenszwalb et al.
Detection with Deformable Part Models
From Felzenszwalb et al.
Shared mixture of deformable parts: Body Plans
Include a body plan for background patches:No appearance models, just a bias
Body Plan Overview
Object Center ++
+
Head Anchors
High Scoring Detections
Anchor Point Score
Sa = bias
+ appearance score
- deformation cost
HOG based Deformable part model (Felzenszwalb et al.)
Quadratic penalty in position and scale
Sa = bias
+ appearance score
- deformation cost
Overall score must be greater than 0 to be detected
Inference: Head
++
+✓
Inference: Leg
++++ +
Inference: Leg
++++ +✓
Search Constraints:CountPairwise Exclusion
Inference: Leg
++++ +✓
Inference: Leg
++++ +✓✓
Inference: Leg
++++ +✓✓
Inference: Leg
++++ +✓✓✓
Inference: Leg
++++ +✓✓✓
Inference: Leg
++++ +✓✓✓✓
Inference
Score for each body plan:
Overall score for an object hypothesis:
Benefits of Joint Learning
Only consider structures with:
Benefits of Joint Learning
No structures have
(Latent) Max Margin Structured Learning
Highest Scoring Valid Structure
Invalid Structure Loss
Soft margin slack
Valid Structures
LEGLEG
LEG LEG
HeadFour-leggedElk
Object Detectors: 50% Overlap with ground truthPart Detectors: 25% Overlap with ground truth
Positive Examples Negative Examples
Must select BG body plan
Loss
LEGLEG
LEGHead
Four-leggedElk
False Positives: +1Duplicate Detections: +1Missed Detections: + 1
Head
LEG
Positive Examples Negative Examples
Non-BG body plan: +1False Positives: +1
Optimization
• Latent Structured SVM– Non-convex - CCCP
• Stochastic gradient descent based cutting plane optimization
Optimization Challenges
1) Expensive search for violated constraints– Mine many violated constraints at once– Speeds convergence
2) Large feature vectors (100k+)– Can’t store every mined violated constraint– Requires careful caching
Experimental Setup
• CORE: Train + Test– Familiar Categories: Camel, Dog, Elephant, Elk– Parts: Head, Leg, Torso– Unfamiliar Categories: Cat, Cow
• Pascal 2008: Test– Unfamiliar Categories: Cat, Cow, Horse, Sheep
Familiar Objects
Unfamiliar Objects
Mistakes
Object Level ResultsAP
Familiar four-legged partsAP
Unfamiliar four-legged partsAP
Mixed Supervision
LEG
LEG
LEG
Head
Four-leggedDog L
EG
LEG
LEG
Four-leggedDog L
EG
LEG
Head
Learning
Mixed Supervision
LEG
LEG
LEG
Head
Four-leggedDog L
EG
Four-leggedDog+
LEG
LEG
Four-leggedDog L
EG
LEG
Head
Learning
Mixed Supervision - Learning
• Unlabeled boxes become latent variables– Compute most likely positition– No loss for missed detections
Highest Scoring Valid Structure
Loss
Mixed Supervision … Mixed ResultsAP
Conclusions
• Jointly representing related categories leads to better performance and generalization to unfamiliar categories
• Joint training important to get full benefit of spatial model
Thanks