image parsing: unifying segmentation and detection
DESCRIPTION
Image Parsing: Unifying Segmentation and Detection. Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty. Outline. Why Image Parsing? Introduction to Concepts in DDMCMC DDMCMC applied to Image Parsing - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/1.jpg)
Image Parsing: Unifying Segmentation and
DetectionZ. Tu, X. Chen, A.L. Yuille and S-C.
HzICCV 2003 (Marr Prize) & IJCV
2005
Sanketh Shetty
![Page 2: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/2.jpg)
Outline
• Why Image Parsing?• Introduction to Concepts in DDMCMC• DDMCMC applied to Image Parsing• Combining Discriminative and
Generative Models for Parsing• Results• Comments
![Page 3: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/3.jpg)
Image Parsing
Image I
Parse Structure W
Optimize p(W|I)
![Page 4: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/4.jpg)
Properties of Parse Structure
• Dynamic and reconfigurable– Variable number of nodes and node types
• Defined by a Markov Chain– Data Driven Markov Chain Monte Carlo
(earlier work in segmentation, grouping and recognition)
![Page 5: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/5.jpg)
Key Concepts• Joint model for Segmentation &
Recognition– Combine different modules to obtain cues
• Fully generative explanation for Image generation– Uses Generative and Discriminative Models
+ DDMCMC framework– Concurrent Top-Down & Bottom-Up Parsing
![Page 6: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/6.jpg)
Pattern Classes
62 characters
Faces
Regions
![Page 7: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/7.jpg)
• Key Concepts:– Markov Chains– Markov Chain Monte Carlo
• Metropolis-Hastings [Metropolis 1953, Hastings 1970]
• Reversible Jump [Green 1995]– Data Driven Markov Chain Monte Carlo
MCMC: A Quick Tour
![Page 8: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/8.jpg)
Markov Chains
Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
![Page 9: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/9.jpg)
Markov Chain Monte Carlo
Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
![Page 10: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/10.jpg)
Metropolis-Hastings Algorithm
Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
![Page 11: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/11.jpg)
Metropolis-Hastings Algorithm
Proposal Distribution
Invariant Distribution
Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
![Page 12: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/12.jpg)
Reversible Jumps MCMC
• Many competing models to explain data– Need to explore this complicated state space
Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
![Page 13: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/13.jpg)
DDMCMC Motivation
Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
![Page 14: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/14.jpg)
DDMCMC Motivation
Generative Modelp(I|W)p(W)
State Space
![Page 15: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/15.jpg)
DDMCMC Motivation
Generative Modelp(I|W)p(W)
State Space
Discriminative Modelq( wj | I ) Dramatically reduce search space by focusing
sampling to highly probable states.
![Page 16: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/16.jpg)
DDMCMC Framework
• Moves:– Node Creation– Node Deletion– Change Node Attributes
![Page 17: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/17.jpg)
Transition Kernel
Satisfies detailed balanced equation
Full Transition Kernel
![Page 18: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/18.jpg)
Convergence to p(W|I)
Monotonically at a geometric rate
![Page 19: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/19.jpg)
Criteria for Designing Transition Kernels
![Page 20: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/20.jpg)
Image Generation ModelRegions:
Constant IntensityTexturesShading
State of parse graph
![Page 21: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/21.jpg)
62 characters
Faces
3 Regions
![Page 22: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/22.jpg)
UniformDesigned to penalize high model complexity
![Page 23: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/23.jpg)
Shape Prior
Faces
3 Regions
![Page 24: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/24.jpg)
Shape Prior: Text
![Page 25: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/25.jpg)
Intensity Models
![Page 26: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/26.jpg)
Intensity Model: Faces
![Page 27: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/27.jpg)
Discriminative Cues Used• Adaboost Trained
– Face Detector– Text Detector
• Adaptive Binarization Cues• Edge Cues
– Canny at 3 scales• Shape Affinity Cues• Region Affinity Cues
![Page 28: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/28.jpg)
Transition Kernel Design• Remember
![Page 29: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/29.jpg)
Possible Transitions
1. Birth/Death of a Face Node2. Birth/Death of Text Node3. Boundary Evolution4. Split/Merge Region5. Change node attributes
![Page 30: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/30.jpg)
Face/Text Transitions
![Page 31: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/31.jpg)
Region Transitions
![Page 32: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/32.jpg)
Change Node Attributes
![Page 33: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/33.jpg)
Basic Control Algorithm
![Page 34: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/34.jpg)
![Page 35: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/35.jpg)
Results
![Page 36: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/36.jpg)
![Page 37: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/37.jpg)
Comments• Well motivated but very complicated approach to THE HOLY GRAIL
problem in vision– Good global convergence results for inference with very minor
dependence on initial W.– Extensible to larger set of primitives and pattern types.
• Many details of the algorithm are missing and it is hard to understand the motivation for choices of values for some parameters
• Unclear if the p(W|I)’s for configurations with different class compositions are comparable.
• Derek’s comment on Adaboost false positives and their failure to report their exact improvement
• No quantitative results/comparison to other algorithms and approaches
– It should be possible to design a simple experiment to measure performance on recognition/detection/localization tasks.
![Page 38: Image Parsing: Unifying Segmentation and Detection](https://reader035.vdocuments.us/reader035/viewer/2022062502/56814bfa550346895db8f22f/html5/thumbnails/38.jpg)
Thank You