random forest and graph cut based segmentation of human limbs
DESCRIPTION
Random Forest and Graph Cut based segmentation of human limbs. Nadezhda Zlateva , IICT-BAS. 7 Sept. 2011. Outline. Human Pose Recognition Case Study Randomized Decision Tree Random Forest Experimental results with RF Graph Cut Experimental results with GC - PowerPoint PPT PresentationTRANSCRIPT
Random Forest and Graph Cut based segmentation of human
limbs
Nadezhda Zlateva, IICT-BAS
7 Sept. 2011
Outline
• Human Pose Recognition• Case Study
• Randomized Decision Tree• Random Forest
• Experimental results with RF• Graph Cut
• Experimental results with GC• Application to hand classification• Conclusion• References
2
Human Pose Recognition
Recognition via conventional intensity cameras depth cameras
Frame to frame points tracking – slow to re-initialize
Pose Recognition in parts:• Body parts segmentation
- Per pixel classification• 3D skeletal joints estimation
3
[1] Shotton et al., 11
Case Study
-Robots medical assistants [Purdue University]
-CT & MRI review in sterile environments[Sunnybrook Hospital, Toronto]
4
Upper limbs segmentation for hand gesture recognition
Application:• Sign language interpretation• Medical environments
Binary Decision Tree: Basics
1
2 3
6 74
9
5
8
category c
split nodesleaf nodes
v
10 11 12 13
14 15 16 17
≥
<
<
≥
5
DT over depth images: Training
feature vector – pixel x [x, y, z]T of depth image Isplit function – depth comparison features fθ as function of x:
6
dI(x) – depth at pixel x
θ1
θ2Combination of weak but computationally efficient features
[1] Shotton, 11
Randomized DT: Training
1. Random selection of a set of split candidates ϕ = (θ, τ), where - set of split thresholds for each θ for
tree t.2. Definition of the set of training pixels Q={(I,x)} over all training
images for the tree t. Q - set of pixels at the root node.
3. Find best split candidate at node n – largest
information gain from splitting Q into Qleft & Qright
7
Randomized DT: Training
4. Recurse for Qleft(ϕ*) & Qright(ϕ*)– till reaching stop conditions- Maximum depth- Minimum information gain- Minimum number of node pixels
5. Estimation of Pt(c|I,x) at each leaf node over body part labels c – use normalized histogram
Note: • dependent on choice of parameters• prone to over-fitting
8
Random Forest
Forest - ensemble of T decision trees
• Divide training (depth) images into T subsets – unique subset for each tree t
• Train each tree
9
[3] Breiman 01[1] Shotton et al. 11[3] Breiman 01[1] Shotton et al. 11
• classification is
Random Forest: Classification
……tree t1 tree tT
label clabel c
x x
10
Random Forest: Toy demo
[2] Shotton et al. 09
11
Random Forest: Summary
• Improves generalization to new data• Ensemble of trees gives robustness• Good for multi-class problems• Resistant to over-fitting• Fast training on large data sets• Efficient classifier
12
RF: Experiments and results
- Ground truth: 500 (upper limb) labeled depth images (640x480)- Number of trees: T=3 - Tree depth: 15- Split candidates: |θ|=100, |τ|=20 for each θ- Random pixels per image: 1000- 5-fold cross validation => 100 test images, 130 training images
per tree
13
Table 1. Average per class accuracy with RF classification
RF: Experiments and results 14
Ground truth & training
Per pixel classification
Segmentation by Graph Cut: Motivation
RF classification results:• Fuzzy body part boundaries• Left/Right uncertainty
Subsequent hand sign recognition – requires cleaner hand region segmentation
Graph Cut framework:• Energy minimization framework• Binary and multi-label image segmentation• Combines local and contextual information
15
Pixel labeling problem 16
Given
Assignment cost – U (unary potential)Separation cost – B (boundary potential)
- pairs of neighboring pixels
Pixels
Labels
Find
that minimize
[4] Boykov et al. 01
Theorem:In a graph G, the maximum source-to-sink flow possible is equal to the capacity of the minimum cut in G.
Graph Cut: Binary case• Image as directed graph G(V, E)
17
t-linkAssignment cost
n-linkSeparation cost
[L. R. Foulds, Graph Theory Applications, 1992 Springer-Verlag New York Inc., 247-248]
Energy minimization problem = min s-t cut on G = max-flow
Graph Cut: Multi-label case 18
Ce
ijwC ||||||Energy = cut cost
Suboptimal approximation of the minimum energy
Graph Cut: Potentials 19
Energy function
Unary potential ,
prob. by RF
Boundary potentialpriorconstraints
Importance weight
,
[5] Boykov et al. 06
Graph Cut: Results 20
Spatial Coherence:
Graph Cut: Results 21
RF classifications GC segmentation
RF & GC for hands 22
Ground truth
RandomForest
Graph Cut
63 frames500 random pixels|Omax| = 45
58.5% per class accuracy
70.9% per class accuracy
• RF – strong classifier• RF + GC over depth maps – good object segmentation
Future Work• Increase available data• Improve pixel label inference• Estimate upper limb/hand joints• Recognize finger configuration
Conclusion 23
References