3d hand pose estimation by finding appearance-based matches in a large database of training views...
Post on 20-Dec-2015
229 views
TRANSCRIPT
3D Hand Pose Estimation by Finding Appearance-Based Matches in a Large Database of Training Views
2006.8.17
outline
Introduction Propose Framework Space Complexity Synthetic Versus Real Training Data Edge-Based View Matching Experimental Results Future Work Conclusion
Introduction
Estimate 3D hand pose from a single image by matching the image with a large database.
What are the storage requirement for an adequate database of training views?
What are the similarity measures? How can the matching be done efficiently?
Introduction
In the database contains more than 100,000 image, generated from 26 hand shape.
In the real images use skin color dectection.
Proposed Framework
Model the hand as an object, consisting 16 links : the palm and 15 links corresponding to finger parts.
Proposed Framework
The five joints connecting fingers between finger links allow rotation with two degrees of freedom (DOFs).
The 10 joints between finger links allow rotation with on DOF
A total of 20 DOFs describes completely all degrees of freedom in the joint angles.
Proposed Framework
Add the viewing parameter.
Given a hand configuration vector
and a viewing parameter vector
, define the hand pose vector
),,( 201 ccC
),,( 321 vvvV
),,,,,( 321201 vvvccP
Proposed Framework
The generic framework that we propose for hand pose estimation is the following:
1. create a database containing a uniform of all possible views of all possible configuration.
2. for each novel image, find the database views that are the most similar. Use the parameters of those views estimates for the image.
Space Complexity
Depend on the number of database images.
In this paper, have 86 viewpoints and generated 48 images for each viewpoints
Use PCA to reduce hand shape configuration
Synthetic Versus Real Training Data
A big advantage of synthetic training sets is that the labeling of the data can be done automatically.
Problem : hard to correct, need multicamera setup.
Edge-Based View Matching
Have defined image similarity using chamfer distance.
Given an input image, extract its edge pixels using an edge detector (canny) and store the coordinates in a set X.
Experimental Results
DB have 26 different hand shapes, each shape rendered from 86 viewing direction, each direction have 48 images.
Test have 28 real hand pose image.
Experimental Results
Define the distance D between a point
and a set of points X to be the Euclidean distance between and the point in X that is the closest to :
xXDXx
min),(
YyXx
c XyDY
YxDX
YXD ),(1
),(1
),(