hand pose estimation rnd project - cse, iit bombaypratikm/projectpages/deeplearningforpo… ·...

Hand Pose Estimation

Author: Pratik Kalshet Supervisor: Parag Chaudhuri

Department of Computer Science and EngineeringIndian Institute of Technology Bombay

RnD Project

Introduction Problem Statement Previous Work Approach Results

Outline

Applications - Human-computer interaction, Augmented and Virtual Reality, … Hot research topic – ICCV, CVPR, SIGGRAPH. 2016

IntroductionMotivation

Robert Wang. Nimble VR 2014

IntroductionChallenges

Self-occlusion Self-similarity NoiseHigh Degree-of-freedom

Aim – Accuracy and Efficiency

Problem StatementHand Pose Estimation

Input – Depth Image (of hand) Output – Joint Locations in 3-D

Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to get hand pose

Advantage – accurate, valid poses

Disadvantage – slow, local minima (initialization problem)

Data-driven (Discriminative) [Sun+15(CVPR)] Direct regression function – observed image to hand pose

Advantage – fast (real-time)

Disadvantage – coarse results, violate hand geometry

Previous WorkTypes of Techniques

Generative Methods

Discriminative MethodsTheobalt. “Real-time Capture of Hands in Motion”. CVPR. 2015

Hybrid [Tay+16(SIGGRAPH)] Initialization using discriminative, refinement using generative

Advantage – accurate, fast

Disadvantage – separate stages lead to sub-optimal results

Previous WorkTypes of Techniques

Tomson et. al. “Real-time continuous pose recovery of human hands using convolutional networks”. TOG. 2014

Hand prior

Non-linear regression

Previous WorkIssues in Existing Systems

Ge. “Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs”. CVPR. 2016

ApproachOverview

Input(Depth Image)

Output(3-D Joint Positions)

Deep Network(ConvNet, Kinematic Layer)

Zhou et al. “Model-based Deep Hand Pose Estimation”. IJCAI. 2016

ApproachPre-processing

Zhang et al. “Accurate per-pixel hand detection from a single depth image”. Optical Engineering. 2017

1. Hand detection

2. Depth normalization

*This is assumed to be done.

ApproachDeep Network

Loss:

NYU Hand Pose Dataset Training samples: 10000 Test samples: 1200 Joints: 31 DoF: 26

ResultsData

Input – Depth Image Label – Joint Positions in 3-D

Tomson et. al. “Real-time continuous pose recovery of human hands using convolutional networks”. TOG. 2014

ResultsQualitative Results

Input

Prediction

Ground Truth

ResultsComparative Study

Input

Prediction

Ground Truth

Without Kinematic Layer With Kinematic Layer

ResultsComparison with state-of-the-art

Technique Error

No prior 6395.45

Existing best prior 4699.16

Kinematic prior 3079.38

Hand kinematic prior in a deep network Achieved competitive results

ConclusionSummary

Future Work

Multi-view CNN Temporal data for tracking Physics-based constraint layer

hand pose estimation rnd project - cse, iit bombaypratikm/projectpages/deeplearningforpo… ·...

Documents