vogler and metaxas university of toronto computer science csc 2528: handshapes and movements:...
Post on 31-Mar-2015
215 Views
Preview:
TRANSCRIPT
Vogler and Metaxas
University of Toronto Computer Science
CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition
Christian Vogler and Dimitris Metaxas(presented by Christopher Collins)
University of Toronto Computer ScienceVogler and Metaxas 2
Overview: Part II Introduction to ASL recognition Challenges of ASL recognition Related work Modelling
Phoneme-based modelling Independent Channels Handshape
Parallel Hidden Markov Models Experiments Conclusions and Future Work
University of Toronto Computer ScienceVogler and Metaxas 3
ASL Recognition: Introduction
Computer interaction is still mainly keyboard/mouserequires literacy in a written language or
an agreed-upon standard written form of ASL (e.g. sign-writing)
difficult for many people who are deaf
University of Toronto Computer ScienceVogler and Metaxas 4
ASL Recognition: Challenges
More difficult than speech recognition due to:simultaneous events
University of Toronto Computer ScienceVogler and Metaxas 5
ASL Recognition: Challenges
More difficult than speech recognition due to:simultaneous eventsinflections
University of Toronto Computer ScienceVogler and Metaxas 6
ASL Recognition: Challenges
More difficult than speech recognition due to:simultaneous eventsinflectionsphonology poorly understood, no
agreed standard
University of Toronto Computer ScienceVogler and Metaxas 7
Challenges of Simultaneity
University of Toronto Computer ScienceVogler and Metaxas 8
Related Work
C. Vogler and D. Metaxas. Parallel Hidden Markov Models for ASL Recognition (1999).
G. Fang et al. Signer-independent continuous sign language recognition based on SRN/HMM (2001).
R.-H. Liang and M. Ouhyoung. A real-time continuous gesture recognition system for sign language (1998).
University of Toronto Computer ScienceVogler and Metaxas 9
Overview
HMM-based approach to ASL recognitionparallel HMMs for different channelschannels are left and right handshape and
movementuses the movement-hold phonology
University of Toronto Computer ScienceVogler and Metaxas 10
Movement-Hold Example
University of Toronto Computer ScienceVogler and Metaxas 11
Handshape Modelling Most previous work uses
joint and abduction angles as features (low-level)
Also experiment with a measure of the openness of a finger (high level) height and width of
quadrilateral MPJ angle abduction angles
University of Toronto Computer ScienceVogler and Metaxas 12
Extensions to HMM
Regular HMM model one process evolving over time
To model parallel, possibly interacting processes with a regular HMM, events must evolve in lockstep
Earlier work by Vogler and Metaxas explains development of parallel HMM model
University of Toronto Computer ScienceVogler and Metaxas 13
Factorial HMM
University of Toronto Computer ScienceVogler and Metaxas 14
Coupled HMM
University of Toronto Computer ScienceVogler and Metaxas 15
Parallel HMM
University of Toronto Computer ScienceVogler and Metaxas 16
Combination of Processes
Using independence assumption, combine path probabilities (from each channel, with states representing the same sign sequence) by multiplying them. Choose the most probable state sequence.
Time is polynomial in number of states, linear in number of parallel processes
More info: C. Vogler and D. Metaxas, Parallel Hidden Markov Models for ASL Recognition; Proc. Int. Conf. on Comp. Vis., Greece, 1999.
University of Toronto Computer ScienceVogler and Metaxas 17
Experiments
Compare handshape models (joint angles vs. quadrilateral) for handshape recognition task
Compare PaHMM model with various channel combinations against single hand movement channel (naïve baseline?)
Vocabulary of 22 signs, 400 training sentences of length 2-7 signs, and 99 test sentences
Omitted left-hand handshape?
University of Toronto Computer ScienceVogler and Metaxas 18
Choice of Handshape Model
Measure correctly recognized handshape (recognizing signs with handshape alone not possible)
Quadrilateral feature vector results in better (and more consistent) recognition accuracy
University of Toronto Computer ScienceVogler and Metaxas 19
Experimental Results
H=correct, D = deletion, S = substitution, I = insertion, N = number
University of Toronto Computer ScienceVogler and Metaxas 20
Conclusions
Handshape information is important in ASL recognition
Parallel HMM a promising model for multi-channel data
University of Toronto Computer ScienceVogler and Metaxas 21
Future Work
Training/Test data from native signers Include facial expressions Use of relative spatial information (classifiers) Larger vocabulary
Incorporation of language modelling to improve recognition, such as n-gram or parsing
top related