project 9 automatic fingersign to speech translator

22
Final Presentation

Upload: erik

Post on 25-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Project 9 Automatic Fingersign to Speech Translator. Final Presentation. The group. Lale Akarun. Oya Aran. Alp Kindiroglu. Alexey Karpov. Milos Zeleny. Marek Hruz. Hasim Sak. Pavel Campr. Erinc Dikici. Daniel Schorno. Zdenek Krnoul. Alexander Ronzhin. Objectives & System Flowchart. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Project 9 Automatic Fingersign to Speech Translator

Final Presentation

Page 2: Project 9 Automatic Fingersign to Speech Translator

Lale Akarun Oya Aran

Alexey Karpov

Milos Zeleny

Hasim Sak

Erinc Dikici

Alp Kindiroglu

Marek Hruz

Pavel Campr

Daniel Schorno

Alexander Ronzhin Zdenek Krnoul

Page 3: Project 9 Automatic Fingersign to Speech Translator

Finger spelling <-> Speech (F2S & S2F) ◦ Translation between Russian, English, Czech, Turkish

Page 4: Project 9 Automatic Fingersign to Speech Translator

Multilingual fingersign alphabet database◦ Turkish alphabet (5 subjects)◦ Czech alphabet (4 subjects)◦ Russian alphabet (2 subjects)◦ Numbers and special stop signs

Page 5: Project 9 Automatic Fingersign to Speech Translator

Semi-Automatic annotation module:◦ 11 videos each 15-30 minutes

Filter Images

Select Keyframes

Crop Sign-Space

Segment Hand

Locations

Page 6: Project 9 Automatic Fingersign to Speech Translator

Skin color based hand detection◦ Initialization of model by movement of hands

Video Input (Turkish or

Czech)Skin Color Detection

Keyframe Selection

Text Output (UTF 8)

Tracking and Segmentation

of hands

Feature Extraction & Classification

Page 7: Project 9 Automatic Fingersign to Speech Translator

Tracking of the hands by Camshift

◦ Hierarchical hand and face redetection

◦ Hand segmentation Backprojection Double Differencing

Video Input (Turkish or

Czech)Skin Color Detection

Keyframe Selection

Text Output (UTF 8)

Tracking and Segmentation

of hands

Feature Extraction & Classification

Page 8: Project 9 Automatic Fingersign to Speech Translator

Two tier classification:◦ Keyframe Selection◦ Gesture Recognition

Detection of Keyframes:◦ Motion of Hands

Displacement of tracked hand centers Changes in hand external contour

◦ Image Blur Strength of gradient trace around hand

contours

Video Input (Turkish or

Czech)Skin Color Detection

Keyframe Selection

Text Output (UTF 8)

Tracking and Segmentation

of hands

Feature Extraction & Classification

Page 9: Project 9 Automatic Fingersign to Speech Translator

Hand gesture Descriptors:◦ Radial Distance Functions

◦ Elliptic Fourier Descriptors

◦ Local Binary Patterns

◦ Hu Moments Classification of each feature is done by KNN.

◦ Classified results for each feature are fused by voting. ◦ Optional word level fusion with Levenshtein Distance.

Video Input (Turkish or

Czech)Skin Color Detection

Keyframe Selection

Text Output (UTF 8)

Tracking and Segmentation

of hands

Feature Extraction & Classification

Page 10: Project 9 Automatic Fingersign to Speech Translator

Continuous speech recognition: ◦ A weighted finite-state transducer based speech decoder◦ 3-gram language model◦ 100K vocabulary size

News portal based 10843 tri-phone HMM states

◦ 11 Gaussians for acoustic model ◦ 188 hours broadcast news speech data

Page 11: Project 9 Automatic Fingersign to Speech Translator

Voice Activity Detection(VAD)◦ Preprocessing step on continious ASR◦ Identifies false voice triggers◦ Employed Methods:

Rabiner’s Method: Energy level and zero-crossing rates of the acoustic waveform

Supervised learning: Energy level of the signal modeled using GMMs

Page 12: Project 9 Automatic Fingersign to Speech Translator

Isolated speech recognition:◦ Phoneme based speech recognition◦ Represented by HMMs using GMMs◦ Used for out-of-vocabulary words◦ Speech Commands allow module control

Page 13: Project 9 Automatic Fingersign to Speech Translator

Python Based Web Service

◦ Handles Input/Output from multiple modules

◦ Users communicate using sessions

◦ All messages in utf-8 encoding or transcribed form

◦ Translation of sentences handled by Google Translate

◦ Messages types: Letter Word Sentence

Page 14: Project 9 Automatic Fingersign to Speech Translator

Computer speech synthesis given an arbitrary input text

Two TTS systems are applied:◦ MARY TTS developed

by DFKI (Germany)

◦ TTS engine developed by UIIP (Belarus) and SPIIRAS (Russia).

Web-based service◦ Polls for messages from the web-server.

Page 15: Project 9 Automatic Fingersign to Speech Translator

Visual Fingersign output provided through a 3D avatar

Available for two languages:◦ Czech Sign Alphabet◦ American Sign Alphabet

Module composed of:◦ 3D animation model

38 joints and segments (16 for hand)◦ Trajectory generator

Rotations of body parts handled with Inverse Kinematics

Head and lip motion provided by talking head system

Inputs and outputs words.

Page 16: Project 9 Automatic Fingersign to Speech Translator
Page 17: Project 9 Automatic Fingersign to Speech Translator

City names game◦ Module Design:

◦ Fingerspell-> Amsterdam Speech-> Madrid◦ Fingerspell-> Doha Speech-> Alta◦ Fingerspell-> Athens Speech-> Sukre◦ Fingerspell-> Eton Speech-> Nairobi

Visual Input (Turkish)

Audio Letter Input (Russian)

Finger Spelling

Recognition

Isolated Speech

Recognition

Finger Spelling

Synthesis

Speech Synthesis

Visual Output (Czech)

Audio Output

(English)

Server (Translator)

Page 18: Project 9 Automatic Fingersign to Speech Translator

City names game◦ Fingerspell-> Amsterdam Speech-> Madrid◦ Fingerspell-> Doha Speech-> Alta◦ Fingerspell-> Athens Speech-> Sukre◦ Fingerspell-> Eton Speech-> Nairobi

Page 19: Project 9 Automatic Fingersign to Speech Translator

Casual Continuous Conversation

Audio Sentence

Input (Turkish)

Isolated Speech

Recognition

Finger Spelling

Synthesis

Speech Synthesis

Visual Output (Czech)

Audio Output

(English)

Server (Translator)

Page 20: Project 9 Automatic Fingersign to Speech Translator

Automated language detection for fingerspelling

Further testing

Increasing overall system speed

Addition of missing languages to underlying modules

Page 21: Project 9 Automatic Fingersign to Speech Translator
Page 22: Project 9 Automatic Fingersign to Speech Translator