micai 13 contextualized practical speech

15
Practical Speech Recognition for Contextualized Service Robots Departamento de Ciencias de la Computación Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas Universidad Nacional Autónoma de México http://golem.iimas.unam.mx/ Ivan Meza, Caleb Rascón and Luis Pineda GrupoGolem

Upload: grupo-golem-dcc-iimas-unam

Post on 04-Jul-2015

41 views

Category:

Technology


1 download

DESCRIPTION

Talk given in MICAI 2013: Practical Speech Recognition for Contextualized Service Robots

TRANSCRIPT

Page 1: Micai 13  contextualized practical speech

Practical Speech Recognition for Contextualized Service Robots

Departamento de Ciencias de la ComputaciónInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas

Universidad Nacional Autónoma de México

http://golem.iimas.unam.mx/

Ivan Meza, Caleb Rascón and Luis Pineda

GrupoGolem

Page 2: Micai 13  contextualized practical speech

Service robots● Our future butlers ● They are task oriented

○ Clean up a room○ Play a game

● Interaction with spoken language ● They work in noisy environments● Microphone is not close to the speaker● Poor speech recognition

Page 3: Micai 13  contextualized practical speech

Proposal● Improve the system on four aspects

● Contextualized recogniser

● Prompting strategies

● Recovery strategies

● Audio calibration

Page 4: Micai 13  contextualized practical speech

I. Contextualized recognition

● Use specific language models for the given expectations

■ YES: yes, okay, all right■ NO: no, don’t, do not

■ NAVIGATE: go to the kitchen, go to the living room, go to the bedroom

Page 5: Micai 13  contextualized practical speech

ASR module

Page 6: Micai 13  contextualized practical speech

II. Prompting strategies

● Let know the user when to speak

■ Beep sound

● Speaker volume monitor

■ Could you speak louder or softer

Page 7: Micai 13  contextualized practical speech

III. Recovery strategy

● Let know the user when something went wrong

■ could you repeat? ■ i can’t hear you well, could you repeat■ sorry, i’m a little deaf

Page 8: Micai 13  contextualized practical speech

IV. Calibration of audio setting

● Hardware■ 1 directional microphone■ 1 USB interface with 4 channels■ 2 speakers

● Calibration of SNR in situ■ For background noise -58dB■ SNR set to 20 dB

Page 9: Micai 13  contextualized practical speech

Corpus evaluation

● Logs from the robot performing RoboCup tasks■ 2 years interactions in lab and competition■ 1,439 utterances■ 2,472 tokens■ 120 types■ 11 tasks■ 9 of 11 tasks are contextualized■ 14 language models

Page 10: Micai 13  contextualized practical speech

Contextualized recognitionWe measure WER (the lower the better)

● With a unique LM for all tasks: 53.84%

● With task-based LM: 28.28%

● With contextualized: 23.42%

17.2% relative error reduction

Page 11: Micai 13  contextualized practical speech

Beep sound

● 79 utterances were recorded without the beep sound

■ Without beeps 55.86%

■ With beeps 39.75%

■ With beeps full 53.72%

30%-4% Relative error reduction

Page 12: Micai 13  contextualized practical speech

Usage of SoundLoc System ● We measure usage

■ 174 times could have been triggered

■ 21 soft speech

■ 4 louder

14.36% of the times

Page 13: Micai 13  contextualized practical speech

Recovery strategy ● We measure usage

■ 504 times could have been triggered

■ 85 times activated

16.87% of the times

Page 14: Micai 13  contextualized practical speech

Conclusions

● These strategies help to improve in small amounts the performance

● Together they allow practical speech recognition on a service robot

Page 15: Micai 13  contextualized practical speech

Thank you

● ¿Questions?