spoken dialogue technology achievements and challenges michael mctear university of ulster
Post on 16-Dec-2015
216 Views
Preview:
TRANSCRIPT
Spoken Dialogue TechnologyAchievements and Challenges
Michael McTear
University of Ulster
Overview
Introduction - What is a spoken dialogue system?
Examples of spoken dialogue systems Technical issues and challenges Future Prospects
What is a spoken dialogue system?
A spoken dialogue system is an automated system that engages in a dialogue with a human user using spoken language as the medium of interaction.
Types of dialogue system
Task-oriented: involves the use of dialogues to accomplish a task, e.g. making a hotel booking, or planning a family holiday
Two main types of spoken dialogue system
Non-task-oriented: engaging in conversational interaction, but without necessarily being involved in a task that needs to be accomplished e.g conversational companion for the elderly
Application Domains for SDS
Telephone-based services and transactions Call-routing, Directory assistance, Travel enquiries,
Bank balance, Bank transactions, Flight / hotel / car rental reservations
In-car interactive and entertainment systems Automated trouble-shooting Smart homes applications Health-care systems e.g. patient monitoring Educational e,g. Intelligent Tutoring Systems,
Foreign Language Learning Computer games
Three generations of task-oriented spoken dialogue system Informational – to retrieve information e.g. flight
times, football scores, … Transactional – to assist the user to perform a
transaction e.g. book a flight, pay a bill
Problem-solving – to support the user in solving a problem e.g. to troubleshoot a PC that is not working
Why is dialogue interesting?
Fundamental aspect of human behaviour Model human conversational competence Simulate human conversational behaviour
Provide tool for interacting with data, services, resources on computers Research challenges Applications in assistive and educational
environments Commercial opportunities
Commercial Systems
Focus on Business opportunities, return on investment (ROI) Benefits for end users Benefits for providers Human factors: performance, usability Tools and languages for design and maintainability Application areas: call centre, enquiries, transactions,
healthcare, …
Academic Systems
Focus on Technologies: speech recognition, spoken language
understanding, dialogue management AI inspired: planning, reasoning, machine learning Statistical v symbolic approaches Advanced dialogue control, error handling, adaptivity,
context representation
Overview
Introduction - What is a spoken dialogue system?
Examples of spoken dialogue systems Technical issues and challenges Future Prospects
Example 1: Voice Menu
System: Hello and welcome ….Main menu. For customer service, say ‘service’.To enquire about an existing order, say ‘order’ …
User: ServiceSystem: Customer service. Would you like to report a fault
or enquire about an extended warranty?User: FaultSystem: Do you have a PC or a laptop?User: LaptopSystem: And the name of the manufacturer?User: SonySystem: Thank you. Please hold while I transfer you to the
Sony …
http://www.speechstorm.com/
Example 2: Research System (Mercury: MIT) Open ended prompt
How may I help you? Disfluencies in input
August twenty-first no August twelfth
I'd like to fly from Boston to Minneapolis on Tuesday no Wednesday November 21st
Inexact response
Prompt: Can you provide the approximate departure time or airline preference
User: Yeah I'd like to fly United and I'd like to leave in the afternoon
http://groups.csail.mit.edu/sls/research/mercury.shtml
Example 2: continued
Response generation
There are more than 3 flights.
The earliest departure leaves at 1.45 pm.
Mixed initiative: user asks question
Do you have something leaving around 4.45?
Relative date reference
I’d like to return the following Tuesday
Example 3: Voice Search GOOG411
GOOG-411 (or Google Voice Local Search) is Google's new 411 service. With GOOG-411, you can find local business information completely free, directly from your phone. You can access 1-800-GOOG-411 from any phone, anywhere, at anytime.
http://www.google.com/goog411/
GOOG411: Prompts
What city and state?
What business name or category?
(Lists services) Number one, …..
Connects to requested service
GOOG411: What can you say?At any point in the call: To go back say "go back" To start over say "start over" or press *All phones
When asked for a city and state: Say the full names for example, "Palo Alto California“ To enter a zip code say it or enter with keypad
When asked for business name or category: Say the full names for example, "Joe's Pizzaria" or "Pizza“
When given results: To navigate between results say or press the listing number To receive an SMS say "text message" To receive a map say "map it" To get more details say "details"
Overview
Introduction - What is a spoken dialogue system?
Examples of spoken dialogue systems Technical issues and challenges Future Prospects
Architecture of a spoken dialogue system
SpeechRecognition
(ASR)
Backend
ResponseGeneration
Text to SpeechSynthesis
(TTS)
a --> xu
SpokenLanguage
Understanding(SLU)
yu, c ã, c
ConceptsWords
Audio
HMMAcousticModel
N-GramLanguage
Model
Dialogue Manager (DM)
DialogueControl
DialogueContext Model
a user dialogue act (intended ) c confidenceã user dialogue act (interpreted)xu user acoustic signalyu speech recognition hypothesis (words)
Component Technologies
Automatic Speech Recognition (ASR) Spoken Language Understanding (SLU) Response Generation (RG) Text to speech synthesis (TTS) Dialogue Management (DM)
Issues in ASR for Dialogue
recognising spontaneous speech in noisy environments
word accuracy does not have to be 100% use of confidence scores in combination with
other information to determine DM actions use of additional information (ASR and parse
probabilities, semantic and contextual features) to re-score recognition hypotheses
Issues in SLU for Dialogue
grammars and parsers for spontaneous speech (disfluencies, errors)
robust understanding problems with hand-crafted approaches use of statistical/ data-driven methods
combined approaches e.g TINA (MIT) hand-crafted rules with trained probabilities robust strategy – if full sentence cannot be parsed,
parse and combine fragments, else use word spotting
Issues in Response Generation for Dialogue
Content selection Determining what to say, selecting and ranking
options Discourse planning
discourse relations e.g. comparison, contrast user-adapted information Presentation ordering
Referring expression generation Aggregation – grouping propositions into clauses
and sentences Use of discourse cues (e.g. firstly, finally, however,
moreover, …)
Issues in Dialogue Management
Dialogue Control Scripts, frames, intelligent agents
Representations Information State Theory
Error handling Dialogue design
Traditional approaches Statistical approaches
Reinforcement learning Corpus / example based approaches
Overview Introduction - What is a spoken dialogue
system? Examples of spoken dialogue systems Technical issues and challenges Future Prospects
A vision for the future
Develop systems that can interact intelligently and co-operatively across a range of environments using a range of appropriate modalities to support people in the activities of their daily lives.
Fundamental research topics
Modelling human conversational competence Dialogue-related issues for ASR, SLU, NLG,
TTS Comparison of methods for dialogue
management: rule-based v stochastic Representation and use of contextual
information Integration and usage of modalities to
complement and supplement speech Incremental processing in dialogue
Areas of application
Voice search Dialogue in vehicles Mobile speech applications Multimodal embodied and situated systems Troubleshooting applications Dialogue systems for ambient intelligence and
as assistive technologies
Concluding remarks
Spoken Dialogue Technology embraces a range of speech and language
technologies poses lots of theoretical as well as practical
challenges is interesting for commercial developers as
well as academic researchers has a wide range of potential applications
Recommended reading
McTear, M. (2004) Spoken Dialogue Technology. Springer.
Lopez Cozar, R. & Araki, M. (2005) Spoken, multilingual and multimodal dialogue systems. John Wiley & Sons.
Aghajan, H., Augusto, J.C., Lopez Cozar, R. (2009) Human-Centric Interfaces for Ambient Intelligence. Elsevier.
Jokinen, K. & McTear, M. (2010) Spoken Dialogue Systems. Morgan Claypool Publishers.
Wilks, Y. (ed.) (2010) Close Engagements with Artificial Companions: Key social, psychological, ethical and design issues. John Benjamins Publishing Company.
Thank you
Questions?
top related