andrew maas stanford university spring 2017 lecture 10...

66
CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 10: Dialogue System Introduction and Frame-Based Dialogue Original slides by Dan Jurafsky

Upload: truongkhue

Post on 07-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

CS 224S / LINGUIST 285Spoken Language Processing

AndrewMaasStanfordUniversity

Spring2017

Lecture10:DialogueSystemIntroductionandFrame-Based

DialogueOriginalslidesbyDanJurafsky

Dialog section

� May3:Dialogintroduction.Framebasedsystems� May8:Humanconversation.Reinforcementlearningfordialog

� May10:Deeplearningfordialog(Jiwei)� May31:Dialoginindustry(AlexLebrun,FounderofWit.aiandFacebookM)

Outline

� BasicConversationalAgents�ASR�NLU�Generation�DialogueManager

� DialogueManagerDesign� FiniteState� Frame-based

� DialogueDesignConsiderations

Conversational Agents� AKA:

� SpokenLanguageSystems� DialogueSystems� SpeechDialogueSystems

� Applications:� Travelarrangements(Amtrak,Unitedairlines)� Telephonecallrouting� Tutoring� Communicatingwithrobots� Anythingwithlimitedscreen/keyboard

Conversational systems

AmazonEcho2015

GoogleHome2016 FacebookM

2015

AppleSiri2011

GoogleAssistant2016

MicrosoftCortana2014 SlackBot

2015

A travel dialog: CommunicatorXu and Rudnicky (2000)

Call routing: ATT HMIHYGoren et al. (1997)

A tutorial dialogue: ITSPOKELitman and Silliman (2004)

Conversational Agent Design Issues� Timetoresponse(Synchronous?)� Taskcomplexity

� Whattimeisit?� BookmeaflightandhotelforvacationinGreece

� Interactioncomplexity/numberofturns� Singlecommand/response� “Iwantnewshoes”Whatkind?Whatcolor?Whatsize?

� Initiative� User,System,Mixed

� Interactionmodality� Purelyspoken,Purelytext,Mixingspeech/text/media

Spoken Synchronous Personal Assistants� Siri� GoogleNow� MicrosoftCortana� AmazonAlexa

� a

Dialogue System Architecture

Dialog architecture for Personal AssistantsBellegarda

Dialog architecture for Personal Assistants

Dialogue Manager� Controlsthearchitectureandstructureofdialogue� TakesinputfromASR/NLUcomponents�Maintainssomesortofstate� InterfaceswithTaskManager� PassesoutputtoNLG/TTSmodules

Possible architectures for dialog management

FiniteStateFrame-basedInformationState(MarkovDecisionProcess)

ClassicAIPlanningDistributional/neuralnetwork

Finite-State Dialog Management

Consideratrivialairlinetravelsystem:AsktheuserforadeparturecityAskforadestinationcityAskforatimeAskwhetherthetripisround-tripornot

Finite State Dialog Manager

Finite-state dialog managers

� Systemcompletelycontrolstheconversationwiththeuser.

� Itaskstheuseraseriesofquestions� Ignoring(ormisinterpreting)anythingtheusersaysthatisnotadirectanswertothesystem’squestions

Dialogue Initiative

�Systemsthatcontrolconversationlikethisaresysteminitiativeorsingleinitiative.

� Initiative:whohascontrolofconversation

� Innormalhuman-humandialogue,initiativeshiftsbackandforthbetweenparticipants.

System InitiativeSystemcompletelycontrolstheconversation

� Simpletobuild�Useralwaysknowswhattheycansaynext� Systemalwaysknowswhatusercansaynext� Knownwords:BetterperformancefromASR� Knowntopic:BetterperformancefromNLU

�OKforVERYsimpletasks(enteringacreditcard,orloginnameandpassword)

� Toolimited

+

-

Problems with System Initiative� Realdialogueinvolvesgiveandtake!� Intravelplanning,usersmightwanttosaysomethingthatisnotthedirectanswertothequestion.

� Forexampleansweringmorethanonequestioninasentence:Hi,I’dliketoflyfromSeattleTuesdaymorningIwantaflightfromMilwaukeetoOrlandoonewayleavingafter5p.m.onWednesday.

Single initiative + universals� Wecangiveusersalittlemoreflexibilitybyaddinguniversals:commandsyoucansayanywhere

� AsifweaugmentedeverystateofFSAwiththeseHelpStartoverCorrect

� Thisdescribesmanyimplementedsystems� Butstilldoesn’tallowusermuchflexibility

User Initiative

� Userdirectsthesystem�Asksasinglequestion,systemanswers

� Examples:Voicewebsearch� Butsystemcan’t:�askquestionsback,�engageinclarificationdialogue,�engageinconfirmationdialogue

Mixed Initiative� Conversationalinitiativecanshiftbetweensystemanduser

� Simplestkindofmixedinitiative:usethestructureoftheframe toguidedialogue

An example of a frame

FLIGHTFRAME:ORIGIN:

CITY:BostonDATE:TuesdayTIME:morning

DEST:CITY:SanFrancisco

AIRLINE:…

Mixed Initiative� Conversationalinitiativecanshiftbetweensystemanduser

� Simplestkindofmixedinitiative:usethestructureoftheframe toguidedialogue

Slot QuestionORIGIN Whatcityareyouleavingfrom?DEST Whereareyougoing?DEPTDATE Whatdaywouldyouliketoleave?DEPTTIME Whattimewouldyouliketoleave?AIRLINE Whatisyourpreferredairline?

Frames are mixed-initiative

� Usercananswermultiplequestionsatonce.� Systemasksquestionsofuser,fillinganyslotsthatuserspecifies�Whenframeisfilled,dodatabasequery

� Ifuseranswers3questionsatonce,systemhastofillslotsandnotaskthesequestionsagain!�Avoidsstrictconstraintsonorderofthefinite-statearchitecture.

Multiple frames� flights,hotels,rentalcars� Flightlegs:Eachflightcanhavemultiplelegs,whichmightneedtobediscussedseparately

� Presentingtheflights(Iftherearemultipleflightsmeetingusersconstraints)� Ithasslotslike1ST_FLIGHTor2ND_FLIGHTsousercanask“howmuchisthesecondone”

� Generalrouteinformation:� WhichairlinesflyfromBostontoSanFrancisco

� Airfarepractices:� DoIhavetostayoverSaturdaytogetadecentairfare?

Natural Language Understanding� Therearemanywaystorepresentthemeaningofsentences

� Forspeechdialoguesystems,mostcommonis“Frameandslotsemantics”.

An example of a frameShowmemorningflightsfromBostontoSFonTuesday.

SHOW:FLIGHTS:

ORIGIN:CITY:BostonDATE:TuesdayTIME:morning

DEST:CITY:SanFrancisco

Semantics for a sentenceLISTFLIGHTSORIGINShowmeflightsfromBoston

DESTINATIONDEPARTDATEtoSanFranciscoonTuesday

DEPARTTIMEmorning

Idea: HMMs for semantics

� HiddenunitsareslotnamesORIGINDESTCITYDEPARTTIME

� ObservationsarewordsequencesonTuesday

HMM model of semantics Pieraccini etal(1991)

Semantic HMM

� GoalofHMMmodel:TocomputelabelingofsemanticrolesC=c1,c2,…,cn

(Cfor‘cases’or‘concepts’)thatismostprobablegivenwordsW

argmaxC

P(C |W ) = argmaxC

P(W |C)P(C)P(W )

= argmaxC

P(W |C)P(C)

= argmaxC

P(wi |wi−1...w1,C)P(w1 |C)i= 2

N

∏ P(ci | ci−1...c1)i= 2

M

Semantic HMM � Frompreviousslide:

� Assumesimplification:

� Finalform:

= argmaxC

P(wi |wi−1...wi−N +1,ci)i= 2

N

∏ P(ci | ci−1...ci−M +1)i= 2

M

= argmaxC

P(wi |wi−1...w1,C)P(w1 |C)i= 2

N

∏ P(ci | ci−1...c1)i= 2

M

P(wi |wi−1...w1,C) = P(wi |wi−1,...,wi−N +1,ci)

P(ci | ci−1...c1,C) = P(ci | ci−1,...,ci−M +1)

semi-HMM model of semantics Pieraccini etal(1991)

P(W|C)=P(me|show,SHOW)P(show|SHOW)P(flights|FLIGHTS)…P(FLIGHTS|SHOW)P(DUMMY|FLIGHTS)…

Semi-HMMs

� Eachhiddenstate� Cangeneratemultipleobservations

� Bycontrast,atraditionalHMM�Oneobservationperhiddenstate�Needtolooptohavemultipleobservationswiththesamestatelabel

How to train� Supervisedtraining� Labelandsegmenteachsentencewithframefillers� EssentiallylearninganN-gramgrammarforeachslot

LISTFLIGHTSDUMMYORIGINDESTShowmeflightsthatgofromBostontoSF

Another way to do NLU: Semantic Grammars � CFGinwhichtheLHSofrulesisasemanticcategory:

LIST->showme|Iwant|canIsee|…DEPARTTIME->(after|around|before)HOUR

|morning|afternoon|eveningHOUR->one|two|three…|twelve(am|pm)FLIGHTS->(a)flight|flightsORIGIN->fromCITYDESTINATION->toCITYCITY->Boston|SanFrancisco|Denver|Washington

Tina parse tree with semantic rulesSeneff 1992

Phoenix SLU system:Recursive Transition Network

Ward1991,figurefromWang,Deng,Acero

Modern Approach: Semantic Parsing� Systemtranslatesnaturallanguageintologicalforms� Systemcanactonstructuredlogicalforms� Modernapproachesmixhandengineeredgrammargenerationwithmachinelearningtomapinputtexttooutputstructuredform

Semantic Parsing Output: Database Query� Directlymapnaturallanguagetodatabasequeries� Potentiallytimeconsumingtobuild/trainforanewschema,butaclean,clearformalism

Slide from Bill McCartney CS224U

Semantic Parsing Output: Procedural Languages� Expressconcept,nestedstatesoractionsequences� Designingsetofpossibleactionsandcompositionrulescangetverycomplex

� Howmuchcanauserreasonablyspecifyinoneutterance?

Slide from Bill McCartney CS224U

Semantic Parsing Output: Intents and Arguments� Personalassistantvoicecommandsaresimpleandneedtoscaletomanydomains

� Simplicityhelpswithrobustnessandscale,justrecognizewhataction andwhatrequiredargumentsforthataction

Slide from Bill McCartney CS224U

Semantic Parsing Approach Outline� Veryactiveareaofresearch� Definepossiblesyntacticstructuresusingacontext-freegrammar

� Constructsemanticsbottom-up,followingsyntacticstructure

� Scoreparseswitha(log-linear)modelthatwasfitontraininginput,action/outputpairs

� Useexternalannotatorstorecognizenames,dates,places,etc.

� Grammarinductionifpossible,orlotsofgrammarengineering

Slide from Bill McCartney CS224U

A final way to do NLU:Condition-Action Rules

� ActiveOntology:relationalnetworkofconcepts�datastructures:ameeting has�adateandtime,�alocation,�atopic�alistofattendees

�rulesetsthatperformactionsforconcepts�thedate conceptturnsstring�Mondayat2pminto�dateobjectdate(DAY,MONTH,YEAR,HOURS,MINUTES)

Rule sets

� Collectionsofrules consistingof:�condition�action

�Whenuserinputisprocessed,factsaddedtostoreand� ruleconditionsareevaluated� relevantactionsexecuted

Part of ontology for meeting task

has-a may-have-a

meetingconcept:ifyoudon’tyethavealocation,askforalocation

Other components

ASR: Language Models for dialogue� Oftenbasedonhand-writtenContext-Freeorfinite-stategrammarsratherthanN-grams

�Why?�Needforunderstanding;weneedtoconstrainusertosaythingsthatweknowwhattodowith.

ASR: Language Models for Dialogue� WecanhaveLMspecifictoadialoguestate� Ifsystemjustasked“Whatcityareyoudepartingfrom?”

� LMcanbe� Citynamesonly� FSA:(Iwantto(leave|depart))(from)[CITYNAME]� N-gramstrainedonanswersto“Cityname”questionsfromlabeleddata

� ALMthatisconstrainedinthiswayistechnicallycalleda“restrictedgrammar”or“restrictedLM”

Generation Component� ContentPlanner

�Decideswhatcontenttoexpresstouser(askaquestion,presentananswer,etc)

�Oftenmergedwithdialoguemanager� LanguageGeneration

� Choosessyntaxandwords� TTS

� Inpractice:Template-basedw/mostwordsprespecifiedWhattimedoyouwanttoleaveCITY-ORIG?WillyoureturntoCITY-ORIGfromCITY-DEST?

More sophisticated language generation component� NaturalLanguageGeneration� Approach:�Dialoguemanagerbuildsrepresentationofmeaningofutterancetobeexpressed

�Passesthistoa“generator”�Generatorshavethreecomponents� Sentenceplanner� Surfacerealizer� Prosodyassigner

Architecture of a generator for a dialogue system

WalkerandRambow 2002)

HCI constraints on generation for dialogue: “Coherence”Discoursemarkersandpronouns(“Coherence”):

Pleasesaythedate.…Pleasesaythestarttime.…Pleasesaytheduration……Pleasesaythesubject…

First,tellmethedate.…Next,I’llneedthetimeitstarts.…Thanks.<pause>Now,howlongisitsupposedtolast?…Lastofall,Ijustneedabriefdescription

HCI constraints on generation for dialogue: coherence (II): tapered promptsPromptswhichgetincrementallyshorter:

System:Now,what’sthefirstcompanytoaddtoyourwatchlist?Caller:CiscoSystem:What’sthenextcompanyname?(Or,youcansay,“Finished”)Caller:IBMSystem:Tellmethenextcompanyname,orsay,“Finished.”Caller:IntelSystem:Nextone?Caller:AmericaOnline.System:Next?Caller:…

How mixed initiative is usually defined

� Firstweneedtodefinetwootherfactors�Openpromptsvs.directiveprompts�Restrictiveversusnon-restrictivegrammar

Open vs. Directive Prompts� Openprompt

� Systemgivesuserveryfewconstraints�Usercanrespondhowtheyplease:“HowmayIhelpyou?” “HowmayIdirectyourcall?”

� Directiveprompt� Explicitinstructsuserhowtorespond“Sayyesifyouacceptthecall;otherwise,sayno”

Restrictive vs. Non-restrictive grammars� Restrictivegrammar

� LanguagemodelwhichstronglyconstrainstheASRsystem,basedondialoguestate

� Non-restrictivegrammar� Openlanguagemodelwhichisnotrestrictedtoaparticulardialoguestate

Definition of Mixed Initiative

Grammar OpenPrompt DirectivePrompt

Restrictive Doesn’tmakesense SystemInitiative

Non-restrictive UserInitiative MixedInitiative

Evaluation

1. SlotErrorRateforaSentence#ofinserted/deleted/subsituted slots#oftotalreferenceslotsforsentence

2. End-to-endevaluation(TaskSuccess)

Evaluation Metrics

Sloterrorrate:1/3Tasksuccess:Atend,wasthecorrectmeetingaddedtothecalendar?

“MakeanappointmentwithChrisat10:30inGates104”

Slot FillerPERSON ChrisTIME 11:30a.m.ROOM Gates104