andrew maas stanford university spring 2017 lecture 10...
TRANSCRIPT
CS 224S / LINGUIST 285Spoken Language Processing
AndrewMaasStanfordUniversity
Spring2017
Lecture10:DialogueSystemIntroductionandFrame-Based
DialogueOriginalslidesbyDanJurafsky
Dialog section
� May3:Dialogintroduction.Framebasedsystems� May8:Humanconversation.Reinforcementlearningfordialog
� May10:Deeplearningfordialog(Jiwei)� May31:Dialoginindustry(AlexLebrun,FounderofWit.aiandFacebookM)
Outline
� BasicConversationalAgents�ASR�NLU�Generation�DialogueManager
� DialogueManagerDesign� FiniteState� Frame-based
� DialogueDesignConsiderations
Conversational Agents� AKA:
� SpokenLanguageSystems� DialogueSystems� SpeechDialogueSystems
� Applications:� Travelarrangements(Amtrak,Unitedairlines)� Telephonecallrouting� Tutoring� Communicatingwithrobots� Anythingwithlimitedscreen/keyboard
Conversational systems
AmazonEcho2015
GoogleHome2016 FacebookM
2015
AppleSiri2011
GoogleAssistant2016
MicrosoftCortana2014 SlackBot
2015
Conversational Agent Design Issues� Timetoresponse(Synchronous?)� Taskcomplexity
� Whattimeisit?� BookmeaflightandhotelforvacationinGreece
� Interactioncomplexity/numberofturns� Singlecommand/response� “Iwantnewshoes”Whatkind?Whatcolor?Whatsize?
� Initiative� User,System,Mixed
� Interactionmodality� Purelyspoken,Purelytext,Mixingspeech/text/media
Dialogue Manager� Controlsthearchitectureandstructureofdialogue� TakesinputfromASR/NLUcomponents�Maintainssomesortofstate� InterfaceswithTaskManager� PassesoutputtoNLG/TTSmodules
Possible architectures for dialog management
FiniteStateFrame-basedInformationState(MarkovDecisionProcess)
ClassicAIPlanningDistributional/neuralnetwork
Finite-State Dialog Management
Consideratrivialairlinetravelsystem:AsktheuserforadeparturecityAskforadestinationcityAskforatimeAskwhetherthetripisround-tripornot
Finite-state dialog managers
� Systemcompletelycontrolstheconversationwiththeuser.
� Itaskstheuseraseriesofquestions� Ignoring(ormisinterpreting)anythingtheusersaysthatisnotadirectanswertothesystem’squestions
Dialogue Initiative
�Systemsthatcontrolconversationlikethisaresysteminitiativeorsingleinitiative.
� Initiative:whohascontrolofconversation
� Innormalhuman-humandialogue,initiativeshiftsbackandforthbetweenparticipants.
System InitiativeSystemcompletelycontrolstheconversation
� Simpletobuild�Useralwaysknowswhattheycansaynext� Systemalwaysknowswhatusercansaynext� Knownwords:BetterperformancefromASR� Knowntopic:BetterperformancefromNLU
�OKforVERYsimpletasks(enteringacreditcard,orloginnameandpassword)
� Toolimited
+
-
Problems with System Initiative� Realdialogueinvolvesgiveandtake!� Intravelplanning,usersmightwanttosaysomethingthatisnotthedirectanswertothequestion.
� Forexampleansweringmorethanonequestioninasentence:Hi,I’dliketoflyfromSeattleTuesdaymorningIwantaflightfromMilwaukeetoOrlandoonewayleavingafter5p.m.onWednesday.
Single initiative + universals� Wecangiveusersalittlemoreflexibilitybyaddinguniversals:commandsyoucansayanywhere
� AsifweaugmentedeverystateofFSAwiththeseHelpStartoverCorrect
� Thisdescribesmanyimplementedsystems� Butstilldoesn’tallowusermuchflexibility
User Initiative
� Userdirectsthesystem�Asksasinglequestion,systemanswers
� Examples:Voicewebsearch� Butsystemcan’t:�askquestionsback,�engageinclarificationdialogue,�engageinconfirmationdialogue
Mixed Initiative� Conversationalinitiativecanshiftbetweensystemanduser
� Simplestkindofmixedinitiative:usethestructureoftheframe toguidedialogue
An example of a frame
FLIGHTFRAME:ORIGIN:
CITY:BostonDATE:TuesdayTIME:morning
DEST:CITY:SanFrancisco
AIRLINE:…
Mixed Initiative� Conversationalinitiativecanshiftbetweensystemanduser
� Simplestkindofmixedinitiative:usethestructureoftheframe toguidedialogue
Slot QuestionORIGIN Whatcityareyouleavingfrom?DEST Whereareyougoing?DEPTDATE Whatdaywouldyouliketoleave?DEPTTIME Whattimewouldyouliketoleave?AIRLINE Whatisyourpreferredairline?
Frames are mixed-initiative
� Usercananswermultiplequestionsatonce.� Systemasksquestionsofuser,fillinganyslotsthatuserspecifies�Whenframeisfilled,dodatabasequery
� Ifuseranswers3questionsatonce,systemhastofillslotsandnotaskthesequestionsagain!�Avoidsstrictconstraintsonorderofthefinite-statearchitecture.
Multiple frames� flights,hotels,rentalcars� Flightlegs:Eachflightcanhavemultiplelegs,whichmightneedtobediscussedseparately
� Presentingtheflights(Iftherearemultipleflightsmeetingusersconstraints)� Ithasslotslike1ST_FLIGHTor2ND_FLIGHTsousercanask“howmuchisthesecondone”
� Generalrouteinformation:� WhichairlinesflyfromBostontoSanFrancisco
� Airfarepractices:� DoIhavetostayoverSaturdaytogetadecentairfare?
Natural Language Understanding� Therearemanywaystorepresentthemeaningofsentences
� Forspeechdialoguesystems,mostcommonis“Frameandslotsemantics”.
An example of a frameShowmemorningflightsfromBostontoSFonTuesday.
SHOW:FLIGHTS:
ORIGIN:CITY:BostonDATE:TuesdayTIME:morning
DEST:CITY:SanFrancisco
Semantics for a sentenceLISTFLIGHTSORIGINShowmeflightsfromBoston
DESTINATIONDEPARTDATEtoSanFranciscoonTuesday
DEPARTTIMEmorning
Idea: HMMs for semantics
� HiddenunitsareslotnamesORIGINDESTCITYDEPARTTIME
� ObservationsarewordsequencesonTuesday
Semantic HMM
� GoalofHMMmodel:TocomputelabelingofsemanticrolesC=c1,c2,…,cn
(Cfor‘cases’or‘concepts’)thatismostprobablegivenwordsW
€
argmaxC
P(C |W ) = argmaxC
P(W |C)P(C)P(W )
€
= argmaxC
P(W |C)P(C)
€
= argmaxC
P(wi |wi−1...w1,C)P(w1 |C)i= 2
N
∏ P(ci | ci−1...c1)i= 2
M
∏
Semantic HMM � Frompreviousslide:
� Assumesimplification:
� Finalform:
€
= argmaxC
P(wi |wi−1...wi−N +1,ci)i= 2
N
∏ P(ci | ci−1...ci−M +1)i= 2
M
∏
€
= argmaxC
P(wi |wi−1...w1,C)P(w1 |C)i= 2
N
∏ P(ci | ci−1...c1)i= 2
M
∏
€
P(wi |wi−1...w1,C) = P(wi |wi−1,...,wi−N +1,ci)
€
P(ci | ci−1...c1,C) = P(ci | ci−1,...,ci−M +1)
semi-HMM model of semantics Pieraccini etal(1991)
P(W|C)=P(me|show,SHOW)P(show|SHOW)P(flights|FLIGHTS)…P(FLIGHTS|SHOW)P(DUMMY|FLIGHTS)…
Semi-HMMs
� Eachhiddenstate� Cangeneratemultipleobservations
� Bycontrast,atraditionalHMM�Oneobservationperhiddenstate�Needtolooptohavemultipleobservationswiththesamestatelabel
How to train� Supervisedtraining� Labelandsegmenteachsentencewithframefillers� EssentiallylearninganN-gramgrammarforeachslot
LISTFLIGHTSDUMMYORIGINDESTShowmeflightsthatgofromBostontoSF
Another way to do NLU: Semantic Grammars � CFGinwhichtheLHSofrulesisasemanticcategory:
LIST->showme|Iwant|canIsee|…DEPARTTIME->(after|around|before)HOUR
|morning|afternoon|eveningHOUR->one|two|three…|twelve(am|pm)FLIGHTS->(a)flight|flightsORIGIN->fromCITYDESTINATION->toCITYCITY->Boston|SanFrancisco|Denver|Washington
Modern Approach: Semantic Parsing� Systemtranslatesnaturallanguageintologicalforms� Systemcanactonstructuredlogicalforms� Modernapproachesmixhandengineeredgrammargenerationwithmachinelearningtomapinputtexttooutputstructuredform
Semantic Parsing Output: Database Query� Directlymapnaturallanguagetodatabasequeries� Potentiallytimeconsumingtobuild/trainforanewschema,butaclean,clearformalism
Slide from Bill McCartney CS224U
Semantic Parsing Output: Procedural Languages� Expressconcept,nestedstatesoractionsequences� Designingsetofpossibleactionsandcompositionrulescangetverycomplex
� Howmuchcanauserreasonablyspecifyinoneutterance?
Slide from Bill McCartney CS224U
Semantic Parsing Output: Intents and Arguments� Personalassistantvoicecommandsaresimpleandneedtoscaletomanydomains
� Simplicityhelpswithrobustnessandscale,justrecognizewhataction andwhatrequiredargumentsforthataction
Slide from Bill McCartney CS224U
Semantic Parsing Approach Outline� Veryactiveareaofresearch� Definepossiblesyntacticstructuresusingacontext-freegrammar
� Constructsemanticsbottom-up,followingsyntacticstructure
� Scoreparseswitha(log-linear)modelthatwasfitontraininginput,action/outputpairs
� Useexternalannotatorstorecognizenames,dates,places,etc.
� Grammarinductionifpossible,orlotsofgrammarengineering
Slide from Bill McCartney CS224U
A final way to do NLU:Condition-Action Rules
� ActiveOntology:relationalnetworkofconcepts�datastructures:ameeting has�adateandtime,�alocation,�atopic�alistofattendees
�rulesetsthatperformactionsforconcepts�thedate conceptturnsstring�Mondayat2pminto�dateobjectdate(DAY,MONTH,YEAR,HOURS,MINUTES)
Rule sets
� Collectionsofrules consistingof:�condition�action
�Whenuserinputisprocessed,factsaddedtostoreand� ruleconditionsareevaluated� relevantactionsexecuted
Part of ontology for meeting task
has-a may-have-a
meetingconcept:ifyoudon’tyethavealocation,askforalocation
ASR: Language Models for dialogue� Oftenbasedonhand-writtenContext-Freeorfinite-stategrammarsratherthanN-grams
�Why?�Needforunderstanding;weneedtoconstrainusertosaythingsthatweknowwhattodowith.
ASR: Language Models for Dialogue� WecanhaveLMspecifictoadialoguestate� Ifsystemjustasked“Whatcityareyoudepartingfrom?”
� LMcanbe� Citynamesonly� FSA:(Iwantto(leave|depart))(from)[CITYNAME]� N-gramstrainedonanswersto“Cityname”questionsfromlabeleddata
� ALMthatisconstrainedinthiswayistechnicallycalleda“restrictedgrammar”or“restrictedLM”
Generation Component� ContentPlanner
�Decideswhatcontenttoexpresstouser(askaquestion,presentananswer,etc)
�Oftenmergedwithdialoguemanager� LanguageGeneration
� Choosessyntaxandwords� TTS
� Inpractice:Template-basedw/mostwordsprespecifiedWhattimedoyouwanttoleaveCITY-ORIG?WillyoureturntoCITY-ORIGfromCITY-DEST?
More sophisticated language generation component� NaturalLanguageGeneration� Approach:�Dialoguemanagerbuildsrepresentationofmeaningofutterancetobeexpressed
�Passesthistoa“generator”�Generatorshavethreecomponents� Sentenceplanner� Surfacerealizer� Prosodyassigner
HCI constraints on generation for dialogue: “Coherence”Discoursemarkersandpronouns(“Coherence”):
Pleasesaythedate.…Pleasesaythestarttime.…Pleasesaytheduration……Pleasesaythesubject…
First,tellmethedate.…Next,I’llneedthetimeitstarts.…Thanks.<pause>Now,howlongisitsupposedtolast?…Lastofall,Ijustneedabriefdescription
HCI constraints on generation for dialogue: coherence (II): tapered promptsPromptswhichgetincrementallyshorter:
System:Now,what’sthefirstcompanytoaddtoyourwatchlist?Caller:CiscoSystem:What’sthenextcompanyname?(Or,youcansay,“Finished”)Caller:IBMSystem:Tellmethenextcompanyname,orsay,“Finished.”Caller:IntelSystem:Nextone?Caller:AmericaOnline.System:Next?Caller:…
How mixed initiative is usually defined
� Firstweneedtodefinetwootherfactors�Openpromptsvs.directiveprompts�Restrictiveversusnon-restrictivegrammar
Open vs. Directive Prompts� Openprompt
� Systemgivesuserveryfewconstraints�Usercanrespondhowtheyplease:“HowmayIhelpyou?” “HowmayIdirectyourcall?”
� Directiveprompt� Explicitinstructsuserhowtorespond“Sayyesifyouacceptthecall;otherwise,sayno”
Restrictive vs. Non-restrictive grammars� Restrictivegrammar
� LanguagemodelwhichstronglyconstrainstheASRsystem,basedondialoguestate
� Non-restrictivegrammar� Openlanguagemodelwhichisnotrestrictedtoaparticulardialoguestate
Definition of Mixed Initiative
Grammar OpenPrompt DirectivePrompt
Restrictive Doesn’tmakesense SystemInitiative
Non-restrictive UserInitiative MixedInitiative
Evaluation
1. SlotErrorRateforaSentence#ofinserted/deleted/subsituted slots#oftotalreferenceslotsforsentence
2. End-to-endevaluation(TaskSuccess)