learning to talk through listening

54
Learning to Talk Through Listening Alexander I. Rudnicky with Ananlada Chotimongkol and Dan Bohus Carnegie Mellon University CATALOG 2004 – Barcelona July 21, 2004

Upload: sanaa

Post on 27-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Learning to Talk Through Listening. Alexander I. Rudnicky with Ananlada Chotimongkol and Dan Bohus Carnegie Mellon University CATALOG 2004 – Barcelona July 21, 2004. Outline. Empirical approaches to understanding dialogue and building dialogue systems A task-based approach to dialogue - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Learning to Talk Through Listening

Learning to Talk Through Listening

Alexander I. Rudnickywith

Ananlada Chotimongkol and Dan BohusCarnegie Mellon University

CATALOG 2004 – BarcelonaJuly 21, 2004

Page 2: Learning to Talk Through Listening

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Fundamental representations and observable events

• Learning through observation

Page 3: Learning to Talk Through Listening

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Fundamental representations and observable events

• Learning through observation

Page 4: Learning to Talk Through Listening

Why Build Dialogue Systems?

• The devil is in the details

• Better understand the actual complexities of human-computer interaction

• Create specific artifacts that embody theories of dialogue and interaction (and thereby allow us to test them directly)

Page 5: Learning to Talk Through Listening

Domains, Tasks and Applications

Domain

Task

Task

Task

Application

Page 6: Learning to Talk Through Listening

Task representation/specification alternatives

• Code (unspecialized representations, procedural)– Difficult to manage

• Forms (properly, F→A sets)– Works for the simplest tasks, which can be easily cast

as such– Many examples

• Forms + graph-based dialogue structure– Graph-based part essentially = code, same problems– Examples: VXML, SALT

• Hierarchical, plan-based– Task specified as a hierarchical plan (recipe) for the

domain– Examples: RavenClaw, Collagen

Page 7: Learning to Talk Through Listening

CMU dialogue approaches and systems

• Procedural– Command and control [OM,, etc]– Information access [MovieLine,etc]

• Script-based and graph-based– Travel planning; maintenance [SpeechWear]

• AGENDA-based– Communicator: travel planning– LARRI: task guidance [m-modal]– Roomline, etc: information access and transactions– Madeleine: medical diagnosis– TeamTalk: multi-participant dialogue– Valerie: interviews

Page 8: Learning to Talk Through Listening

Graph-based systems

Welcome to Bank ABC! Please say one of the following:

Balance, Hours, Loan, ...

What type of loan are you interested in?Please say one of the following:

Mortgage, Car, Personal, ...

. . . .

Loan

Car

Page 9: Learning to Talk Through Listening

Frame-based systems

• I would like to fly to Boston• When would you like to fly?

• Friday

Destination_City: ______Departure_Date: ______Departure_Time: ______Preferred_Airline: ______...

20030822Boston

Page 10: Learning to Talk Through Listening

Frame-based systems

• I’d like to go to Boston on Friday, …• What time would you like to leave?

Destination_City: ______Departure_Date: ______Departure_Time: ______Preferred_Airline: ______...

20030822Boston

Page 11: Learning to Talk Through Listening

Frame-based systems

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Transition onkeyword or phrase

Page 12: Learning to Talk Through Listening

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Learning through observation

• Fundamental representations and observable events

Page 13: Learning to Talk Through Listening

Task-oriented Interaction

• Implicit system goal is to create products– Data structures that specify information for action

• Sessions can generate multiple products– Immediate products, e.g., information requests– Products that are built up incrementally over the

course of a session, e.g., a plan such as an itinerary

• An Agenda to order (and re-order) topics for discussion

Page 14: Learning to Talk Through Listening

Products and Actions

• Products and Actions are domain-specific – e.g., itineraries bookings, queries information

display• Products are represented as an ordered tree

– nodes in the trees correspond to schemas (handlers, agents, etc.) and are slots or forms

• Slot-specific computation is encapsulated in schema (handler objects)

• Agenda is generated from the current product tree

– defines the sequence of topics to take up with the user

Page 15: Learning to Talk Through Listening

Agenda Structure

• Ordered list of conversational topics

– current goal: focussed topic– pending goals: schema yet to

be filled– persistent goals: handlers that

are always active• constructors• generic help• garble

Current focus

Pending

goals

Persistent

goals

Page 16: Learning to Talk Through Listening

Simple and Compound Schema

valuetransform

focus hook prompt•Invalidate value

•self-promote

•reorder tree

receptors

Domain

Agent

valuetransform

Value_3

Value_1

Value_2

report

Domain

Agent

e.g. SQL query

receptor

+

Page 17: Learning to Talk Through Listening

Agendas from product tree traversal

• Default traversal of current product tree

– left-to-right, depth-first

– all nodes in the current product tree are always on the agenda

• Persistent goals sort to the bottom of the list

profile

root

Leg_1

Hotel_1 Car_1

Leg_2

1

2

4Flight_1

Dest_1 Time_1Date_1 65 7

3

9

10

8

Page 18: Learning to Talk Through Listening

Shifting focus

• Agenda has linear structure

– Derived from product tree

• Focus capture implies reordering sibling nodes

– Reordering propagates to root

• enclosing topic contexts get promoted

– Focus node is promoted to top of the agenda

node i gets focus

a

dc

b e

gf

iha

dc

b e

g f

ih

a

fg

e b

dc

ih

1

2

63

7

98

5 ( 1)4

Page 19: Learning to Talk Through Listening

Constructors

• Products are not fixed data structures but may expand through the course of a session

• Users can modify the product

– “I’d like to go on to Syracuse”

– [system adds a new leg sub-tree to the product]

t

n l

f

D d t

ch

l

f

D d t

ch

t

n l

f

D d t

ch

Page 20: Learning to Talk Through Listening

Hierarchical Plan-based Representation

Login

AskRegistered

AskName GreetUser

GetProfile

GreetGuest

PRE: registered=false

PRE: AVAILABLE(name)

PRE: AVAILABLE(name)

GOAL: (registered = false) || AVAILABLE(profile)

Execution policy

• Dialog control:– Task constraints (Declarative): define the boundaries

of the space of possible dialogs– Execution policy (Procedural/Workflow): actively

defines dialogue control

Page 21: Learning to Talk Through Listening

Hierarchical Plan-based Representation

Communicator

Welcome Login Travel Locals Bye

AskRegistered AskName GreetUser GetProfile Leg1

GetQuery ExecuteQuery DiscussLeg1

Registered: [yes]

Registered: [yes]Name: [user_name]

Registered: [yes]Name: [user_name]Departure: [City]Arrival: [City]… … …

AskRegistered

Login

Communicator

FOCUS

MAIN TOPIC

S: Are you a registered user?U: Yes, this is Alex [yes] [user_name]

Page 22: Learning to Talk Through Listening

Hierarchical Plan-based Representation

Leg1

ExecuteQuery DiscussLeg1GetQuery: FORM

DepartureLocation: TCityArrivalLocation: TCityDepartureDate: TDateDepartureTime: TTime

Common task skills

Page 23: Learning to Talk Through Listening

Dialog Engine

• Controls the dialog by executing the hierarchical plan-based task specification

• In the process, automatically exhibits appropriate generic (task and domain-independent) conversational skills:– Global dialogue mechanisms

• repeat, suspend, start-over, help, where are we?

– Grounding• Implicit and explicit confirmations, disambiguations, various

non-understanding handling strategies

– Timing and turn-taking

Page 24: Learning to Talk Through Listening

Issues that remain

• Parallel activities and asynchronous events– Understanding the scope of “dialogue”

• Knowledge engineering dialogue systems– Building the interface between the dialogue engine

and the world (“pragmatics”)– Capturing human speech and language behavior

within tasks and domains– Reasoning about the world within applications– Communicating meaningfully and efficiently with the

user about the state of the world

Page 25: Learning to Talk Through Listening

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Learning through observation

• Fundamental representations and observable events

Page 26: Learning to Talk Through Listening

Learning by observation

• Many automatic systems are meant to substitute for current human-based operations (e.g., a travel agency or a call center)

• Can we use such existing working human systems to infer the structure of a corresponding automatic system?

• If so, what might be the requisite representations and learning heuristics?

Page 27: Learning to Talk Through Listening

Learning to dialogue

• Goal-directed conversation is regular– Both participants can agree on the same goal

and both participants want to achieve this goal

– Correct transmission of information is at a premium

• Can we exploit the regularity to extract the (currently human engineered) structure of the dialogue?

Page 28: Learning to Talk Through Listening

Learning structure from dialogue

• Concept identification

• Form (topic) segmentation

• Task graphs

• Multiple data streams

• Lightly supervised learning

Page 29: Learning to Talk Through Listening

Travel agent and client

greeting

hotel

confirm

returnout leg

carpayment / close

Page 30: Learning to Talk Through Listening

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Learning through observation

• Fundamental representations and observable events

Page 31: Learning to Talk Through Listening

Properties of a dialogue representation

1. Sufficiency– Captures sufficient information for the creation of a

dialogue system– Describes the important (i.e., operative)

phenomena in conversations

2. Generality– Covers conversations in dissimilar domains

3. Learnability– Can be populated through observation (e.g., from a

corpus of human-human conversations)

Page 32: Learning to Talk Through Listening

• Components of task structure– Procedures for completing task goal(s)

• Steps in the task and their dependencies (i.e., the workflow)

– Domain language• Concepts and idioms that humans use to

communicate about the task

– Domain reasoning• The relationships between language and task, and

the domain of the application

Task-centric dialogue representation

• Components of task structure– Procedures for completing task goal(s)

• Steps in the task and their dependencies (i.e., the workflow)

– Domain language• Words, constructs and idioms that humans use to

communicate about the task

– Domain reasoning• The relationships between language and task, and

the domain of the application

Page 33: Learning to Talk Through Listening

Dialogue primitives

Levels of representation1. Task: a subset of conversational sequences that

achieves a particular (human/system) goal 2. Sub-task: a step in a task that contributes toward the

fulfillment of the task goal– The smallest unit of a dialogue that contains information

sufficient to execute a specific domain action

3. Concept: key domain entities (perhaps organized into a type-hierarchy or ontology)

Mechanisms1. Task Oriented: form-filling and result negotiation2. Discourse oriented: grounding, etc

Page 34: Learning to Talk Through Listening

Task Structure Representation

• Task = collection of forms

• Sub-task = a form

• Concept = a slot in a form

F: Query_Departure_Time

Depart_Location: carnegie_mellon

Arrive_Location: the airport

Arrive_Time: Hour: four Minute: thirty

Bus_Number: 28X

Page 35: Learning to Talk Through Listening

Example: Air travel planning

1. Task: create itinerary2. Sub-tasks:

– Flight reservation– Hotel reservation– Car rental reservation

3. Concepts: – Airline = { Continental, Iberia, … }– Hotel = { Novotel, Hilton, … }

Page 36: Learning to Talk Through Listening

Example: Bus schedule enquiry

1. Task (multiple tasks): – Find bus numbers that run between two locations – Find a departure time given a bus number and

stop location

2. Sub-tasks: – No further decomposition needed

3. Concepts: – Bus Number = { 61C, 28X, … }– Location = { CMU, airport, … }

Page 37: Learning to Talk Through Listening

Dialogue mechanisms

• Operations invoked by participants:– Correspond to an utterance or a part of an utterance – Has a unique consequence on the state of the

conversation– init_form causes a system to create a new form– The behavior of the same operation is the same

regardless of the domain (only the parameters that are different)

Page 38: Learning to Talk Through Listening

Dialogue mechanisms (2)

• Dialogue procedure– Requires more than one utterance to complete– A confirmation mechanism = 2 operations

(confirmation_request + respond)

• Non-verbal operation– Activated by a state of the representation rather than

a verbal expression– access_database is activated by the completion of

the query form

Page 39: Learning to Talk Through Listening

An example from the Map Task

• Forms– Action forms ( →draw_line )– Entity forms ( landmark )

• Operations ( various )

• Resolving a misunderstanding through grounding [session q8nc7]

Page 40: Learning to Talk Through Listening

Giver’s Map

Follower’s Map

Page 41: Learning to Talk Through Listening

Episode 11-1Operation:GIVER87:     ask_landmark: have you got a TarLM:[golden beach((left))]?

FOLLOWER88:     respond: yes uh-huh.     add_landmark: (golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one)

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location:

Follower’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Follower’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Page 42: Learning to Talk Through Listening

Episode 11-1 (2)Operation:GIVER87:     ask_landmark: have you got a TarLM:[golden beach((left))]?

FOLLOWER88:     respond: yes uh-huh.     add_landmark: (golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one)

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Page 43: Learning to Talk Through Listening

Origin:

Orientation:

Distance:

Path:

Destination

Episode 11-2Operation:GIVER89:     fill_form_info: well goDir:[straight up ]... ... from Ori:[Loc:[the top of the white

mountain]] 'til you're just Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]FOLLOWER90:     acknowledge:  right,

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Origin: Ori:[Loc:[the top of the white mountain]]

Orientation: Dir:[straight up ]

Distance:

Path:

Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]

Page 44: Learning to Talk Through Listening

Origin: Ori:[Loc:[the top of the white mountain]]

Orientation: Dir:[straight up ]

Distance:

Path:

Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]

Episode 11-3Operation:

ask_fill_form_info: you want me to go dilect-- ... Dir:[directly right]?   GIVER91:     respond: no,      fill_form_info: Dir:[directly up].

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Page 45: Learning to Talk Through Listening

Episode 11-4Operation:FOLLOWER92:     fill_form_info:  but golden beach((right)) is away in Loc:[the far right].

(The follower explicitly fill the location of the golden beach (right). )

GIVER93:     acknowledge:  ah right. (Agree with the location of the golden beach (right))

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Follower’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Follower’s

Landmark: golden beach (right)

Giver Map:

Follower Map: yes

Location: the far right

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location:

Giver’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map:

Location:

Page 46: Learning to Talk Through Listening

Episode 11-5Operation:FOLLOWER94:     ask_landmark:  have you got TarLM:[your (golden beach (right))]?  

GIVER95:     inform_other_info: i've got two golden beaches.

FOLLOWER96:     acknowledge: ah.      add_landmark: (golden beach (right))

Landmark: golden beach (right)

Giver Map:

Follower Map: yes

Location: the far right

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Page 47: Learning to Talk Through Listening

Episode 11-5 (2)Operation:FOLLOWER94:     ask_landmark:  have you got TarLM:[your (golden beach (right))]?  

GIVER95:     inform_other_info: i've got two golden beaches.

FOLLOWER96:     acknowledge: ah.      add_landmark: (golden beach (right))

Grounding Form

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Page 48: Learning to Talk Through Listening

Episode 11-6Operation:GIVER97:     fill_form_info:  sorry ... so there's TarLM:[the one(golden beach (left))] Loc:

[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me.

FOLLOWER98:     fill_form_info: is there, yeah there's nothing nothing there.     

add_landmark: golden beach (left)

GIVER99:     acknowledge:  right okay,

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location:

Landmark: golden beach (left)

Giver Map: yes

Follower Map: no

Location: above the ... white mountain, the left of it (white mountain)

Landmark: golden beach (left)

Giver Map: yes

Follower Map: no

Location: above the ... white mountain, the left of it (white mountain)

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location: above the ... white mountain, the left of it (white mountain)

Grounding Form

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Page 49: Learning to Talk Through Listening

Episode 11-6 (2)Operation:GIVER97:     fill_form_info:  sorry ... so there's TarLM:[the one(golden beach (left))] Loc:

[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me.

FOLLOWER98:     fill_form_info: is there, yeah there's nothing nothing there.     

add_landmark: golden beach (left)

GIVER99:     acknowledge:  right okay,

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location: above the ... white mountain, the left of it (white mountain)

Grounding Form

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Page 50: Learning to Talk Through Listening

Applying the representation

• Four different task-oriented domains – Air travel planning

• Professional travel agent and volunteer clients (re)booking former trips

– HCRC map-reading task• Hired subjects communicating path information

– Bus schedule information• Professional agents helping customers

– UAV operation• Trainees flying an unmanned airline, in a

simulation

Page 51: Learning to Talk Through Listening

Evaluation Corpora

• Annotated conversations from the four task-oriented domains

Domain Available Analyzed

#Dialogs #Dialogs #Utterances

Bus schedule 12 5 90

Air travel 43 4 273

Map reading 128 4 498

UAV operation 2 1 224

Page 52: Learning to Talk Through Listening

Rejected utterances

• Utterances that could not be described by the proposed structure– Out Of Domain (OOD)– Out Of Scope (OOS) : in-domain but out of the

conversation goal– Indirect : requires substantial reasoning or world-

knowledge to interpret– Task Management (TM) : manages the overall state

of the dialogue, rather than a particular form

Page 53: Learning to Talk Through Listening

Rejected utterance percentage

Domain Rejected utterances (%)

OOD OOS Indirect TM Total

Bus schedule 4.4 4.4 6.7 0.0 15.6

Air travel 1.8 4.4 0.4 2.6 9.2

Map reading 0.0 0.0 2.2 0.0 2.2

UAV simulation 1.0 0.0 1.0 4.0 5.9

Page 54: Learning to Talk Through Listening

Summary

• Human-computer dialogue is organized around specific tasks within domains

• The key level of representation is in fact the task; applications are particular embodiments of these tasks

• All applications necessarily include a large amount of detail– Such detail is not knowable a priori (and much of it

cannot be generated from principle)– Either extensive knowledge engineering or (better)

systems that learn are necessary to produce systems that function robustly