planungsbasierte kommunikationsmodelle - uni …garoufi/teaching/planung/03...planungsbasierte...

Planungsbasierte Kommunikationsmodelle

Einführung in die Sprechakttheorie

A scalable model of planning perlocutionary acts(Koller et al., 2010)

Konstantina Garoufi9. November 2011

Today we talk about...

Today

• Basic speech act theory

• A scalable model of planning perlocutionary acts

• Demo (if time)

Speech act theory

• Main idea: Communication is more than just transmitting information. It changes the world!

• What can communication change?

• Mental states of the agents involved in a conversation

• Overall state of the dialog

Communication = action

I sentence you to death.I promise to

be at home for dinner.

I now pronounce you husband and

wife.I order you to attack

the enemy.

Levels of speech acts

• Locutionary: locution (= saying, expression; from the Latin “loqui” = to speak) + -ary

• Illocutionary: in- + locution + -ary

• Perlocutionary: per- (= through) + locution + -ary

Austin (1962)

http://dictionary.reference.com/browse/locutionary



Locutionary act

• The act of saying something

• Utterance + meaning

• Phonetic act: verbal aspect

• Phatic act: syntactic aspect

• Rhetic act: semantic aspect

Illocutionary act

• The act performed in saying something

• Intended meaning of an utterance

• Illocutionary force: type of action (e.g. request, inform, ...)

• Propositional content: details of action (e.g. what is the hearer requested to do?)

Perlocutionary act

• The act performed by saying something

• Intended effects of utterance to the state of the dialog

• E.g. to persuade the hearer that something is true, or to get the hearer to perform a physical act

Example: “Mary, don’t go into the water”

• Locutionary act: phonetics, syntax, semantics of “Mary, don’t go into the water”

• Illocutionary act: warning (illocutionary force) that Mary should not go into the water (propositional content)

• Perlocutionary act: Mary stays away from the water

Taxonomy of illocutionary acts

• Assertives: commit the speaker to the truth of an expressed proposition (e.g. believe)

• Directives: get the hearer to do something (e.g. request)

• Commissives: commit the speaker to future actions (e.g. promise)

• Expressives: express speaker’s psychological state (e.g. thank)

• Declarations: change reality in accordance to the proposition of the declaration (e.g. baptize)

Searle (1976)

Note on performatives

• In his early work, Austin distinguished between two types of utterances

• Constatives: describe states of affairs (e.g. “Snow is white.”)

• Performatives: describe actions carried out by speakers

• Explicit: “I promise to come.”

• Implicit: “I intend to come.”

• How to distinguish statements from actions?

➔ Both the constative vs. performative and the explicit vs. implicit distinctions are problematic

• Austin eventually abandoned these dichotomies for more general notions

Note on performatives

Performatives: more info

• Searle (1989). How performatives work.

• Bach & Harnish (1992). How performatives really work.

• Grewendorf (2002). How performatives don’t work.

http://garnet.berkeley.edu/~jsearle/133/howperfwork.pdf

http://garnet.berkeley.edu/~jsearle/133/howperfwork.pdf

http://www.jstor.org/stable/10.2307/25001463




http://www.buchhandel.de/WebApi1/GetMmo.asp?MmoId=1003507&mmoType=PDF




A scalable model of planning perlocutionary

acts

Motivation: the SCARE corpus• 15 spontaneous English

dialogue sessions

• Each session records the joint problem-solving of a pair of human partners working through a treasure-hunt style task in a 3D virtual world

DF view of the virtual world, displayed on the

Stoia et al. (2008)

The SCARE corpus• instruction giver (IG) guides

instruction follower (IF) through completing tasks

DF view of the virtual world, displayed on theIF’s view of the world,

as displayed on IG’s monitor

IG’s map of the worldStoia et al. (2008)

Example transliterated

walk forward and go through the first door you see [pause]

and then go through the next one right in front of it [pause]

yeah that one [pause] ok [disfluency - w] and then turn to your right [pause]

and then hit the button in the middle [pause]

Example step-by-stepwalk forward and go through the

first door you see

and then go through the next one right in front of it

and then turn to your right

and then hit the button in the middle

Precondition: particular spatio-visual configuration

Effect: spatio-visual configuration changes, history of interaction is enriched


first door you see




Precondition: particular spatio-visual configuration, particular history of interaction

Effect: spatio-visual configuration changes, history of interaction is enriched


first door you see




}navigation

Epistemic goal: improve cognition by reducing time/space complexity or unreliability of mental computation during the next turns of the discourse


first door you see



and then hit the button in the middle } referring

expression generation

Pragmatic goal: complete task at hand

• Which are the linguistic and extra-linguistic factors that are important for situated communication?

• How does the interplay of language, pragmatic and epistemic action, and goal-based problem-solving work?

• In what ways can we model this interplay?

The problems

Solution: Sentence generation as planning

goal: express “sleepʼ(e,a)”

a

b

c

Koller & Stone (2007)

Lexicalized tree adjoining grammar

S:self

NP:subj ! VP:self

sleeps

V:self

N:self

rabbit

NP:self

the

N:self

white N:self *

{sleep(self,subj)} {rabbit(self)} {white(self)}

Figure 1: The example grammar.

S:e

NP:a ! VP:e

sleeps

V:e

N:a

rabbit

NP:a

the

N:a

white N:a *

S:e

VP:e

sleeps

V:e

rabbit

NP:a

the N:a

white

Figure 2: Derivation of “The white rabbit sleeps.”

In a first step, called sentence planning, the semantic rep-resentation is first enriched with more information; for in-stance, referring expressions, which refer to individuals thatwe want to talk about, are determined at this point. In asecond step, called surface realization, this enriched repre-sentation is then translated into a natural-language sentence,using the grammar.

In practice, the process of determining referring expres-sions typically interacts with the realization step and so itturns out to be beneficial to perform both of these steps to-gether. This was the goal of the SPUD system (Stone etal. 2003), which performed this task using a top-down gen-eration algorithm based on tree-adjoining grammars (Joshiand Schabes 1997) whose lexical entries were equipped withsemantic and pragmatic information. Unfortunately, SPUDsuffered from having to explore a huge search space, andhad to resort to a non-optimal greedy search strategy to re-tain reasonable efficiency. To improve on the efficiency ofSPUD, Koller and Stone (2007) translate the sentence gen-eration problem into a planning problem and use a planningalgorithm for generation.

We illustrate this process on a simplified example. Con-sider a knowledge base containing the individuals e, r1 andr2, and a set of attributes encoding the fact that r1 and r2

are rabbits, r1 is white and r2 is brown, and e is an event inwhich r1 sleeps. Say that we want to express the informa-tion {sleep(e, r1)} using the tree-adjoining grammar shownin Figure 1. This grammar consists of elementary trees (i.e.,the disjoint trees in the figure), each of which contributescertain semantic content. We can instantiate these trees bysubstituting individuals for semantic roles, such as self andsubj, and then combine the tree instances as shown in Fig-ure 2 to obtain the sentence “The white rabbit sleeps”.

We compute this grammatical derivation in a top-downmanner, starting with the elementary tree for “sleeps”. Thistree satisfies the need to convey the semantic information,but introduces a need to generate a noun phrase (NP) forthe subject; this NP must refer uniquely to the target ref-

(:action add-sleeps:parameters (?u - node

?xself - individual?xsubj - individual)

:precondition (and (subst S ?u)(referent ?u ?xself)(sleep ?xself ?xsubj))

:effect (and (not (subst S ?u))(expressed sleep ?xself ?xsubj)(subst NP (subj ?u))(referent (subj ?u) ?xsubj)(forall (?y - individual)

(when (not (= ?y ?xself))(distractor (subj ?u) ?y)))))

(:action add-rabbit:parameters (?u - node

?xself - individual):precondition (and (subst NP ?u)

(referent ?u ?xself)(rabbit ?xself))

:effect (and (not (subst NP ?u))(canadjoin N ?u)(forall (?y - individual)

(when (not (rabbit ?y))(not (distractor ?u ?y))))))

(:action add-white:parameters (?u - node

?xself - individual):precondition (and (canadjoin N ?u)


:effect (forall (?y - individual)(when (not (white ?y))

(not (distractor ?u ?y)))))

Figure 3: PDDL actions for generating the sentence “Thewhite rabbit sleeps.”

erent r1. In a second step, we substitute the tree for “therabbit” into the open NP leaf, which makes the derivationgrammatically complete. Since there are two different indi-viduals that could be described as “the rabbit”—technically,r2 is still a distractor (i.e., based on the description “the rab-bit”, the hearer might erroneously think that we’re talkingabout r2 and not r1)—we are still not finished. To completethe derivation, the tree for “white” is added to the existingstructure by an adjunction operation, making the derivationsyntactically and semantically complete.

The process described above has clear parallels to plan-ning: we manipulate a state by applying actions in order toachieve a goal. We can make this connection even moreprecise by translating the SPUD problem into a planningproblem. For instance, Figure 3 shows the correspondingPDDL actions for the above generation task, where each ac-tion corresponds to an operation that adds a single elemen-tary tree to the derivation. In each case, the first parameterof the action is a node name in the derivation tree, and theremaining parameters stand for the individuals to which thesemantic roles will be instantiated. The syntactic precon-

elementary treesKoller & Stone (2007)

Combining elementary trees

N:a

white N:a *

NP:a

the NP:a

S:e

NP:a ! VP:e

sleeps

V:e

N:a

rabbit

S:self

NP:subj ! VP:self

sleeps

V:self

N:self

rabbit

NP:self

the

N:self

white N:self *

{sleep(self,subj)} {rabbit(self)} {white(self)}

Figure 1: The example grammar.

S:e

NP:a ! VP:e

sleeps

V:e

N:a

rabbit

NP:a

the

N:a

white N:a *

S:e

VP:e

sleeps

V:e

rabbit

NP:a

the N:a

white

Figure 2: Derivation of “The white rabbit sleeps.”

In a first step, called sentence planning, the semantic rep-resentation is first enriched with more information; for in-stance, referring expressions, which refer to individuals thatwe want to talk about, are determined at this point. In asecond step, called surface realization, this enriched repre-sentation is then translated into a natural-language sentence,using the grammar.

In practice, the process of determining referring expres-sions typically interacts with the realization step and so itturns out to be beneficial to perform both of these steps to-gether. This was the goal of the SPUD system (Stone etal. 2003), which performed this task using a top-down gen-eration algorithm based on tree-adjoining grammars (Joshiand Schabes 1997) whose lexical entries were equipped withsemantic and pragmatic information. Unfortunately, SPUDsuffered from having to explore a huge search space, andhad to resort to a non-optimal greedy search strategy to re-tain reasonable efficiency. To improve on the efficiency ofSPUD, Koller and Stone (2007) translate the sentence gen-eration problem into a planning problem and use a planningalgorithm for generation.

We illustrate this process on a simplified example. Con-sider a knowledge base containing the individuals e, r1 andr2, and a set of attributes encoding the fact that r1 and r2

are rabbits, r1 is white and r2 is brown, and e is an event inwhich r1 sleeps. Say that we want to express the informa-tion {sleep(e, r1)} using the tree-adjoining grammar shownin Figure 1. This grammar consists of elementary trees (i.e.,the disjoint trees in the figure), each of which contributescertain semantic content. We can instantiate these trees bysubstituting individuals for semantic roles, such as self andsubj, and then combine the tree instances as shown in Fig-ure 2 to obtain the sentence “The white rabbit sleeps”.

We compute this grammatical derivation in a top-downmanner, starting with the elementary tree for “sleeps”. Thistree satisfies the need to convey the semantic information,but introduces a need to generate a noun phrase (NP) forthe subject; this NP must refer uniquely to the target ref-

(:action add-sleeps:parameters (?u - node

?xself - individual?xsubj - individual)

:precondition (and (subst S ?u)(referent ?u ?xself)(sleep ?xself ?xsubj))

:effect (and (not (subst S ?u))(expressed sleep ?xself ?xsubj)(subst NP (subj ?u))(referent (subj ?u) ?xsubj)(forall (?y - individual)

(when (not (= ?y ?xself))(distractor (subj ?u) ?y)))))

(:action add-rabbit:parameters (?u - node

?xself - individual):precondition (and (subst NP ?u)


:effect (and (not (subst NP ?u))(canadjoin N ?u)(forall (?y - individual)

(when (not (rabbit ?y))(not (distractor ?u ?y))))))

(:action add-white:parameters (?u - node

?xself - individual):precondition (and (canadjoin N ?u)


:effect (forall (?y - individual)(when (not (white ?y))

(not (distractor ?u ?y)))))

Figure 3: PDDL actions for generating the sentence “Thewhite rabbit sleeps.”

erent r1. In a second step, we substitute the tree for “therabbit” into the open NP leaf, which makes the derivationgrammatically complete. Since there are two different indi-viduals that could be described as “the rabbit”—technically,r2 is still a distractor (i.e., based on the description “the rab-bit”, the hearer might erroneously think that we’re talkingabout r2 and not r1)—we are still not finished. To completethe derivation, the tree for “white” is added to the existingstructure by an adjunction operation, making the derivationsyntactically and semantically complete.

The process described above has clear parallels to plan-ning: we manipulate a state by applying actions in order toachieve a goal. We can make this connection even moreprecise by translating the SPUD problem into a planningproblem. For instance, Figure 3 shows the correspondingPDDL actions for the above generation task, where each ac-tion corresponds to an operation that adds a single elemen-tary tree to the derivation. In each case, the first parameterof the action is a node name in the derivation tree, and theremaining parameters stand for the individuals to which thesemantic roles will be instantiated. The syntactic precon-

derived treeKoller & Stone (2007)

Elementary trees as actions

S-sleeps(0,e,a)

pre: open(0,S,e), sleep!(e,a)

effect: ¬open(0,S,e), open(1,NP,a),

!y. y " a ! distractor(1,y)

NP-the-rabbit(1,a)

pre: open(1,NP,a), rabbit!(a)

effect: ¬open(1,NP,a), allowed(1, adj-N, a),

!y. ¬rabbit(y) ! ¬distractor(1,y)

N-white(1,a)

pre: allowed(1,adj-N,a), white!(a)

effect: !y. ¬white!(y) !

¬distractor(1,y)

semantics

syntax

Koller & Stone (2007)

NP-the-rabbit(1,a)




goal: express “sleepʼ(e,a)”

S-sleeps(0,e,a)




N-white(1,a)



¬distractor(1,y)

S-sleeps(0,e,a)




NP-the-rabbit(1,a)




N-white(1,a)



¬distractor(1,y)

off-the-shelf automated

planning system

plan decoding

N:a

white N:a *

NP:a

the NP:a

S:e

NP:a ! VP:e

sleeps

V:e

N:a

rabbit

The white rabbit sleeps.

Sentence generation as planning

a b cKoller & Stone (2007)

Extensions for planning perlocutionary acts

S:self

V:self

push

NP:obj !

semreq: visible(p, o, obj)nonlingcon: player–pos(p),

player–ori(o)impeff: push(obj)

S:self

V:self

turn

Adv

left

nonlingcon: player–ori(o1),next–ori–left(o1, o2)

nonlingeff: ¬player–ori(o1),player–ori(o2)

impeff: turnleft

S:self

S:self * S:other ! and

Figure 4: An example SCRISP lexicon.

of you”. This lowers the cognitive load on the IF,and presumably improves the rate of correctly in-terpreted REs.

SCRISP is capable of deliberately generat-ing such context-changing navigation instructions.The key idea of our approach is to extend theCRISP planning operators with preconditions andeffects that describe the (simulated) physical envi-ronment: A “turn left” action, for example, mod-ifies the IF’s orientation in space and changes theset of visible objects; a “push” operator can thenpick up this changed set and restrict the distractorsof the forthcoming RE it introduces (i.e. “the but-ton”) to only objects that are visible in the changedcontext. We also extend CRISP to generate imper-ative rather than declarative sentences.

4.1 Situated CRISP

We define a lexicon for SCRISP to be a CRISPlexicon in which every lexicon entry may also de-scribe non-linguistic conditions, non-linguistic ef-fects and imperative effects. Each of these is aset of atoms over constants, semantic roles, andpossibly some free variables. Non-linguistic con-ditions specify what must be true in the worldso a particular instance of a lexicon entry can beuttered felicitously; non-linguistic effects specifywhat changes uttering the word brings about in theworld; and imperative effects contribute to the IF’s“to-do list” (Portner, 2007) by adding the proper-ties they denote.

A small lexicon for our example is shown inFig. 4. This lexicon specifies that saying “pushX” puts pushing X on the IF’s to-do list, and car-ries the presupposition that X must be visible fromthe location where “push X” is uttered; this re-flects our simplifying assumption that the IG can

turnleft(u, x, o1, o2):Precond: subst(S, u), ref(u, x), player–ori(o1),

next–ori–left(o1, o2), . . .Effect: ¬subst(S, u),¬player–ori(o1), player–ori(o2),

to–do(turnleft), . . .

push(u, u1, un, x, x1, p, o):Precond: subst(S, u), ref(u, x), player–pos(p),

player–ori(o), visible(p, o, x1), . . .Effect: ¬subst(S, u), subst(NP, u1), ref(u1, x1),

∀y.(y �= x1 ∧ visible(p, o, y) → distractor(u1, y)),to–do(push(x1)), canadjoin(S, u), . . .

and(u, u1, un, e1, e2):Precond: canadjoin(S, u), ref(u, e1), . . .Effect: subst(S, u1), ref(u1, e2), . . .

Figure 5: SCRISP planning operators for the lexi-con in Fig. 4.

only refer to objects that are currently visible.Similarly, “turn left” puts turning left on the IF’sagenda. In addition, the lexicon entry for “turnleft” specifies that, under the assumption that theIF understands and follows the instruction, theywill turn 90 degrees to the left after hearing it. Theplanning operators are written in a way that as-sumes that the intended (perlocutionary) effects ofan utterance actually come true. This assumptionis crucial in connecting the non-linguistic effectsof one SCRISP action to the non-linguistic pre-conditions of another, and generalizes to a scalablemodel of planning perlocutionary acts. We discussthis in more detail in Koller et al. (2010a).

We then translate a SCRISP generation prob-lem into a planning problem. In addition to whatCRISP does, we translate all non-linguistic condi-tions into preconditions and all non-linguistic ef-fects into effects of the planning operator, addingany free variables to the operator’s parameters.An imperative effect P is translated into an ef-fect to–do(P ). The operators for the example lex-icon of Fig. 4 are shown in Fig. 5. Finally, weadd information about the situated environment tothe initial state, and specify the planning goal byadding to–do(P ) atoms for each atom P that is tobe placed on the IF’s agenda.

4.2 An exampleNow let’s look at how this generates the appropri-ate instructions for our example scene of Fig. 3.We encode the state of the world as depictedin the map in an initial state which contains,among others, the atoms player–pos(pos3,2),player–ori(north), next–ori–left(north,west),

action: push(e,x,p,o)

precondition: visible(p,o,x), ...

effect: ∀y. (y≠x ∧ visible(p,o,y) → distractor(y)), to-do(push(x)), ...

S:self

V:self

push

NP:obj !

semreq: visible(p, o, obj)nonlingcon: player–pos(p),

player–ori(o)impeff: push(obj)

S:self

V:self

turn

Adv

left

nonlingcon: player–ori(o1),next–ori–left(o1, o2)

nonlingeff: ¬player–ori(o1),player–ori(o2)

impeff: turnleft

S:self

S:self * S:other ! and

Figure 4: An example SCRISP lexicon.

of you”. This lowers the cognitive load on the IF,and presumably improves the rate of correctly in-terpreted REs.

SCRISP is capable of deliberately generat-ing such context-changing navigation instructions.The key idea of our approach is to extend theCRISP planning operators with preconditions andeffects that describe the (simulated) physical envi-ronment: A “turn left” action, for example, mod-ifies the IF’s orientation in space and changes theset of visible objects; a “push” operator can thenpick up this changed set and restrict the distractorsof the forthcoming RE it introduces (i.e. “the but-ton”) to only objects that are visible in the changedcontext. We also extend CRISP to generate imper-ative rather than declarative sentences.

4.1 Situated CRISP

We define a lexicon for SCRISP to be a CRISPlexicon in which every lexicon entry may also de-scribe non-linguistic conditions, non-linguistic ef-fects and imperative effects. Each of these is aset of atoms over constants, semantic roles, andpossibly some free variables. Non-linguistic con-ditions specify what must be true in the worldso a particular instance of a lexicon entry can beuttered felicitously; non-linguistic effects specifywhat changes uttering the word brings about in theworld; and imperative effects contribute to the IF’s“to-do list” (Portner, 2007) by adding the proper-ties they denote.

A small lexicon for our example is shown inFig. 4. This lexicon specifies that saying “pushX” puts pushing X on the IF’s to-do list, and car-ries the presupposition that X must be visible fromthe location where “push X” is uttered; this re-flects our simplifying assumption that the IG can

turnleft(u, x, o1, o2):Precond: subst(S, u), ref(u, x), player–ori(o1),

next–ori–left(o1, o2), . . .Effect: ¬subst(S, u),¬player–ori(o1), player–ori(o2),

to–do(turnleft), . . .

push(u, u1, un, x, x1, p, o):Precond: subst(S, u), ref(u, x), player–pos(p),

player–ori(o), visible(p, o, x1), . . .Effect: ¬subst(S, u), subst(NP, u1), ref(u1, x1),

∀y.(y �= x1 ∧ visible(p, o, y) → distractor(u1, y)),to–do(push(x1)), canadjoin(S, u), . . .

and(u, u1, un, e1, e2):Precond: canadjoin(S, u), ref(u, e1), . . .Effect: subst(S, u1), ref(u1, e2), . . .

Figure 5: SCRISP planning operators for the lexi-con in Fig. 4.

only refer to objects that are currently visible.Similarly, “turn left” puts turning left on the IF’sagenda. In addition, the lexicon entry for “turnleft” specifies that, under the assumption that theIF understands and follows the instruction, theywill turn 90 degrees to the left after hearing it. Theplanning operators are written in a way that as-sumes that the intended (perlocutionary) effects ofan utterance actually come true. This assumptionis crucial in connecting the non-linguistic effectsof one SCRISP action to the non-linguistic pre-conditions of another, and generalizes to a scalablemodel of planning perlocutionary acts. We discussthis in more detail in Koller et al. (2010a).

We then translate a SCRISP generation prob-lem into a planning problem. In addition to whatCRISP does, we translate all non-linguistic condi-tions into preconditions and all non-linguistic ef-fects into effects of the planning operator, addingany free variables to the operator’s parameters.An imperative effect P is translated into an ef-fect to–do(P ). The operators for the example lex-icon of Fig. 4 are shown in Fig. 5. Finally, weadd information about the situated environment tothe initial state, and specify the planning goal byadding to–do(P ) atoms for each atom P that is tobe placed on the IF’s agenda.

4.2 An exampleNow let’s look at how this generates the appropri-ate instructions for our example scene of Fig. 3.We encode the state of the world as depictedin the map in an initial state which contains,among others, the atoms player–pos(pos3,2),player–ori(north), next–ori–left(north,west),

action: turn-left(e,x,y)

precondition: player-ori(x), ...

effect: ¬player-ori(x), player-ori(y), to-do(turn-left), ...

communicative acts have the

intended perlocutionary

effects

pragmatic conditions

Koller et al. (2010)

Exampletarget

instruction followerʼs location

hit the yellow button on the left wall in the

next room

1. walk forward and go through

the door2. (...) ok and then

turn to your left

3. and then hit the button

✔

Problem

How to navigate the user so that the forthcoming referring expression is cognitively simple?

Push the button on the wall to your left.

Push the button on the wall to your left in the next room.

target

Stateplayer-pos(p1)player-ori(o1)

¬visible(p1,o1,b1)

turn-left(e1,o1,o2)

Turn left

action: turn-left(e1,o1,o2)

precondition: player-ori(o1), ...

effect: ¬player-ori(o1), player-ori(o2), ...

Solutiontarget

Stateplayer-pos(p1)player-ori(o2)

visible(p1,o2,b1)

action: and(e1,e2)

precondition: ...

effect: ...

and(e1,e2)push(e2,b1,p1,o2)

the-button(b1)

and push the button.

action: the-button(b1)

effect: ∀y. (¬button(y) → ¬distractor(y)), ...

action: push(e2,b1,p1,o2)

precondition: visible(p1,o2,b1), ...

effect: ∀y. (y≠b1 ∧ visible(p1,o2,y) → distractor(y)), to-do(push(b1)), ...

Solutiontarget

Statetodo(push(b1))

✔

SolutionPush the button.

State

to-do(hit(b))✔

...by assuming that the intended perlocutionary effects become true, the system

➡ achieves deliberate manipulation of the extra-linguistic context

➡ but makes execution monitoring crucial

Notice that...

Solution

Monitoring perlocutionary effects

• Inactivity execution monitor: tracks whether the user has not moved or acted in a certain period of time, in which case it repeats the utterance

• Distance execution monitor: observes whether the user is moving away from the target, in which case it recomputes the utterance

• Danger execution monitor: monitors whether the user approaches a trap in the world, in which case it issues a warning

User plays 3D gamein virtual world.

inte

rnet

Natural language generation system generates

instructions in real time.

“move forward 2 steps!”“press the blue button!”

� �

! !

1

2

Evaluation framework: GIVE Challenge Byron et al.

(2009)

Website

Try GIVE out!

http://www.give-challenge.org

http://www.give-challenge.org/

http://www.give-challenge.org/

Implementation as a GIVE system

• Solves planning problems with off-the-shelf FF planner

• Max planning time 1.03 sec (3 GHz CPU) for a knowledge base of about 1500 facts and a grammar of about 30 lexicon entries

Real-time performance!

Evaluation

• System achieves competitive performance in the GIVE task

• With respect to task success, it outperforms an informed baseline and performs comparably to an improved version of the best performer in GIVE-1

• It can refer to objects from significantly further away than the baselines

Conclusions

• It is possible to model perlocutionary effects of speech acts as direct effects of corresponding planning operators

• This keeps planning complexity low...

• ...and employs execution monitoring to make sure the intended perlocutionary effects are actually achieved

Discussion points

• Which kinds of speech acts (according to Searle) has this approach covered?

• For which kinds of speech acts is it most appropriate?

• Is it possible to model any speech act in this manner?

• Austin (1962). How to do things with words.

• Byron et al. (2009). Report on the First NLG Challenge on Generating Instructions in Virtual Environments (GIVE).

• Koller & Stone (2007). Sentence generation as a planning problem.

• Searle (1976). A classification of illocutionary acts.

• Stoia et al. (2008). SCARE: A Situated Corpus with Annotated Referring Expressions.

References

http://www.dwrl.utexas.edu/~davis/crs/rhe321/Austin-How-To-Do-Things.pdf

http://www.dwrl.utexas.edu/~davis/crs/rhe321/Austin-How-To-Do-Things.pdf

http://www.coli.uni-saarland.de/%7Ekoller/papers/give-report-09.pdf






http://aclweb.org/anthology-new/P/P07/P07-1043.pdf






http://slate.cse.ohio-state.edu/quake-corpora/scare/papers/lrec2008.pdf




planungsbasierte kommunikationsmodelle - uni …garoufi/teaching/planung/03...planungsbasierte...

Documents