generation of referring expressions: the state of the art sellc winter school, guangzhou 2010

24
Generation of Referring Expressions: the State of the Art SELLC Winter School, Guangzhou 2010 Kees van Deemter Computing Science University of Aberdeen Guangzhou, Dec 2010

Upload: kemp

Post on 14-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Generation of Referring Expressions: the State of the Art SELLC Winter School, Guangzhou 2010. Kees van Deemter Computing Science University of Aberdeen. Open Questions in GRE. Open Questions in GRE. Your input is welcome! suggestions about other open questions? ideas about answering them. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

Generation of Referring Expressions: the State of the Art SELLC Winter School, Guangzhou 2010

Kees van Deemter

Computing Science

University of Aberdeen

Guangzhou, Dec 2010

Page 2: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

Open Questions in GRE

Guangzhou, Dec 2010

Page 3: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

Open Questions in GRE

Your input is welcome! suggestions about other open questions? ideas about answering them

Guangzhou, Dec 2010

Page 4: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ1: Reference in context

How can existing GRE algorithms be adapted to produce appropriate references in a (discourse or) dialogue context? Much work exists on the choice between broad

categories, e.g., pronoun vs. full NP vs demonstrative (Poesio et al; Piwek). This does not help to decide what NP to choose. Integration with GRE is needed.

Pioneering accounts are available (Krahmer & Theune 2002, Siddharthan & Copestake 2004, Stoia et al 2007), but these are tentative and largely untested.

Dialogue requires modelling of interaction between speaker and hearer (e.g., alignment and collaboration)

Guangzhou, Dec 2010

Page 5: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ2 Issues regarding knowledge and belief

How should mismatches in knowledge between Speaker and Hearer be modelled?

GRE so far has kept epistemic operators implicit: all the information in the crucial part of the KB was “shared”.

What if S and H differ?

Guangzhou, Dec 2010

Page 6: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ2 Issues regarding mutual knowledgeTwo instances of this problem

1. Litman’s airport scenario: What do you say to someone who needs to pick up a person from an airport? (Speaker does not know who the distractors are.)

2. Roman Kutlak’s “reference to famous people” scenario: Someone asks “Who is Chu Enlai?”, or “Who is Nelson Mandela?” What should you say? (Who are the distractors? What is their salience?)

Guangzhou, Dec 2010

Page 7: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

Relevant to the simplifications made by current GRE algorithms. E.g.:

“The king of France” (Frege/Russell/Strawson) “Whoever it will be, the winner of this year’s Tour de

France will be less proud than last year’s winner” “The winner of the lottery may win 20 million” “X believes that a witch ...; Y believes that she ...”.

(Geach? Groenendijk, Stokhof?) “The man with the martini is the murderer”, when it’s

actually a soft drink (Donnellan) “The ham sandwich is getting restless”, by waitress

who doesn’t know customer’s name (Nunberg)Guangzhou, Dec 2010

Page 8: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

More radical new departures needed?

Consider texts about genuinely complex domains: “We examine the problem of generating

definite noun phrases that are appropriate referring expressions” (Opening sentence of the abstract of D&R 1995.)

“Bush’s Middle-East policies are a disaster. Even his closest aids have started to withdraw their support”

What do these NPs refer to? Is it realistic to want to generate them from a shared KB?

Guangzhou, Dec 2010

Page 9: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ3: Incrementality

Studies of the TUNA corpus suggest that incremental GRE can work very well ... ... but only if you have a good preference order

How can good preference orders be found? Does every new domain require new empirical

studies? Or are there general principles that underlie preference

orders? (E.g., frequency or complexity of a property) Sometimes the extremity/unusualness of the values is

more important than the attribute itself (cf. Hermann & Deutsch; Aberdeen Cameras study.)

Psycholinguistic issue: Relation with realisation order? (Sedivy et al. 1999)

Guangzhou, Dec 2010

Page 10: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ4: Hearer-oriented GRE

Most work on reference in GRE has focussed on production. Exceptions: Paraboni et al. (2007). Preliminary study in

first STEC (Belz & Gatt 2007). Khan et al. 2008. How might one build generators that optimise for the

hearer? (High processing speed, low likelihood of errors)

And what if it turns out that speakers are bad at this? The egocentricity debate If it’s practical GRE you’re interested in then this

allows GRE programs to do better than human speakers.

Guangzhou, Dec 2010

Page 11: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ5 Multimodality

How does textual GRE interact with non-linguistic issues, such as speech (e.g. pitch accent on “given” information; other

prosodic issues; cf. Theune’s thesis) pointing (e.g. Van der Sluis & Krahmer [to appear]) salience as determined by physical proximity (as well

as textual recency, intrinsic importance of objects, etc.) facial expressions such as gaze, eyebrow movements.

These and other issues to be explored in Krahmer’s VICI project on GRE (Tilburg, 2008-2012).

Guangzhou, Dec 2010

Page 12: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ6 Realisation & Lexical Choice

Much of what we discussed focusses on Content Determination

But referring expressions require words and syntactic constructions as well!

But surface phenomena can be difficult and interesting too Gatt’s exploration of lexical coherence Siddharthan’s work on lexical ambiguity Imtiaz Khan’s work on syntactic ambiguity

Guangzhou, Dec 2010

Page 13: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ6 Realisation & Lexical Choice

Siddharthan & Copestake (2004) observed: words can introduce ambiguities. E.g.

“The old president” = the previous present, or the president who is old (i.e., aged)

Khan: Syntax can be ambiguous as well: “the man on the hill with the telescope” “the old men and women”

Guangzhou, Dec 2010

Page 14: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ6 Realisation & Lexical Choice

One possible position: “avoid all ambiguities”. Khan: ambiguous strings are not only often

generated, but sometimes also preferred by hearers “the old men and women” preferred over “the old men and the old women”

Finding: surface ambiguities are balanced against other issues (e.g. brevity)

Guangzhou, Dec 2010

Page 15: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ7 Reference in spacial domains

There is preliminary work (e.g. by Gatt), based on simple domains

What happens when you want to refer to an area of a country? Ross Turner’s PhD project (Aberdeen) Input: a set of points in Scotland where ice is

predicted to hamper road traffic Example output: “icy patches are expected in

the North East and on high grounds”

Guangzhou, Dec 2010

Page 16: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ7 Reference in spacial domains

Ross Turner’s PhD project (Aberdeen) Input: a set of points in Scotland where ice is

predicted to hamper road traffic. (Each point is on a road.)

Example output: “Icy patches are expected in the North East and on high grounds”

This is GRE ... but with a twist : it may not be necessary to include all target points it may not be necessary to exclude all other points

Referential success becomes a graded affair!

Guangzhou, Dec 2010

Page 17: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ8 Integration with the rest of NLG

GRE is arguably the most mature area of NLG: Linguistic Realisation is the main other contender most GRE practitioners use the same assumptions the fact that the first NLG STEC focused on GRE

confirms this Ultimately, the GRE problem is “linguistically

complete”: if we had a flawless GRE algorithm then this algorithm

could easily be transformed into an equally flawless algorithm for all of NLG ...

Guangzhou, Dec 2010

Page 18: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ8 Integration with the rest of NLG

For example, John walks [S] (The person who) walks [ref NP]

Or A man saw a girl with earrings [S] (The man who) saw a girl with earrings [ref NP]

Or Someone saw a beautiful girl with incredibly elaborate

jade earrings bought in Paris (...) [S] (The person who) saw a beautiful girl with incredibly

elaborate jade earrings bought in Paris (...) [ref NP]

Guangzhou, Dec 2010

Page 19: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ9: Integration between GRE and other areas of linguistics Integration with psycholinguistics:

(NLG more generally: G.Kempen et al, A.Roelofs et al. Recent book by M.Guhe.)

GRE: modest beginnings in Dale & Reiter 1995 (inspiration from Levelt’s book)

CogSci 2009 workshop “Bridging the gap between computational and psycholinguistic approaches to references”

Special Issue of the journal TopiCS.

Guangzhou, Dec 2010

Page 20: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ10: Integration between GRE and other areas of linguistics Integration with syntax has so far been

meagre interleaving of Linguistic Realisation and

Content Determination (Stone & Webber 1998; Krahmer & Theune 2002)

Guangzhou, Dec 2010

Page 21: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ10: Integration between GRE and other areas of linguistics Integration with formal semantics and

pragmatics has been limited DeVault & Stone 2004 on vagueness (based on

Kyburg & Morreau 2000) Use of salience (mainly for category choice; see

also Krahmer & Theune 2002) Formal semantics focusses on intensionality and

quantification Generating appropriate REs in belief contexts:

(“John knows that the nuclear button / the leftmost button / the red button is dangerous”)

Guangzhou, Dec 2010

Page 22: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

OQ10: Integration between GRE and other areas of linguistics It would be interesting to let GRE explore core areas

of formal semantics, e.g. Use a flat KB as input (just like in GRE), to generate

quantified NPs like “Five rats died”, “A few rats died”, “Not all rats died”.

Find principles for choosing the quantifier pattern that’s most appropriate in the utterance situation

Early attempts by N.Creaney (2002), but limited progress so far.

Guangzhou, Dec 2010

Page 23: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

Q11: Problematic referents

“the water in this pond” “water” “5”, “2+3” “virtue”, “power”

Guangzhou, Dec 2010

Page 24: Generation of Referring Expressions:  the State of the Art SELLC Winter School, Guangzhou 2010

Plenty of challenges for enthusiastic young researchers!

Guangzhou, Dec 2010