14th july 2010 uppsala, sweden

Incorporating Extra-linguistic Information into Reference Resolution in Collaborative Task Dialogue

Ryu Iida Shumpei Kobayashi

Takenobu Tokunaga

Tokyo Institute of Technology{ryu-i,skobayashi,take}@cl.cs.titech.ac.jp

ACL 2010

1

14th July 2010 Uppsala, Sweden

Research background The task of identifying reference relations including

anaphora and coreference within texts has received a great deal of attention in NLP

Research trends for reference resolution have drastically shifted from hand-crafted rule-based approaches to corpus-based approaches Many researchers have examined ways for introducing

various linguistic clues(Ge et al. 1998, Soon et al. 2001, Ng and Cardie 2002, Yang et al. 2003, 2005, Poon and Domingos, 2008, etc.)

2

Typical problem setting of reference resolution

Annotated data sets provided by Message Understanding Conference (MUC) and Automatic Content Extraction (ACE) Limited version of coreference;

relations where expressions refer to named entities More information extraction-oriented

Coreference task as defined by MUC and ACE is geared toward only identifying coreference relations anchored to an entity within the text

3

Treatment of referential behavior in language generation community

Investigations of referential behaviour in real world situations (Di Eugenio et al. 2000, Byron 2005, van Deemter 2007, Foster 2008, Spanger et al. 2009) applications: e.g. human-robot interaction

Spanger et al. (2009): dialogues of two participants collaboratively solving Tangram puzzle Corpus includes extra-linguistic information synchronised

with utterances (e.g. operations on the puzzle pieces) They revealed that multi-modal perspective of reference is

needed for more practical reference understanding

4

Challenging issue Create a model bridging a referring expression in text

and its object in real world

Focus on incorporating extra-linguistic information into existing corpus-based approach Target corpus: Spanger et al. (2009)’s REX-J corpus

5

Table of contents Research background Collaborative work dialogue corpus: REX-J corpus Reference resolution model and use of extra-linguistic

information Empirical evaluation Summary and future work

6

REX-J corpus (Spanger et al. 2009)

Collaborative work dialogues in Japanese for solving Tangram puzzle Operations to solve the puzzle and situations updated by a

series of operations are recorded by a puzzle simulator on computer

Relationship between referring expressions and their referents on a computer display is manually annotated

7

8

Screenshot of Tangram simulatorGoal shape

area

Working area

3 operations on puzzle pieces:move, rotate,flip

Positions of every piece and every action are recorded at intervals of 10 msec

9

Experimental environment Share only working area and linguistic information in

dialogue Two different roles: “solver” and “operator”

operatorsolver

can see a certain goal shape

cannot manipulate pieces

cannot see the goal shape

can manipulate pieces

REX-J Corpus: statistics Recruited 12 Japanese graduate students

6 pairs * 4 different goal shapes 24 dialogues

10

Table of contents Research background REX-J corpus Reference resolution model and use of extra-linguistic


11

123

456 7

Task definition12

…A ： move it more to the right.B ： which triangle? Is this?

no antecedent in preceding utterances

Time piece operation．．．12:01:03 1 rotate12:01:05 3 move12:01:10 6 move12:01:12 6 rotate

referent of ‘it’: piece 6

Operation history

utterances

Task: select a piece out of a fixed set of pieces given a referring expression by referring to both preceding utterances and series of the recent operations

Ranking model to identify referents

Machine learning-based approaches (Soon et al. 2001, Ng and Cardie 2002, etc.) Take into account linguistic factors: relative salience Ranking candidate antecedents in preceding discourse

(Iida et al. 2003, Yang et al. 2003, Denis and Baldridge 2008) Denis and Baldridge (2008) reported appropriately constructing a

model for ranking all candidates achieved better performance than pairwise ranking.

Adopt a ranking-based model in which all candidates compete with one another Use ranking SVM instead of Maximum Entropy

13

Extra-linguistic information (1/2):history of mouse movement

Current position of mouse cursor and history of mouse movements

Represent the temporal salience of participant’s focus of attention and its transition

mouse cursor

12 3

45

67

14

Extra-linguistic information (1/2):Action history feature

mouse cursor was over a piece (i.e. a candidate referent) at the beginning of uttering a RE

a piece is the last piece that mouse cursor was over time distance after mouse cursor was over a piece:

x <10 sec / 10 sec ≤ x < 20 sec / 20 sec ≤ x mouse cursor is never over a piece in the preceding

utterances

15

Extra-linguistic information (2/2):history of series of operations

Recently manipulated pieces tend to be paid more attention than the other pieces

12 3

45

67

Time piece operation．．．12:01:03 1 rotate12:01:05 3 move12:01:10 2 move12:01:12 2 rotate

Operation history

16

Extra-linguistic information (2/2):Current operation feature

a piece is being manipulated at the beginning of utteringa RE

a piece is the most recently manipulated piece time distance after a piece was most recently

manipulated: x <10 sec / 10 sec ≤ x < 20 sec / 20 sec ≤ x

a piece has never been manipulated

17

Table of contents Research background REX-J corpus Reference resolution model and use of extra-linguistic


18

19

Empirical evaluation Investigate the impact of the extra-linguistic information

Data set: referring expressions in REX-J corpus (2,048 referring expressions in 40 dialogues) 13 expressions are excluded

Expressions referring to more than one object Vague expressions

E.g. “biggest triangle” in the situation where there are two biggest triangles on the display

2,035 expressions are used on 10-fold cross-validation

20

Two models Pronouns are likely to be more directly associated with actions

pointing to a piece Denis and Baldridge (2008)

the size of training instances is relatively small, the models induced by learning algorithms should be separately

created with regards to distinct features Separated model

Create two rankers; learn pronouns and non-pronouns independently Pronoun model: use the training instances whose REs are pronouns Non-pronoun model: use all other training instances

Combined model Create one ranker; induced from all training instances

Features21

3 types of features Action history features Current operation features Discourse history features

Acquired from the expressions of a given referring expression and its candidate antecedent in the preceding utterances

e.g. a piece is referred to by the most recent RE case makers (o (accusative) or ni (dative)) follow RE

Baseline model: use only discourse history features

Resultsmodel discourse

history(baseline)

+action history

+current operation

+action history, +current operation

separated model

0.664(1352/2035)

0.790(1608/2035)

0.685(1394/2035)

0.780(1587/2035)

a) pronoun model

0.648(660/1018)

0.886(902/1018)

0.692(704/1018)

0.875(891/1018)

b) non-pronoun model

0.680(692/1017)

0.694(706/1017)

0.678(690/1017)

0.684(696/1017)

combined model

0.664(1352/2035)

0.749(1524/2035)

0.650(1322/2035)

0.743(1513/2035)

22


history(baseline)

+action history

+current operation


separated model

0.664(1352/2035)

0.790(1608/2035)

0.685(1394/2035)

0.780(1587/2035)

a) pronoun model

0.648(660/1018)

0.886(902/1018)

0.692(704/1018)

0.875(891/1018)


0.680(692/1017)

0.694(706/1017)

0.678(690/1017)

0.684(696/1017)

combined model

0.664(1352/2035)

0.749(1524/2035)

0.650(1322/2035)

0.743(1513/2035)

24

0.227

0.004

Pronouns are more sensitive to the usage of the action history features


history(baseline)

+action history

+current operation


separated model

0.664(1352/2035)

0.790(1608/2035)

0.685(1394/2035)

0.780(1587/2035)

a) pronoun model

0.648(660/1018)

0.886(902/1018)

0.692(704/1018)

0.875(891/1018)


0.680(692/1017)

0.694(706/1017)

0.678(690/1017)

0.684(696/1017)

combined model

0.664(1352/2035)

0.749(1524/2035)

0.650(1322/2035)

0.743(1513/2035)

feature name

feature type Description

AH1 action history mouse cursor was over a piece at the beginning of uttering a RE

CO1 current operation

a piece is being manipulated at the beginning of uttering a RE

25

Partially overlapped

Other current operation features may have bad effects for ranking referents due to their ill-formed definitions

27

Summary and future directions

[Summary] We demonstrated our first result of incorporating extra-linguistic

clues into a corpus-based approach to reference resolution The performance increased by at most 12 points in comparison

to the baseline model. extra-linguistic information in this domain are useful

[Future work] Explore the effect of other extra-linguistic information

e.g. eye-gaze information Investigate general aspect between REs and their objects;

Further evaluation based on the different multimodal tasks

14th july 2010 uppsala, sweden

Documents

coreference resolution

extralinguistic information

typical coreference

various linguistic information

research backgroundthe

extralinguistic clues

various linguistic cluesge

target data