detecting anaphoricity and antecedenthood for coreference resolution

21
Detecting Anaphoricity and Antecedenthood for Coreference Resolution Olga Uryupina (uryupina @ gmail . com ) Institute of Linguistics, RAS 13.11.08

Upload: geona

Post on 04-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Detecting Anaphoricity and Antecedenthood for Coreference Resolution. Olga Uryupina ( uryupina @ gmail . com ) Institute of Linguistics, RAS 13.11.08. Overview. Anaphoricity and Antecedenthood Experiments Incorporating A&A detectors into a CR system Conclusion. A&A: example. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Detecting Anaphoricity and Antecedenthood for Coreference

Resolution

Olga Uryupina ([email protected])

Institute of Linguistics, RAS 13.11.08

Page 2: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Overview

• Anaphoricity and Antecedenthood• Experiments• Incorporating A&A detectors into a

CR system• Conclusion

Page 3: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

A&A: example

Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.

Page 4: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

A&A: example

Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.

Page 5: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Anaphoricity

Likely anaphors:- pronouns, definite descriptions

Unlikely anaphors:- indefinites

Unknown:- proper names

Poesio&Vieira: more than 50% of definite descriptions in a newswire text are not anaphoric!

Page 6: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

A&A: example

Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.

Page 7: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

A&A: example

Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.

Page 8: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Antecedenthood

Related to referentiality (Karttunen, 1976):

„no debt“ etc

Antecedenthood vs. Referentiality: corpus-based decision

Page 9: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Experiments

• Can we learn anaphoricity/antecedenthood classifiers?

• Do they help for coreference resolution?

Page 10: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Methodology

• MUC-7 dataset • Anaphoricity/antecedenthood

induced from the MUC annotations• Ripper, SVM

Page 11: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Features

• Surface form (12)• Syntax (20)• Semantics (3)• Salience (10)• „same-head“ (2)• From Karttunen, 1976 (7)

49 features – 123 boolean/continuous

Page 12: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Results: anaphoricity

Feature groups R P F

Baseline 100 66.5 79.9

All 93.5 82.3 87.6

Surface 100 66.5 79.9

Syntax 97.4 72.0 82.8

Semantics 98.5 68.9 81.1

Salience 91.2 69.3 78.7

Same-head 84.5 81.1 82.8

Karttunen‘s 91.6 71.1 80.1

Synt+SH 90.0 83.5 86.6

Page 13: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Results: antecedenthood

Feature groups R P F

Baseline 100 66.5 79.9

All 95.7 69.2 80.4

Surface 94.6 68.5 79.5

Syntax 95.7 69.2 80.3

Semantics 94.9 69.4 80.2

Salience 98.9 67.0 79.9

Same-head 100 66.5 79.9

Karttunen‘s 99.3 67.3 80.2

Page 14: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Integrating A&A into a CR system

Apply an A&A prefiltering before CR starts:

- Saves time- Improves precision

Problem: we can filter out good candidates..:

- Will loose some recall

Page 15: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Oracle-based A&A prefiltering

Take MUC-based A&A classifier („gold standard“

CR system: Soon et al. (2001) with SVMs

MUC-7 validation set (3 „training“ documents)

Page 16: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Oracle-based A&A prefiltering

R P F

No prefilteing 54.5 56.9 55.7

±ana 49.6 73.6 59.3

±ante 54.2 69.4 60.9

±ana & ±ante 52.9 81.9 64.3

Page 17: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Automatically induced classifiers

Precision more crucial than Recall

Learn Ripper classifiers with different Ls (Loss Ratio)

Page 18: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Anaphoricity prefiltering

Page 19: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Antecedenthood prefiltering

Page 20: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Conclusion

Automatically induced detectors:• Reliable for anaphoricity• Much less reliable for antecedenthood(a corpus, explicitly annotated for

referentiality could help)A&A prefiltering:• Ideally, should help• In practice – substantial optimization

required

Page 21: Detecting Anaphoricity and Antecedenthood for Coreference Resolution

Thank You!