![Page 1: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/1.jpg)
TEXT PROCESSING 1
Anaphora resolution
Introduction to Anaphora Resolution
![Page 2: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/2.jpg)
Outline
A reminder: anaphora resolution, factors affecting the interpretation of anaphoric expressions
A brief history of anaphora resolution First algorithms: Charniak, Winograd, Wilks Pronouns: Hobbs Salience: S-List, LRC
Early ML work Definite descriptions: Vieira & Poesio The MUC initiative – also: coreference evaluation
methods Soon et al
![Page 3: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/3.jpg)
Anaphora resolution: a specification of the problem
![Page 4: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/4.jpg)
Anaphora resolution:coreference chains
![Page 5: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/5.jpg)
Reminder: Factors that affect the interpretation of anaphoric expressions
Factors: Morphological features (agreement) Syntactic information Salience Lexical and commonsense knowledge
Distinction often made between CONSTRAINTS and PREFERENCES
![Page 6: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/6.jpg)
Agreement
GENDER strong CONSTRAINT for pronouns (in other languages: for other anaphors as well) [Jane] blamed [Bill] because HE spilt the
coffee (Ehrlich, Garnham e.a, Arnold e.a) NUMBER also strong constraint
[[Union] representatives] told [the CEO] that THEY couldn’t be reached
![Page 7: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/7.jpg)
Some complexities
Gender: [India] withdrew HER ambassador from the
Commonwealth “…to get a customer’s 1100 parcel-a-week load
to its doorstep” [actual error from LRC algorithm]
Number: The Union said that THEY would withdraw from
negotations until further notice.
![Page 8: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/8.jpg)
Syntactic information
BINDING constraints [John] likes HIM
EMBEDDING constraints [[his] friend]
PARALLELISM preferences Around 60% of pronouns occur in subject
position; around 70% of those refer to antecedents in subject position
[John] gave [Bill] a book, and [Fred] gave HIM a pencil
Effect of syntax on SALIENCE (Next)
![Page 9: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/9.jpg)
Salience
In every discourse, certain entities are more PROMINENT
![Page 10: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/10.jpg)
Factors that affect prominence
Distance Order of mention in the sentence
Entities mentioned earlier in the sentence more prominent Type of NP (proper names > other types of NPs) Number of mentions Syntactic position (subj > other GF, matrix >
embedded) Semantic role (‘implicit causality’ theories) Discourse structure
![Page 11: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/11.jpg)
Focusing theories
Hypothesis: One or more entities in the discourse are the FOCUS OF (LINGUISTIC) ATTENTION just like some entities in the visual space are the focus of VISUAL attention Grosz 1977, Reichman 1985: ‘focus
spaces’ Sidner 1979, Sanford & Garrod 1981:
‘focused entities’ Grosz et al 1981, 1983, 1995: Centering
![Page 12: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/12.jpg)
Lexical and commonsense knowledge
[The city council] refused [the women] a permit because they feared violence.[The city council] refused [the women] a permit because they advocated violence.
Winograd (1974), Sidner (1979)
BRISBANE – a terrific right rip from [Hector Thompson] dropped [Ross Eadie] at Sandgate on Friday night and won him the Australian welterweight boxing title. (Hirst, 1981)
![Page 13: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/13.jpg)
Problems to be resolved by an AR system: mention identification
Effect: recall Typical problems:
Nested NPs (possessives) [a city] 's [computer system] [[a city]’s computer system]
Appositions: [Madras], [India] [Madras, [India]]
Attachments
![Page 14: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/14.jpg)
Problems for AR:Complex attachments
[The quality that’s coming out of [software from [India]] The quality that’s coming out of software
from India is now exceeding the quality of software that’s coming out from the United States
scanning through millions of lines of computer code
ACE/bnews/devel/ABC19981001.1830.1257
![Page 15: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/15.jpg)
Problems for AR: agreement extraction
The committee are meeting / is meeting
The Union sent a representative. They ….
The doctor came to visit my father. SHE told him …
![Page 16: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/16.jpg)
Problems to be solved: anaphoricity determination
Expletives: IT’s not easy to find a solution Is THERE any reason to be optimistic at
all? Non-anaphoric definites
![Page 17: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/17.jpg)
Outline
A reminder: anaphora resolution, factors affecting the interpretation of anaphoric expressions
A brief history of anaphora resolution First algorithms: Charniak, Winograd, Wilks Pronouns: Hobbs Salience: S-List, LRC
Early ML work Definite descriptions: Vieira & Poesio The MUC initiative, coreference evaluation methods Soon et al 2001
![Page 18: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/18.jpg)
A brief history of early work on anaphora resolution
Algorithms heavily based on (hand-coded) commonsense knowledge: Charniak 72, Winograd 74, Hobbs 1974, Wilks 1975
Syntax-based algorithms: Hobbs 1974 Salience-based algorithms: Brennan et
al, Strube & Hahn, Tetreault
![Page 19: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/19.jpg)
First approaches to anaphora resolution: Charniak, Winograd, Wilks
Heavy emphasis on inference Hand-coded No formal evaluation
![Page 20: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/20.jpg)
1970s Example: SHRDLUhttp://hci.stanford.edu/~winograd/shrdlu/
Person: PICK UP A BIG RED BLOCK.
Computer: OK. (does it)
Person: GRASP THE PYRAMID.
Computer: I DON'T UNDERSTAND WHICH
PYRAMID YOU MEAN.
Person: FIND A BLOCK WHICH IS TALLER
THAN THE ONE YOU ARE HOLDING
AND PUT IT INTO THE BOX.
Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING.
Computer: OK. (does it)
Person: WHAT DOES THE BOX CONTAIN?
Computer: THE BLUE PYRAMID AND THE BLUE BLOCK.
Person: WHAT IS THE PYRAMID SUPPORTED BY?
Computer: THE BOX.
Person: HOW MANY BLOCKS ARE NOT
IN THE BOX?
Computer: FOUR OF THEM.
Person: IS AT LEAST ONE OF THEM
NARROWER THAN THE ONE WHICH
I TOLD YOU TO PICK UP?
Computer: YES, THE RED CUBE.
• This is one of the only dialogs it knows.• SHRDLU is too stupid to make mistakes.• Beautiful “Demo-ware”
Terry Winograd. 1971.MIT Ph.D. Thesis. Terry Winograd
![Page 21: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/21.jpg)
Anaphora in SHRDLU
First example of HISTORY LIST algorithm Uses a combination of agreement features and
semantic constraints Check all possibilities and assign a global score
rather than simply find the first match Score incorporates syn component: entities in subj
position higher score than entities in object position, in turn ranked more highly than entities in adjunct position
Performance made more impressive by including solutions to a number of complex cases, such as reference to events (Why did you do it?) – often ad hoc
![Page 22: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/22.jpg)
Hobbs’ `Naïve Algorithm’ (Hobbs, 1974)
The reference algorithm for PRONOUN resolution (until Soon et al it was the standard baseline) Interesting since Hobbs himself in the
1974 paper suggests that this algorithm is very limited (and proposes one based on semantics)
The first anaphora resolution algorithm to have an (informal) evaluation
Purely syntax based
![Page 23: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/23.jpg)
Hobbs’ `Naïve Algorithm’ (Hobbs, 1974)
Works off ‘surface parse tree’ Starting from the position of the pronoun
in the surface tree, first go up the tree looking for an antecedent
in the current sentence (left-to-right, breadth-first);
then go to the previous sentence, again traversing left-to-right, breadth-first.
And keep going back
![Page 24: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/24.jpg)
Hobbs’ algorithm: Intrasentential anaphora
Steps 2 and 3 deal with intrasentential anaphora and incorporate basic syntactic constraints:
Also: John’s portrait of him
S
NPJohn V
likesNPhim
Xp
![Page 25: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/25.jpg)
Hobbs’ Algorithm: intersentential anaphora
S
NPJohn V
likesNPhim
S
NPBill V
isNPa good friend
X
candidate
![Page 26: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/26.jpg)
Evaluation The first anaphora resolution algorithm to be evaluated in a
systematic manner, and still often used as baseline (hard to beat!) Hobbs, 1974:
300 pronouns from texts in three different styles (a fiction book, a non-fiction book, a magazine)
Results: 88.3% correct without selectional constraints, 91.7% with SR 132 ambiguous pronouns; 98 correctly resolved.
Tetreault 2001 (no selectional restrictions; all pronouns) 1298 out of 1500 pronouns from 195 NYT articles (76.8% correct) 74.2% correct intra, 82% inter
Main limitations Reference to propositions excluded Plurals Reference to events
![Page 27: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/27.jpg)
Salience-based algorithms
Common hypotheses: Entities in discourse model are RANKED by
salience Salience gets continuously updated Most highly ranked entities are preferred
antecedents Variants:
DISCRETE theories (Sidner, Brennan et al, Strube & Hahn): 1-2 entities singled out
CONTINUOUS theories (Alshawi, Lappin & Leass, Strube 1998, LRC): only ranking
![Page 28: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/28.jpg)
Salience-based algorithms
Sidner 1979: Most extensive theory of the influence of salience on
several types of anaphors Two FOCI: discourse focus, agent focus never properly evaluated
Brennan et al 1987 (see Walker 1989) Ranking based on grammatical function One focus (CB)
Strube & Hahn 1999 Ranking based on information status (NP type)
S-List (Strube 1998): drop CB LRC (Tetreault): incremental
![Page 29: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/29.jpg)
LRC
An update on Strube’s S-LIST algorithm (= Centering without centers)
Initial version augmented with various syntactic and discourse constraints
![Page 30: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/30.jpg)
LRC Algorithm (LRC)
Maintain a stack of entities ranked by grammatical function and sentence order (Subj > DO > IO)
Each sentence is represented by a Cf-list: list of salient entities ordered by grammatical function
While processing utterance’s entities (left to right) do: Push entity onto temporary list (Cf-list-new), if pronoun, attempt
to resolve first: Search through Cf-list-new (l-to-r) taking the first candidate
that meets gender, agreement constraints, etc. If none found, search past utterance’s Cf-lists starting from
previous utterance to beginning of discourse
![Page 31: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/31.jpg)
Results
Algorithm PTB-News (1694)
PTB-Fic (511)
LRC 74.9% 72.1%
S-List 71.7% 66.1%
BFP 59.4% 46.4%
![Page 32: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/32.jpg)
Comparison with ML techniques of the time
Algorithm All 3rd
LRC 76.7%
Ge et al. (1998) 87.5% (*)
Morton (2000) 79.1%
![Page 33: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/33.jpg)
Carter’s Algorithm (1985)
The most systematic attempt to produce a system using both salience (Sidner’s theory) & commonsense knowledge (Wilks’s preference semantics)
Small-scale evaluation (around 100 hand-constructed examples)
Many ideas found their way in the SRI’s Core Language Engine (Alshawi, 1992)
![Page 34: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/34.jpg)
Outline
A reminder: anaphora resolution, factors affecting the interpretation of anaphoric expressions
A brief history of anaphora resolution First algorithms: Charniak, Winograd, Wilks Pronouns: Hobbs Salience: S-List, LRC
Modern work in anaphora resolution Early ML work The MUC initiative – also: coreference evaluation
methods Soon et al
![Page 35: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/35.jpg)
MODERN WORK IN ANAPHORA RESOLUTION
Availability of the first anaphorically annotated corpora circa 1993 (MUC6) made statistical methods possible
![Page 36: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/36.jpg)
STATISTICAL APPROACHES TO ANAPHORA RESOLUTION
UNSUPERVISED approaches Eg Cardie & Wagstaff 1999, Ng 2008
SUPERVISED approaches Early (NP type specific) Soon et al: general classifier + modern
architecture
![Page 37: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/37.jpg)
ANAPHORA RESOLUTION AS A CLASSIFICATION PROBLEM
1. Classify NP1 and NP2 as coreferential or not
2. Build a complete coreferential chain
![Page 38: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/38.jpg)
SOME KEY DECISIONS
ENCODING I.e., what positive and negative instances to
generate from the annotated corpus Eg treat all elements of the coref chain as
positive instances, everything else as negative: DECODING
How to use the classifier to choose an antecedent
Some options: ‘sequential’ (stop at the first positive), ‘parallel’ (compare several options)
![Page 39: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/39.jpg)
Outline
A reminder: anaphora resolution, factors affecting the interpretation of anaphoric expressions
A brief history of anaphora resolution First algorithms: Charniak, Winograd, Wilks Pronouns: Hobbs Salience: S-List, LRC
Modern work in anaphora resolution Early ML work The MUC initiative – also: coreference evaluation
methods Soon et al
![Page 40: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/40.jpg)
Early machine-learning approaches
Main distinguishing feature: concentrate on a single NP type
Both hand-coded and ML: Aone & Bennett (pronouns) Vieira & Poesio (definite descriptions)
Ge and Charniak (pronouns)
![Page 41: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/41.jpg)
Definite descriptions:Vieira & Poesio
A first attempt at going beyond pronouns while still doing a large-scale evaluation
Definite descriptions chosen because they require lexical and commonsense knowledge
Developing both a hand-coded and a ML decision tree (as in Aone and Bennett)
Vieira & Poesio 1996, Vieira 1998, Vieira & Poesio 2000
![Page 42: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/42.jpg)
Preliminary corpus study (Poesio and Vieira, 1998)
Annotators asked to classify about 1,000 definite descriptions from the ACL/DCI corpus (Wall Street Journal texts) into three classes:
DIRECT ANAPHORA: a house … the house
DISCOURSE-NEW: the belief that ginseng tastes like spinach is more widespread than one would expect
BRIDGING DESCRIPTIONS:the flat … the living room; the car … the vehicle
![Page 43: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/43.jpg)
Poesio and Vieira, 1998
Results:
More than half of the def descriptions are first-mention
Subjects didn’t always agree on the classification of an antecedent (bridging descriptions: ~8%)
![Page 44: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/44.jpg)
The Vieira / Poesio system for robust definite description resolution
Follows a SHALLOW PROCESSING approach (Carter, 1987; Mitkov, 1998): it only uses
Structural information (extracted from Penn Treebank)
Existing lexical sources (WordNet)
(Very little) hand-coded information
![Page 45: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/45.jpg)
Methods for resolving direct anaphors
DIRECT ANAPHORA:
the red car, the car, the blue car: premodification heuristics
segmentation: approximated with ‘loose’ windows
![Page 46: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/46.jpg)
Methods for resolving discourse-new definite descriptions
DISCOURSE-NEW DEFINITES
the first man on the Moon, the fact that Ginseng tastes of spinach: a list of the most common functional predicates (fact, result, belief) and modifiers (first, last, only… )
heuristics based on structural information (e.g., establishing relative clauses)
![Page 47: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/47.jpg)
The (hand-coded) decision tree
1. Apply ‘safe’ discourse-new recognition heuristics
2. Attempt to resolve as same-head anaphora
3. Attempt to classify as discourse new
4. Attempt to resolve as bridging description. Search backward 1 sentence at a time and apply heuristics in the following order:1. Named entity recognition heuristics – R=.66, P=.95
2. Heuristics for identifying compound nouns acting as anchors – R=.36
3. Access WordNet – R, P about .28
![Page 48: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/48.jpg)
The decision tree obtained via ML
Same features as for the hand-coded decision tree
Using ID3 classifier (non probabilistic decision tree)
Training instances: Positive: closest annotated antecedent Negative: all mentions in previous four
sentences Decoding: consider all possible
antecedents in previous four sentences
![Page 49: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/49.jpg)
Automatically learned decision tree
![Page 50: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/50.jpg)
Overall Results
Evaluated on a test corpus of 464 definite descriptions
Overall results:
R P F
Version 1 53% 76% 62%
Version 2 57% 70% 62%
D-N def 77% 77% 77%
ID3 75% 75% 75%
![Page 51: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/51.jpg)
Overall Results
Results for each type of definite description:
R P F
Direct anaphora
62% 83% 71%
Disc new 69% 72% 70%
Bridging 29% 38% 32.9%
![Page 52: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/52.jpg)
Outline
A reminder: anaphora resolution, factors affecting the interpretation of anaphoric expressions
A brief history of anaphora resolution First algorithms: Charniak, Winograd, Wilks Pronouns: Hobbs Salience: S-List, LRC
Early ML work Definite descriptions: Vieira & Poesio The MUC initiative and coreference evaluation
methods Soon et al
![Page 53: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/53.jpg)
MUC
First big initiative in Information Extraction
Produced first sizeable annotated data for coreference
Developed first methods for evaluating systems
![Page 54: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/54.jpg)
MUC terminology:
MENTION: any markable COREFERENCE CHAIN: a set of
mentions referring to an entity KEY: the (annotated) solution (a
partition of the mentions into coreference chains)
RESPONSE: the coreference chains produced by a system
![Page 55: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/55.jpg)
Evaluation of coreference resolution systems
Lots of different measures proposed ACCURACY:
Consider a mention correctly resolved if Correctly classified as anaphoric or not anaphoric ‘Right’ antecedent picked up
Measures developed for the competitions: Automatic way of doing the evaluation
More realistic measures (Byron, Mitkov) Accuracy on ‘hard’ cases (e.g., ambiguous
pronouns)
![Page 56: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/56.jpg)
Vilain et al 1995
The official MUC scorer Based on precision and recall of links
![Page 57: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/57.jpg)
Vilain et al: the goalThe problem: given that A,B,C and D are part of a coreference chain in the KEY, treat as equivalent the two responses:
And as superior to:
![Page 58: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/58.jpg)
Vilain et al: RECALL
To measure RECALL, look at how each coreference chain Si in the KEY is partitioned in the RESPONSE, and count how many links would be required to recreate the original, then average across all coreference chains.
![Page 59: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/59.jpg)
Vilain et al: Example recall
In the example above, we have one coreference chain of size 4 (|S| = 4)
The incorrect response partitions it in two sets (|p(S)| = 2)
R = 4-2 / 4-1 = 2/3
![Page 60: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/60.jpg)
Vilain et al: precision
Count links that would have to be (incorrectly) added to the key to produce the response
I.e., ‘switch around’ key and response in the equation before
![Page 61: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/61.jpg)
Beyond Vilain et al
Problems: Only gain points for links. No points
gained for correctly recognizing that a particular mention is not anaphoric
All errors are equal Proposals:
Bagga & Baldwin’s B-CUBED algorithm Luo recent proposal
![Page 62: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/62.jpg)
Outline
A reminder: anaphora resolution, factors affecting the interpretation of anaphoric expressions
A brief history of anaphora resolution First algorithms: Charniak, Winograd, Wilks Pronouns: Hobbs Salience: S-List, LRC
Early ML work Definite descriptions: Vieira & Poesio The MUC initiative – also: coreference evaluation
methods Soon et al
![Page 63: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/63.jpg)
Soon et al 2001
First ‘modern’ ML approach to anaphora resolution Resolves ALL anaphors Fully automatic mention identification
Developed instance generation & decoding methods used in a lot of work since
![Page 64: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/64.jpg)
Soon et al: preprocessing
POS tagger: HMM-based 96% accuracy
Noun phrase identification module HMM-based Can identify correctly around 85% of mentions (??
90% ??)
NER: reimplementation of Bikel Schwartz and Weischedel 1999 HMM based 88.9% accuracy
![Page 65: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/65.jpg)
Soon et al: training instances
<ANAPHOR (j), ANTECEDENT (i)>
![Page 66: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/66.jpg)
Soon et al 2001: Features
NP type Distance Agreement Semantic class
![Page 67: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/67.jpg)
Soon et al: NP type and distance
NP type of antecedent i i-pronoun (bool)
NP type of anaphor j (3) j-pronoun, def-np, dem-np (bool)
DIST 0, 1, ….
Types of both both-proper-name (bool)
![Page 68: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/68.jpg)
Soon et al features: string match, agreement, syntactic position
STR_MATCHALIAS dates (1/8 – January 8) person (Bent Simpson / Mr. Simpson) organizations: acronym match (Hewlett Packard / HP)
AGREEMENT FEATURES number agreement gender agreement
SYNTACTIC PROPERTIES OF ANAPHOR occurs in appositive contruction
![Page 69: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/69.jpg)
Soon et al: semantic class agreement
PERSON
FEMALE MALE
OBJECT
DATEORGANIZATION
TIME MONEY PERCENT
LOCATION
SEMCLASS = true iff semclass(i) <= semclass(j) or viceversa
![Page 70: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/70.jpg)
Soon et al: generating training instances
Marked antecedent used to create positive instance
All mentions between anaphor and marked antecedent used to create negative instances
![Page 71: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/71.jpg)
Generating training instances
((Eastern Airlines) executives) notified ((union) leaders) that (the carrier) wishes to discuss (selective (wage) reductions) on (Feb 3)
POSITIVE
NEGATIVE
NEGATIVE
NEGATIVE
![Page 72: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/72.jpg)
Soon et al: decoding
Right to left, consider each antecedent until classifier returns true
![Page 73: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/73.jpg)
Soon et al: evaluation
MUC-6: P=67.3, R=58.6, F=62.6
MUC-7: P=65.5, R=56.1, F=60.4
![Page 74: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/74.jpg)
Soon et al: evaluation
![Page 75: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/75.jpg)
Subsequent developments
Different models of the task Different preprocessing techniques Using lexical / commonsense knowledge
(particularly semantic role labelling) Next lecture
Salience Anaphoricity detection Development of AR toolkits (GATE,
LingPipe, GUITAR)
![Page 76: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/76.jpg)
Other models
Cardie & Wagstaff: coreference as (unsupervised) clustering Much lower performance
Ng and Cardie Yang ‘twin-candidate’ model
![Page 77: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/77.jpg)
Ng and Cardie
2002: Changes to the model:
Positive: first NON PRONOMINAL Decoding: choose MOST HIGH PROBABILITY
Many more features: Many more string features Linguistic features (binding, etc)
Subsequently: Discourse new detection (see below)
![Page 78: TEXT PROCESSING 1 Anaphora resolution Introduction to Anaphora Resolution](https://reader030.vdocuments.us/reader030/viewer/2022032612/56649ea25503460f94ba644e/html5/thumbnails/78.jpg)
Readings
Kehler’s chapter in Jurafsky & Martin Alternatively: Elango’s survey
http://pages.cs.wisc.edu/~apirak/cs/cs838/pradheep-survey.pdf
Hobbs J.R. 1978, “Resolving Pronoun References,” Lingua, Vol. 44, pp. 311-. 338. Also in Readings in Natural Language Processing,
Renata Vieira, Massimo Poesio, 2000. An Empirically-based System for Processing Definite Descriptions. Computational Linguistics 26(4): 539-593
W. M. Soon, H. T. Ng, and D. C. Y. Lim, 2001. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521--544,