error analysis of word sense disambiguation
TRANSCRIPT
Error analysis of Word Sense DisambiguationRuben IzquierdoMarten PostmaPiek Vossen
Izq
uie
rdo
, P
ost
ma
an
d V
oss
en
VU
Am
ste
rda
m
Motivation
Word Sense Disambiguation is still an unsolved problem
2 Izquierdo, Postma and Vossen VU Amsterdam
Error Analysis
Perform error analysis on previous WSD evaluations to prove our hypothesis
Senseval-2: all-words task
Senseval-3: all-words task
Semeval2007: all-words task (#17)
Semeval2010: all-words on specific domain (#17)
Semeval2013: multilingual all-words WSD and entity linking (#12)
3 Izquierdo, Postma and Vossen VU Amsterdam
Motivation
Some “propagated” errors
Errors on monosemous
Errors because pos-tags
Multiwords and phrasal verbs
Little attention has been paid to the real problem
WSD is not 1 problem but N problems
Our hypothesis
Context is not modeled properly in general
System rely too much on the most frequent sense
4 Izquierdo, Postma and Vossen VU Amsterdam
Monosemous errors
5 Izquierdo, Postma and Vossen VU Amsterdam
Monosemous errors
6 Izquierdo, Postma and Vossen VU Amsterdam
Competition Monosemous Wrong Examples
Senseval2 499 (20.9%) 37.5% gene.n (suppressor_gene.n), chance.a(chance.n) next.r (next.a)
Senseval3 334 (16.6%) 44.1% Datum.n (data.n) making.n (make.v) out_of_sight (sight)
Semeval2007 25 (5.5%) 11.1% get_stuck.v, lack.v, write_about.v
Semeval2010 31 (2.2%) 97.9% Tidal_zone.n pine_marten.n roe_deer.ncordgrass.n
Semeval2013 (lemmas)
348 (21.1%) 1.9% Private_enterprise, developing_country, narrow_margin
Most Frequent Sense
7 Izquierdo, Postma and Vossen VU Amsterdam
Most Frequent Sense
When the correct sense is NOT the most frequent sense
Systems still assign mostly the MFS
Senseval2
799 tokens are not MFS
84% systems still assign the MFS
Most “failed” words due to MFS bias
Senseval2, senseval3
Say.v find.v take.v have.v cell.n church.n
Semeval2010
Area.n nature.n connection.n water.n population.n
8 Izquierdo, Postma and Vossen VU Amsterdam
Analysis per PoS-tag
9 Izquierdo, Postma and Vossen VU Amsterdam
Analysis per polysemy class
10 Izquierdo, Postma and Vossen VU Amsterdam
2Senses
Poly. C.
6 15
Low Medium High
Analysis per frequency class
11 Izquierdo, Postma and Vossen VU Amsterdam
Most difficult words
12 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observeddifficulties
Calculate per sentence
The “expected” difficulty
Average polysemy, sentence length, average word length
13 Izquierdo, Postma and Vossen VU Amsterdam
Calculate per sentence
The “expected” difficulty
Average polysemy, sentence length, average word length
14 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observeddifficulties
Calculate per sentence
The “expected” difficulty
Average polysemy, sentence length, average wor length
The “observed” difficulty
From the real participant outputs, average error rate
We should expect:
harder sentences higher error rate
easier sentences lower error rate
15 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observeddifficulties
16 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observeddifficulties
17 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observeddifficulties
• The context is not (probably) exploited properly • Expected “easy” sentences SHOULD show low error rates• Occurrences of the same word in different contexts have similar error
rate• The difficulty of a word depends more on its polysemy than on the
context where it appears18 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observeddifficulties
WSD Corpora
http://github.com/rubenIzquierdo/wsd_corpora
19 Izquierdo, Postma and Vossen VU Amsterdam
WSD Corpora
20 Izquierdo, Postma and Vossen VU Amsterdam
System Outputs
https://github.com/rubenIzquierdo/sval_systems
21 Izquierdo, Postma and Vossen VU Amsterdam
System Outputs
22 Izquierdo, Postma and Vossen VU Amsterdam
Error analysis of Word Sense Disambiguation
Ruben Izquierdo
Marten Postma
Piek Vossen
http://github.com/rubenIzquierdo/wsd_corpora
http://github.com/rubenIzquierdo/sval_systems
23
Analysis per PoS-tag
24 Izquierdo, Postma and Vossen VU Amsterdam