using semantic relatedness for word sense disambiguation siddharth patwardhan [email protected]...

15
Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan [email protected] 10/24/2002

Upload: cory-blake

Post on 13-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Using Semantic Relatedness for Word Sense Disambiguation

Siddharth Patwardhan

[email protected]

10/24/2002

Page 2: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

The Lesk1 Algorithm

• Two Hypotheses:– The intended sense of the target word in a given

context is semantically related to other word senses in the context.

– Semantically related words have greater number of overlaps of their dictionary definitions.

1[Lesk 1986]

Page 3: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

An ExampleThe rate of interest at this bank is high.

rate (charge per unit) rate (change with time) rate (pace)

interest (involvement)interest (interestingness)interest (sake)interest (charge for loan)interest (pastime)interest (stake)

bank (financial institution)bank (river)bank (stock)bank (building)bank (arrangement)bank (container)

rate: amount of a charge or payment relative to some basis; "a 10-minute phone call at that rate would cost $5“

interest: a fixed charge for borrowing money; usually a percentage of the amount borrowed; "how much interest do you pay on your mortgage?“

bank: a financial institution that accepts deposits and channels the money into lending activities; "he cashed a check at the bank"; "that bank holds the mortgage on my home"

Page 4: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Adapting Lesk to WordNet2

Banerjee and Pedersen [2001] adapt the Lesk algorithm to use the rich source of knowledge in WordNet.

rate: gloss interest: gloss bank: gloss

hypernym: gloss hypernym: gloss hypernym: gloss

hyponym: gloss hyponym: gloss hyponym: gloss

2 [Fellbaum 1998]

Page 5: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Semantic Relatedness by counting edges.

• Rada, et al [1989] introduce a notion of relatedness between words by counting the number of edges between the them in a “broader-than” hierarchy (MeSH: a hierarchy of medical terms).

• Leacock and Chodorow [1998] use a similar approach to measure semantic relatedness between concepts by finding the length of the shortest path between the two concepts in the is-a hierarchy of WordNet. They scale this value by the maximum depth of the taxonomy and get a formula for relatedness:

relatedness = -log(pathLength/(2·maxDepth))

Page 6: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Information Content• Introduced by Resnik [1995].• Indicates the specificity or generality of a concept.• More specific concepts have higher information content,

while more general concepts have less information content.• For example concepts like dime, clinker and hayfork are

rather specific or topical, would be localized in a discourse and would greatly restrict the choice of concepts that can be used around them (in the context).

• Computed from large (ideally sense-tagged) corpora.• IC(concept) = -log(Probability of occurrence of the

concept in a large corpus)

Page 7: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Information Content

*Root*

minicab

cab

car

Motor vehicle

+1

+1

+1 *Root*

minicab

cab

car

Motor vehicle

+1

+1

+1

+1

+1

Page 8: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Measures of Semantic Relatednessbased on a concept hierarchy

LOWEST COMMON SUBSUMER

CONCEPT c1CONCEPT c2

Page 9: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Measures of Semantic Relatedness

Resnik[1995]

relatedness = IC(lcs)

Jiang Conrath[1997]

distance = 2 x IC(lcs) – (IC(c1) + IC(c2))

Lin[1998]

relatedness = 2 x IC(lcs) IC(c1) + IC(c2)

Page 10: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Measures of Semantic Relatedness

The Hirst-St.Onge[1998] measure

Is a kind of(UPWARD)

Has part(DOWNWARD)

Opposite(HORIZONTAL)

(1) Extra Strong Relation – between twooccurrences of the same word.

(2) Strong Relation – three rules.• Synonyms• Horizontal Relation• Compound – Word Relation

(3) Medium Strong Relation – if there exists an allowable path between the two concepts.

c1

c2

c3

c4

Page 11: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Word Sense Disambiguation using Measures of Semantic Relatedness

LOCAL APPROACH

CONTEXT Word1 Target Word2

SENSESW11

W12

T1

T2

W21

W22

T1

W11

W12

W21

W22

T2

W11

W12

W21

W22

S11 S21

S12

S13

S14

S22

S23

S24

Score(T1) = S11 + S12 + S13 + S14 Score(T2) = S21 + S22 + S23 + S24

Page 12: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

W21W11

Word Sense Disambiguation using Measures of Semantic Relatedness

GLOBAL APPROACH

T1

W22W12 T1

W21W11 T2

W22W11 T1

W21W12 T1

W22W12T2

W22W11T2

W21W12 T2

a b

c

S1 = a + b + c

S2

S3

S4

S5

S6

S7

S8

Combination with the highest score is selected.

Page 13: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

Some Results

Resnik 0.295 0.290

Jiang-Conrath 0.330 0.331

Lin 0.328 0.363

Leacock-Chodorow 0.305

Hirst-St.Onge 0.316

Adapted Lesk 0.391

• The experiments were performed on the noun instances of the SENSEVAL-2 data (1723 instances).

• A context window size of 3 with the local scoring approachwas considered for the experiments.

• Information content was calculated using the SemCor semantically tagged corpus and from the Brown corpus.

SemCor Brown

Page 14: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

References[Lesk 1986] M. Lesk. Automatic sense disambiguation using machine

readable dictionaries: How to tell a pine cone from an ice cream cone In Proceedings of International Conference on Machine Learning, Madison, Wisconsin, August 1998.

[Budanitsky Hirst 2001] A. Budanitsky and G. Hirst. Semantic Distance in WordNet: An experimental application-oriented evaluation of five measures. In Workshop on WordNet and Other Lexical Resources, Second meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh, June 2001.

[Banerjee Pedersen 2002] S. Banerjee and T. Pedersen. An adapted Lesk algorithm for word sense disambiguation using WordNet. In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Feb 2002.

[Resnik 1995] P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, August 1995.

Page 15: Using Semantic Relatedness for Word Sense Disambiguation Siddharth Patwardhan patw0006@d.umn.edu 10/24/2002

References[Jiang Conrath 1997] J. Jiang and D. Conrath. Semantic similarity based

on corpus statistics and lexical taxonomy. In Proceedings on International Conference on Research in Computational Linguistics, Taiwan, 1997.

[Lin 1998] D. Lin. An information-theoretic definition of similarity. In Proceedings of International Conference on Machine Learning, Madison, Wisconsin, August 1998.

[Leacock Chodorow 1998] C. Leacock and M. Chodorow. Combining local context and WordNet similarity for word sense identification. In Fellbaum, pp. 265 – 283, 1998.

[Hirst St-Onge 1998] G. Hirst and D. St-Onge. Lexical chain as representations of context for the detection and correction of malapropisms. In Fellbaum, pp. 305 – 332, 1998.

[Fellbaum 1998] C. Fellbaum, editor. WordNet: An electronic lexical database. MIT Press, 1998.

[Rada et al, 1989] R. Rada, H. Mili, E. Bicknell and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics, 19(1):17-30, February 1989.