monitoring the penalization/advantage of lexical ambiguity in vector model representations
DESCRIPTION
So what can a model such as LSA tell us about some of the empirical effects related to lexical ambiguity?TRANSCRIPT
![Page 1: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/1.jpg)
1st Joint Conference of the EPS and SEPEX, Granada 2010
Monitoring the penalization/advantage of lexical ambiguity in vector model representations
Guillermo Jorge-Botana, Ricardo Olmos & José A. LeónUniversidad Autónoma de Madrid
![Page 2: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/2.jpg)
Extensional definition of lexical ambiguity
Two criteria for categorizing the phenomenon
Representation of meanings Separate vs. single representation of senses(Homonymy, Polysemy, Metonymy, Metaphors…)
Contextual distribution of word occurrencesOccurrences across few or many contexts
www.elsemantico.com
![Page 3: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/3.jpg)
Representation of senses
Separate senses (Klein & Murphy, 2001).
A common core with specific parts(Rodd, Gaskell & Marslen-Wilson, 2002; Klepousniotou & Baum, 2006).
Unique representation as in LSA models (Kintsch, 2001; Kintsch & Mangalath, in press).
www.elsemantico.com
![Page 4: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/4.jpg)
Contextual distribution of word occurrences
Measured by Contextual Diversity (Adelman, Gordon, Brown & Quesada, 2006, McDonald & Shillcock, 2001)
Polysemy: Polysemic words have greater Contextual Diversity than monosemic.
Since Abstract words have greater Contextual Diversity than Concrete words …
Can abstractness be a kind of lexical ambiguity?
www.elsemantico.com
![Page 5: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/5.jpg)
Concept of Context Diversity in Vector-Models
The vectors that represent the words inside a Vector-Space (as in LSA) can be a measure of indices such Context Diversity
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
Measuring the thematic focus, for example with entropy formula
www.elsemantico.com
![Page 6: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/6.jpg)
Current Study
So what can a model such as LSA tell us about some of the empirical effects related to lexical ambiguity?
A model of a unique lexical representationAn exclusively linguistic model.
www.elsemantico.com
![Page 7: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/7.jpg)
Current Study
Effects related to lexical ambiguity
1. Associative difficulty2. Lexical Decision Task
www.elsemantico.com
![Page 8: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/8.jpg)
1. Associative difficulty
For polysemic words: difficulty generating meaning without a context (Duffy, Morris, & Rayner, 1988)
Abstract words rarely have semantic relations (they predominantly have associative relations) (e.g. Crutch, 2006; Crutch, Ridha, & Warrington, 2006; Crutch & Warrington, 2005; Warrington & Crutch, 2007; Crutch & Warrington, 2005; Duñabeitia et al., 2009 )
www.elsemantico.com
![Page 9: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/9.jpg)
Associative difficulty for LSA
www.elsemantico.com
A
B
C
D
E
F
G
Main sense The other sense
Penalization
The similaritydecreases Polysemy
Monosemy
Monosemy
![Page 10: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/10.jpg)
2. Data with LDT
Polysemy advantage (Hino & Lupker, 1996; Pexman & Lupker, 1999)
Concrete advantageThe response to LDT is task dependentIt is favored by unspecific activation mediated by semantic representations (Piercey & Joordens, 2000)
www.elsemantico.com
![Page 11: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/11.jpg)
LDT with LSA
Ambiguous vectors have difficulty with main relations but an advantage with other relations.This is due to the distributional properties of the vectors.This advantage with other relations can produce non-specific activation, improving LDT response times
www.elsemantico.com
![Page 12: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/12.jpg)
LDT with LSA
www.elsemantico.com
A
B
C
D
E
F
Advantage with other relationsgenerates non-specific activation
Penalization with other wordswith different senses
Polysemy
Polysemy
Monosemy
![Page 13: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/13.jpg)
HypothesesA vector model like LSA can simulate the associative difficulty
In polysemic wordsIn abstract words
A vector model can simulate the response to LDT in polysemicwordsSince responses to LDT are the opposite in Abstract words (to polysemic). A linguistic model like LSA cannot simulate the response of abstract words to LDT without introducing another source of activation (for example mental imagery).
www.elsemantico.com
![Page 14: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/14.jpg)
Procedure
Three simulations1. Introducing an invented word into semantic space at
different points along the polysemy-monosemycontinuum.
2. With real polysemic-monosemic words
3. With real abstract-concrete words
www.elsemantico.com
![Page 15: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/15.jpg)
The analysis
Extract semantic neighbors and analyze the similarities (cosines) between them and the reference word.Analyze Context Diversity within the vector with ranges and entropy measures.
www.elsemantico.com
![Page 16: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/16.jpg)
First simulationLSA is trained with LEXESP corpus (Sebastián, Cuetos, Carreiras & Martí, 2000)
An artificial nonsense word “noray” is introduced into the space in documents with two possible senses represented in different proportions: the sense of “traditional sport” and sense of “old neighborhood”The proportions of the two senses were
30-0 (monosemy), 25-5 (dominant polysemy)15-15 (equally probable polysemy)5-25 (dominant polysemy)0-30 (monosemy)
www.elsemantico.com
![Page 17: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/17.jpg)
Results(I)
0.3
0.35
0.4
0.45
0.5
0.55
0.6
Series1 0.5918 0.52358 0.44215 0.4206 0.42179 0.47611 0.47638
0-30 5-25 10-20 15-15 20-10 25-5 30-0
Mean of the 30 first semantic neighbors
The penalization exists for polysemic conditions
www.elsemantico.com
![Page 18: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/18.jpg)
Results(I)
0
0,05
0,1
0,15
0,2
0,25
0,3
0,35
0,41
180
359
538
717
896
1075
1254
1433
1612
1791
1970
2149
2328
2507
2686
2865
3044
3223
3402
3581
3760
3939
0-305-2510-2015-1520-1025-530-0
Ranges of the 4000 first semantic neighbors
The pattern changed around neighbor 350 (cosine 0.2).For the polysemic conditions, the penalization became advantageous from this point.
www.elsemantico.com
![Page 19: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/19.jpg)
Results(I)
-0,1
-0,08
-0,06
-0,04
-0,02
0
0,02
0,04
0,06
0,08
0,1
1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239
0-3015-1530-0
Scores on the dimension of the vectors in the three main conditions
The effect of the penalization and the change of pattern may be due to the distributional properties of the vectors.The 15-15 condition was spread along all dimensions.
www.elsemantico.com
![Page 20: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/20.jpg)
Results(I)
1st, 2nd and 3rd position in the saturations of dimensions
The position of polysemic condition (15-15) were often medium (scoring low on all the dimensions).
www.elsemantico.com
![Page 21: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/21.jpg)
Second simulation
Same as first simulation but with real polysemicand monosemic words extracted from other studies (Estévez, 1991; Jorge-Botana, León & Olmos, 2010).
Controlled for frequency, concreteness and imaginability.
www.elsemantico.com
![Page 22: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/22.jpg)
Results (II)
0
0.1
0.2
0.3
0.4
0.5
0.6
1
153
305
457
609
761
913
1065
1217
1369
1521
1673
1825
1977
2129
2281
2433
2585
2737
2889
3041
3193
3345
3497
3649
3801
3953
4105
4257
4409
4561
4713
4865
MonosemyPolysemy
www.elsemantico.com
Ranges of the first 5000 semantic neighbors
Monosemy had an advantage for the first relations.The pattern changes around neighbor 550 (cosine 0.2) in favour of polysemic.
![Page 23: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/23.jpg)
Results(II)
GrupomonosemyPolisemy
Entro
pía
0,80
0,70
0,60
0,50
0,40
0,30
87
Global weight
Polysemy Monosemy
The polysemy condition have less global weight thus more entropyThe spread among different contexts for polysemic words is a substantive fact.
www.elsemantico.com
![Page 24: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/24.jpg)
Third simulation
Same as second simulation but with real abstract and concrete words extracted from other studies (Duñabeitia et al., 2009).
www.elsemantico.com
![Page 25: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/25.jpg)
Results(III)
0
0.1
0.2
0.3
0.4
0.5
0.6
1
242
483
724
965
1206
1447
1688
1929
2170
2411
2652
2893
3134
3375
3616
3857
4098
4339
4580
4821
AbstractConcrete
Ranges of the first 5000 semantic neighbors
Concrete words obtained an advantage for the first 400 relations.The pattern changed around neighbor 400 (cosine 0.2) in favor ofabstract words.
www.elsemantico.com
![Page 26: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/26.jpg)
Results(III)
GrupoConcreteAbstract
Entro
pía
0,80
0,70
0,60
0,50
0,40
0,30
Global weight
Abtracts Concretes
The abstract condition had less global weight thus more entropy.The spread among different contexts for abstract words is a substantive fact.
www.elsemantico.com
![Page 27: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/27.jpg)
Preliminary conclusionsSo, what can a model like LSA tell us about the empirical effects in question?
A model of a unique representation
An exclusively linguistic model.
www.elsemantico.com
![Page 28: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/28.jpg)
Preliminary conclusions (I)
In polysemy and in abstractness, associative difficulty could be explained by the linguistic distributional properties of a unique representation.
www.elsemantico.com
![Page 29: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/29.jpg)
Preliminary conclusions (II)
In polysemy, LDT responses could be explained by the unspecific activation produced by the linguistic distributional properties of a unique representation.
www.elsemantico.com
![Page 30: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/30.jpg)
Preliminary conclusions (III)
In the Abstract word condition, LDT responses couldn’t be explained by the linguistic distributional properties. Another source of potential activation is required.
(perceptual representations?)
www.elsemantico.com
![Page 31: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/31.jpg)
Preliminary conclusions (IV)
In abstract word condition:No additional source other than linguistic is needed as an explanation to account for associative difficulty.No differences in cerebral activation between concrete and abstract in semantic tasks (Pexman et al., 2007).
Another source of activation is needed for LDT responses.Differences in cerebral activation between concrete and abstract in LDT (Binder, Westbury, et al, 2005).
www.elsemantico.com
![Page 32: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/32.jpg)
Thank you
www.elsemantico.com
![Page 33: Monitoring the penalization/advantage of lexical ambiguity in vector model representations](https://reader038.vdocuments.us/reader038/viewer/2022110302/547e2079b37959932b8b552b/html5/thumbnails/33.jpg)
Monitoring the penalization/advantage of lexical ambiguity in vector model representations
Guillermo Jorge-Botana, Ricardo Olmos & José A. León
Universidad Autónoma de Madrid