![Page 1: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/1.jpg)
Qualitative Differences Between Human Data and Co-occurrenceModels of Semantic Similarity
Gabriel Recchia
![Page 2: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/2.jpg)
• Corpus-based methods
• Pointwise Mutual Information (Church & Hanks, 1989)
• Spreading Activation (Anderson & Pirolli, 1984; Farahat, Pirolli, & Markova, 2004)
• Latent Semantic Analysis (Landauer & Dumais, 1997)
• Generalized Latent Semantic Analysis (Matveeva et al., 2005)
• WordNet-based methods
• Resnik (1995), Jiang & Conrath (1997), Hirst & St-Onge (1998), Leacock & Chodorow (1998), Lin (1998), Pedersen et al. (2004)
Background
![Page 3: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/3.jpg)
Compares the probability of observing word x and word y together (the joint probability) with the probabilities of observing x and y independently (chance)
1. Pointwise Mutual Information (PMI)
I(x,y) = log2P(x,y)
P(x)P(y)
![Page 4: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/4.jpg)
1. Pointwise Mutual Information (PMI)
I(x,y) (# of docs containing {x and y})
(# of docs containing x) (# of docs containing y)
Compares the probability of observing word x and word y together (the joint probability) with the probabilities of observing x and y independently (chance)
![Page 5: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/5.jpg)
• PMI a metric, not a process model
• Like PMI, simple vector space models– reward co-occurrences– penalize highly frequent words
• Unlike PMI, they also store latent/indirect similarity information
![Page 6: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/6.jpg)
2. Vector Addition Model
![Page 7: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/7.jpg)
• Build a term-document matrix where element (i,j) describes the frequency of term i in document j
• Apply log-entropy weighing scheme to decrease the weight of high-frequency words
• Use singular value decomposition to find an approximation to the term-document matrix with lower rank k
• Optimize k for the task at hand
3. Latent Semantic Analysis (LSA)
(Landauer & Dumais, 1997)
![Page 8: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/8.jpg)
• Forced-choice synonymy tests
• TOEFL synonymy test (Landauer & Dumais, 1997) (TOEFL)
• ESL synonymy test (Turney, 2001) (ESL)
• Semantic similarity judgments
• Rubenstein & Goodenough, 1965 (RG)
• Miller & Charles, 1991 (MC)
• Resnik, 1995 (R)
• Finkelstein et al.’s “WordSimilarity 353,” 2002 (WS353)
![Page 9: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/9.jpg)
PMI VectorsESL 0.35 0.28
TOEFL 0.41 0.33RG 0.47 0.14R 0.46 0.17
MC 0.46 0.12WS353 0.54 0.57
Trained on small Wikipedia subset:
![Page 10: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/10.jpg)
PMI vs. LSA
Task PMI (Wiki subset) LSA (Wiki subset)
ESL .35 .36
TOEFL .41 .44
RG .47 .62
R .46 .60
MC .46 .46
WS353 .54 .57
![Page 11: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/11.jpg)
PMI usingfull Wikipedia
WN NSS.F NSS.T SA.N SA.W LSA.T
ESL .62 .70 .44 .56 .39 .51 .44TOEFL .64 .87 .59 .50 .61 .59 .55
RG .78 .88 .62 .53 .49 .39 .69R .86 .90 .56 .54 .49 .52 .74
MC .76 .77 .61 .56 .45 .45 .61WS353 .73 .46 .60 .59 .40 .38 .60
• WN: Wordnet::Similarity vector measure
• NSS.F: Normalized Search Similarity, using Factiva business news corpus
• NSS.T: Normalized Search Similarity, using TASA corpus
• SA.N: Spreading Activation, using Google counts restricted to nytimes.com
• SA.W: Spreading Activation, using Google counts restricted to wikipedia.org
• LSA.T: LSA, using TASA corpus
![Page 12: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/12.jpg)
PMI vs. Vector Addition vs. LSA
TaskPMI
(Wikipedia)Vectors
(Wikipedia)LSA
(Wiki subset)
ESL .62 0.44 .36
TOEFL .64 0.54 .44
RG .78 0.62 .62
R .86 0.73 .60
MC .76 0.68 .46
WS353 .73 0.45 .57
![Page 13: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/13.jpg)
cor(pmi, hum) cor(vec, hum) cor(pmi, vec)ESL -- -- --
TOEFL -- -- --RG 0.78 0.62 0.52R 0.86 0.73 0.64
MC 0.76 0.68 0.57WS353 0.73 0.45 0.54
![Page 14: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/14.jpg)
Rank (Humans) Rank (Humans)
most similar least similar most similar least similar
Ran
k (P
MI)
Ran
k (V
ecto
rs)
Rubenstein & Goodenough Word Pairs
![Page 15: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/15.jpg)
Rank (Humans) Rank (Humans)
most similar least similar most similar least similar
Ran
k (P
MI)
Ran
k (V
ecto
rs)
Resnik Word Pairs
![Page 16: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/16.jpg)
Rank (Humans) Rank (Humans)
most similar least similar most similar least similar
Ran
k (P
MI)
Ran
k (V
ecto
rs)
Miller & Charles Word Pairs
![Page 17: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/17.jpg)
Rank (Humans) Rank (Humans)
most similar least similar most similar least similar
Ran
k (P
MI)
Ran
k (V
ecto
rs)
WordSim353 Word Pairs
![Page 18: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/18.jpg)
Proportion of Word Pairs that PMI Assigns a Similarity Value of 0 to, By Task
ESL TOEFL RG R MC WS353
.26 .34 .25 .21 .20 .04
![Page 19: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/19.jpg)
Rubenstein & Goodenough Word Pairs
Vectors PMI
HumansLSA
![Page 20: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/20.jpg)
Resnik Word Pairs
Vectors PMI
HumansLSA
![Page 21: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/21.jpg)
Miller & Charles Word Pairs
Vectors PMI
HumansLSA
![Page 22: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/22.jpg)
WordSim353 Word Pairs
Vectors PMI
HumansLSA
![Page 23: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/23.jpg)
Vectors PMI
Humans (Forward Strengths)LSA
USF Free Association Norms
![Page 24: Qualitative differences between human behvaioral data and co-occurrence models of semantic similarity](https://reader035.vdocuments.us/reader035/viewer/2022062514/557bf494d8b42a302d8b5007/html5/thumbnails/24.jpg)
• Looking at rank correlations alone obscures important distributional properties that should not be ignored if the goal is to emulate human semantic representations
• Closer attention to qualitative trends should guide model design