vector-space (distributional) lexical semantics
DESCRIPTION
Vector-Space (Distributional) Lexical Semantics. Document corpus. Query String. 1. Doc1 2. Doc2 3. Doc3. Ranked Documents. Information Retrieval System. IR System. The Vector-Space Model. Graphic Representation. Term Weights: Term Frequency. Term Weights: Inverse Document Frequency. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/1.jpg)
Vector-Space (Distributional)Lexical Semantics
![Page 2: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/2.jpg)
2
Information Retrieval System
IRSystem
Query String
Documentcorpus
RankedDocuments
1. Doc12. Doc23. Doc3 . .
![Page 3: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/3.jpg)
The Vector-Space Model
![Page 4: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/4.jpg)
Graphic Representation
![Page 5: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/5.jpg)
Term Weights: Term Frequency
![Page 6: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/6.jpg)
Term Weights: Inverse Document Frequency
![Page 7: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/7.jpg)
TF-IDF Weighting
![Page 8: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/8.jpg)
Similarity Measure
![Page 9: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/9.jpg)
Cosine Similarity Measure
![Page 10: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/10.jpg)
Vector-Space (Distributional)Lexical Semantics
• Represent word meanings as points (vectors) in a (high-dimensional) Euclidian space.
• Dimensions encode aspects of the context in which the word appears (e.g. how often it co-occurs with another specific word).– “You will know a word by the company that it
keeps.” (J.R. Firth, 1957)• Semantic similarity defined as distance
between points in this semantic space.
10
![Page 11: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/11.jpg)
Sample Lexical Vector Space
11
dogcat
manwoman
bottle cup
water
rock
computerrobot
![Page 12: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/12.jpg)
Simple Word Vectors
• For a given target word, w, create a bag-of-words “document” of all of the words that co-occur with the target word in a large corpus.– Window of k words on either side.– All words in the sentence, paragraph, or
document.• For each word, create a (tf-idf weighted)
vector from the “document” for that word.• Compute semantic relatedness of words as
the cosine similarity of their vectors. 12
![Page 13: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/13.jpg)
Other Contextual Features
• Use syntax to move beyond simple bag-of-words features.
• Produced typed (edge-labeled) dependency parses for each sentence in a large corpus.
• For each target word, produce features for it having specific dependency links to specific other words (e.g. subj=dog, obj=food, mod=red)
13
![Page 14: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/14.jpg)
Other Feature Weights
• Replace TF-IDF with other feature weights.• Pointwise mutual information (PMI)
between target word, w, and the given feature, f:
14
𝑃𝑀𝐼 ( 𝑓 ,𝑤 )=𝑙𝑜𝑔 𝑃 ( 𝑓 ,𝑤 )𝑃 ( 𝑓 )𝑝(𝑤)
![Page 15: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/15.jpg)
Dimensionality Reduction
• Word-based features result in extremely high-dimensional spaces that can easily result in over-fitting.
• Reduce the dimensionality of the space by using various mathematical techniques to create a smaller set of k new dimensions that most account for the variance in the data.– Singular Value Decomposition (SVD) used in
Latent Semantic Analysis (LSA)– Principle Component Analysis (PCA)
15
![Page 16: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/16.jpg)
Sample Dimensionality Reduction
16
![Page 17: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/17.jpg)
Sample Dimensionality Reduction
17
![Page 18: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/18.jpg)
Evaluation of Vector-Space Lexical Semantics
• Have humans rate the semantic similarity of a large set of word pairs.– (dog, canine): 10; (dog, cat): 7; (dog, carrot): 3;
(dog, knife): 1• Compute vector-space similarity of each
pair.• Compute correlation coefficient (Pearson or
Spearman) between human and machine ratings.
18
![Page 19: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/19.jpg)
TOEFL Synonymy Test
• LSA shown to be able to pass TOEFL synonymy test.
19
![Page 20: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/20.jpg)
Vector-Space Word Sense Induction
• Create a context-vector for each individual occurrence of the target word, w.
• Cluster these vectors into k groups.• Assume each group represents a “sense” of
the word and compute a vector for this sense by taking the mean of each cluster.
20
![Page 21: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/21.jpg)
Sample Word Sense Induction
21
Occurrence vectors for “bat”
hit
flew
woodenplayer
cave
vampire
atebaseball
![Page 22: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/22.jpg)
Sample Word Sense Induction
22
Occurrence vectors for “bat”
hit
flew
woodenplayer
cave
vampire
atebaseball
+
+
![Page 23: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/23.jpg)
Vector-Space Word Meaning in Context
• Compute a semantic vector for an individual occurrence of a word based on its context.
• Combine a standard vector for a word with vectors representing the immediate context.
23
![Page 24: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/24.jpg)
Example Using Dependency Context
24
fired
hunter gunnsubj dobj
fired
boss secretarynsubj dobj
• Compute vector for nsubj-boss by summing contextual vectors for all word occurrences that have “boss” as a subject.
• Compute vector for dobj-secretary by summing contextual vectors for all word occurrences that have “secretary” as a direct object.
• Compute “in context” vector for “fire” in “boss fired secretary” by adding nsubj-boss and dobj-secretary vectors to the general vector for “fire”
![Page 25: Vector-Space (Distributional) Lexical Semantics](https://reader035.vdocuments.us/reader035/viewer/2022062501/56815b62550346895dc94b71/html5/thumbnails/25.jpg)
Conclusions
• A word’s meaning can be represented as a vector that encodes distributional information about the contexts in which the word tends to occur.
• Lexical semantic similarity can be judged by comparing vectors (e.g. cosine similarity).
• Vector-based word senses can be automatically induced by clustering contexts.
• Contextualized vectors for word meaning can be constructed by combining lexical and contextual vectors.
25