distributional semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfdistributional...
TRANSCRIPT
![Page 1: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/1.jpg)
Distributional Semantics
Magnus Sahlgren
Pavia, 11 September 2012
![Page 2: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/2.jpg)
Distributional Semantics
Course web page:
http://www.gavagai.se/distributional_semantics.php
![Page 3: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/3.jpg)
Distributional Semantics
This course:
Theoretical foundations (the distributional hypothesis)
Basic concepts (co-occurrence matrix, distributional vectors)
Distributional semantic models (LSA, HAL, Random Indexing)
Applications (lexical semantics, cognitive modelling, data
mining)
Research questions (semantic relations, compositionality)
![Page 4: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/4.jpg)
Distributional Semantics
This course:
Theoretical foundations (the distributional hypothesis)
Basic concepts (co-occurrence matrix, distributional vectors)
Distributional semantic models (LSA, HAL, Random Indexing)
Applications (lexical semantics, cognitive modelling, data
mining)
Research questions (semantic relations, compositionality)
![Page 5: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/5.jpg)
Distributional Semantics
This course:
Theoretical foundations (the distributional hypothesis)
Basic concepts (co-occurrence matrix, distributional vectors)
Distributional semantic models (LSA, HAL, Random Indexing)
Applications (lexical semantics, cognitive modelling, data
mining)
Research questions (semantic relations, compositionality)
![Page 6: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/6.jpg)
Distributional Semantics
This course:
Theoretical foundations (the distributional hypothesis)
Basic concepts (co-occurrence matrix, distributional vectors)
Distributional semantic models (LSA, HAL, Random Indexing)
Applications (lexical semantics, cognitive modelling, data
mining)
Research questions (semantic relations, compositionality)
![Page 7: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/7.jpg)
Distributional Semantics
This course:
Theoretical foundations (the distributional hypothesis)
Basic concepts (co-occurrence matrix, distributional vectors)
Distributional semantic models (LSA, HAL, Random Indexing)
Applications (lexical semantics, cognitive modelling, data
mining)
Research questions (semantic relations, compositionality)
![Page 8: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/8.jpg)
Distributional Semantics
This course:
Theoretical foundations (the distributional hypothesis)
Basic concepts (co-occurrence matrix, distributional vectors)
Distributional semantic models (LSA, HAL, Random Indexing)
Applications (lexical semantics, cognitive modelling, data
mining)
Research questions (semantic relations, compositionality)
![Page 9: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/9.jpg)
Distributional Semantics
What are the underlying assumptions and theories?
What are the main models, and how do they di�er?
How to build and use a model
How to evaluate a model
What can we use the models for?
What are the main research questions at the moment?
![Page 10: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/10.jpg)
Distributional Semantics
What are the underlying assumptions and theories?
What are the main models, and how do they di�er?
How to build and use a model
How to evaluate a model
What can we use the models for?
What are the main research questions at the moment?
![Page 11: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/11.jpg)
Distributional Semantics
What are the underlying assumptions and theories?
What are the main models, and how do they di�er?
How to build and use a model
How to evaluate a model
What can we use the models for?
What are the main research questions at the moment?
![Page 12: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/12.jpg)
Distributional Semantics
What are the underlying assumptions and theories?
What are the main models, and how do they di�er?
How to build and use a model
How to evaluate a model
What can we use the models for?
What are the main research questions at the moment?
![Page 13: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/13.jpg)
Distributional Semantics
What are the underlying assumptions and theories?
What are the main models, and how do they di�er?
How to build and use a model
How to evaluate a model
What can we use the models for?
What are the main research questions at the moment?
![Page 14: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/14.jpg)
Distributional Semantics
What are the underlying assumptions and theories?
What are the main models, and how do they di�er?
How to build and use a model
How to evaluate a model
What can we use the models for?
What are the main research questions at the moment?
![Page 15: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/15.jpg)
Distributional Semantics
Theoretical basisThe Distributional Hypothesis
Representational frameworkVector Space
![Page 16: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/16.jpg)
Distributional Semantics
Theoretical basisThe Distributional Hypothesis
Representational frameworkVector Space
![Page 17: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/17.jpg)
Distributional Semantics
Theoretical basisThe Distributional Hypothesis
Representational frameworkVector Space
![Page 18: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/18.jpg)
The Distributional Hypothesis
Extracting semantics from the distributions of terms
![Page 19: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/19.jpg)
Distributional Semantics
Firth:�you shall know a word by the company it keeps�
Wittgenstein:�the meaning of a word is its use in language�
?:�words with similar distributions have similar meanings�
=The Distributional Hypothesis
![Page 20: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/20.jpg)
Distributional Semantics
Firth:�you shall know a word by the company it keeps�
Wittgenstein:�the meaning of a word is its use in language�
?:�words with similar distributions have similar meanings�
=The Distributional Hypothesis
![Page 21: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/21.jpg)
Distributional Semantics
Firth:�you shall know a word by the company it keeps�
Wittgenstein:�the meaning of a word is its use in language�
?:�words with similar distributions have similar meanings�
=The Distributional Hypothesis
![Page 22: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/22.jpg)
Distributional Semantics
Firth:�you shall know a word by the company it keeps�
Wittgenstein:�the meaning of a word is its use in language�
?:�words with similar distributions have similar meanings�
=The Distributional Hypothesis
![Page 23: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/23.jpg)
The Distributional Hypothesis
Usually motivated by Zellig Harris' distributional methodology:
Linguistics can only deal with what is internal to language
Linguistic explanans is distributional facts
![Page 24: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/24.jpg)
The Distributional Hypothesis
Usually motivated by Zellig Harris' distributional methodology:
Linguistics can only deal with what is internal to language
Linguistic explanans is distributional facts
![Page 25: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/25.jpg)
The Distributional Hypothesis
Usually motivated by Zellig Harris' distributional methodology:
Linguistics can only deal with what is internal to language
Linguistic explanans is distributional facts
![Page 26: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/26.jpg)
The Distributional Hypothesis
Harris' distributional methodology is the discovery procedure bywhich we can quantify distributional similarities between linguisticentities
But why would such distributional similarities indicate semantic
similarity? (And what does �meaning� mean anyway?)
![Page 27: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/27.jpg)
The Distributional Hypothesis
Harris' distributional methodology is the discovery procedure bywhich we can quantify distributional similarities between linguisticentities
But why would such distributional similarities indicate semantic
similarity? (And what does �meaning� mean anyway?)
![Page 28: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/28.jpg)
The Distributional Hypothesis
If linguistics is to deal with meaning, it can only do so throughdistributional analysis
Di�erential, not referential
Structuralist legacy � Bloom�eld, Saussure
![Page 29: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/29.jpg)
The Distributional Hypothesis
If linguistics is to deal with meaning, it can only do so throughdistributional analysis
Di�erential, not referential
Structuralist legacy � Bloom�eld, Saussure
![Page 30: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/30.jpg)
The Distributional Hypothesis
If linguistics is to deal with meaning, it can only do so throughdistributional analysis
Di�erential, not referential
Structuralist legacy � Bloom�eld, Saussure
![Page 31: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/31.jpg)
The Distributional Hypothesis
Structuralism:
Language as a system
Signs (e.g. words) are identi�ed by their functional di�erenceswithin this system
Two kinds: syntagmatic and paradigmatic
![Page 32: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/32.jpg)
The Distributional Hypothesis
Structuralism:
Language as a system
Signs (e.g. words) are identi�ed by their functional di�erenceswithin this system
Two kinds: syntagmatic and paradigmatic
![Page 33: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/33.jpg)
The Distributional Hypothesis
Structuralism:
Language as a system
Signs (e.g. words) are identi�ed by their functional di�erenceswithin this system
Two kinds: syntagmatic and paradigmatic
![Page 34: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/34.jpg)
The Distributional Hypothesis
Structuralism:
Language as a system
Signs (e.g. words) are identi�ed by their functional di�erenceswithin this system
Two kinds: syntagmatic and paradigmatic
![Page 35: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/35.jpg)
The Distributional Hypothesis
Syntagmatic relations:
Words that co-occur
Combinatorial relations (phrases, sentences, etc.)
![Page 36: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/36.jpg)
The Distributional Hypothesis
Syntagmatic relations:
Words that co-occur
Combinatorial relations (phrases, sentences, etc.)
![Page 37: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/37.jpg)
The Distributional Hypothesis
Syntagmatic relations:
Words that co-occur
Combinatorial relations (phrases, sentences, etc.)
![Page 38: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/38.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Words that do not co-occur
Substitutional relations (co-occur with same other words)
![Page 39: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/39.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Words that do not co-occur
Substitutional relations (co-occur with same other words)
![Page 40: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/40.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Words that do not co-occur
Substitutional relations (co-occur with same other words)
![Page 41: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/41.jpg)
The Distributional Hypothesis
Example:
I drink co�eeyou sip teathey gulp cocoa
![Page 42: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/42.jpg)
The Distributional Hypothesis
Syntagmatic relations:
I drink co�ee
you sip tea
they gulp cocoa
![Page 43: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/43.jpg)
The Distributional Hypothesis
Paradigmatic relations:
I drink co�eeyou sip teathey gulp cocoa
![Page 44: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/44.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Synonyms (e.g. �good� / �great�)
Antonyms (e.g. �good� / �bad�)
Hypernyms (e.g. �dog� / �animal�)
Hyponyms (e.g. �dog� / �poodle�)
Meronymy (e.g. ��nger� / �hand�)
![Page 45: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/45.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Synonyms (e.g. �good� / �great�)
Antonyms (e.g. �good� / �bad�)
Hypernyms (e.g. �dog� / �animal�)
Hyponyms (e.g. �dog� / �poodle�)
Meronymy (e.g. ��nger� / �hand�)
![Page 46: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/46.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Synonyms (e.g. �good� / �great�)
Antonyms (e.g. �good� / �bad�)
Hypernyms (e.g. �dog� / �animal�)
Hyponyms (e.g. �dog� / �poodle�)
Meronymy (e.g. ��nger� / �hand�)
![Page 47: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/47.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Synonyms (e.g. �good� / �great�)
Antonyms (e.g. �good� / �bad�)
Hypernyms (e.g. �dog� / �animal�)
Hyponyms (e.g. �dog� / �poodle�)
Meronymy (e.g. ��nger� / �hand�)
![Page 48: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/48.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Synonyms (e.g. �good� / �great�)
Antonyms (e.g. �good� / �bad�)
Hypernyms (e.g. �dog� / �animal�)
Hyponyms (e.g. �dog� / �poodle�)
Meronymy (e.g. ��nger� / �hand�)
![Page 49: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/49.jpg)
The Distributional Hypothesis
Paradigmatic relations:
Synonyms (e.g. �good� / �great�)
Antonyms (e.g. �good� / �bad�)
Hypernyms (e.g. �dog� / �animal�)
Hyponyms (e.g. �dog� / �poodle�)
Meronymy (e.g. ��nger� / �hand�)
![Page 50: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/50.jpg)
The Distributional Hypothesis
�drink� and �co�ee� have a syntagmatic relation if they co-occur
�co�ee� and �tea� have a paradigmatic relation if they co-occurwith the same other words (e.g. �drink�)
![Page 51: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/51.jpg)
The Distributional Hypothesis
�drink� and �co�ee� have a syntagmatic relation if they co-occur
�co�ee� and �tea� have a paradigmatic relation if they co-occurwith the same other words (e.g. �drink�)
![Page 52: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/52.jpg)
The Distributional Hypothesis
To summarize:
Harris: If linguistics is to deal with meaning, it can only do sothrough distributional analysis
Structuralism: Meaning = relations between words
Saussure: Two types of relations: syntagmatic and paradigmatic
![Page 53: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/53.jpg)
The Distributional Hypothesis
To summarize:
Harris: If linguistics is to deal with meaning, it can only do sothrough distributional analysis
Structuralism: Meaning = relations between words
Saussure: Two types of relations: syntagmatic and paradigmatic
![Page 54: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/54.jpg)
The Distributional Hypothesis
To summarize:
Harris: If linguistics is to deal with meaning, it can only do sothrough distributional analysis
Structuralism: Meaning = relations between words
Saussure: Two types of relations: syntagmatic and paradigmatic
![Page 55: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/55.jpg)
The Distributional Hypothesis
To summarize:
Harris: If linguistics is to deal with meaning, it can only do sothrough distributional analysis
Structuralism: Meaning = relations between words
Saussure: Two types of relations: syntagmatic and paradigmatic
![Page 56: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/56.jpg)
Distributional Semantics
Extracting semantics from the distributions of terms
=Extracting syntagmatic or paradigmatic relations between wordsfrom co-occurrences or second-order co-occurrences
![Page 57: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/57.jpg)
Distributional Semantics
Extracting semantics from the distributions of terms
=Extracting syntagmatic or paradigmatic relations between wordsfrom co-occurrences or second-order co-occurrences
![Page 58: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/58.jpg)
Distributional Semantics
The Distributional Hypothesis makes sense from a linguisticperspective
Does it also make sense from a cognitive perspective?
![Page 59: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/59.jpg)
Distributional Semantics
The Distributional Hypothesis makes sense from a linguisticperspective
Does it also make sense from a cognitive perspective?
![Page 60: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/60.jpg)
Distributional Semantics
We (i.e. humans) obviously can learn new words based oncontextual cues
He filled the wapimuk with the substance, passed it
around and we all drunk some
We found a little, hairy wapimuk sleeping behind the
tree
![Page 61: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/61.jpg)
Distributional Semantics
We (i.e. humans) obviously can learn new words based oncontextual cues
He filled the wapimuk with the substance, passed it
around and we all drunk some
We found a little, hairy wapimuk sleeping behind the
tree
![Page 62: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/62.jpg)
Distributional Semantics
We (i.e. humans) obviously can learn new words based oncontextual cues
He filled the wapimuk with the substance, passed it
around and we all drunk some
We found a little, hairy wapimuk sleeping behind the
tree
![Page 63: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/63.jpg)
Distributional Semantics
We encounter new words everyday...
but rarely ask for de�nitions
The absence of de�nitions in normal discourse is a sign ofcommunicative prosperity
![Page 64: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/64.jpg)
Distributional Semantics
We encounter new words everyday...
but rarely ask for de�nitions
The absence of de�nitions in normal discourse is a sign ofcommunicative prosperity
![Page 65: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/65.jpg)
Distributional Semantics
We encounter new words everyday...
but rarely ask for de�nitions
The absence of de�nitions in normal discourse is a sign ofcommunicative prosperity
![Page 66: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/66.jpg)
Distributional Semantics
Abstract terms (Vigliocco et al. 2009)justice, tax, etc.
Concrete terms for which we have no direct experience of thereferentsaardvark, cyclotron, etc.
![Page 67: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/67.jpg)
Distributional Semantics
Abstract terms (Vigliocco et al. 2009)justice, tax, etc.
Concrete terms for which we have no direct experience of thereferentsaardvark, cyclotron, etc.
![Page 68: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/68.jpg)
Distributional Semantics
Congenitally blind individuals show normal competence of colorterms and visual perception verbs (Landau & Gleitman 1985)
![Page 69: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/69.jpg)
Distributional Semantics
The cognitive representation of a word is some abstraction orgeneralization derived from the contexts in which the word hasbeen encountered (Miller & Charles 1991)
![Page 70: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/70.jpg)
Distributional Semantics
A contextual representation may include extra-linguistic contexts
De facto, context is equated with linguistic context
Practical reason � it is easy to collect linguistic contexts(from corpora) and to process them
Theoretical reason � it is possible to investigate the role oflinguistic distributions in shaping word meaning
![Page 71: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/71.jpg)
Distributional Semantics
A contextual representation may include extra-linguistic contexts
De facto, context is equated with linguistic context
Practical reason � it is easy to collect linguistic contexts(from corpora) and to process them
Theoretical reason � it is possible to investigate the role oflinguistic distributions in shaping word meaning
![Page 72: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/72.jpg)
Distributional Semantics
A contextual representation may include extra-linguistic contexts
De facto, context is equated with linguistic context
Practical reason � it is easy to collect linguistic contexts(from corpora) and to process them
Theoretical reason � it is possible to investigate the role oflinguistic distributions in shaping word meaning
![Page 73: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/73.jpg)
Distributional Semantics
A contextual representation may include extra-linguistic contexts
De facto, context is equated with linguistic context
Practical reason � it is easy to collect linguistic contexts(from corpora) and to process them
Theoretical reason � it is possible to investigate the role oflinguistic distributions in shaping word meaning
![Page 74: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/74.jpg)
Distributional Semantics
Weak Distributional Hypothesis
A quantitative method for semantic analysis and lexical resourceinduction
Meaning (whatever this might be) is re�ected in distribution
![Page 75: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/75.jpg)
Distributional Semantics
Weak Distributional Hypothesis
A quantitative method for semantic analysis and lexical resourceinduction
Meaning (whatever this might be) is re�ected in distribution
![Page 76: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/76.jpg)
Distributional Semantics
Weak Distributional Hypothesis
A quantitative method for semantic analysis and lexical resourceinduction
Meaning (whatever this might be) is re�ected in distribution
![Page 77: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/77.jpg)
Distributional Semantics
Strong Distributional Hypothesis
A cognitive hypothesis about the form and origin of semanticrepresentations
Word distributions in context have a speci�c causal role in theformation of the semantic representation for that word
The distributional properties of words in linguistic contexts explainshuman semantic behavior (e.g. judgment of semantic similarity)
![Page 78: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/78.jpg)
Distributional Semantics
Strong Distributional Hypothesis
A cognitive hypothesis about the form and origin of semanticrepresentations
Word distributions in context have a speci�c causal role in theformation of the semantic representation for that word
The distributional properties of words in linguistic contexts explainshuman semantic behavior (e.g. judgment of semantic similarity)
![Page 79: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/79.jpg)
Distributional Semantics
Strong Distributional Hypothesis
A cognitive hypothesis about the form and origin of semanticrepresentations
Word distributions in context have a speci�c causal role in theformation of the semantic representation for that word
The distributional properties of words in linguistic contexts explainshuman semantic behavior (e.g. judgment of semantic similarity)
![Page 80: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/80.jpg)
Distributional Semantics
Strong Distributional Hypothesis
A cognitive hypothesis about the form and origin of semanticrepresentations
Word distributions in context have a speci�c causal role in theformation of the semantic representation for that word
The distributional properties of words in linguistic contexts explainshuman semantic behavior (e.g. judgment of semantic similarity)
![Page 81: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/81.jpg)
Distributional Semantics
Theoretical basisThe Distributional Hypothesis
Representational frameworkVector Space
![Page 82: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/82.jpg)
Vector space
x
y
1 2 3 4
1
2
3
4
5
5 6 7
6
a = (4, 3)
b = (5, 6)
![Page 83: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/83.jpg)
Vector space
x y
A2,2 =a
b
[(4 3)(5 6)
]
![Page 84: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/84.jpg)
Vector space
x y
A2,2 =a
b
[(4 3)(5 6)
]
![Page 85: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/85.jpg)
Vector space
x y
A2,2 =a
b
[(4 3)(5 6)
]
![Page 86: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/86.jpg)
Vector space
Am,n =
a1,1 a1,2 a1,3 . . . a1,na2,1 a2,2 a2,3 . . . a2,na3,1 a3,2 a3,3 . . . a3,n...
......
. . ....
am,1 am,2 am,3 . . . am,n
![Page 87: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/87.jpg)
Vector space
High-dimensional vector spaces (i.e. dimensionalities on the order ofthousands)
Mathematically straightforward
Warning: high-dimensional spaces can be counterintuitive!
![Page 88: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/88.jpg)
Vector space
High-dimensional vector spaces (i.e. dimensionalities on the order ofthousands)
Mathematically straightforward
Warning: high-dimensional spaces can be counterintuitive!
![Page 89: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/89.jpg)
Vector space
High-dimensional vector spaces (i.e. dimensionalities on the order ofthousands)
Mathematically straightforward
Warning: high-dimensional spaces can be counterintuitive!
![Page 90: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/90.jpg)
Vector space
Imagine two nested cubes:
How big must the inner cube be in order to cover 1% of the area ofthe outer cube?
Answer: 10% of the side length (0.1× 0.1 = 0.01)
![Page 91: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/91.jpg)
Vector space
Imagine two nested cubes:
How big must the inner cube be in order to cover 1% of the area ofthe outer cube?
Answer: 10% of the side length (0.1× 0.1 = 0.01)
![Page 92: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/92.jpg)
Vector space
Imagine two nested cubes:
How big must the inner cube be in order to cover 1% of the area ofthe outer cube?
Answer: 10% of the side length (0.1× 0.1 = 0.01)
![Page 93: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/93.jpg)
Vector space
For 3-dimensional cubes, the inner cube must have ≈ 21% of theside length of the outer cube (0.21× 0.21× 0.21 ≈ 0.01)
For n-dimensional cubes, the inner cube must have 0.011n of the
side length
![Page 94: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/94.jpg)
Vector space
For 3-dimensional cubes, the inner cube must have ≈ 21% of theside length of the outer cube (0.21× 0.21× 0.21 ≈ 0.01)
For n-dimensional cubes, the inner cube must have 0.011n of the
side length
![Page 95: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/95.jpg)
Vector space
For n = 1 000 that is 99.5%
That is, if the outer 1 000-dimensional cube has sides 2 units long,and the inner 1 000-dimensional cube has sides 1.99 units long, theouter cube will still contain 100 times more volume
![Page 96: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/96.jpg)
Vector space
For n = 1 000 that is 99.5%
That is, if the outer 1 000-dimensional cube has sides 2 units long,and the inner 1 000-dimensional cube has sides 1.99 units long, theouter cube will still contain 100 times more volume
![Page 97: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/97.jpg)
Distributional Semantics
Distributional semantics + vector space
=Word Space Models
![Page 98: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/98.jpg)
Distributional Semantics
Distributional semantics + vector space
=Word Space Models
![Page 99: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/99.jpg)
Word Space
�Vector similarity is the only information present in Word Space:semantically related words are close, unrelated words are distant.�
(Hinrich Schütze: Word space, 1993)
![Page 100: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/100.jpg)
Word Space
(Schütze: Dimensions of Meaning, 1992)
![Page 101: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/101.jpg)
Word Space (A Brief History)
Manually de�ned features
Psychology: Osgood (1950s)
Human ratings over contrastive adjective-pairs
small � large bald � furry docile � dangerous
mouse 2 6 1rat 2 6 4
![Page 102: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/102.jpg)
Word Space (A Brief History)
Manually de�ned features
Psychology: Osgood (1950s)
Human ratings over contrastive adjective-pairs
small � large bald � furry docile � dangerous
mouse 2 6 1rat 2 6 4
![Page 103: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/103.jpg)
Word Space (A Brief History)
Manually de�ned features
Psychology: Osgood (1950s)
Human ratings over contrastive adjective-pairs
small � large bald � furry docile � dangerous
mouse 2 6 1rat 2 6 4
![Page 104: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/104.jpg)
Word Space (A Brief History)
Manually de�ned features
Connectionism, AI: Waltz & Pollack (1980s), Gallant (1990s),Gärdenfors (2000s)
human machine politics ...
astronomer +2 -1 -1 ...
![Page 105: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/105.jpg)
Word Space (A Brief History)
Manually de�ned features
Connectionism, AI: Waltz & Pollack (1980s), Gallant (1990s),Gärdenfors (2000s)
human machine politics ...
astronomer +2 -1 -1 ...
![Page 106: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/106.jpg)
Word Space (A Brief History)
Manually de�ned features
Pros: Expert knowledge
Cons: Bias; how many features are enough?
![Page 107: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/107.jpg)
Word Space (A Brief History)
Manually de�ned features
Pros: Expert knowledge
Cons: Bias; how many features are enough?
![Page 108: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/108.jpg)
Word Space (A Brief History)
Manually de�ned features
Pros: Expert knowledge
Cons: Bias; how many features are enough?
![Page 109: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/109.jpg)
Word Space
Automatic methods: count co-occurrences (c.f. the distributionalhypothesis)
Co-occurrence matrix
Distributional vector
![Page 110: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/110.jpg)
Word Space
Automatic methods: count co-occurrences (c.f. the distributionalhypothesis)
Co-occurrence matrix
Distributional vector
![Page 111: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/111.jpg)
Word Space
Automatic methods: count co-occurrences (c.f. the distributionalhypothesis)
Co-occurrence matrix
Distributional vector
![Page 112: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/112.jpg)
The Co-occurrence Matrix
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
a1,1 a1,2 a1,3 . . . a1,na2,1 a2,2 a2,3 . . . a2,na3,1 a3,2 a3,3 . . . a3,n...
......
. . ....
am,1 am,2 am,3 . . . am,n
w are the m (target) word types
c are the n contexts in the data
and the cells are word-context co-occurrence frequencies
![Page 113: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/113.jpg)
The Co-occurrence Matrix
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
a1,1 a1,2 a1,3 . . . a1,na2,1 a2,2 a2,3 . . . a2,na3,1 a3,2 a3,3 . . . a3,n...
......
. . ....
am,1 am,2 am,3 . . . am,n
w are the m (target) word types
c are the n contexts in the data
and the cells are word-context co-occurrence frequencies
![Page 114: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/114.jpg)
The Co-occurrence Matrix
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
a1,1 a1,2 a1,3 . . . a1,na2,1 a2,2 a2,3 . . . a2,na3,1 a3,2 a3,3 . . . a3,n...
......
. . ....
am,1 am,2 am,3 . . . am,n
w are the m (target) word types
c are the n contexts in the data
and the cells are word-context co-occurrence frequencies
![Page 115: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/115.jpg)
The Co-occurrence Matrix
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
a1,1 a1,2 a1,3 . . . a1,na2,1 a2,2 a2,3 . . . a2,na3,1 a3,2 a3,3 . . . a3,n...
......
. . ....
am,1 am,2 am,3 . . . am,n
w are the m (target) word types
c are the n contexts in the data
and the cells are word-context co-occurrence frequencies
![Page 116: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/116.jpg)
The Co-occurrence Matrix
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
0 1 0 . . . 11 0 2 . . . 00 2 1 . . . 3...
......
. . ....
0 0 1 . . . 0
w are the m (target) word types
c are the n contexts in the data
and the cells are word-context co-occurrence frequencies
![Page 117: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/117.jpg)
Distributional Vectors
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
0 1 0 . . . 11 0 2 . . . 00 2 1 . . . 3...
......
. . ....
0 0 1 . . . 0
Rows are n-dimensional distributional vectors
Words that occur in similar contexts get similar distributionalvectors
So how should we de�ne �context�?
![Page 118: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/118.jpg)
Distributional Vectors
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
0 1 0 . . . 11 0 2 . . . 00 2 1 . . . 3...
......
. . ....
0 0 1 . . . 0
Rows are n-dimensional distributional vectors
Words that occur in similar contexts get similar distributionalvectors
So how should we de�ne �context�?
![Page 119: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/119.jpg)
Distributional Vectors
c1 c2 c3 . . . cn
Am,n =
w1
w2
w3
...wm
0 1 0 . . . 11 0 2 . . . 00 2 1 . . . 3...
......
. . ....
0 0 1 . . . 0
Rows are n-dimensional distributional vectors
Words that occur in similar contexts get similar distributionalvectors
So how should we de�ne �context�?
![Page 120: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/120.jpg)
Words-by-region Matrices
r1 r2 r3 . . . rn
Am,n =
drink
w2
co�ee...wm
0 1 0 . . . 11 0 1 . . . 00 1 1 . . . 1...
......
. . ....
0 0 1 . . . 0
A region is a text unit (document, paragraph, sentence, etc.)r2 = �I drink co�ee�
Syntagmatic similarity: words that co-occur in the same region
![Page 121: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/121.jpg)
Words-by-region Matrices
r1 r2 r3 . . . rn
Am,n =
drink
w2
co�ee...wm
0 1 0 . . . 11 0 1 . . . 00 1 1 . . . 1...
......
. . ....
0 0 1 . . . 0
A region is a text unit (document, paragraph, sentence, etc.)r2 = �I drink co�ee�
Syntagmatic similarity: words that co-occur in the same region
![Page 122: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/122.jpg)
Words-by-region Matrices
r1 r2 r3 . . . rn
Am,n =
drink
w2
co�ee...wm
0 1 0 . . . 11 0 1 . . . 00 1 1 . . . 1...
......
. . ....
0 0 1 . . . 0
A region is a text unit (document, paragraph, sentence, etc.)r2 = �I drink co�ee�
Syntagmatic similarity: words that co-occur in the same region
![Page 123: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/123.jpg)
Words-by-words Matrices
w1 drink w3 . . . wm
Am,n =
co�ee
w2
tea...wm
0 1 0 . . . 11 0 2 . . . 00 2 0 . . . 3...
......
. . ....
0 0 1 . . . 0
Paradigmatic similarity: words that co-occur with the same other
words
Syntagmatic similarity by looking at the individual dimensions
![Page 124: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/124.jpg)
Words-by-words Matrices
w1 drink w3 . . . wm
Am,n =
co�ee
w2
tea...wm
0 1 0 . . . 11 0 2 . . . 00 2 0 . . . 3...
......
. . ....
0 0 1 . . . 0
Paradigmatic similarity: words that co-occur with the same other
words
Syntagmatic similarity by looking at the individual dimensions
![Page 125: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/125.jpg)
Words-by-words Matrices
w1 drink w3 . . . wm
Am,n =
co�ee
w2
tea...wm
0 1 0 . . . 11 0 2 . . . 00 2 0 . . . 3...
......
. . ....
0 0 1 . . . 0
Paradigmatic similarity: words that co-occur with the same other
words
Syntagmatic similarity by looking at the individual dimensions
![Page 126: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/126.jpg)
Context window
The region within which co-occurrences are collected
Parameters:
Size
Direction
Weighting
![Page 127: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/127.jpg)
Context window
The region within which co-occurrences are collected
Parameters:
Size
Direction
Weighting
![Page 128: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/128.jpg)
Context window
The region within which co-occurrences are collected
Parameters:
Size
Direction
Weighting
![Page 129: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/129.jpg)
Context window
The region within which co-occurrences are collected
Parameters:
Size
Direction
Weighting
![Page 130: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/130.jpg)
Context window
The region within which co-occurrences are collected
Parameters:
Size
Direction
Weighting
![Page 131: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/131.jpg)
Context window
[I drink very strong] coffee [at the cafe down the street ]1 1 1 1 1 1 1 1 1 1
An entire sentence as a �at context window
![Page 132: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/132.jpg)
Context window
[I drink very strong] coffee [at the cafe down the street ]1 1 1 1 1 1 1 1 1 1
An entire sentence as a �at context window
![Page 133: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/133.jpg)
Context window
I drink [very strong] coffee [at the ] cafe down the street
1 1 1 1
A 2+2-sized �at context window
![Page 134: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/134.jpg)
Context window
[I drink very strong] coffee [at the cafe down ] the street
1 2 3 4 4 3 2 1
A 4+4-sized distance weighted context window
![Page 135: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/135.jpg)
Context window
I drink very strong coffee [at the cafe down the street]1 0.9 0.8 0.7 0.6 0.5
A 0+6-sized distance weighted context window
![Page 136: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/136.jpg)
Context window
I [drink very strong] coffee [at the cafe] down the street
1 0 3 0 0 1
A 3+3-sized distance weighted context window that ignores stopwords
![Page 137: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/137.jpg)
Distributional vectors
Count how many times each target word occurs in a certain context
Collect (a function of) these frequency counts in vectors
[12,0,234,92,1,0,87,525,0,0,1,2,0,8129,1,0,51,0,235...]
And then what?
![Page 138: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/138.jpg)
Distributional vectors
Count how many times each target word occurs in a certain context
Collect (a function of) these frequency counts in vectors
[12,0,234,92,1,0,87,525,0,0,1,2,0,8129,1,0,51,0,235...]
And then what?
![Page 139: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/139.jpg)
Distributional vectors
Count how many times each target word occurs in a certain context
Collect (a function of) these frequency counts in vectors
[12,0,234,92,1,0,87,525,0,0,1,2,0,8129,1,0,51,0,235...]
And then what?
![Page 140: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/140.jpg)
Distributional vectors
Count how many times each target word occurs in a certain context
Collect (a function of) these frequency counts in vectors
[12,0,234,92,1,0,87,525,0,0,1,2,0,8129,1,0,51,0,235...]
And then what?
![Page 141: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/141.jpg)
The Meaning of Life
![Page 142: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/142.jpg)
Vector similarity
abuse
drink
tea
co�ee
drugs
![Page 143: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/143.jpg)
Vector similarity
abuse
drink
tea
co�ee
drugs
![Page 144: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/144.jpg)
Vector similarity
abuse
drink
tea
co�ee
drugs
![Page 145: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/145.jpg)
Vector similarity
abuse
drink
tea
co�ee
drugs
![Page 146: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/146.jpg)
Vector similarity
abuse
drink
tea
co�ee
drugs
![Page 147: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/147.jpg)
Vector similarity
Minkowski distance:(∑n
i=1|xi − yi |N
) 1N
City-Block (or Manhattan) distance: N = 1
Euclidean distance: N = 2
Chebyshev distance: N →∞
Scalar product: x · y = x1y1 + x2y2 + ...+ xnyn
Cosine: x ·y|x ||y | =
∑n
i=1 xiyi√∑n
i=1 x2i
√∑n
i=1 y2i
![Page 148: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/148.jpg)
Vector similarity
Minkowski distance:(∑n
i=1|xi − yi |N
) 1N
City-Block (or Manhattan) distance: N = 1
Euclidean distance: N = 2
Chebyshev distance: N →∞
Scalar product: x · y = x1y1 + x2y2 + ...+ xnyn
Cosine: x ·y|x ||y | =
∑n
i=1 xiyi√∑n
i=1 x2i
√∑n
i=1 y2i
![Page 149: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/149.jpg)
Vector similarity
Minkowski distance:(∑n
i=1|xi − yi |N
) 1N
City-Block (or Manhattan) distance: N = 1
Euclidean distance: N = 2
Chebyshev distance: N →∞
Scalar product: x · y = x1y1 + x2y2 + ...+ xnyn
Cosine: x ·y|x ||y | =
∑n
i=1 xiyi√∑n
i=1 x2i
√∑n
i=1 y2i
![Page 150: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/150.jpg)
Vector similarity
Minkowski distance:(∑n
i=1|xi − yi |N
) 1N
City-Block (or Manhattan) distance: N = 1
Euclidean distance: N = 2
Chebyshev distance: N →∞
Scalar product: x · y = x1y1 + x2y2 + ...+ xnyn
Cosine: x ·y|x ||y | =
∑n
i=1 xiyi√∑n
i=1 x2i
√∑n
i=1 y2i
![Page 151: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/151.jpg)
Distributional vectors
Count how many times each target word occurs in a certain context
Collect (a function of) these frequency counts in vectors
And then what?
Compute similarity between such vectors
![Page 152: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/152.jpg)
Distributional vectors
Count how many times each target word occurs in a certain context
Collect (a function of) these frequency counts in vectors
And then what?
Compute similarity between such vectors
![Page 153: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/153.jpg)
Nearest Neighbors
Nearest neighbor search: extracting the k nearest neighbors to atarget word
1 compute the cosine similarity between the context vector ofthe target word and the context vectors of all other words inthe word space
2 sort the resulting similarities
3 return only the top-k words
![Page 154: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/154.jpg)
Nearest Neighbors
Nearest neighbor search: extracting the k nearest neighbors to atarget word
1 compute the cosine similarity between the context vector ofthe target word and the context vectors of all other words inthe word space
2 sort the resulting similarities
3 return only the top-k words
![Page 155: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/155.jpg)
Nearest Neighbors
Nearest neighbor search: extracting the k nearest neighbors to atarget word
1 compute the cosine similarity between the context vector ofthe target word and the context vectors of all other words inthe word space
2 sort the resulting similarities
3 return only the top-k words
![Page 156: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/156.jpg)
Nearest Neighbors
Nearest neighbor search: extracting the k nearest neighbors to atarget word
1 compute the cosine similarity between the context vector ofthe target word and the context vectors of all other words inthe word space
2 sort the resulting similarities
3 return only the top-k words
![Page 157: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/157.jpg)
Nearest Neighbors
dog
dog 1.000000cat 0.774165horse 0.668386fox 0.648134pet 0.626650
rabbit 0.615840pig 0.570504
animal 0.566003mongrel 0.560158sheep 0.551973pigeon 0.547442deer 0.534663rat 0.531442bird 0.527370
red
red 1.000000yellow 0.824409white 0.789056brown 0.723576grey 0.720103blue 0.700047pink 0.672573black 0.671302shiny 0.661379purple 0.633858striped 0.619801dark 0.610804gleam 0.603501palea 0.595221
![Page 158: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/158.jpg)
Building a word space: step-by-stepThe �linguistic� steps
I drank very strong �Arabica� coffees, at the cafe.
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 159: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/159.jpg)
Building a word space: step-by-stepThe �linguistic� steps
I drank very strong �Arabica� coffees, at the cafe.
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 160: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/160.jpg)
Building a word space: step-by-stepThe �linguistic� steps
I drank very strong Arabica coffees at the cafe
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 161: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/161.jpg)
Building a word space: step-by-stepThe �linguistic� steps
i drank very strong arabica coffees at the cafe
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 162: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/162.jpg)
Building a word space: step-by-stepThe �linguistic� steps
i drink very strong arabica coffee at the cafe
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 163: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/163.jpg)
Building a word space: step-by-stepThe �linguistic� steps
i drink very strong arabica coffee at the cafe
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 164: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/164.jpg)
Building a word space: step-by-stepThe �linguistic� steps
i/PN drink/VB very/AV strong/AJ arabica/NN coffee/NN at/PR the/DT
cafe/NN
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 165: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/165.jpg)
Building a word space: step-by-stepThe �linguistic� steps
(S (NP i/PN) (VP drink/VB (NP (AP very/AV strong/AJ) arabica/NN
coffee/NN) (PP at/PR (NP the/DT cafe/NN))))
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 166: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/166.jpg)
Building a word space: step-by-stepThe �linguistic� steps
(S (NP i/PN) (VP drink/VB (NP (AP very/AV strong/AJ) arabica/NN
coffee/NN) (PP at/PR (NP the/DT cafe/NN))))
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 167: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/167.jpg)
Building a word space: step-by-stepThe �linguistic� steps
(S (NP i/PN) (VP drink/VB (NP (AP very/AV strong/AJ) arabica/NN
coffee/NN) (PP at/PR (NP the/DT cafe/NN))))
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 168: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/168.jpg)
Building a word space: step-by-stepThe �linguistic� steps
(S (NP i/PN) (VP drink/VB (NP (AP very/AV strong/AJ) arabica/NN
coffee/NN) (PP at/PR (NP the/DT cafe/NN))))
Pre-process a corpus (optional, but recommended)(e.g. tokenization, downcasing, stemming/lemmatization...)
Linguistic mark-up(PoS tagging, syntactic parsing, named entity recognition...)
Select (the target words and) the linguistic contexts(e.g. text region, words...)
![Page 169: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/169.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 170: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/170.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 171: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/171.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 172: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/172.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 173: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/173.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 174: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/174.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 175: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/175.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 176: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/176.jpg)
Building a word space: step-by-stepThe �mathematical� steps
Count the target-context co-occurrences
Weight the contexts (optional, but recommended)(e.g. raw co-occurrence frequency, entropy, association
measures...)
Reduce the matrix dimensions (optional)(dimension reduction method: context selection, variance,
SVD, RI...)
(dimensionality)
Compute vector similarities (do nearest neighbor search)between distributional vectors(e.g. cosine, euclidean distance, etc.)
![Page 177: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/177.jpg)
Building a word space
Tools:
GSDM Guile Sparse Distributed Memory (Guile)
S-Space https://github.com/fozziethebeat/S-Space (Java)
SemanticVectors http://code.google.com/p/semanticvectors/(Java)
GenSim http://radimrehurek.com/gensim/ (Python)
Word2word http://www.indiana.edu/ semantic/word2word/(Java)
![Page 178: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/178.jpg)
Building a word space
Tools:
GSDM Guile Sparse Distributed Memory (Guile)
S-Space https://github.com/fozziethebeat/S-Space (Java)
SemanticVectors http://code.google.com/p/semanticvectors/(Java)
GenSim http://radimrehurek.com/gensim/ (Python)
Word2word http://www.indiana.edu/ semantic/word2word/(Java)
![Page 179: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/179.jpg)
Building a word space
Tools:
GSDM Guile Sparse Distributed Memory (Guile)
S-Space https://github.com/fozziethebeat/S-Space (Java)
SemanticVectors http://code.google.com/p/semanticvectors/(Java)
GenSim http://radimrehurek.com/gensim/ (Python)
Word2word http://www.indiana.edu/ semantic/word2word/(Java)
![Page 180: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/180.jpg)
Building a word space
Tools:
GSDM Guile Sparse Distributed Memory (Guile)
S-Space https://github.com/fozziethebeat/S-Space (Java)
SemanticVectors http://code.google.com/p/semanticvectors/(Java)
GenSim http://radimrehurek.com/gensim/ (Python)
Word2word http://www.indiana.edu/ semantic/word2word/(Java)
![Page 181: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/181.jpg)
Building a word space
Tools:
GSDM Guile Sparse Distributed Memory (Guile)
S-Space https://github.com/fozziethebeat/S-Space (Java)
SemanticVectors http://code.google.com/p/semanticvectors/(Java)
GenSim http://radimrehurek.com/gensim/ (Python)
Word2word http://www.indiana.edu/ semantic/word2word/(Java)
![Page 182: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/182.jpg)
Building a word space
Tools:
GSDM Guile Sparse Distributed Memory (Guile)
S-Space https://github.com/fozziethebeat/S-Space (Java)
SemanticVectors http://code.google.com/p/semanticvectors/(Java)
GenSim http://radimrehurek.com/gensim/ (Python)
Word2word http://www.indiana.edu/ semantic/word2word/(Java)
![Page 183: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/183.jpg)
Lab
Use GSDM to:
Build a words-by-documents model
Build a words-by-words model
Extract similar words
![Page 184: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/184.jpg)
Lab
Use GSDM to:
Build a words-by-documents model
Build a words-by-words model
Extract similar words
![Page 185: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/185.jpg)
Lab
Use GSDM to:
Build a words-by-documents model
Build a words-by-words model
Extract similar words
![Page 186: Distributional Semanticsattach.matita.net/caterinamauri/sitovecchio/692485562_1.pdfDistributional Semantics This course: Theoretical foundations ( the distributional hypothesis ) Basic](https://reader031.vdocuments.us/reader031/viewer/2022020214/5acc8ca17f8b9a27628c9521/html5/thumbnails/186.jpg)
Lab
Use GSDM to:
Build a words-by-documents model
Build a words-by-words model
Extract similar words