unsupervised knowledge-free word sense disambiguation
TRANSCRIPT
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Unsupervised Knowledge-Free Word SenseDisambiguation
Dr. Alexander Panchenko
University of Hamburg, Language Technology Group
23 February, 2017
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Overview
Introduction
Dense Sense Representations
Sparse Sense Representations
Future Work
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
About me
I 2008, Engineering degree (MS.) in Computer Science,Moscow State Technical University
I 2009, Research intern, Xerox Research Centre Europe
I 2013, PhD in Natural Language Processing, University ofLouvain
I 2013, Research engineer at a start-up related to socialnetwork analysis (Digsolab)
I 2015, Postdoc at Technical University of Darmstadt
I 2017, Postdoc at University of Hamburg
Topics: computational lexical semantics (semanticsimilarity/relatedness, semantic relations, sense induction, sensedisambiguation), nlp for social network analysis, text categorizationPapers, presentations, datasets: http://panchenko.me
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Publications Related to the TalkI Pelevina M., Arefiev N., Biemann C., Panchenko A. (2016)
Making Sense of Word Embeddings. In Proceedings of the1st Workshop on Representation Learning for NLP. ACL 2016,Berlin, Germany. Best Paper Award
I Panchenko A., Simon J., Riedl M., Biemann C. (2016) NounSense Induction and Disambiguation using Graph-BasedDistributional Semantics. In Proceedings of the KONVENS2016, Bochum, Germany
I Panchenko A., Ruppert E., Faralli S., Ponzetto S. P., andBiemann C. (2017). Unsupervised Does Not MeanUninterpretable: The Case for Word Sense Inductionand Disambiguation. In Proceedings of the 15th Conferenceof the European Chapter of the Association for ComputationalLinguistics (EACL’2017). Valencia, Spain
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Motivation for Unsupervised Knowledge-Free WSD
I A word sense disambiguation (WSD) system:I Input: word and its context.I Output: a sense of this word.
I Existing approaches (Navigli, 2009):I Knowledge-based approaches that rely on hand-crafted
resources, such as WordNet.I Supervised approaches learn from hand-labeled training data,
such as SemCor.
I Problem 1: hand-crafted lexical resources and training dataexpensive, often inconsistent, domain-dependent.
I Problem 2: These methods assume a fixed sense inventory:I senses emerge and disappear over time.I different applications require different granularities.
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Motivation for Unsupervised Knowledge-Free WSD
I A word sense disambiguation (WSD) system:I Input: word and its context.I Output: a sense of this word.
I Existing approaches (Navigli, 2009):I Knowledge-based approaches that rely on hand-crafted
resources, such as WordNet.I Supervised approaches learn from hand-labeled training data,
such as SemCor.
I Problem 1: hand-crafted lexical resources and training dataexpensive, often inconsistent, domain-dependent.
I Problem 2: These methods assume a fixed sense inventory:I senses emerge and disappear over time.I different applications require different granularities.
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Motivation for Unsupervised Knowledge-Free WSD
I A word sense disambiguation (WSD) system:I Input: word and its context.I Output: a sense of this word.
I Existing approaches (Navigli, 2009):I Knowledge-based approaches that rely on hand-crafted
resources, such as WordNet.I Supervised approaches learn from hand-labeled training data,
such as SemCor.
I Problem 1: hand-crafted lexical resources and training dataexpensive, often inconsistent, domain-dependent.
I Problem 2: These methods assume a fixed sense inventory:I senses emerge and disappear over time.I different applications require different granularities.
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Motivation for Unsupervised Knowledge-Free WSD (cont.)
I An alternative route is the unsupervised knowledge-freeapproach.
I learn an interpretable sense inventoryI learn a disambiguation model
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Dense Sense Representations for WSD
I Pelevina M., Arefiev N., Biemann C., Panchenko A. MakingSense of Word Embeddings. In Proceedings of the 1stWorkshop on Representation Learning for NLP. ACL 2016,Berlin, Germany.
I An approach to learn word sense embeddings.
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Overview of the contribution
Prior methods:
I Induce inventory by clustering of word instances (Li andJurafsky, 2015)
I Use existing inventories (Rothe and Schutze, 2015)
Our method:
I Input: word embeddings
I Output: word sense embeddings
I Word sense induction by clustering of word ego-networks
I Word sense disambiguation based on the induced senserepresentations
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Overview of the contribution
Prior methods:
I Induce inventory by clustering of word instances (Li andJurafsky, 2015)
I Use existing inventories (Rothe and Schutze, 2015)
Our method:
I Input: word embeddings
I Output: word sense embeddings
I Word sense induction by clustering of word ego-networks
I Word sense disambiguation based on the induced senserepresentations
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Learning Word Sense Embeddings
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Word Sense Induction: Ego-Network Clustering
I Graph clustering using the Chinese Whispers algorithm(Biemann, 2006).
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Neighbours of Word and Sense Vectors
Vector Nearest Neighbours
tabletray, bottom, diagram, bucket, brackets, stack,basket, list, parenthesis, cup, trays, pile, play-field, bracket, pot, drop-down, cue, plate
table#0
leftmost#0, column#1, randomly#0,tableau#1, top-left0, indent#1, bracket#3,pointer#0, footer#1, cursor#1, diagram#0,grid#0
table#1pile#1, stool#1, tray#0, basket#0, bowl#1,bucket#0, box#0, cage#0, saucer#3, mir-ror#1, birdcage#0, hole#0, pan#1, lid#0
I Neighbours of the word “table” and its senses produced byour method.
I The neighbours of the initial vector belong to both senses.I The neighbours of the sense vectors are sense-specific.
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Word Sense Disambiguation
1. Context Extraction
I use context words around the target word
2. Context Filtering
I based on context word’s relevance for disambiguation
3. Sense Choice
I maximize similarity between context vector and sense vector
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Word Sense Disambiguation: Example
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Evaluation on SemEval 2013 Task 13 Dataset: Comparisonto the State-of-the-art
Model Jacc. Tau WNDCG F.NMI F.B-Cubed
AI-KU (add1000) 0.176 0.609 0.205 0.033 0.317AI-KU 0.176 0.619 0.393 0.066 0.382AI-KU (remove5-add1000) 0.228 0.654 0.330 0.040 0.463Unimelb (5p) 0.198 0.623 0.374 0.056 0.475Unimelb (50k) 0.198 0.633 0.384 0.060 0.494UoS (#WN senses) 0.171 0.600 0.298 0.046 0.186UoS (top-3) 0.220 0.637 0.370 0.044 0.451La Sapienza (1) 0.131 0.544 0.332 – –La Sapienza (2) 0.131 0.535 0.394 – –
AdaGram, α = 0.05, 100 dim 0.274 0.644 0.318 0.058 0.470
w2v 0.197 0.615 0.291 0.011 0.615w2v (nouns) 0.179 0.626 0.304 0.011 0.623JBT 0.205 0.624 0.291 0.017 0.598JBT (nouns) 0.198 0.643 0.310 0.031 0.595TWSI (nouns) 0.215 0.651 0.318 0.030 0.573
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Conclusion
I Novel approach for learning word sense embeddings.
I Can use existing word embeddings as input.
I WSD performance comparable to the state-of-the-artsystems.
I Source code and pre-trained models:https://github.com/tudarmstadt-lt/SenseGram
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Sparse Sense Representations for WSD
I Panchenko A., Simon J., Riedl M., Biemann C. (2016) NounSense Induction and Disambiguation using Graph-BasedDistributional Semantics. In Proceedings of the KONVENS2016, Bochum, Germany
I Panchenko A., Ruppert E., Faralli S., Ponzetto S. P., andBiemann C. (2017). Unsupervised Does Not MeanUninterpretable: The Case for Word Sense Inductionand Disambiguation. In Proceedings of the 15th Conferenceof the European Chapter of the Association for ComputationalLinguistics (EACL’2017). Valencia, Spain
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Contributions
I A framework that relies on induced inventories as a pivot forlearning contextual feature representations anddisambiguation.
I The method can integrate several types of context features inan unsupervised way.
I The method is interpretable at several levels.
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Outline of the Method
Training Corpus
Contexts
ComputingWordandFeatureSimilarities
WordSenseInduction
Dependencies
Language Model
Co-occurrences
Meta-combination
Disambiguated Contexts
Disambiguation
Dependencies
Language Model
Co-occurrences
FeatureExtraction
Word-Feature Counts from Contexts
Word-Feature Counts from Corpus
Word Sense InvenoryDependency Word-Feature Counts from Corpus
Word Similarities
Feature Similarities
Figure: Outline of our unsupervised interpretable method for word senseinduction and disambiguation
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Interpretable Unsupervised Knowledge-Free WSD
Interpretability levels of our model
1. word sense inventory;
2. sense feature representation;
3. results of disambiguation in context.Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
WSD based on an Induced Word Sense Inventory
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Results on the TWSI Dataset
Table: WSD performance of different configurations of our method onthe full and the sense-balancedTWSI datasets based on the coarse
inventory with 1.96 senses/word
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Impact of Word Sense Inventory Granularity on WSDperformance: the TWSI Dataset
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Results on the SemEval 2013 Task 13: Word SenseInduction and Disambiguation
Table: WSD performance of the best configuration of our methodidentified on the TWSI dataset as compared to participants of the
SemEval 2013 Task 13 and two systems based on word sense embeddings(AdaGram and SenseGram).
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Demonstrating Unsupervised Knowledge-Free WSD
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Demonstrating Unsupervised Knowledge-Free WSD
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Demonstrating Unsupervised Knowledge-Free WSD
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
WSD without Sense Inventory: Co-Sets
fruit#1food#0
apple#2 mango#0 pear#0
Hypernym Layer
Co-Hyponym Layer
Hypernymy
Co-Hypernymy
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
WSD without Sense Inventory: Co-Sets
ID Hypernym Layer, H(c) ⊂ S Co-Hyponym Layer, c ⊂ S
1vegetable#0,fruit#0, crop#0, in-gredient#0, food#0
peach#0, banana#0, pineapple#0, berry#0, black-berry#0, grapefruit#0, strawberry#0, blueberry#0,fruit#0, grape#0, melon#0, orange#0, pear#0,plum#0, raspberry#0, watermelon#0, apple#0,apricot#0, cherry#0
2
programminglanguage#3,technology#0,language#0, for-mat#2, app#0
C#4, Basic#2, Haskell#5, Flash#1, Java#1,Pascal#0, Ruby#6, PHP#0, Ada#1, Oracle#3,Python#3, Apache#3, Visual Basic#1, ASP#2,Delphi#2, SQL Server#0, CSS#0, AJAX#0, theJava#0
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation
Introduction Dense Sense Representations Sparse Sense Representations Future Work
Thank you!
Dr. Alexander Panchenko University of Hamburg, Language Technology Group
Unsupervised Knowledge-Free Word Sense Disambiguation