deep learning for nlpnlp.skku.edu/talks/dl-wordembedding(youngjoong ko).pdf · 2019-09-10 · 45...
TRANSCRIPT
![Page 1: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/1.jpg)
Ko, Youngjoong
September 9, 2015
Dept. of Computer Engineering, Dong-A University
Deep Learning for NLP - Word Embedding -
1. Basic Concepts of Neural Network (NN) 2. Why do we need Deep Learning? 3. Learning Representation for NLP 4. Tools for Word Embedding - Word2Vector - Ranking-based
2
Contents
![Page 2: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/2.jpg)
3
Basic Concepts of NN
� Perceptron
Basic Concepts of NN
4
� Illustration Example (Apple Tree)
![Page 3: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/3.jpg)
Basic Concepts of NN
5
� Illustration Example (Apple Tree)
Basic Concepts of NN
6
� Illustration Example (Apple Tree)
![Page 4: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/4.jpg)
Basic Concepts of NN
7
� Illustration Example (Apple Tree)
� Multilayer Neural Network
8
Basic Concepts of NN
![Page 5: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/5.jpg)
� Multilayer Neural Network
9
Basic Concepts of NN
� Training (Weight Optimization)
10
Basic Concepts of NN
![Page 6: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/6.jpg)
� Training (Weight Optimization)
11
Basic Concepts of NN
)
� Training (Activation Functions)
12
Basic Concepts of NN
![Page 7: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/7.jpg)
� Training (Activation Functions)
� Scoring Functions (Softmax)
13
Basic Concepts of NN
� Learning: Backpropagation � Calculate error at the output � Back-propagation = gradient descent + chain rule
14
Basic Concepts of NN
![Page 8: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/8.jpg)
� Learning: Backpropagation
15
Basic Concepts of NN
� Learning: Backpropagation
16
Basic Concepts of NN
![Page 9: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/9.jpg)
� Learning: Backpropagation � Calculate error at the output
17
Basic Concepts of NN
� Learning: Backpropagation � Calculate error at the output
18
Basic Concepts of NN
![Page 10: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/10.jpg)
� Neural Network-Core Components
19
Basic Concepts of NN
� Neural Network-Process
20
Basic Concepts of NN
![Page 11: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/11.jpg)
� Why was not old NN successful?
21
Why? Deep Learning
� Neural Network-Process � , parameter � Parameter Local Minima
22
Why? Deep Learning
![Page 12: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/12.jpg)
� Initialization Problem
23
Why? Deep Learning
� Initialization Tip
24
Why? Deep Learning
out
![Page 13: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/13.jpg)
� Deeper Network, Harder Learning � Network . , Error
Propagation . ReLU
25
Why? Deep Learning
� Pre-Training � Pre-training NN � AutoEncoder Restricted Boltzmann Machine
26
Why? Deep Learning
![Page 14: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/14.jpg)
� Pre-Training-Performance
27
Why? Deep Learning
� Auto Encoder
28
Why? Deep Learning
![Page 15: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/15.jpg)
� Auto Encoder
29
Why? Deep Learning
30
Learning Representation for NLP
![Page 16: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/16.jpg)
31
Learning Representation for NLP
� One-hot representation (or symbolic) � Ex) [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0] � Dimensionality
� 20K (speech) – 50K (PTB) – 500K (big vocab) – 3M (Google 1T)
� Continuous representation � Latent Semantic Analysis, Random projection � Latent Dirichlet Allocation, HMM clustering � Distributed Representation (Neural word embedding)
� Dense vector � By adding supervision from other tasks -> improve the representation
� Distributed Representation � DNN AI
Object Symbol .
32
Learning Representation for NLP
![Page 17: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/17.jpg)
� Distributed Representation � ‘ ’ � Curse of Dimensionality
33
Learning Representation for NLP
34
Learning Representation for NLP
![Page 18: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/18.jpg)
35
Learning Representation for NLP
36
Learning Representation for NLP
![Page 19: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/19.jpg)
37
Learning Representation for NLP
� Good One – Word Representation
38
Learning Representation for NLP
� Neural Network Language Model
![Page 20: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/20.jpg)
39
Learning Representation for NLP
� Back-Propagation Algorithm
40
Learning Representation for NLP
� Ranking-based (Collobert)
![Page 21: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/21.jpg)
41
Learning Representation for NLP
� Recurrent Neural Network
42
Learning Representation for NLP
� Word2Vec: CBOW, Skip-Gram
![Page 22: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/22.jpg)
43
Tools for Word Embedding
� Word2Vec � https://code.google.com/p/word2vec/ � http://deeplearning4j.org/word2vec.html#just � Ubutu (JAVA)
� Googlecode � svn checkout http://word2vec.googlecode.com/svn/trunk � trunk
44
Tools for Word Embedding
� Word2Vec � Make
� Warning
![Page 23: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/23.jpg)
45
Tools for Word Embedding
� Word2Vec
46
Tools for Word Embedding � Word2Vec parameters
� -output �
� -size � word vector (default value: 100)
� -windows � Max skip length (default value: 5)
� -cbow � 1: continuous bag of word model, 0: skip-gram model
� -iter � (default value: 5)
� -min-count � (default value: 5)
� -save-vocab �
� -read-vocab �
![Page 24: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/24.jpg)
47
Tools for Word Embedding
� Word2Vec � -train �
� Tutorial
� http://alexminnaar.com/word2vec-tutorial-part-ii-the-continuous-bag-of-words-model.html
48
Tools for Word Embedding
� Ranking-based Model � https://bitbucket.org/aboSamoor/word2embeddings � Python, Theano � Result file (pickle)
� a = np.load(sys.argv[2])
![Page 25: Deep Learning for NLPnlp.skku.edu/talks/DL-WordEmbedding(Youngjoong Ko).pdf · 2019-09-10 · 45 Tools for Word Embedding Word2Vec 4 F½ 46 Tools for Word Embedding Word2Vec parameters](https://reader033.vdocuments.us/reader033/viewer/2022060415/5f1310ad11c6c13574567718/html5/thumbnails/25.jpg)
49
� Ronan Collbert, et al. “Natural Language Processing (Almost) from Scratch,” Journal of Machine Learning Research, 2011.
� Mikolov, T., et al. “Recurrent Neural Network based Language Model,” 2010.
� Mikolov, T., et al., “Distributed Representations of Words and Phrases and their Compositionality,” NIPS, 2013.
� Le, Q., Mikolov, T., “Distributed Representations of Sentences and Documents,” ICML, 2014.
� Kalchbrenner, N., Grefenstette, E. and Blunsom, P. “A Convolutional Neural Network for Modelling Sentences,” ACL, 2014.
� Kim, Y. “Convolutional Neural Networks for Sentence Classification,” EMNLP, 2014.
� Al-Rfou, R., Perozzi, B., Skiena, S., “Polyglot: Distributed Word Representations for Multiligual NLP,” ACL, 2013.
� , “Introduction to Deep Learning,” , 2015. � , “Word and Phrase Embedding,” , 2015.
References
Thank you for your attention! http://web.donga.ac.kr/yjko/