document modeling with gated recurrent neural network for...
TRANSCRIPT
Document Modeling with Gated Recurrent Neural Network for
Sentiment Classification
Duyu Tang, Bing Qin, Ting Liu
Harbin Institute of Technology
1
Sentiment Classification
• Given a piece of text, sentiment classification focus on inferring the sentiment polarity of the text.• Positive / Negative
• 1-5 stars
• The task can be at• Word/phrase level, sentence level, document level
• We target at document-level sentiment classification in this work
2
Standard Supervised Learning Pipeline
TrainingData
Learning Algorithm
FeatureRepresentation
SentimentClassifier
3
Feature Learning Pipeline
TrainingData
Learning Algorithm
FeatureRepresentation
SentimentClassifier
Learn text representation/feature from data!
4
Deep Learning Pipeline
TrainingData
Learning Algorithm
FeatureRepresentation
SentimentClassifier
Word Representation
Words
Semantic Composition
w1 w2 …… wn−1wn
5
• Represent each word as a low dimensional, real-valued vector
• Solutions: Word2Vec, Glove, SSWE
Deep Learning Pipeline
TrainingData
Learning Algorithm
FeatureRepresentation
SentimentClassifier
Word Representation
Words
Semantic Composition
w1 w2 …… wn−1wn
6
• Compositionality: the meaning of a longer expression depends on the meaning of its constituents
• Solutions at sentence level• Recurrent NN, Recursive NN,
Convolutional NN, Tree-Structured LSTM
• Represent each word as a low dimensional, real-valued vector
• Solutions: Word2Vec, Glove, SSWE
The idea of this work
• We want to build an end-to-end neural network approach for document level sentiment classification
• Human beings solve this problem in a hierarchical way: represent sentence from words, and then represent document from sentences
• We want to use the semantic/discourse relatedness between sentences to obtain the document representation• We do not want to use an external discourse parser.
7
w11 w2
1 w31 w𝑙1−1
1 w𝑙11
Word Representation
w12 w2
2 w32 w𝑙2−1
2 w𝑙22 w1
𝑛 w2𝑛 w3
𝑛 w𝑙𝑛−1𝑛 w𝑙𝑛
𝑛
……
8
w11 w2
1 w31 w𝑙1−1
1 w𝑙11
CNN/LSTM
Word Representation
Sentence Representation
Sentence Composition
w12 w2
2 w32 w𝑙2−1
2 w𝑙22
CNN/LSTM
w1𝑛 w2
𝑛 w3𝑛 w𝑙𝑛−1
𝑛 w𝑙𝑛𝑛
CNN/LSTM
……
……
9
w11 w2
1 w31 w𝑙1−1
1 w𝑙11
CNN/LSTM
Word Representation
Sentence Representation
Document Composition
Sentence Composition
w12 w2
2 w32 w𝑙2−1
2 w𝑙22
CNN/LSTM
w1𝑛 w2
𝑛 w3𝑛 w𝑙𝑛−1
𝑛 w𝑙𝑛𝑛
CNN/LSTM
Forward Gated Neural Network
Forward Gated Neural Network
Forward Gated Neural Network
……
……
10
w11 w2
1 w31 w𝑙1−1
1 w𝑙11
CNN/LSTM
Word Representation
Sentence Representation
Document Composition
Sentence Composition
w12 w2
2 w32 w𝑙2−1
2 w𝑙22
CNN/LSTM
w1𝑛 w2
𝑛 w3𝑛 w𝑙𝑛−1
𝑛 w𝑙𝑛𝑛
CNN/LSTM
Forward Gated Neural Network
Backward Gated Neural Network
Forward Gated Neural Network
Backward Gated Neural Network
Forward Gated Neural Network
Backward Gated Neural Network
……
……
11
w11 w2
1 w31 w𝑙1−1
1 w𝑙11
CNN/LSTM
Word Representation
Sentence Representation
Document Composition
Sentence Composition
w12 w2
2 w32 w𝑙2−1
2 w𝑙22
CNN/LSTM
w1𝑛 w2
𝑛 w3𝑛 w𝑙𝑛−1
𝑛 w𝑙𝑛𝑛
CNN/LSTM
Forward Gated Neural Network
Backward Gated Neural Network
Forward Gated Neural Network
Backward Gated Neural Network
Forward Gated Neural Network
Backward Gated Neural Network
……
……
12
w11 w2
1 w31 w𝑙1−1
1 w𝑙11
CNN/LSTM
Word Representation
Sentence Representation
Document Representation
Document Composition
Sentence Composition
w12 w2
2 w32 w𝑙2−1
2 w𝑙22
CNN/LSTM
w1𝑛 w2
𝑛 w3𝑛 w𝑙𝑛−1
𝑛 w𝑙𝑛𝑛
CNN/LSTM
Softmax
Forward Gated Neural Network
Backward Gated Neural Network
Forward Gated Neural Network
Backward Gated Neural Network
Forward Gated Neural Network
Backward Gated Neural Network
……
……
13
14
Sentence Modeling
15
Sentence Modeling
Document Modeling
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
16
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
17
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
18
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
19
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
Conv-Gated NN (BiDirectional Gated Avg)
0.660 0.425
20
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
Conv-Gated NN (BiDirectional Gated Avg)
0.660 0.425
LSTM-Gated NN 0.676 0.453
21
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
Conv-Gated NN (BiDirectional Gated Avg)
0.660 0.425
Document Modeling Yelp 2015 (5-class) IMDB (10-class)
Average 0.614 0.366
Recurrent 0.383 0.176
22
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
Conv-Gated NN (BiDirectional Gated Avg)
0.660 0.425
Document Modeling Yelp 2015 (5-class) IMDB (10-class)
Average 0.614 0.366
Recurrent 0.383 0.176
Recurrent Avg 0.597 0.344
23
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
Conv-Gated NN (BiDirectional Gated Avg)
0.660 0.425
Document Modeling Yelp 2015 (5-class) IMDB (10-class)
Average 0.614 0.366
Recurrent 0.383 0.176
Recurrent Avg 0.597 0.344
Gated NN 0.651 0.430
24
Yelp 2015 (5-class) IMDB (10-class)
Majority 0.369 0.179
SVM + Unigrams 0.611 0.399
SVM + Bigrams 0.624 0.409
SVM + TextFeatures 0.624 0.405
SVM + AverageWordVec 0.568 0.319
Conv-Gated NN (BiDirectional Gated Avg)
0.660 0.425
Document Modeling Yelp 2015 (5-class) IMDB (10-class)
Average 0.614 0.366
Recurrent 0.383 0.176
Recurrent Avg 0.597 0.344
Gated NN 0.651 0.430
Gated NN Avg 0.657 0.416
25
In Summary
• We develop a neural network approach for document level sentiment classification.
• We model document with gated recurrent neural network, and we show that adding neural gates could significantly boost the classification accuracy.
• The codes and datasets are available at: http://ir.hit.edu.cn/~dytang
26
Thanks
27