1 sentiment analysis of blogs by combining lexical knowledge with text classification (acm kdd...

24
1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Woj ciech Gryc, Richard D. Lawrence Date: 07/07/09 Speaker: Hsu, Yu-Wen Advisor: Dr. Koh, Jia-Ling

Upload: julius-welch

Post on 28-Jan-2016

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

1

Sentiment Analysis of Blogs by Combining Lexical Knowledge

with Text Classification

(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence

Date: 07/07/09Speaker: Hsu, Yu-Wen

Advisor: Dr. Koh, Jia-Ling

Page 2: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

2

Outline

IntroductionBaseline ApproachesPooling MultinomialsEmpirical EvaluationConclusion & Future Works

Page 3: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

3

Introduction

Most prior work in sentiment analysis use knowledge-based approaches, that classify the sentiment of texts based on dictionaries defining the sentiment-polarity of words, and simple linguistic patterns.

Recently, there have been some studies that take a machine learning approach, and build text classifiers trained on documents that have been human-labeled as positive or negative. not adapt well to different domains require much effort in human annotation of documents.

Page 4: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

4

We present a new machine learning approach that overcomes these drawbacks by effectively combining background lexical knowledge with supervised learning.

We construct a generative model based on a lexicon of sentiment-laden words, and a second model trained on labeled documents. The distributions from these two models are then adapti

vely pooled to create a composite multinomial Naïve Bayes classifier that captures both sources of information.

Page 5: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

5

Baseline Approaches

Lexical classification Given a lexicon of positive and negative terms,

one straightforward approach to using this information is to measure the frequency of occurrence of these terms in each document.

Feature supervision To use a lexicon along with unlabeled data in a

semi-supervised learning setting. There have been few approaches to incorporating such background knowledge into learning.

Page 6: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

6

Pooling Multinomials

The multinomial Naïve Bayes classifier commonly used for text categorization relies on three assumption (1)documents are produced by a mixture model (2) there is a one-to-one correspondence betwe

en each mixture component and class (3) each mixture component is a multinomial dis

tribution of words, i.e., given a class, the words in a document are produced independently of each other.

Page 7: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

7

the likelihood of a document (D)

: the probability of the class : the probability of the document

given the classthe words of a document are

generated independently

Page 8: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

8

compute the class membership probabilities of each class

the class with the highest likelihood is predicted

Page 9: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

9

*combining probability distributions

linear opinion pool

K: the number of experts : the probability assigned by expert

to word occurring in a document of class :the weights sum to one

Page 10: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

10

logarithmic opinion pool

Z : a normalizing constant :weights satisfy restrictions that assure that

is a probability distribution

weights of individual experts

:the error of expert k on the training set

Page 11: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

11

*A generative background-knowledge model

Definitions: – the vocabulary, i.e., set of words in our domain – set of positive terms from the lexicon that exists in V – set of negative terms from the lexicon that exists in V – set of unknown terms, i.e. – size of vocabulary, i.e. |V| – number of positive terms, i.e. |P| – number of negative terms, i.e. |N |

Page 12: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

12

Property 1:

Page 13: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

13

Property 2:

For this to be true for all values of α and β

Page 14: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

14

Property 3: Since a positive document is more likely to contain a positive term than a negative term, and vice versa, we would like: (r as the polarity level)

Property 4: Since each component of our mixture model is a probability distribution, we have the following constraint on the conditional probabilities for each class :

Page 15: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

15

Page 16: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

16

Empirical Evaluation

Data sets Lotus blogs: IBM Lotus collaborative

software, 34 positive and 111 negative. Political candidate blogs: Posts focusing on

politics about Obama and Clinton, 49 positive and 58 negative.

Movie reviews: Apart from the blog data that we generated, we also used the publicly available data of movie reviews, 1000 positive and 1000 negative reviews

Page 17: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

17

Result

Page 18: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

18

Page 19: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

19

Page 20: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

20

Page 21: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

21

Page 22: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

22

Evaluating sensitivity of Pooling Multinomials to the polarity level parameter.

Page 23: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

23

Conclusion

we develop an effective framework for incorporating lexical knowledge in supervised learning for text categorization.

we apply the developed approach to the task of sentiment classification — extending the state-of-the-art in the field which has focused primarily on using either background knowledge or supervised learning in isolation.

Page 24: 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence Date:

24

Future Works

There has been a flurry of recent work in the area of transfer learning that could be applied to extend a background knowledge-based model to incorporate data from different domains.

The fundamental challenge in such transfer learning is accounting for the training and test sets being from different distributions.