topic sentiment mixture: modeling facets and opinions in

28
Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, ChengXiang Zhai Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme http://www.dbs.ifi.lmu.de/cms/Hauptseminar_"Mining_Volatile_Data_WS1213 Presentor: Liangchen Fan Hauptseminar:Mining Volatile Data Dr. Eirini Ntoutsi Wintersemester 2012/13 DATABASE SYSTEMS GROUP

Upload: others

Post on 29-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topic Sentiment Mixture: Modeling Facets and Opinions in

Topic Sentiment Mixture:Modeling Facets and Opinions in Weblogs

Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, ChengXiang Zhai

Ludwig-Maximilians-Universität MünchenInstitut für InformatikLehr- und Forschungseinheit für Datenbanksysteme

http://www.dbs.ifi.lmu.de/cms/Hauptseminar_"Mining_Volatile_Data”_WS1213

Presentor: Liangchen FanHauptseminar:Mining Volatile Data

Dr. Eirini NtoutsiWintersemester 2012/13

DATABASESYSTEMSGROUP

Page 2: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

Topic Sentiment MixtureLiangchen Fan

2 of 28

● Outline- I Introduction

- II Definitions

- III Topic-Sentiment Mixture Model

- IV Experiments and Results

- V Conclusion

- VI Discussion

Page 3: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ I Introduction

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● What is „sentiment“ and „sentiment analysis“?

Topic Sentiment MixtureLiangchen Fan

3 of 28

Page 4: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ I Introduction

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Related Work- Sentiment Classification

- Topic Modelling

● This Work- Mixture of Topic and Sentiment Research

- Subtopics

- a set of documents:

- major topics(subtopics) with topic model :

- Sentiment Polarities: positive, negative

- Problem: Topic-Sentiment Analysis(TSA)

- Model: Topic-Sentiment Mixture Model(TSM)

C={d 1 , d 2 , ... , dm}

{θ1 ,θ2 , ... ,θk }k θ

Topic Sentiment MixtureLiangchen Fan

4 of 28

Page 5: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ I Introduction

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● What is „topic-sentiment analysis“?

Topic Sentiment MixtureLiangchen Fan

5 of 28

Page 6: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ II Definitions

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Definitions● - Topic Model:

- probabilistic distribution of word in a set of words :

- words coherence:

● - Sentiment Model

- positive words:

- negative words:

● - Sentiment Coverrage of topic in document :

- coverrage of positive opinons:

- coverrage of neutral opinons:

- coverrage of negative opinions:

Topic Sentiment MixtureLiangchen Fan

6 of 28

w {p (w∣θ)}w∈VV

∑w∈V

p (w∣θ)=1

({p (w∣θP)}w∈V )

C i , d={δi , d , F ,δi , d , P ,δ i , d , N}

∑w∈V

p (w∣θ p)=1

∑w∈V

p (w∣θN )=1({p (w∣θN )}w∈V )

δi , d , F+δi , d , P+δi , d , N=1

δi , d , N

δi , d , F

δi , d , P

θi d

Page 7: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ II Definitions

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Definitions- Topic Life Cycle:

the amount of document

content about a topic

over time

- Sentiment Dynamics:

the distribution of the

strength(the amount) of

a sentiment(positive,

negative, neutral) about a

topic over time

Topic Sentiment MixtureLiangchen Fan

7 of 28

Page 8: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic-Sentiment Analysis(TSA)- Learning General Sentiment Models

- Extracting Topic Models and Sentiment Coverages

- Modeling Topic Life Cycle and Sentiment Dynamics

Topic Sentiment MixtureLiangchen Fan

8 of 28

Page 9: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● The Generation Process- Categories of Words

- common english words: „the“, „a“, „of“

- words related to a topical theme: „iPad“, „price“, „screen“

- the two categories above are captured with a background component model

- for words related to a topic: three subcategories

- neutral opinions: „price“, „screen“

- positive opinions: „awesome“, „love“

- negative opinions: „bad“, „hate“

Topic Sentiment MixtureLiangchen Fan

9 of 28

Page 10: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● The Generation Process- Multinominal Models

- Background Model

- a set of topic models:

- Positive sentiment model:

- Negative sentiment model:

- How does the author choose a

word while expressing an opinion?

(Example: Opinions about iPadmini)

θB

{θ1 ,θ2 , ... ,θk}

θP

θN

Topic Sentiment MixtureLiangchen Fan

10 of 28

Page 11: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic-Sentiment

Mixture Model- Formular

- C: a collection of weblog articles

Topic Sentiment MixtureLiangchen Fan

11 of 28

Page 12: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic-Sentiment Mixture Model- Two-Step Estimation Framework

- The first step: defining model priors:

- to tell the TSM what the sentiment models should look like

- Opinmind: retrieves online sentiments

- When given a query(topic), it can retrieve positive and negative

sentences(with sentiment labels)

- mixture of results retrieved with various queries

Topic Sentiment MixtureLiangchen Fan

12 of 28

Page 13: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic-Sentiment Mixture Model- Two-Step Estimation Framework

- The second step: maximizing a posterior estimation:

- Expectation-Maximization Algorithm(EM)

- based on the prior

- M-Step Updates with the MAP estimator:

in positive, negative and neural sentiment models

Topic Sentiment MixtureLiangchen Fan

13 of 28

p(Λ)

Λ̂=arg maxΛ p (C∣Λ) p (Λ)

Page 14: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Further Tasks- based on the EM Algorithm and the M-step Updates

- Rank Sentences for Topics

- Categorize Sentences by Sentiments

- Reveal the Overall Opinions for documents/Topics

Topic Sentiment MixtureLiangchen Fan

14 of 28

Page 15: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ III Topic-Sentiment Mixture Model

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Sentiment Dynamics Analysis- the change of the positive and negative

opinions about a topic over time

- understand the public opinions better

- predict user behavior more accurate

- Hidden Markov Model(HMM)

- Tag every word in the collection with

a topic and sentiment polarity, but

which sentiment word is about which

topic can't be tagged

- an alternative HMM:

- Count the Words with Corresponding

Labels Over Time (based the first

presentation)

Topic Sentiment MixtureLiangchen Fan

15 of 28

Page 16: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Data Sets- One Set for Learning: a dataset retrived with Opinmind

- The Other Set for Testing: a „raw“ dataset with timestamp about topics

„iPod“and „Da Vinci Code“

Topic Sentiment MixtureLiangchen Fan

16 of 28

Page 17: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Sentiment Model Extraction- Differences bt. General and Biased Sentiment Models

- Unbiased sentiment models contain neutral contents.

Topic Sentiment MixtureLiangchen Fan

17 of 28

Page 18: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Model Extraction

- Without Prior: more confused

- With Prior: more informative and coherent

Topic Sentiment MixtureLiangchen Fan

18 of 28

Page 19: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

Topic Sentiment MixtureLiangchen Fan

19 of 28

● Topic Model Extraction

- Without Prior: more confused

- With Prior: more informative and coherent

Page 20: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Model Extraction

- Result of query „da vinci code“ by Opinmind

Topic Sentiment MixtureLiangchen Fan

20 of 28

Page 21: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Model Extraction

- iPod

Topic Sentiment MixtureLiangchen Fan

21 of 28

Page 22: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Life Cycle and Sentiment Dynamics- Time Axis: 2005-01-13 to 2006-10-05, interval 30 days

- Single Burst vs Continuous Burst

Topic Sentiment MixtureLiangchen Fan

22 of 28

Page 23: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Life Cycle and Sentiment Dynamics- Burst in 2006-05 in „Book“ and „Religion“ because of the movie,

in „Book “ from 2006-04

- In „Book“: more positive opinions

- In „Religion“: more negative opinions

Topic Sentiment MixtureLiangchen Fan

23 of 28

Page 24: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Life Cycle and Sentiment Dynamics- In 2005-09 the burst of positive and neutral, later the burst of negative

Topic Sentiment MixtureLiangchen Fan

24 of 28

Page 25: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ IV Experiments and Results

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic Life Cycle and Sentiment Dynamics- Relative: Pos/Neg and Neg/Pos with different scale

- the sentiment domination and its trends

- Blackline(Pos/Neg of „iPod“) is getting weaker

Topic Sentiment MixtureLiangchen Fan

25 of 28

Page 26: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ V Conclusion

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Topic-Sentiment Mixture Model(TSM)

● Further Development- Customize the sentiment models according to each topic and obtain

different contextual views

- user behavior prediction

● My Comments- The mixture model release more accurate sentiment mining results

associated to many different topics in a topic-mixed documents. It is

applicable to mega media datas, a very strong tool for „content analysis“.

- It can be more accuracy on „subtopics“ of „subtopics“.

Topic Sentiment MixtureLiangchen Fan

26 of 28

Page 27: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture➣ VI Discussion

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Discussion

Topic Sentiment MixtureLiangchen Fan

27 of 28

Page 28: Topic Sentiment Mixture: Modeling Facets and Opinions in

DATABASESYSTEMSGROUP

Topic Sentiment Mixture

Hauptseminar "Mining Volatile Data"Dr. Eirini Ntoutsi

● Thank you!

Topic Sentiment MixtureLiangchen Fan

28 of 28