emotion classification of online news articles from the readers perspective kevin hsin-yih lin,...

EMOTION CLASSIFICATION OF ONLINE NEWS ARTICLES FROM THE READER’S PERSPECTIVE

Kevin Hsin-Yih Lin, Changhua Yang, Hsin-Hsi Chen

Department of Computer Science and Information Engineering

National Taiwan University

IEEE 2008

1. Introduction

Past studies on the emotion classification of documents focus on the writer’s emotional state.

This paper addresses the problem from the reader’ perspective. There are distinctions between reader and

writer emotions, because they do not always agree.

an infamous politician’s miserable day

1. Introduction

Reader-emotion classification has several applications. Integrating reader-emotion classification

into a web search engine dog-loving girl heartwarming puppy stories

Another application is to classify a website’s contents into emotion classes Like Yahoo! Kimo News Need feedback of users Automatic method can relieve the problem

1. Introduction

An essential prerequisite to realizing the above applications is the ability to classify documents into reader-emotion categories. Research in such an area was difficult in the

past, due to the scarcity of large manually-annotated corpora.

Now we have many websites like Yahoo! Kimo News or United Daily News.

1. Introduction

They classify online news articles into reader-emotion categories:

a set N of news articles , a set of E of emotions The goal is to find function f: N E

They experiment adopted the machine learning method and different features were involved.

2. Related Work

Past research on emotion classification focuses on writers. Only a few (K. H. Lin 2007 ) address the reader

aspect. Some studies relating to writer-emotion:

Pang et al (2002) design a classifier to decide whether a movie review contains a positive or negative sentiment.

Their results reveal that using SVM with word unigram features outperforms other combinations of unigram, bigram, part-of-speech, word position, and adjective features.

2. Related Work

More work has been done to search for features better than unigrams. Mullen and Collier (2004), and Hu et al. (2005)

exploit word sentiments to achieve better classification accuracy.

Cui et al. (2006) show that high-order n-grams are beneficial if the corpus size is large enough.

Sentiment classification of texts is not restricted to the document level. Wiebe (2000) conducts experiments to learn the

subjectivity of adjectives. Kim and Hovy (2004) study sentence sentiments.

3. Constructing the Corpus

They obtain Chinese news articles from Yahoo! Kimo News, which allows a user to cast a vote for one of eight emotions. collect news articles along with their voting statistics a

week after their publication dates to ensure that the vote counts have stabilized.

They use Yahoo!’s eight emotions: happy, sad, angry, surprising, boring, heartwarming, awesome, and useful.

They treat the most dominant emotion of a news article as the article’s emotion class.

The corpus consists of news articles dating from January 24, 2007 to August 7, 2007.

4. Extracting Features

After obtaining news articles, the next step is to convert them into features.

Five different types of features are used: Chinese character bigrams Chinese words News metadata Affix similarities Word emotions


4.1Basic Feature Chinese character bigrams

Taken from the headline and content of each news article. Binary value to indicate the presence of a bigram. If a bigram appears at least once in a news article, then the

bigram has a feature value of 1. Chinese words

Extract words by utilizing Stanford NLP Group’s Chinese word segmenter.

As in the case of bigrams, binary feature values are used. News meta-data

News category, Agency, Hour of publication ,reporter, event location

Again, binary feature values was used.


4.2 Affix Similarity Features It is computed by first identifying all the

common substrings between a news article and the training data of an emotion class.

Then we quantify the similarity based on the number and lengths of the common substrings.

Affix similarity can be divided into two parts: prefix similarity and suffix similarity.


E: The set of emotionsS: The set containing all suffixes of all news articles in the training corpusT: The set containing all suffixes we wish to obtain features fromVe: a value representing the degree of similarity between emotion e and the test news articleLCP(t, s) : a function which returns the length of the longest common prefix of t and s.EMOTION(E, s*) : a function which returns the emotion associated with s*

Compute emotions’ score of an article:

1. 先將每個 emotion value 歸零2. 對每個 article 的 suffixes做計算3. 找到 S 中與 t 最長的相似字串 s*

4. 讓 e* 為 s* 所代表的emotion5. 更新 article 在 emotion e*

的得分normalize回傳 a set of emotion value


Training corpus: S = {“The team won”, “team won”, “won”, “This team lost”, “team lost”, “lost”}

article string: T = {“This team won”, “team won”, “won”}

emotions: E = {“happy”, “sad”}

Assume we are now at line 2 of Algorithm 1 and t = “This team won”

跟 S 作比對 “ This team lost” 與 “ This team won” 有最長的相似 prefix

Then, s* 為 “ This team lost” , e* 為 “ sad”

由於 t 與 s 的最長相似 prefix 長度為 2, 所以 Ve* 也就是 Vsad 增加 2

重複執行 t=“team won” 與 t=“won”

在整個程式結束後可以得到這篇文章的 emotion feature value Vsad =2/3 , Vhappy =3/3

Happy Sad

DATA INPUTS

Processing

t=“This team won” (2)

t = “team won” (2) + t = “won” (1)


The algorithm for computing prefix similarity is the same as suffix similarity algorithm. Except that all substrings in S and T are

reversed In the last example, T would become {“won

team This”, “team This”, “This”}.


4.3. Word Emotion Features Many words have implied emotional meanings.

Wonderful happiness We first generate an emotion lexicon.

Method C. Yang, K. H. Lin (2007) The lexicon contains entries describing collocation

information between words and 40 emotions Each entry in the lexicon is a 3-tuple (w, b, m),

w is a Chinese word b is a blog emotion m is the point-wise mutual information of w and b.


Suppose we have a test news article string “an excellent and tearful story”.

Then W = {“an”, “excellent”, “and”, “tearful”, “story”}. Suppose only the words “excellent” and “tearful” appear in L

(set of emotion lexicon) the associated entries are

(“excellent”, happy, 9) (“excellent”, surprising, 5) (“tearful”, sad, 7) (“tearful”, surprising, 3).

Then the feature values are Vhappy = 9/9, Vsurprising = 8/9, and

Vsad = 7/9. The values of the other 37 features (other emostions) are 0.

5. Experiment and Results

5.1 Experiment Setup Given the great performance of support

vector machines (SVM) in many classification tasks, they choose SVM as the classifier algorithm.

The implementation they use is libsvm. To estimate the optimal C cost

parameter value, they perform four-fold cross-validation on the training data.

As for the kernel, linear kernel is used.


5.1 Experiment Setup Other methods are implemented for

performance comparison. Baseline: naïve bayes (NB) on Chinese

character bigrams and Chinese words. Writer- emotion classification

Pang (2002) Cui (2006) Extend their methods to handle multi-class

classification


5.3. Results and Discussions

SVM – support vector machinesPA – passive-aggressive classifierNB – Naïve Bayes classifierBI – bigram WD – word MT – metadata

AS – affix similarityWE – word emotionCN – Cui’s combined word-ngramThe number following CN is the number of features keptafter performing χ2 test filtering.

The bestsignificantly higher than every other model with p-value ≤ 0.01.

exceptions


Analyzing each feature type individually, we see that SVM+BI has the best accuracy of 0.7441.

It is also worth noting that SVM+AS obtains a relatively high accuracy of 0.7131 Only 16 distinct features in total. In contrast, BI consists of 865,451 distinct

features.


Let us investigate the effect of adding a feature type to an existing feature combination.

Classification accuracy increases when AS is added to any combination of BI, WD, MT and WE. Every accuracy improvement is statistically significant with p-

value ≤ 0.001. The increase indicates that AS is able to capture some important

emotion-related characteristics As for BI, adding it to any combination improves accuracy

with p-value ≤ 0.01. BI is also an important feature type.

In contrast, adding WD, MT or WE to an existing feature combination neither consistently increases nor consistently decreases accuracy.

However, adding WE to SVM+BI+WD+MT+AS produces the model with the highest accuracy.


Both SVM+BI and SVM+WD perform better than their NB+BI and NB+WD baseline counterparts.

Pang’s word unigram classifier is equivalent to the SVM+WD model, which achieves a relatively high accuracy of 0.7325. Unlike the observation made in Pang’s work, however,

combining WD with other feature types can improve accuracy in this experiment

The PA classifier does not perform as well as SVM when used with Cui’s n-gram features.

The PA classifier does not perform as well as SVM when used with Cui’s n-gram features The PA classifier does not perform as well as SVM when used with Cui’s n-gram features This accuracy is beaten by the simpler model of SVM+BI, which

has a slightly higher feature count of 865,451. So, contrary to Cui’s results, using high-order n-grams does not

improve accuracy in this experiment.

6. Reader Behavior Versus Classifier Behavior

Examine how closely the best classifier, SVM+BI+WD+MT+AS+WE, models reader behavior. Observe the similarities and differences between the

classifier’s confusion matrix and the news articles’ emotional distributions.

Average votes inHappy class

Classifying result


Figure 1(a) shows that if most people feel heartwarming after reading a news article, then many other people are going to feel happy.

Figure 2(b) reveals that if the SVM classifier wrongly classifies a happy article, then the incorrect category is most likely to be the boring class.

Figures 1(a) to 1(c) are placed directly above 2(a) to 2(c) so that we can observe the similarities and differences between the readers’ and the classifier’s behavior.

Only histograms for heartwarming, happy and useful classes are shown, because the patterns they exhibit are representative of the characteristics found in other histograms.


In Figure 1(a), the happy class receives 20% of the votes on average when the most dominant class is heartwarming.

However, the percentage of instances wrongly assigned to the happy category is only 6% in Figure 2(b) 2(a).

In fact, the happy class is not even the category that the SVM classifier is most likely to wrongly classify a heartwarming instance into.

Although Figure 1(a) indicates that many readers are likely to feel happy after reading a heartwarming article, Figure 2(a) shows that the classifier does not exhibit this tendency.

It implies that heartwarming articles have certain discriminative features that differentiate them from happy articles.


They use χ2 test to measure how discriminative a feature is, and inspect the features that appear in heartwarming instances.

The Chinese translations of the words such as affect, caring, story and mother, are among the most discriminative features according to the χ2 test.

The news category, charity, is also prevalent and discriminative.


Figure 1(b) shows that if most readers feel happy after reading an article, then the vote counts for other emotions will be quite low.

The SVM classifier’s high accuracy for the happy class mirrors this pattern.

It is discovered that many happy news articles have features related to sports, especially baseball.

In fact, the sports and baseball news categories are the two most discriminative features for the happy class according to χ2 test. 27.9% of all the happy articles in the training corpus are in the baseball

category the probability of an article in this news category belonging to the happy

class is 0.779. 50.3% of all the happy instances in the training corpus are in the

sports category. The probability of a sports news article belonging to the happy

class is 0.739. The highly-skewed emotional distribution of the happy class is

likely to be an effect of the readers’ great interest in sports.


Figure 1(c) and Figure 2(c) display different patterns. The classifier has an outstanding accuracy of 0.90 for

the useful class, but the average fraction of reader votes is 0.65.

It is discovered that certain news categories have very large χ2 values with respect to the useful class. For example, 92% of all the news articles in the weather

news category are in the useful class in the training corpus.

The corresponding percentage for the test corpus is 86%.

Other news categories associated with the useful class include cosmetics, financial management, and health.

These observations are intuitive, because the news categories listed above should contain news articles with practical information.

The emotional ambiguities indicated by the readers’ voting patterns do not necessarily translate into classifier performance.

7. Ranking Emotions

Sometimes more than one emotion may be prevalent in a news article.

In such cases, it would be useful to provide a ranking of emotions.

To rank emotions, we use regression on an emotion to predict its percentage of votes in a news article.

To perform regression, we adopt support vector regression (SVR).

7. Ranking Emotions

The evaluation metric is ACC@n, or accuracy at n, which considers a proposed emotion list to be correct if its first n emotions are both the same as and in the same order as the true emotion list’s first n emotions.

7. Ranking Emotions

The accuracy for predicting the most dominant emotion (i.e., ACC@1) is 0.7541 slightly lower than the best accuracy in the

classification experiment. The sharp decrease in accuracy as n increases

reflects the hardness of the ranking task. We regard each unique emotion sequence of length

n as a class. In particular, when n = 8, we are essentially classifying news articles into 8! = 40320 classes.

Generating a completely correct ranked list is a difficult task.

emotion classification of online news articles from the readers perspective kevin hsin-yih lin,...

Documents

chinese news articles

articles emotion class

readeremotion categories

reader emotion categories

set n of news articles

emotion classes

dominant emotion

problem slide