research article a multilayer na ve bayes model for...

12
Research Article A Multilayer Na\ve Bayes Model for Analyzing User’s Retweeting Sentiment Tendency Mengmeng Wang, 1,2 Wanli Zuo, 1,2 and Ying Wang 1,2,3 1 College of Computer Science and Technology, Jilin University, Changchun 130012, China 2 Key Laboratory of Symbolic Computation and Knowledge Engineering, Jilin University, Ministry of Education, Changchun 130012, China 3 College of Mathematics, Jilin University, Changchun 130012, China Correspondence should be addressed to Ying Wang; [email protected] Received 28 May 2015; Accepted 17 August 2015 Academic Editor: Stefan Haufe Copyright © 2015 Mengmeng Wang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Today microblogging has increasingly become a means of information diffusion via user’s retweeting behavior. Since retweeting content, as context information of microblogging, is an understanding of microblogging, hence, user’s retweeting sentiment tendency analysis has gradually become a hot research topic. Targeted at online microblogging, a dynamic social network, we investigate how to exploit dynamic retweeting sentiment features in retweeting sentiment tendency analysis. On the basis of time series of user’s network structure information and published text information, we first model dynamic retweeting sentiment features. en we build Na¨ ıve Bayes models from profile-, relationship-, and emotion-based dimensions, respectively. Finally, we build a multilayer Na¨ ıve Bayes model based on multidimensional Na¨ ıve Bayes models to analyze user’s retweeting sentiment tendency towards a microblog. Experiments on real-world dataset demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of dynamic retweeting sentiment features and temporal information in retweeting sentiment tendency analysis. What is more, we provide a new train of thought for retweeting sentiment tendency analysis in dynamic social networks. 1. Introduction With the rapid growth of user-generated data on the Web, people usually use microblogging which has become a novel social media [1] for expressing their opinion. erefore, a necessity of analyzing and understanding these online generated data/opinions has arisen [2]. Mining emotional information in users’ contents may contribute to analyzing the relationship between social economy change and emotion change expressed by the public [3], measuring strength of public happiness [4], detecting current trend of stock market [5], predicting results of presidential election [6], and modeling for opinion mining [7]. Hence, analyzing sentiment tendency of microblogging has gradually become a hot research topic. Furthermore, the emergence of microblogging has bro- ken the mode of transmission: once a user posts a microblog, other users can retweet it and add more contents via 140 words text box which makes further development and enrich- ment of information during forwarding process. anks to resonance of information, retweeting content, as context information of microblogging, contains users’ views and emotions to express approval or opposition towards a cer- tain microblog. Besides, exploring on retweeting sentiment tendency can make enterprises and governments better understand users’ opinions on products, stocks, current hot issues, hit movies, hate to someone, and so forth. As stated, retweeting sentiment tendency analysis is of great significance for public opinion monitoring. Our work on predicting user’s retweeting sentiment tendency is motivated by its broad application prospect. However, previous sentiment tendency analysis methods [8–11], which only focused on emotion of contents rather than users’ individual emotion, attributes, social correlations, and Hindawi Publishing Corporation Computational Intelligence and Neuroscience Volume 2015, Article ID 510281, 11 pages http://dx.doi.org/10.1155/2015/510281

Upload: others

Post on 09-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Research ArticleA Multilayer Nave Bayes Model for Analyzing Userrsquos RetweetingSentiment Tendency

Mengmeng Wang12 Wanli Zuo12 and Ying Wang123

1College of Computer Science and Technology Jilin University Changchun 130012 China2Key Laboratory of Symbolic Computation and Knowledge Engineering Jilin University Ministry of EducationChangchun 130012 China3College of Mathematics Jilin University Changchun 130012 China

Correspondence should be addressed to Ying Wang 726854768qqcom

Received 28 May 2015 Accepted 17 August 2015

Academic Editor Stefan Haufe

Copyright copy 2015 Mengmeng Wang et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Today microblogging has increasingly become a means of information diffusion via userrsquos retweeting behavior Since retweetingcontent as context information of microblogging is an understanding of microblogging hence userrsquos retweeting sentimenttendency analysis has gradually become a hot research topic Targeted at online microblogging a dynamic social network weinvestigate how to exploit dynamic retweeting sentiment features in retweeting sentiment tendency analysis On the basis oftime series of userrsquos network structure information and published text information we first model dynamic retweeting sentimentfeatures Then we build Naıve Bayes models from profile- relationship- and emotion-based dimensions respectively Finally webuild a multilayer Naıve Bayes model based on multidimensional Naıve Bayes models to analyze userrsquos retweeting sentimenttendency towards a microblog Experiments on real-world dataset demonstrate the effectiveness of the proposed frameworkFurther experiments are conducted to understand the importance of dynamic retweeting sentiment features and temporalinformation in retweeting sentiment tendency analysis What is more we provide a new train of thought for retweeting sentimenttendency analysis in dynamic social networks

1 Introduction

With the rapid growth of user-generated data on the Webpeople usually use microblogging which has become a novelsocial media [1] for expressing their opinion Thereforea necessity of analyzing and understanding these onlinegenerated dataopinions has arisen [2] Mining emotionalinformation in usersrsquo contents may contribute to analyzingthe relationship between social economy change and emotionchange expressed by the public [3] measuring strengthof public happiness [4] detecting current trend of stockmarket [5] predicting results of presidential election [6]and modeling for opinion mining [7] Hence analyzingsentiment tendency of microblogging has gradually becomea hot research topic

Furthermore the emergence of microblogging has bro-ken the mode of transmission once a user posts a microblog

other users can retweet it and add more contents via 140words text boxwhichmakes further development and enrich-ment of information during forwarding process Thanks toresonance of information retweeting content as contextinformation of microblogging contains usersrsquo views andemotions to express approval or opposition towards a cer-tain microblog Besides exploring on retweeting sentimenttendency can make enterprises and governments betterunderstand usersrsquo opinions on products stocks currenthot issues hit movies hate to someone and so forth Asstated retweeting sentiment tendency analysis is of greatsignificance for public opinion monitoring Our work onpredicting userrsquos retweeting sentiment tendency is motivatedby its broad application prospect

However previous sentiment tendency analysis methods[8ndash11] which only focused on emotion of contents rather thanusersrsquo individual emotion attributes social correlations and

Hindawi Publishing CorporationComputational Intelligence and NeuroscienceVolume 2015 Article ID 510281 11 pageshttpdxdoiorg1011552015510281

2 Computational Intelligence and Neuroscience

dynamic nature of network cannot make a comprehensiveanalysis on userrsquos retweeting sentiment tendency Hence inthis paper we propose a multilayer Naıve Bayes model foranalyzing userrsquos retweeting sentiment tendency towards amicroblog (denoted as MLNBRST) and our main contribu-tions are summarized next

(1) Take userrsquos recent individual emotion as well asemotion difference between user and microblog asmetrics to further analyze userrsquos retweeting sentimenttendency

(2) Improve traditional Salton metrics according todirectivity of link for being applied to directed net-work better

(3) Blend temporal information in userrsquos retweeting sen-timent features on the basis of time series of userrsquoscontents and network topological information so asto capture dynamic evolution process of informationand network structure

(4) Build a multilayer Naıve Bayes model on accountof Naıve Bayes models from different dimensionsto complete userrsquos retweeting sentiment tendencyanalysis in a more fine-grained perspective

(5) Evaluate MLNBRST on real-world Sina microblog-ging dataset and elaborate the importance of differentretweeting sentiment features and temporal informa-tion on userrsquos retweeting sentiment tendency analysis

The rest of the paper is organized as follows Section 2describes the related work dynamic retweeting sentimentfeatures are depicted in Section 3 Section 4 defines themethod we propose details of the experimental results anddataset which is used in this study are given in Section 5Finally conclusion appears in Section 6

2 Related Work

In recent years with the popularization of microbloggingsentiment analysis of microblogging has become one of thehot research topics [12] Existing microblogging sentimentanalysis algorithms can be roughly categorized into twogroups emotional dictionary-based methods and machinelearning methods

In emotional dictionary-based methods through sum-ming up all emotion wordsrsquo sentiment polarity a microblogrsquossentiment polarity is calculated Golder and Macy [8]adopted a prominent lexicon Linguistic Inquiry and WordCount (LIWC) to analyze sentiment automatically withtweets which were published by millions of different regionsand different cultural background microbloggers and theresults revealed that the proposedmethod can clearly identifyuserrsquos emotion pattern in a period of time Ghiassi et al [13]introduced an approach to supervised feature reduction usingn-grams and statistical analysis to develop a Twitter-specificlexicon for sentiment analysis which yielded improvementof sentiment classification accuracy Montejo-Raez et al [1415] explored an unsupervised domain-independent method

for polarity classification in Twitter via combining Senti-WordNet scores with WordNet Through weighting Senti-WordNet values with the score of a random walk analysis(PageRank) on the concepts found in texts over WordNetgraph polarity classification problem was solved The resultsobtained showed that both disambiguation and expansionwere good strategies for improving performance Stieglitzand Dang-Xuan [9] leveraged the tool ldquoSentiStrengthrdquo whichused a human-designed lexicon of emotional terms witha set of additional linguistic rules for negations boosterwords amplifications emoticons spelling corrections andother factors such as word weighting to analyze the level ofsentiments in politically relevant tweets

Although emotional dictionary-based method can bet-ter represent unstructured characteristics of text it is toodependent on emotion dictionary and ignores the effectsof new words Machine learning method can overcome theinfluence that unknown words have on the performanceof sentiment analysis It first selects words and phrasesas features to form vector space model and then convertssentiment analysis problem into a classification problem bytreating different emotional tendencies as different categoriesDavidov et al [10] treated related tags and emoticons intweets as labels and designed a supervised 119896-nearest neighborclassifier to complete emotion classification on Twitter whichdid not need too much manual annotation Based on 95most frequently used emoticons which were mapped tofour emotional categories happy hate low and anger Zhaoet al [11] employed the emoticons for the generation ofsentiment labels for tweets and built an incremental learningNaıve Bayes classifier for the categorization of four types ofsentiments with an empirical precision of 643 Wang et al[16] analyzed emotion tendentiousness of related microblogsin the 2012 US presidential election They found that interms of short text effect of Naıve Bayes was better than thatof SVM on emotion tendentiousness classification So theypresented an improved SVM classifier NBSVM which wasbetter and more stable than SVM and Naıve Bayes Korenekand Simko [17] first created an appraisal dictionary byutilizing a psychological theory called appraisal theory whichallowed a deeper andfine-grained analysis ofmicrobloggingsfollowed by classifying posts using SVM classifier whichrevealed that the proposed method was feasible even forspecific content presented on microbloggings On the basisof common social network characteristics and other carefullygeneralized linguistic patterns Li and Xu [18] proposedand implemented a novel method for identifying sentimentin microblogging posts First after thorough analysis onsample data an automatic rule-based systemwas constructedto detect and extract the cause event of each emotionalpost And then an emotional corpus was built with Chinesemicroblogging posts labeled by human annotators Finallya classifier namely SVR (support vector regression) wastrained to classify emotions in microblogging posts basedon extracted cause events The overall performance of pro-posed system was very promising Focusing on emotiondynamics in OSNs Xiong et al [19] proposed an emotionclassifier based on Bayes theory and some effective strategies

Computational Intelligence and Neuroscience 3

(Shannon entropy and the salience degree of each word)were introduced to improve the performance of classifierwith which proposed method can classify any Chinese tweetinto a particular emotion with a satisfactory accuracy Inorder to achieve target-dependent Twitter sentiment clas-sification Dong et al [20] proposed Adaptive RecursiveNeural Network (AdaRNN) through employing more thanone composition function For a given tweet its dependencytree for interested target was first converted Next AdaRNNlearned how to adaptively propagate sentiments of wordsto target node based on context and linguistic tags Theexperimental studies illustrated that AdaRNN improved thebaseline methods

To sum up retweeting sentiment tendency analyzing indynamic social networks is in the stage of development howto depict directivity of relationship how to fuse multidi-mensional features reasonably and how to build model thatcould adapt to dynamic evolution process of network can bevery challenging jobs To this end we present a multilayerNaıve Bayes model for analyzing userrsquos retweeting sentimenttendency which is appropriate for dynamic and directedsocial networks

3 Modeling RetweetingSentiment Tendency Features

Since it cannot make a comprehensive analysis on userrsquosretweeting sentiment tendency based on specific types offeatures only consequently in this paper we synthesizeprofile- relationship- and emotion-based features so as toachieve higher accuracy in retweeting sentiment tendencyanalysis

31 Profile-Based Features In this paper according to datasetin [21] we name the number of bifollowers the number offollowers the number of followees the number of contentsthat user posts the userrsquos province the userrsquos city the userrsquosgender the created time of userrsquos account and the verifiedtype of userrsquos account which are provided in the originaldataset as profile-based features

32 Relationship-Based Features Tan et al [22] pointed outthat users who had close relationship may have similaremotional point of view so relationship between users canbe used to extract usersrsquo emotions In this paper we inferrelationship-based features via userrsquos network topologicalstructure and interaction frequency

321 Dynamic Salton Metrics Common Neighbors metricsassumed that similarity between users was proportionalto the number of their common neighbors [23] Saltonmetrics introduces usersrsquo degree compared to CommonNeighbors metrics to measure userrsquos network topologicalstructure Since nonreciprocal friendships which may reflectmoderately valued friendship ties [24] are more importantthan reciprocal friends hence we make an improvement ontraditional Salton metrics according to directivity of linkBesides understanding dynamic structure of online social

networks plays an important role in the development ofretweeting sentiment tendency analyzing algorithms Thusgiven the dynamic nature of social network we considernetwork as a dynamic flowof time slices (one time slice standsfor one day and the older the time slice is the lower is itsimportance so does its weight) and dynamic Salton metricsbetween user 119906 and user V on the 119894th time slice 119905

119894is defined as

follows

Sa (119906 V 119905119894)

=

10038161003816100381610038161003816Γin(119906 119905119894) cap Γ

in(V 119905119894)10038161003816100381610038161003816radic1003816100381610038161003816Γ

in (119906 119905119894)10038161003816100381610038161003816100381610038161003816Γ

in (V 119905119894)1003816100381610038161003816

1003816100381610038161003816Γout (119906 119905

119894) cap Γout (V 119905

119894)1003816100381610038161003816 radic1003816100381610038161003816Γ

out (119906 119905119894)10038161003816100381610038161003816100381610038161003816Γ

out (V 119905119894)1003816100381610038161003816

(1)

where Γin(119906 119905119894) and Γin(V 119905

119894) stand for in-link users set of user

119906 and user V on 119905119894 respectively Γout(119906 119905

119894) and Γout(V 119905

119894) stand

for out-link users set of user 119906 and user V on 119905119894 respectively

where in-link and out-link are defined by follower relation-ship and |Γ(119909)| stands for the number of elements in set Γ(119909)Thus dynamic Salton metrics between user 119906 and user V onthe flow of time slices [0 119905

119899] is calculated as

Sa[0119905119899] (119906 V) =119899

sum

119894=0

120572119899minus119894times Sa (119906 V 119905

119894) (2)

where 120572 isin [0 1] 120572119899minus119894 represents weight of 119905119894 and 119899 represents

the number of time slices

322 Dynamic Interaction Frequency Similar to dynamicSalton metrics we employ 120572119899minus119894 which is defined inSection 321 to stand for the 119894th time slice 119905

119894rsquos weight

and dynamic interaction frequency between user 119906 and userV on 119905

119894is defined as follows

Fre (119906 V 119905119894) =

119888 (119906 V 119905119894)

119901 (119906 119905119894) + 119901 (V 119905

119894) (3)

where 119901(119906 119905119894) and 119901(V 119905

119894) stand for the number of posts of

user 119906 and user V on 119905119894 respectively and 119888(119906 V 119905

119894) stands

for the number of interactions between user 119906 and user Von 119905119894 where an interaction is defined as user 119906 retweeting a

microblog of user V or user V retweeting a microblog of user119906 Thus dynamic interaction frequency between user 119906 anduser V on the flow of time slices [0 119905

119899] is calculated as

Fre[0119905119899] (119906 V) =119899

sum

119894=0

120572119899minus119894times Fre (119906 V 119905

119894) (4)

33 Emotion-Based Features Since people tend to postmicroblogs which express their experiences or views in orderto realize desire of self-expression consequently there will bequite a lot of emotionalwords or expressions in their contentsThus along with the number of emotional words we employrecent mood statistics and emotion divergence as emotion-based features

4 Computational Intelligence and Neuroscience

331 The Number of Emotional Words In this paper wecount up the number of positive and negative emotionalwords in userrsquos retweeting contents with corpus of HowNetKnowledge (httpwwwkeenagecomdownloadsentimentrar) HowNet Knowledge which includes 8945 words andphrases consists of six files positive emotional words list filenegative emotional words list file positive review words listfile negative review words list file degree words list file andpropositional words list file

332 Recent Mood Statistics To enhance the performance ofretweeting sentiment analysis we further conduct analysison userrsquos contents according to the number of positive andnegative emotional words Similar to dynamic Saltonmetricswe employ 120572119899minus119894 which is defined in Section 321 to stand forthe 119894th time slice 119905

119894rsquos weight and user 119906rsquos mood statistics on 119905

119894

is calculated as follows

Ue (119906 119905119894) =

Upn (119906 119905119894)

Upn (119906 119905119894) + Unn (119906 119905

119894) (5)

where Upn(119906 119905119894) andUnn(119906 119905

119894) represent the number of pos-

itive emotional words and the number of negative emotionalwords user 119906 used on 119905

119894which are included in HowNet

Knowledge Thus user 119906rsquos recent mood statistics on the flowof time slices [0 119905

119899] is calculated with

Ue[0119905119899] (119906) =119899

sum

119894=0

120572119899minus119894times Ue (119906 119905

119894) (6)

333 Emotion Divergence In this paper we quantitativelydefine emotion divergence between user 119906 and microblog 119887as follows

Em (119906 119887 119905119899) = Ue[0119905119899] (119906) minus Ie (119887) (7)

where Ue[0119905119899](119906) represents user 119906rsquos latent mood statistics onthe flow of time slices [0 119905

119899] and Ie(119887) represents emotion

statistics expressed in microblog 119887 which can be calculatedas

Ie (119887) =Ipn (119887)

Ipn (119887) + Inn (119887) (8)

where Ipn(119887) and Inn(119887) denote the number of positiveemotional words and the number of negative emotionalwords used in microblog 119887 which are included in HowNetKnowledge

4 Multilayer Nave Bayes Model for AnalyzingUserrsquos Retweeting Sentiment Tendency

Some researchers found that Naıve Bayes was more suit-able for sentiment classification on microblogging [25] On

the basis of Bayesrsquo theorem Naıve Bayes model presentsuncertainty with probability and realizes process of learningand reasoning via probability Hence in this paper we putforward a multilayer Naıve Bayes model which is a tweakof Naıve Bayes model to analyze userrsquos retweeting sentimenttendency in a fine granularity

We formally define retweeting sentiment tendency anal-ysis as follows given a group of retweets with related dis-crete feature vector and corresponding retweeting sentimenttendency label we aim to leverage prior knowledge toautomatically assign retweeting sentiment tendency labelsto unknown retweets And our proposed method consistsof three modules (1) Naıve Bayes models in bottom layer(2) Naıve Bayes model in middle layer (3) Naıve Bayesmodel in top layer The process outlined is shown in Figure 1where the bottom layer is Naıve Bayes model for predictinguserrsquos profile and emotion the middle layer is Naıve Bayesmodel for predicting userrsquos relationship and finally in thetop layer throughmultilayer nestedNaıve Bayesmodel userrsquosretweeting sentiment tendency is predicted The detaileddescriptions are shown as follows

41 Naıve Bayes Models in Bottom Layer

411 Profile-Based Naıve Bayes Model Naıve Bayes modelwhich is based on Bayesrsquo theorem reduces computationaloverhead via conditional independent assumption to classifyunknown samples according to their maximum a posterioriprobability Since the calculation of userrsquos profile (denoted asUP) conforms to Naıve Bayes model which is determined bythe number of bifollowers (denoted as BI) the number offollowers (denoted as FO) the number of followees (denotedas FE) the number of contents that user posts (denotedas CN) the userrsquos province (denoted as PR) the userrsquos city(denoted as CI) the userrsquos gender (denoted as GE) thecreated time of userrsquos account (denoted as CT) and theverified type of userrsquos account (denoted as VT) UP can beseen as root node of Naıve Bayes model BI FO FE CN PRCI GE CT and VT can be seen as leaf nodes of Naıve Bayesmodel

Given that BI FO FE CN and CT are continuousattributes in order to calculate the conditional probabilityof them we discretize them by using discrete intervalsto represent them before modeling UP according to theabove featuresrsquo values BI is mapped to three levels (namelyfew medium and many) FO is mapped to three levels(namely few medium and many) FE is mapped to threelevels (namely few medium and many) CN is mappedto three levels (namely few medium and many) CT ismapped to three levels (namely short medium and long)Consequently the probability that userrsquos profile equals acertain discrete value is calculated as follows

119875 (UP = 119901 | BI FO FECNPRCI 119866CTVT) =119875 (UP = 119901BI FO FECNPRCI 119866CTVT)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

=119875 (UP = 119901) 119875 (BI | UP = 119901) 119875 (FO | UP = 119901) 119875 (FE | UP = 119901) 119875 (CN | UP = 119901) 119875 (PR | UP = 119901) 119875 (CI | UP = 119901) 119875 (119866 | UP = 119901) 119875 (CT | UP = 119901) 119875 (VT | UP = 119901)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

(9)

Computational Intelligence and Neuroscience 5

PEN

Relationship-based Naiumlve Bayes model

Emotion-based Naiumlve Bayes modelProfile-based Naiumlve Bayes model

Retweeting sentiment tendency Naiumlve Bayes model

Top layer

Middle layer

Bottom layer

NEN RMS ED

UEUP

BI FO FE CN PR CI GE CT VT

UR

UP DSM DIF

UEUR

ST

Figure 1 The framework of MLNBRST

where 119901 isin badmedium good denotes a certain discretevalue of UP BI FO FE CN PR CI 119866 CT and VT denotetheir corresponding discrete values respectively 119875(UP =119901BI FO FECNPRCI 119866CTVT) denotes the probabilitythat UP = 119901 and BI FO FE CN PR CI 119866 CT and VTare equal to corresponding discrete values 119875(sdot) denotes theprobability that a feature equals a certain discrete value and119875(sdot | UP = 119901) denotes the probability that a feature equalsa certain discrete value when UP = 119901 Then userrsquos profilecan be classified into three levels (namely bad mediumand good) according to its maximum a posteriori probabilitywhich is calculated with (9)

412 Emotion-Based Naıve Bayes Model In the same waythe calculation of userrsquos emotion (denoted as UE) also tallieswith Naıve Bayes model which is determined by the number

of positive emotional words (denoted as PEN) the numberof negative emotional words (denoted as NEN) recentmood statistics (denoted as RMS) and emotion divergence(denoted as ED) as a consequence UE can be viewed as rootnode of Naıve Bayes model PEN NEN RMS and ED can beviewed as leaf nodes of Naıve Bayes model

BeforemodelingUE given that PENNEN RMS and EDare continuous attributes in order to calculate conditionalprobability of them we discretize them by using discreteintervals to represent them PEN is mapped to three levels(namely few medium and many) NEN is mapped to threelevels (namely few medium and many) RMS is mappedto three levels (namely low medium and high) and ED ismapped to three levels (namely small medium and large)And the probability that userrsquos emotion is equal to a certaindiscrete value is calculated as below

119875 (UE = 119890 | PENNENRMSED) = 119875 (UE = 119890PENNENRMSED)119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

=119875 (UE = 119890) 119875 (PEN | UE = 119890) 119875 (NEN | UE = 119890) 119875 (RMS | UE = 119890) 119875 (ED | UE = 119890)

119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

(10)

where 119890 isin lowmedium high denotes a certain discretevalue of UE PEN NEN RMS and ED denote their corre-sponding discrete values respectively 119875(UE = 119890PENNENRMSED) denotes the probability that UE = 119890 and PENNEN RMS and ED are equal to corresponding discrete

values and 119875(sdot | UE = 119890) denotes the probability that afeature equals a certain discrete value when UE = 119890Then weclassify userrsquos emotion into three levels (namely lowmediumand high) according to its maximum a posteriori probabilitywhich is calculated with (10)

6 Computational Intelligence and Neuroscience

42 Naıve Bayes Model in Middle Layer Similarly the calcu-lation of userrsquos relationship (denoted as UR) is in accordancewith Naıve Bayes model as well which is determined by userrsquosprofile (UP) dynamic Salton metrics (denoted as DSM)and dynamic interaction frequency (denoted as DIF) henceUR can be treated as root node of Naıve Bayes model UPDSM and DIF can be treated as leaf nodes of Naıve Bayesmodel

Since DSM and DIF are continuous attributes in orderto calculate the conditional probability of them we discretizethem by using discrete intervals to represent them beforemodeling UR DSM is mapped to three levels (namelylow medium and high) DIF is mapped to three levels(namely low medium and high) So the probability thatuserrsquos relationship is equal to a certain discrete value iscalculated with

119875 (UR = 119903 | UPDSMDIF) = 119875 (UR = 119903UPDSMDIF)119875 (UP) 119875 (DSM) 119875 (DIF)

=119875 (UR = 119903) 119875 (UP | UR = 119903) 119875 (DSM | UR = 119903) 119875 (DIF | UR = 119903)

119875 (UP) 119875 (DSM) 119875 (DIF)

(11)

where 119903 isin lowmedium high denotes a certain discretevalue of UR UP DSM and DIF denote their correspondingdiscrete values respectively 119875(UR = 119903UPDSMDIF)denotes the probability that UR = 119903 and UP DSM andDIF are equal to corresponding discrete values and 119875(sdot |UR = 119903) denotes the probability that a feature equals a certaindiscrete value when UR = 119903 Then UR can be classified intothree levels (namely low medium and high) according to itsmaximum a posteriori probability which is calculated with(11)

43 Naıve Bayes Model in Top Layer Finally since thecalculation of userrsquos retweeting sentiment tendency (denotedas ST) userrsquos profile (UP) relationship (UR) and emotion(UE) all conforms to Naıve Bayes model so we adopt amultilayer Naıve Bayes model to analyze userrsquos retweetingsentiment tendency In this paper determined by userrsquos rela-tionship and emotion userrsquos retweeting sentiment tendencycan be regarded as root node of Naıve Bayes model userrsquosrelationship and emotion can be regarded as leaf nodesof Naıve Bayes model Thus userrsquos retweeting sentimenttendency is calculated as follows

119875 (ST = st | UPURUE) = 119875 (ST = stUPURUE)119875 (UP) 119875 (UR) 119875 (UE)

=119875 (ST = st) 119875 (UP | ST = st) 119875 (UR | ST = st) 119875 (UE | ST = st)

119875 (UP) 119875 (UR) 119875 (UE)

(12)

where st isin positive negative neutral denotes a certain dis-crete value of STUPUR andUEdenote their correspondingdiscrete values respectively 119875(ST = stUPURUE) denotesthe probability that ST = st and UP UR and UE are equal tocorresponding discrete values and 119875(sdot | ST = st) denotes theprobability that a feature equals a certain discrete value whenST = st Thus ST could be classified into three particularemotion statuses (namely positive negative and neutral)according to its maximum a posteriori probability which iscalculated with (12)

5 Experimental Evaluation

In this section we conduct experiments to assess the effec-tiveness of the proposed frameworkMLNBRSTThrough theexperiments we aim to answer the following two questions

(i) How effective is the proposed frameworkMLNBRSTcompared with other methods of retweeting senti-ment tendency analyzing

(ii) What are the effects of different features and temporalinformation on the performance of retweeting senti-ment tendency analyzing

51 Dataset and Experimental Settings To study the problemof retweeting sentiment tendency analyzing we leverage aSina microblogging dataset [21] which contains time seriesof usersrsquo tweets retweets and followingsrsquo number fromSeptember 28 2012 to October 29 2012 to evaluate thevalidity of the proposedmethodMoreover since not all userspost opinions when retweeting we manually label a subsetof Sina microblogging which contains retweeting contentswith sentiment polarity Statistics of the dataset are shown inTable 1

The experimental settings of retweeting sentiment ten-dency analyzing are described as follows we randomly dividethe dataset into two parts 119860 and 119861 119860 possesses 90 ofretweets used for training The left 10 of retweets denotedas 119861 is designated for testing And we use 10-fold cross-validations to ensure that our results are reliable and reportthe mean performance via precision recall and 1198651-measure

52 Performance Comparisons with Different RetweetingSentiment Tendency Analyzing Methods

521 Experiments with Different Feature-Based Methods Toanswer the first question we first compare the proposedframework MLNBRST with four different feature-basedmethods

(i) pMLNBRST considering userrsquos profile-based featuresonly

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

2 Computational Intelligence and Neuroscience

dynamic nature of network cannot make a comprehensiveanalysis on userrsquos retweeting sentiment tendency Hence inthis paper we propose a multilayer Naıve Bayes model foranalyzing userrsquos retweeting sentiment tendency towards amicroblog (denoted as MLNBRST) and our main contribu-tions are summarized next

(1) Take userrsquos recent individual emotion as well asemotion difference between user and microblog asmetrics to further analyze userrsquos retweeting sentimenttendency

(2) Improve traditional Salton metrics according todirectivity of link for being applied to directed net-work better

(3) Blend temporal information in userrsquos retweeting sen-timent features on the basis of time series of userrsquoscontents and network topological information so asto capture dynamic evolution process of informationand network structure

(4) Build a multilayer Naıve Bayes model on accountof Naıve Bayes models from different dimensionsto complete userrsquos retweeting sentiment tendencyanalysis in a more fine-grained perspective

(5) Evaluate MLNBRST on real-world Sina microblog-ging dataset and elaborate the importance of differentretweeting sentiment features and temporal informa-tion on userrsquos retweeting sentiment tendency analysis

The rest of the paper is organized as follows Section 2describes the related work dynamic retweeting sentimentfeatures are depicted in Section 3 Section 4 defines themethod we propose details of the experimental results anddataset which is used in this study are given in Section 5Finally conclusion appears in Section 6

2 Related Work

In recent years with the popularization of microbloggingsentiment analysis of microblogging has become one of thehot research topics [12] Existing microblogging sentimentanalysis algorithms can be roughly categorized into twogroups emotional dictionary-based methods and machinelearning methods

In emotional dictionary-based methods through sum-ming up all emotion wordsrsquo sentiment polarity a microblogrsquossentiment polarity is calculated Golder and Macy [8]adopted a prominent lexicon Linguistic Inquiry and WordCount (LIWC) to analyze sentiment automatically withtweets which were published by millions of different regionsand different cultural background microbloggers and theresults revealed that the proposedmethod can clearly identifyuserrsquos emotion pattern in a period of time Ghiassi et al [13]introduced an approach to supervised feature reduction usingn-grams and statistical analysis to develop a Twitter-specificlexicon for sentiment analysis which yielded improvementof sentiment classification accuracy Montejo-Raez et al [1415] explored an unsupervised domain-independent method

for polarity classification in Twitter via combining Senti-WordNet scores with WordNet Through weighting Senti-WordNet values with the score of a random walk analysis(PageRank) on the concepts found in texts over WordNetgraph polarity classification problem was solved The resultsobtained showed that both disambiguation and expansionwere good strategies for improving performance Stieglitzand Dang-Xuan [9] leveraged the tool ldquoSentiStrengthrdquo whichused a human-designed lexicon of emotional terms witha set of additional linguistic rules for negations boosterwords amplifications emoticons spelling corrections andother factors such as word weighting to analyze the level ofsentiments in politically relevant tweets

Although emotional dictionary-based method can bet-ter represent unstructured characteristics of text it is toodependent on emotion dictionary and ignores the effectsof new words Machine learning method can overcome theinfluence that unknown words have on the performanceof sentiment analysis It first selects words and phrasesas features to form vector space model and then convertssentiment analysis problem into a classification problem bytreating different emotional tendencies as different categoriesDavidov et al [10] treated related tags and emoticons intweets as labels and designed a supervised 119896-nearest neighborclassifier to complete emotion classification on Twitter whichdid not need too much manual annotation Based on 95most frequently used emoticons which were mapped tofour emotional categories happy hate low and anger Zhaoet al [11] employed the emoticons for the generation ofsentiment labels for tweets and built an incremental learningNaıve Bayes classifier for the categorization of four types ofsentiments with an empirical precision of 643 Wang et al[16] analyzed emotion tendentiousness of related microblogsin the 2012 US presidential election They found that interms of short text effect of Naıve Bayes was better than thatof SVM on emotion tendentiousness classification So theypresented an improved SVM classifier NBSVM which wasbetter and more stable than SVM and Naıve Bayes Korenekand Simko [17] first created an appraisal dictionary byutilizing a psychological theory called appraisal theory whichallowed a deeper andfine-grained analysis ofmicrobloggingsfollowed by classifying posts using SVM classifier whichrevealed that the proposed method was feasible even forspecific content presented on microbloggings On the basisof common social network characteristics and other carefullygeneralized linguistic patterns Li and Xu [18] proposedand implemented a novel method for identifying sentimentin microblogging posts First after thorough analysis onsample data an automatic rule-based systemwas constructedto detect and extract the cause event of each emotionalpost And then an emotional corpus was built with Chinesemicroblogging posts labeled by human annotators Finallya classifier namely SVR (support vector regression) wastrained to classify emotions in microblogging posts basedon extracted cause events The overall performance of pro-posed system was very promising Focusing on emotiondynamics in OSNs Xiong et al [19] proposed an emotionclassifier based on Bayes theory and some effective strategies

Computational Intelligence and Neuroscience 3

(Shannon entropy and the salience degree of each word)were introduced to improve the performance of classifierwith which proposed method can classify any Chinese tweetinto a particular emotion with a satisfactory accuracy Inorder to achieve target-dependent Twitter sentiment clas-sification Dong et al [20] proposed Adaptive RecursiveNeural Network (AdaRNN) through employing more thanone composition function For a given tweet its dependencytree for interested target was first converted Next AdaRNNlearned how to adaptively propagate sentiments of wordsto target node based on context and linguistic tags Theexperimental studies illustrated that AdaRNN improved thebaseline methods

To sum up retweeting sentiment tendency analyzing indynamic social networks is in the stage of development howto depict directivity of relationship how to fuse multidi-mensional features reasonably and how to build model thatcould adapt to dynamic evolution process of network can bevery challenging jobs To this end we present a multilayerNaıve Bayes model for analyzing userrsquos retweeting sentimenttendency which is appropriate for dynamic and directedsocial networks

3 Modeling RetweetingSentiment Tendency Features

Since it cannot make a comprehensive analysis on userrsquosretweeting sentiment tendency based on specific types offeatures only consequently in this paper we synthesizeprofile- relationship- and emotion-based features so as toachieve higher accuracy in retweeting sentiment tendencyanalysis

31 Profile-Based Features In this paper according to datasetin [21] we name the number of bifollowers the number offollowers the number of followees the number of contentsthat user posts the userrsquos province the userrsquos city the userrsquosgender the created time of userrsquos account and the verifiedtype of userrsquos account which are provided in the originaldataset as profile-based features

32 Relationship-Based Features Tan et al [22] pointed outthat users who had close relationship may have similaremotional point of view so relationship between users canbe used to extract usersrsquo emotions In this paper we inferrelationship-based features via userrsquos network topologicalstructure and interaction frequency

321 Dynamic Salton Metrics Common Neighbors metricsassumed that similarity between users was proportionalto the number of their common neighbors [23] Saltonmetrics introduces usersrsquo degree compared to CommonNeighbors metrics to measure userrsquos network topologicalstructure Since nonreciprocal friendships which may reflectmoderately valued friendship ties [24] are more importantthan reciprocal friends hence we make an improvement ontraditional Salton metrics according to directivity of linkBesides understanding dynamic structure of online social

networks plays an important role in the development ofretweeting sentiment tendency analyzing algorithms Thusgiven the dynamic nature of social network we considernetwork as a dynamic flowof time slices (one time slice standsfor one day and the older the time slice is the lower is itsimportance so does its weight) and dynamic Salton metricsbetween user 119906 and user V on the 119894th time slice 119905

119894is defined as

follows

Sa (119906 V 119905119894)

=

10038161003816100381610038161003816Γin(119906 119905119894) cap Γ

in(V 119905119894)10038161003816100381610038161003816radic1003816100381610038161003816Γ

in (119906 119905119894)10038161003816100381610038161003816100381610038161003816Γ

in (V 119905119894)1003816100381610038161003816

1003816100381610038161003816Γout (119906 119905

119894) cap Γout (V 119905

119894)1003816100381610038161003816 radic1003816100381610038161003816Γ

out (119906 119905119894)10038161003816100381610038161003816100381610038161003816Γ

out (V 119905119894)1003816100381610038161003816

(1)

where Γin(119906 119905119894) and Γin(V 119905

119894) stand for in-link users set of user

119906 and user V on 119905119894 respectively Γout(119906 119905

119894) and Γout(V 119905

119894) stand

for out-link users set of user 119906 and user V on 119905119894 respectively

where in-link and out-link are defined by follower relation-ship and |Γ(119909)| stands for the number of elements in set Γ(119909)Thus dynamic Salton metrics between user 119906 and user V onthe flow of time slices [0 119905

119899] is calculated as

Sa[0119905119899] (119906 V) =119899

sum

119894=0

120572119899minus119894times Sa (119906 V 119905

119894) (2)

where 120572 isin [0 1] 120572119899minus119894 represents weight of 119905119894 and 119899 represents

the number of time slices

322 Dynamic Interaction Frequency Similar to dynamicSalton metrics we employ 120572119899minus119894 which is defined inSection 321 to stand for the 119894th time slice 119905

119894rsquos weight

and dynamic interaction frequency between user 119906 and userV on 119905

119894is defined as follows

Fre (119906 V 119905119894) =

119888 (119906 V 119905119894)

119901 (119906 119905119894) + 119901 (V 119905

119894) (3)

where 119901(119906 119905119894) and 119901(V 119905

119894) stand for the number of posts of

user 119906 and user V on 119905119894 respectively and 119888(119906 V 119905

119894) stands

for the number of interactions between user 119906 and user Von 119905119894 where an interaction is defined as user 119906 retweeting a

microblog of user V or user V retweeting a microblog of user119906 Thus dynamic interaction frequency between user 119906 anduser V on the flow of time slices [0 119905

119899] is calculated as

Fre[0119905119899] (119906 V) =119899

sum

119894=0

120572119899minus119894times Fre (119906 V 119905

119894) (4)

33 Emotion-Based Features Since people tend to postmicroblogs which express their experiences or views in orderto realize desire of self-expression consequently there will bequite a lot of emotionalwords or expressions in their contentsThus along with the number of emotional words we employrecent mood statistics and emotion divergence as emotion-based features

4 Computational Intelligence and Neuroscience

331 The Number of Emotional Words In this paper wecount up the number of positive and negative emotionalwords in userrsquos retweeting contents with corpus of HowNetKnowledge (httpwwwkeenagecomdownloadsentimentrar) HowNet Knowledge which includes 8945 words andphrases consists of six files positive emotional words list filenegative emotional words list file positive review words listfile negative review words list file degree words list file andpropositional words list file

332 Recent Mood Statistics To enhance the performance ofretweeting sentiment analysis we further conduct analysison userrsquos contents according to the number of positive andnegative emotional words Similar to dynamic Saltonmetricswe employ 120572119899minus119894 which is defined in Section 321 to stand forthe 119894th time slice 119905

119894rsquos weight and user 119906rsquos mood statistics on 119905

119894

is calculated as follows

Ue (119906 119905119894) =

Upn (119906 119905119894)

Upn (119906 119905119894) + Unn (119906 119905

119894) (5)

where Upn(119906 119905119894) andUnn(119906 119905

119894) represent the number of pos-

itive emotional words and the number of negative emotionalwords user 119906 used on 119905

119894which are included in HowNet

Knowledge Thus user 119906rsquos recent mood statistics on the flowof time slices [0 119905

119899] is calculated with

Ue[0119905119899] (119906) =119899

sum

119894=0

120572119899minus119894times Ue (119906 119905

119894) (6)

333 Emotion Divergence In this paper we quantitativelydefine emotion divergence between user 119906 and microblog 119887as follows

Em (119906 119887 119905119899) = Ue[0119905119899] (119906) minus Ie (119887) (7)

where Ue[0119905119899](119906) represents user 119906rsquos latent mood statistics onthe flow of time slices [0 119905

119899] and Ie(119887) represents emotion

statistics expressed in microblog 119887 which can be calculatedas

Ie (119887) =Ipn (119887)

Ipn (119887) + Inn (119887) (8)

where Ipn(119887) and Inn(119887) denote the number of positiveemotional words and the number of negative emotionalwords used in microblog 119887 which are included in HowNetKnowledge

4 Multilayer Nave Bayes Model for AnalyzingUserrsquos Retweeting Sentiment Tendency

Some researchers found that Naıve Bayes was more suit-able for sentiment classification on microblogging [25] On

the basis of Bayesrsquo theorem Naıve Bayes model presentsuncertainty with probability and realizes process of learningand reasoning via probability Hence in this paper we putforward a multilayer Naıve Bayes model which is a tweakof Naıve Bayes model to analyze userrsquos retweeting sentimenttendency in a fine granularity

We formally define retweeting sentiment tendency anal-ysis as follows given a group of retweets with related dis-crete feature vector and corresponding retweeting sentimenttendency label we aim to leverage prior knowledge toautomatically assign retweeting sentiment tendency labelsto unknown retweets And our proposed method consistsof three modules (1) Naıve Bayes models in bottom layer(2) Naıve Bayes model in middle layer (3) Naıve Bayesmodel in top layer The process outlined is shown in Figure 1where the bottom layer is Naıve Bayes model for predictinguserrsquos profile and emotion the middle layer is Naıve Bayesmodel for predicting userrsquos relationship and finally in thetop layer throughmultilayer nestedNaıve Bayesmodel userrsquosretweeting sentiment tendency is predicted The detaileddescriptions are shown as follows

41 Naıve Bayes Models in Bottom Layer

411 Profile-Based Naıve Bayes Model Naıve Bayes modelwhich is based on Bayesrsquo theorem reduces computationaloverhead via conditional independent assumption to classifyunknown samples according to their maximum a posterioriprobability Since the calculation of userrsquos profile (denoted asUP) conforms to Naıve Bayes model which is determined bythe number of bifollowers (denoted as BI) the number offollowers (denoted as FO) the number of followees (denotedas FE) the number of contents that user posts (denotedas CN) the userrsquos province (denoted as PR) the userrsquos city(denoted as CI) the userrsquos gender (denoted as GE) thecreated time of userrsquos account (denoted as CT) and theverified type of userrsquos account (denoted as VT) UP can beseen as root node of Naıve Bayes model BI FO FE CN PRCI GE CT and VT can be seen as leaf nodes of Naıve Bayesmodel

Given that BI FO FE CN and CT are continuousattributes in order to calculate the conditional probabilityof them we discretize them by using discrete intervalsto represent them before modeling UP according to theabove featuresrsquo values BI is mapped to three levels (namelyfew medium and many) FO is mapped to three levels(namely few medium and many) FE is mapped to threelevels (namely few medium and many) CN is mappedto three levels (namely few medium and many) CT ismapped to three levels (namely short medium and long)Consequently the probability that userrsquos profile equals acertain discrete value is calculated as follows

119875 (UP = 119901 | BI FO FECNPRCI 119866CTVT) =119875 (UP = 119901BI FO FECNPRCI 119866CTVT)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

=119875 (UP = 119901) 119875 (BI | UP = 119901) 119875 (FO | UP = 119901) 119875 (FE | UP = 119901) 119875 (CN | UP = 119901) 119875 (PR | UP = 119901) 119875 (CI | UP = 119901) 119875 (119866 | UP = 119901) 119875 (CT | UP = 119901) 119875 (VT | UP = 119901)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

(9)

Computational Intelligence and Neuroscience 5

PEN

Relationship-based Naiumlve Bayes model

Emotion-based Naiumlve Bayes modelProfile-based Naiumlve Bayes model

Retweeting sentiment tendency Naiumlve Bayes model

Top layer

Middle layer

Bottom layer

NEN RMS ED

UEUP

BI FO FE CN PR CI GE CT VT

UR

UP DSM DIF

UEUR

ST

Figure 1 The framework of MLNBRST

where 119901 isin badmedium good denotes a certain discretevalue of UP BI FO FE CN PR CI 119866 CT and VT denotetheir corresponding discrete values respectively 119875(UP =119901BI FO FECNPRCI 119866CTVT) denotes the probabilitythat UP = 119901 and BI FO FE CN PR CI 119866 CT and VTare equal to corresponding discrete values 119875(sdot) denotes theprobability that a feature equals a certain discrete value and119875(sdot | UP = 119901) denotes the probability that a feature equalsa certain discrete value when UP = 119901 Then userrsquos profilecan be classified into three levels (namely bad mediumand good) according to its maximum a posteriori probabilitywhich is calculated with (9)

412 Emotion-Based Naıve Bayes Model In the same waythe calculation of userrsquos emotion (denoted as UE) also tallieswith Naıve Bayes model which is determined by the number

of positive emotional words (denoted as PEN) the numberof negative emotional words (denoted as NEN) recentmood statistics (denoted as RMS) and emotion divergence(denoted as ED) as a consequence UE can be viewed as rootnode of Naıve Bayes model PEN NEN RMS and ED can beviewed as leaf nodes of Naıve Bayes model

BeforemodelingUE given that PENNEN RMS and EDare continuous attributes in order to calculate conditionalprobability of them we discretize them by using discreteintervals to represent them PEN is mapped to three levels(namely few medium and many) NEN is mapped to threelevels (namely few medium and many) RMS is mappedto three levels (namely low medium and high) and ED ismapped to three levels (namely small medium and large)And the probability that userrsquos emotion is equal to a certaindiscrete value is calculated as below

119875 (UE = 119890 | PENNENRMSED) = 119875 (UE = 119890PENNENRMSED)119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

=119875 (UE = 119890) 119875 (PEN | UE = 119890) 119875 (NEN | UE = 119890) 119875 (RMS | UE = 119890) 119875 (ED | UE = 119890)

119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

(10)

where 119890 isin lowmedium high denotes a certain discretevalue of UE PEN NEN RMS and ED denote their corre-sponding discrete values respectively 119875(UE = 119890PENNENRMSED) denotes the probability that UE = 119890 and PENNEN RMS and ED are equal to corresponding discrete

values and 119875(sdot | UE = 119890) denotes the probability that afeature equals a certain discrete value when UE = 119890Then weclassify userrsquos emotion into three levels (namely lowmediumand high) according to its maximum a posteriori probabilitywhich is calculated with (10)

6 Computational Intelligence and Neuroscience

42 Naıve Bayes Model in Middle Layer Similarly the calcu-lation of userrsquos relationship (denoted as UR) is in accordancewith Naıve Bayes model as well which is determined by userrsquosprofile (UP) dynamic Salton metrics (denoted as DSM)and dynamic interaction frequency (denoted as DIF) henceUR can be treated as root node of Naıve Bayes model UPDSM and DIF can be treated as leaf nodes of Naıve Bayesmodel

Since DSM and DIF are continuous attributes in orderto calculate the conditional probability of them we discretizethem by using discrete intervals to represent them beforemodeling UR DSM is mapped to three levels (namelylow medium and high) DIF is mapped to three levels(namely low medium and high) So the probability thatuserrsquos relationship is equal to a certain discrete value iscalculated with

119875 (UR = 119903 | UPDSMDIF) = 119875 (UR = 119903UPDSMDIF)119875 (UP) 119875 (DSM) 119875 (DIF)

=119875 (UR = 119903) 119875 (UP | UR = 119903) 119875 (DSM | UR = 119903) 119875 (DIF | UR = 119903)

119875 (UP) 119875 (DSM) 119875 (DIF)

(11)

where 119903 isin lowmedium high denotes a certain discretevalue of UR UP DSM and DIF denote their correspondingdiscrete values respectively 119875(UR = 119903UPDSMDIF)denotes the probability that UR = 119903 and UP DSM andDIF are equal to corresponding discrete values and 119875(sdot |UR = 119903) denotes the probability that a feature equals a certaindiscrete value when UR = 119903 Then UR can be classified intothree levels (namely low medium and high) according to itsmaximum a posteriori probability which is calculated with(11)

43 Naıve Bayes Model in Top Layer Finally since thecalculation of userrsquos retweeting sentiment tendency (denotedas ST) userrsquos profile (UP) relationship (UR) and emotion(UE) all conforms to Naıve Bayes model so we adopt amultilayer Naıve Bayes model to analyze userrsquos retweetingsentiment tendency In this paper determined by userrsquos rela-tionship and emotion userrsquos retweeting sentiment tendencycan be regarded as root node of Naıve Bayes model userrsquosrelationship and emotion can be regarded as leaf nodesof Naıve Bayes model Thus userrsquos retweeting sentimenttendency is calculated as follows

119875 (ST = st | UPURUE) = 119875 (ST = stUPURUE)119875 (UP) 119875 (UR) 119875 (UE)

=119875 (ST = st) 119875 (UP | ST = st) 119875 (UR | ST = st) 119875 (UE | ST = st)

119875 (UP) 119875 (UR) 119875 (UE)

(12)

where st isin positive negative neutral denotes a certain dis-crete value of STUPUR andUEdenote their correspondingdiscrete values respectively 119875(ST = stUPURUE) denotesthe probability that ST = st and UP UR and UE are equal tocorresponding discrete values and 119875(sdot | ST = st) denotes theprobability that a feature equals a certain discrete value whenST = st Thus ST could be classified into three particularemotion statuses (namely positive negative and neutral)according to its maximum a posteriori probability which iscalculated with (12)

5 Experimental Evaluation

In this section we conduct experiments to assess the effec-tiveness of the proposed frameworkMLNBRSTThrough theexperiments we aim to answer the following two questions

(i) How effective is the proposed frameworkMLNBRSTcompared with other methods of retweeting senti-ment tendency analyzing

(ii) What are the effects of different features and temporalinformation on the performance of retweeting senti-ment tendency analyzing

51 Dataset and Experimental Settings To study the problemof retweeting sentiment tendency analyzing we leverage aSina microblogging dataset [21] which contains time seriesof usersrsquo tweets retweets and followingsrsquo number fromSeptember 28 2012 to October 29 2012 to evaluate thevalidity of the proposedmethodMoreover since not all userspost opinions when retweeting we manually label a subsetof Sina microblogging which contains retweeting contentswith sentiment polarity Statistics of the dataset are shown inTable 1

The experimental settings of retweeting sentiment ten-dency analyzing are described as follows we randomly dividethe dataset into two parts 119860 and 119861 119860 possesses 90 ofretweets used for training The left 10 of retweets denotedas 119861 is designated for testing And we use 10-fold cross-validations to ensure that our results are reliable and reportthe mean performance via precision recall and 1198651-measure

52 Performance Comparisons with Different RetweetingSentiment Tendency Analyzing Methods

521 Experiments with Different Feature-Based Methods Toanswer the first question we first compare the proposedframework MLNBRST with four different feature-basedmethods

(i) pMLNBRST considering userrsquos profile-based featuresonly

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Computational Intelligence and Neuroscience 3

(Shannon entropy and the salience degree of each word)were introduced to improve the performance of classifierwith which proposed method can classify any Chinese tweetinto a particular emotion with a satisfactory accuracy Inorder to achieve target-dependent Twitter sentiment clas-sification Dong et al [20] proposed Adaptive RecursiveNeural Network (AdaRNN) through employing more thanone composition function For a given tweet its dependencytree for interested target was first converted Next AdaRNNlearned how to adaptively propagate sentiments of wordsto target node based on context and linguistic tags Theexperimental studies illustrated that AdaRNN improved thebaseline methods

To sum up retweeting sentiment tendency analyzing indynamic social networks is in the stage of development howto depict directivity of relationship how to fuse multidi-mensional features reasonably and how to build model thatcould adapt to dynamic evolution process of network can bevery challenging jobs To this end we present a multilayerNaıve Bayes model for analyzing userrsquos retweeting sentimenttendency which is appropriate for dynamic and directedsocial networks

3 Modeling RetweetingSentiment Tendency Features

Since it cannot make a comprehensive analysis on userrsquosretweeting sentiment tendency based on specific types offeatures only consequently in this paper we synthesizeprofile- relationship- and emotion-based features so as toachieve higher accuracy in retweeting sentiment tendencyanalysis

31 Profile-Based Features In this paper according to datasetin [21] we name the number of bifollowers the number offollowers the number of followees the number of contentsthat user posts the userrsquos province the userrsquos city the userrsquosgender the created time of userrsquos account and the verifiedtype of userrsquos account which are provided in the originaldataset as profile-based features

32 Relationship-Based Features Tan et al [22] pointed outthat users who had close relationship may have similaremotional point of view so relationship between users canbe used to extract usersrsquo emotions In this paper we inferrelationship-based features via userrsquos network topologicalstructure and interaction frequency

321 Dynamic Salton Metrics Common Neighbors metricsassumed that similarity between users was proportionalto the number of their common neighbors [23] Saltonmetrics introduces usersrsquo degree compared to CommonNeighbors metrics to measure userrsquos network topologicalstructure Since nonreciprocal friendships which may reflectmoderately valued friendship ties [24] are more importantthan reciprocal friends hence we make an improvement ontraditional Salton metrics according to directivity of linkBesides understanding dynamic structure of online social

networks plays an important role in the development ofretweeting sentiment tendency analyzing algorithms Thusgiven the dynamic nature of social network we considernetwork as a dynamic flowof time slices (one time slice standsfor one day and the older the time slice is the lower is itsimportance so does its weight) and dynamic Salton metricsbetween user 119906 and user V on the 119894th time slice 119905

119894is defined as

follows

Sa (119906 V 119905119894)

=

10038161003816100381610038161003816Γin(119906 119905119894) cap Γ

in(V 119905119894)10038161003816100381610038161003816radic1003816100381610038161003816Γ

in (119906 119905119894)10038161003816100381610038161003816100381610038161003816Γ

in (V 119905119894)1003816100381610038161003816

1003816100381610038161003816Γout (119906 119905

119894) cap Γout (V 119905

119894)1003816100381610038161003816 radic1003816100381610038161003816Γ

out (119906 119905119894)10038161003816100381610038161003816100381610038161003816Γ

out (V 119905119894)1003816100381610038161003816

(1)

where Γin(119906 119905119894) and Γin(V 119905

119894) stand for in-link users set of user

119906 and user V on 119905119894 respectively Γout(119906 119905

119894) and Γout(V 119905

119894) stand

for out-link users set of user 119906 and user V on 119905119894 respectively

where in-link and out-link are defined by follower relation-ship and |Γ(119909)| stands for the number of elements in set Γ(119909)Thus dynamic Salton metrics between user 119906 and user V onthe flow of time slices [0 119905

119899] is calculated as

Sa[0119905119899] (119906 V) =119899

sum

119894=0

120572119899minus119894times Sa (119906 V 119905

119894) (2)

where 120572 isin [0 1] 120572119899minus119894 represents weight of 119905119894 and 119899 represents

the number of time slices

322 Dynamic Interaction Frequency Similar to dynamicSalton metrics we employ 120572119899minus119894 which is defined inSection 321 to stand for the 119894th time slice 119905

119894rsquos weight

and dynamic interaction frequency between user 119906 and userV on 119905

119894is defined as follows

Fre (119906 V 119905119894) =

119888 (119906 V 119905119894)

119901 (119906 119905119894) + 119901 (V 119905

119894) (3)

where 119901(119906 119905119894) and 119901(V 119905

119894) stand for the number of posts of

user 119906 and user V on 119905119894 respectively and 119888(119906 V 119905

119894) stands

for the number of interactions between user 119906 and user Von 119905119894 where an interaction is defined as user 119906 retweeting a

microblog of user V or user V retweeting a microblog of user119906 Thus dynamic interaction frequency between user 119906 anduser V on the flow of time slices [0 119905

119899] is calculated as

Fre[0119905119899] (119906 V) =119899

sum

119894=0

120572119899minus119894times Fre (119906 V 119905

119894) (4)

33 Emotion-Based Features Since people tend to postmicroblogs which express their experiences or views in orderto realize desire of self-expression consequently there will bequite a lot of emotionalwords or expressions in their contentsThus along with the number of emotional words we employrecent mood statistics and emotion divergence as emotion-based features

4 Computational Intelligence and Neuroscience

331 The Number of Emotional Words In this paper wecount up the number of positive and negative emotionalwords in userrsquos retweeting contents with corpus of HowNetKnowledge (httpwwwkeenagecomdownloadsentimentrar) HowNet Knowledge which includes 8945 words andphrases consists of six files positive emotional words list filenegative emotional words list file positive review words listfile negative review words list file degree words list file andpropositional words list file

332 Recent Mood Statistics To enhance the performance ofretweeting sentiment analysis we further conduct analysison userrsquos contents according to the number of positive andnegative emotional words Similar to dynamic Saltonmetricswe employ 120572119899minus119894 which is defined in Section 321 to stand forthe 119894th time slice 119905

119894rsquos weight and user 119906rsquos mood statistics on 119905

119894

is calculated as follows

Ue (119906 119905119894) =

Upn (119906 119905119894)

Upn (119906 119905119894) + Unn (119906 119905

119894) (5)

where Upn(119906 119905119894) andUnn(119906 119905

119894) represent the number of pos-

itive emotional words and the number of negative emotionalwords user 119906 used on 119905

119894which are included in HowNet

Knowledge Thus user 119906rsquos recent mood statistics on the flowof time slices [0 119905

119899] is calculated with

Ue[0119905119899] (119906) =119899

sum

119894=0

120572119899minus119894times Ue (119906 119905

119894) (6)

333 Emotion Divergence In this paper we quantitativelydefine emotion divergence between user 119906 and microblog 119887as follows

Em (119906 119887 119905119899) = Ue[0119905119899] (119906) minus Ie (119887) (7)

where Ue[0119905119899](119906) represents user 119906rsquos latent mood statistics onthe flow of time slices [0 119905

119899] and Ie(119887) represents emotion

statistics expressed in microblog 119887 which can be calculatedas

Ie (119887) =Ipn (119887)

Ipn (119887) + Inn (119887) (8)

where Ipn(119887) and Inn(119887) denote the number of positiveemotional words and the number of negative emotionalwords used in microblog 119887 which are included in HowNetKnowledge

4 Multilayer Nave Bayes Model for AnalyzingUserrsquos Retweeting Sentiment Tendency

Some researchers found that Naıve Bayes was more suit-able for sentiment classification on microblogging [25] On

the basis of Bayesrsquo theorem Naıve Bayes model presentsuncertainty with probability and realizes process of learningand reasoning via probability Hence in this paper we putforward a multilayer Naıve Bayes model which is a tweakof Naıve Bayes model to analyze userrsquos retweeting sentimenttendency in a fine granularity

We formally define retweeting sentiment tendency anal-ysis as follows given a group of retweets with related dis-crete feature vector and corresponding retweeting sentimenttendency label we aim to leverage prior knowledge toautomatically assign retweeting sentiment tendency labelsto unknown retweets And our proposed method consistsof three modules (1) Naıve Bayes models in bottom layer(2) Naıve Bayes model in middle layer (3) Naıve Bayesmodel in top layer The process outlined is shown in Figure 1where the bottom layer is Naıve Bayes model for predictinguserrsquos profile and emotion the middle layer is Naıve Bayesmodel for predicting userrsquos relationship and finally in thetop layer throughmultilayer nestedNaıve Bayesmodel userrsquosretweeting sentiment tendency is predicted The detaileddescriptions are shown as follows

41 Naıve Bayes Models in Bottom Layer

411 Profile-Based Naıve Bayes Model Naıve Bayes modelwhich is based on Bayesrsquo theorem reduces computationaloverhead via conditional independent assumption to classifyunknown samples according to their maximum a posterioriprobability Since the calculation of userrsquos profile (denoted asUP) conforms to Naıve Bayes model which is determined bythe number of bifollowers (denoted as BI) the number offollowers (denoted as FO) the number of followees (denotedas FE) the number of contents that user posts (denotedas CN) the userrsquos province (denoted as PR) the userrsquos city(denoted as CI) the userrsquos gender (denoted as GE) thecreated time of userrsquos account (denoted as CT) and theverified type of userrsquos account (denoted as VT) UP can beseen as root node of Naıve Bayes model BI FO FE CN PRCI GE CT and VT can be seen as leaf nodes of Naıve Bayesmodel

Given that BI FO FE CN and CT are continuousattributes in order to calculate the conditional probabilityof them we discretize them by using discrete intervalsto represent them before modeling UP according to theabove featuresrsquo values BI is mapped to three levels (namelyfew medium and many) FO is mapped to three levels(namely few medium and many) FE is mapped to threelevels (namely few medium and many) CN is mappedto three levels (namely few medium and many) CT ismapped to three levels (namely short medium and long)Consequently the probability that userrsquos profile equals acertain discrete value is calculated as follows

119875 (UP = 119901 | BI FO FECNPRCI 119866CTVT) =119875 (UP = 119901BI FO FECNPRCI 119866CTVT)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

=119875 (UP = 119901) 119875 (BI | UP = 119901) 119875 (FO | UP = 119901) 119875 (FE | UP = 119901) 119875 (CN | UP = 119901) 119875 (PR | UP = 119901) 119875 (CI | UP = 119901) 119875 (119866 | UP = 119901) 119875 (CT | UP = 119901) 119875 (VT | UP = 119901)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

(9)

Computational Intelligence and Neuroscience 5

PEN

Relationship-based Naiumlve Bayes model

Emotion-based Naiumlve Bayes modelProfile-based Naiumlve Bayes model

Retweeting sentiment tendency Naiumlve Bayes model

Top layer

Middle layer

Bottom layer

NEN RMS ED

UEUP

BI FO FE CN PR CI GE CT VT

UR

UP DSM DIF

UEUR

ST

Figure 1 The framework of MLNBRST

where 119901 isin badmedium good denotes a certain discretevalue of UP BI FO FE CN PR CI 119866 CT and VT denotetheir corresponding discrete values respectively 119875(UP =119901BI FO FECNPRCI 119866CTVT) denotes the probabilitythat UP = 119901 and BI FO FE CN PR CI 119866 CT and VTare equal to corresponding discrete values 119875(sdot) denotes theprobability that a feature equals a certain discrete value and119875(sdot | UP = 119901) denotes the probability that a feature equalsa certain discrete value when UP = 119901 Then userrsquos profilecan be classified into three levels (namely bad mediumand good) according to its maximum a posteriori probabilitywhich is calculated with (9)

412 Emotion-Based Naıve Bayes Model In the same waythe calculation of userrsquos emotion (denoted as UE) also tallieswith Naıve Bayes model which is determined by the number

of positive emotional words (denoted as PEN) the numberof negative emotional words (denoted as NEN) recentmood statistics (denoted as RMS) and emotion divergence(denoted as ED) as a consequence UE can be viewed as rootnode of Naıve Bayes model PEN NEN RMS and ED can beviewed as leaf nodes of Naıve Bayes model

BeforemodelingUE given that PENNEN RMS and EDare continuous attributes in order to calculate conditionalprobability of them we discretize them by using discreteintervals to represent them PEN is mapped to three levels(namely few medium and many) NEN is mapped to threelevels (namely few medium and many) RMS is mappedto three levels (namely low medium and high) and ED ismapped to three levels (namely small medium and large)And the probability that userrsquos emotion is equal to a certaindiscrete value is calculated as below

119875 (UE = 119890 | PENNENRMSED) = 119875 (UE = 119890PENNENRMSED)119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

=119875 (UE = 119890) 119875 (PEN | UE = 119890) 119875 (NEN | UE = 119890) 119875 (RMS | UE = 119890) 119875 (ED | UE = 119890)

119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

(10)

where 119890 isin lowmedium high denotes a certain discretevalue of UE PEN NEN RMS and ED denote their corre-sponding discrete values respectively 119875(UE = 119890PENNENRMSED) denotes the probability that UE = 119890 and PENNEN RMS and ED are equal to corresponding discrete

values and 119875(sdot | UE = 119890) denotes the probability that afeature equals a certain discrete value when UE = 119890Then weclassify userrsquos emotion into three levels (namely lowmediumand high) according to its maximum a posteriori probabilitywhich is calculated with (10)

6 Computational Intelligence and Neuroscience

42 Naıve Bayes Model in Middle Layer Similarly the calcu-lation of userrsquos relationship (denoted as UR) is in accordancewith Naıve Bayes model as well which is determined by userrsquosprofile (UP) dynamic Salton metrics (denoted as DSM)and dynamic interaction frequency (denoted as DIF) henceUR can be treated as root node of Naıve Bayes model UPDSM and DIF can be treated as leaf nodes of Naıve Bayesmodel

Since DSM and DIF are continuous attributes in orderto calculate the conditional probability of them we discretizethem by using discrete intervals to represent them beforemodeling UR DSM is mapped to three levels (namelylow medium and high) DIF is mapped to three levels(namely low medium and high) So the probability thatuserrsquos relationship is equal to a certain discrete value iscalculated with

119875 (UR = 119903 | UPDSMDIF) = 119875 (UR = 119903UPDSMDIF)119875 (UP) 119875 (DSM) 119875 (DIF)

=119875 (UR = 119903) 119875 (UP | UR = 119903) 119875 (DSM | UR = 119903) 119875 (DIF | UR = 119903)

119875 (UP) 119875 (DSM) 119875 (DIF)

(11)

where 119903 isin lowmedium high denotes a certain discretevalue of UR UP DSM and DIF denote their correspondingdiscrete values respectively 119875(UR = 119903UPDSMDIF)denotes the probability that UR = 119903 and UP DSM andDIF are equal to corresponding discrete values and 119875(sdot |UR = 119903) denotes the probability that a feature equals a certaindiscrete value when UR = 119903 Then UR can be classified intothree levels (namely low medium and high) according to itsmaximum a posteriori probability which is calculated with(11)

43 Naıve Bayes Model in Top Layer Finally since thecalculation of userrsquos retweeting sentiment tendency (denotedas ST) userrsquos profile (UP) relationship (UR) and emotion(UE) all conforms to Naıve Bayes model so we adopt amultilayer Naıve Bayes model to analyze userrsquos retweetingsentiment tendency In this paper determined by userrsquos rela-tionship and emotion userrsquos retweeting sentiment tendencycan be regarded as root node of Naıve Bayes model userrsquosrelationship and emotion can be regarded as leaf nodesof Naıve Bayes model Thus userrsquos retweeting sentimenttendency is calculated as follows

119875 (ST = st | UPURUE) = 119875 (ST = stUPURUE)119875 (UP) 119875 (UR) 119875 (UE)

=119875 (ST = st) 119875 (UP | ST = st) 119875 (UR | ST = st) 119875 (UE | ST = st)

119875 (UP) 119875 (UR) 119875 (UE)

(12)

where st isin positive negative neutral denotes a certain dis-crete value of STUPUR andUEdenote their correspondingdiscrete values respectively 119875(ST = stUPURUE) denotesthe probability that ST = st and UP UR and UE are equal tocorresponding discrete values and 119875(sdot | ST = st) denotes theprobability that a feature equals a certain discrete value whenST = st Thus ST could be classified into three particularemotion statuses (namely positive negative and neutral)according to its maximum a posteriori probability which iscalculated with (12)

5 Experimental Evaluation

In this section we conduct experiments to assess the effec-tiveness of the proposed frameworkMLNBRSTThrough theexperiments we aim to answer the following two questions

(i) How effective is the proposed frameworkMLNBRSTcompared with other methods of retweeting senti-ment tendency analyzing

(ii) What are the effects of different features and temporalinformation on the performance of retweeting senti-ment tendency analyzing

51 Dataset and Experimental Settings To study the problemof retweeting sentiment tendency analyzing we leverage aSina microblogging dataset [21] which contains time seriesof usersrsquo tweets retweets and followingsrsquo number fromSeptember 28 2012 to October 29 2012 to evaluate thevalidity of the proposedmethodMoreover since not all userspost opinions when retweeting we manually label a subsetof Sina microblogging which contains retweeting contentswith sentiment polarity Statistics of the dataset are shown inTable 1

The experimental settings of retweeting sentiment ten-dency analyzing are described as follows we randomly dividethe dataset into two parts 119860 and 119861 119860 possesses 90 ofretweets used for training The left 10 of retweets denotedas 119861 is designated for testing And we use 10-fold cross-validations to ensure that our results are reliable and reportthe mean performance via precision recall and 1198651-measure

52 Performance Comparisons with Different RetweetingSentiment Tendency Analyzing Methods

521 Experiments with Different Feature-Based Methods Toanswer the first question we first compare the proposedframework MLNBRST with four different feature-basedmethods

(i) pMLNBRST considering userrsquos profile-based featuresonly

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

4 Computational Intelligence and Neuroscience

331 The Number of Emotional Words In this paper wecount up the number of positive and negative emotionalwords in userrsquos retweeting contents with corpus of HowNetKnowledge (httpwwwkeenagecomdownloadsentimentrar) HowNet Knowledge which includes 8945 words andphrases consists of six files positive emotional words list filenegative emotional words list file positive review words listfile negative review words list file degree words list file andpropositional words list file

332 Recent Mood Statistics To enhance the performance ofretweeting sentiment analysis we further conduct analysison userrsquos contents according to the number of positive andnegative emotional words Similar to dynamic Saltonmetricswe employ 120572119899minus119894 which is defined in Section 321 to stand forthe 119894th time slice 119905

119894rsquos weight and user 119906rsquos mood statistics on 119905

119894

is calculated as follows

Ue (119906 119905119894) =

Upn (119906 119905119894)

Upn (119906 119905119894) + Unn (119906 119905

119894) (5)

where Upn(119906 119905119894) andUnn(119906 119905

119894) represent the number of pos-

itive emotional words and the number of negative emotionalwords user 119906 used on 119905

119894which are included in HowNet

Knowledge Thus user 119906rsquos recent mood statistics on the flowof time slices [0 119905

119899] is calculated with

Ue[0119905119899] (119906) =119899

sum

119894=0

120572119899minus119894times Ue (119906 119905

119894) (6)

333 Emotion Divergence In this paper we quantitativelydefine emotion divergence between user 119906 and microblog 119887as follows

Em (119906 119887 119905119899) = Ue[0119905119899] (119906) minus Ie (119887) (7)

where Ue[0119905119899](119906) represents user 119906rsquos latent mood statistics onthe flow of time slices [0 119905

119899] and Ie(119887) represents emotion

statistics expressed in microblog 119887 which can be calculatedas

Ie (119887) =Ipn (119887)

Ipn (119887) + Inn (119887) (8)

where Ipn(119887) and Inn(119887) denote the number of positiveemotional words and the number of negative emotionalwords used in microblog 119887 which are included in HowNetKnowledge

4 Multilayer Nave Bayes Model for AnalyzingUserrsquos Retweeting Sentiment Tendency

Some researchers found that Naıve Bayes was more suit-able for sentiment classification on microblogging [25] On

the basis of Bayesrsquo theorem Naıve Bayes model presentsuncertainty with probability and realizes process of learningand reasoning via probability Hence in this paper we putforward a multilayer Naıve Bayes model which is a tweakof Naıve Bayes model to analyze userrsquos retweeting sentimenttendency in a fine granularity

We formally define retweeting sentiment tendency anal-ysis as follows given a group of retweets with related dis-crete feature vector and corresponding retweeting sentimenttendency label we aim to leverage prior knowledge toautomatically assign retweeting sentiment tendency labelsto unknown retweets And our proposed method consistsof three modules (1) Naıve Bayes models in bottom layer(2) Naıve Bayes model in middle layer (3) Naıve Bayesmodel in top layer The process outlined is shown in Figure 1where the bottom layer is Naıve Bayes model for predictinguserrsquos profile and emotion the middle layer is Naıve Bayesmodel for predicting userrsquos relationship and finally in thetop layer throughmultilayer nestedNaıve Bayesmodel userrsquosretweeting sentiment tendency is predicted The detaileddescriptions are shown as follows

41 Naıve Bayes Models in Bottom Layer

411 Profile-Based Naıve Bayes Model Naıve Bayes modelwhich is based on Bayesrsquo theorem reduces computationaloverhead via conditional independent assumption to classifyunknown samples according to their maximum a posterioriprobability Since the calculation of userrsquos profile (denoted asUP) conforms to Naıve Bayes model which is determined bythe number of bifollowers (denoted as BI) the number offollowers (denoted as FO) the number of followees (denotedas FE) the number of contents that user posts (denotedas CN) the userrsquos province (denoted as PR) the userrsquos city(denoted as CI) the userrsquos gender (denoted as GE) thecreated time of userrsquos account (denoted as CT) and theverified type of userrsquos account (denoted as VT) UP can beseen as root node of Naıve Bayes model BI FO FE CN PRCI GE CT and VT can be seen as leaf nodes of Naıve Bayesmodel

Given that BI FO FE CN and CT are continuousattributes in order to calculate the conditional probabilityof them we discretize them by using discrete intervalsto represent them before modeling UP according to theabove featuresrsquo values BI is mapped to three levels (namelyfew medium and many) FO is mapped to three levels(namely few medium and many) FE is mapped to threelevels (namely few medium and many) CN is mappedto three levels (namely few medium and many) CT ismapped to three levels (namely short medium and long)Consequently the probability that userrsquos profile equals acertain discrete value is calculated as follows

119875 (UP = 119901 | BI FO FECNPRCI 119866CTVT) =119875 (UP = 119901BI FO FECNPRCI 119866CTVT)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

=119875 (UP = 119901) 119875 (BI | UP = 119901) 119875 (FO | UP = 119901) 119875 (FE | UP = 119901) 119875 (CN | UP = 119901) 119875 (PR | UP = 119901) 119875 (CI | UP = 119901) 119875 (119866 | UP = 119901) 119875 (CT | UP = 119901) 119875 (VT | UP = 119901)

119875 (BI) 119875 (FO) 119875 (FE) 119875 (CN) 119875 (PR) 119875 (CI) 119875 (119866) 119875 (CT) 119875 (VT)

(9)

Computational Intelligence and Neuroscience 5

PEN

Relationship-based Naiumlve Bayes model

Emotion-based Naiumlve Bayes modelProfile-based Naiumlve Bayes model

Retweeting sentiment tendency Naiumlve Bayes model

Top layer

Middle layer

Bottom layer

NEN RMS ED

UEUP

BI FO FE CN PR CI GE CT VT

UR

UP DSM DIF

UEUR

ST

Figure 1 The framework of MLNBRST

where 119901 isin badmedium good denotes a certain discretevalue of UP BI FO FE CN PR CI 119866 CT and VT denotetheir corresponding discrete values respectively 119875(UP =119901BI FO FECNPRCI 119866CTVT) denotes the probabilitythat UP = 119901 and BI FO FE CN PR CI 119866 CT and VTare equal to corresponding discrete values 119875(sdot) denotes theprobability that a feature equals a certain discrete value and119875(sdot | UP = 119901) denotes the probability that a feature equalsa certain discrete value when UP = 119901 Then userrsquos profilecan be classified into three levels (namely bad mediumand good) according to its maximum a posteriori probabilitywhich is calculated with (9)

412 Emotion-Based Naıve Bayes Model In the same waythe calculation of userrsquos emotion (denoted as UE) also tallieswith Naıve Bayes model which is determined by the number

of positive emotional words (denoted as PEN) the numberof negative emotional words (denoted as NEN) recentmood statistics (denoted as RMS) and emotion divergence(denoted as ED) as a consequence UE can be viewed as rootnode of Naıve Bayes model PEN NEN RMS and ED can beviewed as leaf nodes of Naıve Bayes model

BeforemodelingUE given that PENNEN RMS and EDare continuous attributes in order to calculate conditionalprobability of them we discretize them by using discreteintervals to represent them PEN is mapped to three levels(namely few medium and many) NEN is mapped to threelevels (namely few medium and many) RMS is mappedto three levels (namely low medium and high) and ED ismapped to three levels (namely small medium and large)And the probability that userrsquos emotion is equal to a certaindiscrete value is calculated as below

119875 (UE = 119890 | PENNENRMSED) = 119875 (UE = 119890PENNENRMSED)119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

=119875 (UE = 119890) 119875 (PEN | UE = 119890) 119875 (NEN | UE = 119890) 119875 (RMS | UE = 119890) 119875 (ED | UE = 119890)

119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

(10)

where 119890 isin lowmedium high denotes a certain discretevalue of UE PEN NEN RMS and ED denote their corre-sponding discrete values respectively 119875(UE = 119890PENNENRMSED) denotes the probability that UE = 119890 and PENNEN RMS and ED are equal to corresponding discrete

values and 119875(sdot | UE = 119890) denotes the probability that afeature equals a certain discrete value when UE = 119890Then weclassify userrsquos emotion into three levels (namely lowmediumand high) according to its maximum a posteriori probabilitywhich is calculated with (10)

6 Computational Intelligence and Neuroscience

42 Naıve Bayes Model in Middle Layer Similarly the calcu-lation of userrsquos relationship (denoted as UR) is in accordancewith Naıve Bayes model as well which is determined by userrsquosprofile (UP) dynamic Salton metrics (denoted as DSM)and dynamic interaction frequency (denoted as DIF) henceUR can be treated as root node of Naıve Bayes model UPDSM and DIF can be treated as leaf nodes of Naıve Bayesmodel

Since DSM and DIF are continuous attributes in orderto calculate the conditional probability of them we discretizethem by using discrete intervals to represent them beforemodeling UR DSM is mapped to three levels (namelylow medium and high) DIF is mapped to three levels(namely low medium and high) So the probability thatuserrsquos relationship is equal to a certain discrete value iscalculated with

119875 (UR = 119903 | UPDSMDIF) = 119875 (UR = 119903UPDSMDIF)119875 (UP) 119875 (DSM) 119875 (DIF)

=119875 (UR = 119903) 119875 (UP | UR = 119903) 119875 (DSM | UR = 119903) 119875 (DIF | UR = 119903)

119875 (UP) 119875 (DSM) 119875 (DIF)

(11)

where 119903 isin lowmedium high denotes a certain discretevalue of UR UP DSM and DIF denote their correspondingdiscrete values respectively 119875(UR = 119903UPDSMDIF)denotes the probability that UR = 119903 and UP DSM andDIF are equal to corresponding discrete values and 119875(sdot |UR = 119903) denotes the probability that a feature equals a certaindiscrete value when UR = 119903 Then UR can be classified intothree levels (namely low medium and high) according to itsmaximum a posteriori probability which is calculated with(11)

43 Naıve Bayes Model in Top Layer Finally since thecalculation of userrsquos retweeting sentiment tendency (denotedas ST) userrsquos profile (UP) relationship (UR) and emotion(UE) all conforms to Naıve Bayes model so we adopt amultilayer Naıve Bayes model to analyze userrsquos retweetingsentiment tendency In this paper determined by userrsquos rela-tionship and emotion userrsquos retweeting sentiment tendencycan be regarded as root node of Naıve Bayes model userrsquosrelationship and emotion can be regarded as leaf nodesof Naıve Bayes model Thus userrsquos retweeting sentimenttendency is calculated as follows

119875 (ST = st | UPURUE) = 119875 (ST = stUPURUE)119875 (UP) 119875 (UR) 119875 (UE)

=119875 (ST = st) 119875 (UP | ST = st) 119875 (UR | ST = st) 119875 (UE | ST = st)

119875 (UP) 119875 (UR) 119875 (UE)

(12)

where st isin positive negative neutral denotes a certain dis-crete value of STUPUR andUEdenote their correspondingdiscrete values respectively 119875(ST = stUPURUE) denotesthe probability that ST = st and UP UR and UE are equal tocorresponding discrete values and 119875(sdot | ST = st) denotes theprobability that a feature equals a certain discrete value whenST = st Thus ST could be classified into three particularemotion statuses (namely positive negative and neutral)according to its maximum a posteriori probability which iscalculated with (12)

5 Experimental Evaluation

In this section we conduct experiments to assess the effec-tiveness of the proposed frameworkMLNBRSTThrough theexperiments we aim to answer the following two questions

(i) How effective is the proposed frameworkMLNBRSTcompared with other methods of retweeting senti-ment tendency analyzing

(ii) What are the effects of different features and temporalinformation on the performance of retweeting senti-ment tendency analyzing

51 Dataset and Experimental Settings To study the problemof retweeting sentiment tendency analyzing we leverage aSina microblogging dataset [21] which contains time seriesof usersrsquo tweets retweets and followingsrsquo number fromSeptember 28 2012 to October 29 2012 to evaluate thevalidity of the proposedmethodMoreover since not all userspost opinions when retweeting we manually label a subsetof Sina microblogging which contains retweeting contentswith sentiment polarity Statistics of the dataset are shown inTable 1

The experimental settings of retweeting sentiment ten-dency analyzing are described as follows we randomly dividethe dataset into two parts 119860 and 119861 119860 possesses 90 ofretweets used for training The left 10 of retweets denotedas 119861 is designated for testing And we use 10-fold cross-validations to ensure that our results are reliable and reportthe mean performance via precision recall and 1198651-measure

52 Performance Comparisons with Different RetweetingSentiment Tendency Analyzing Methods

521 Experiments with Different Feature-Based Methods Toanswer the first question we first compare the proposedframework MLNBRST with four different feature-basedmethods

(i) pMLNBRST considering userrsquos profile-based featuresonly

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Computational Intelligence and Neuroscience 5

PEN

Relationship-based Naiumlve Bayes model

Emotion-based Naiumlve Bayes modelProfile-based Naiumlve Bayes model

Retweeting sentiment tendency Naiumlve Bayes model

Top layer

Middle layer

Bottom layer

NEN RMS ED

UEUP

BI FO FE CN PR CI GE CT VT

UR

UP DSM DIF

UEUR

ST

Figure 1 The framework of MLNBRST

where 119901 isin badmedium good denotes a certain discretevalue of UP BI FO FE CN PR CI 119866 CT and VT denotetheir corresponding discrete values respectively 119875(UP =119901BI FO FECNPRCI 119866CTVT) denotes the probabilitythat UP = 119901 and BI FO FE CN PR CI 119866 CT and VTare equal to corresponding discrete values 119875(sdot) denotes theprobability that a feature equals a certain discrete value and119875(sdot | UP = 119901) denotes the probability that a feature equalsa certain discrete value when UP = 119901 Then userrsquos profilecan be classified into three levels (namely bad mediumand good) according to its maximum a posteriori probabilitywhich is calculated with (9)

412 Emotion-Based Naıve Bayes Model In the same waythe calculation of userrsquos emotion (denoted as UE) also tallieswith Naıve Bayes model which is determined by the number

of positive emotional words (denoted as PEN) the numberof negative emotional words (denoted as NEN) recentmood statistics (denoted as RMS) and emotion divergence(denoted as ED) as a consequence UE can be viewed as rootnode of Naıve Bayes model PEN NEN RMS and ED can beviewed as leaf nodes of Naıve Bayes model

BeforemodelingUE given that PENNEN RMS and EDare continuous attributes in order to calculate conditionalprobability of them we discretize them by using discreteintervals to represent them PEN is mapped to three levels(namely few medium and many) NEN is mapped to threelevels (namely few medium and many) RMS is mappedto three levels (namely low medium and high) and ED ismapped to three levels (namely small medium and large)And the probability that userrsquos emotion is equal to a certaindiscrete value is calculated as below

119875 (UE = 119890 | PENNENRMSED) = 119875 (UE = 119890PENNENRMSED)119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

=119875 (UE = 119890) 119875 (PEN | UE = 119890) 119875 (NEN | UE = 119890) 119875 (RMS | UE = 119890) 119875 (ED | UE = 119890)

119875 (PEN) 119875 (NEN) 119875 (RMS) 119875 (ED)

(10)

where 119890 isin lowmedium high denotes a certain discretevalue of UE PEN NEN RMS and ED denote their corre-sponding discrete values respectively 119875(UE = 119890PENNENRMSED) denotes the probability that UE = 119890 and PENNEN RMS and ED are equal to corresponding discrete

values and 119875(sdot | UE = 119890) denotes the probability that afeature equals a certain discrete value when UE = 119890Then weclassify userrsquos emotion into three levels (namely lowmediumand high) according to its maximum a posteriori probabilitywhich is calculated with (10)

6 Computational Intelligence and Neuroscience

42 Naıve Bayes Model in Middle Layer Similarly the calcu-lation of userrsquos relationship (denoted as UR) is in accordancewith Naıve Bayes model as well which is determined by userrsquosprofile (UP) dynamic Salton metrics (denoted as DSM)and dynamic interaction frequency (denoted as DIF) henceUR can be treated as root node of Naıve Bayes model UPDSM and DIF can be treated as leaf nodes of Naıve Bayesmodel

Since DSM and DIF are continuous attributes in orderto calculate the conditional probability of them we discretizethem by using discrete intervals to represent them beforemodeling UR DSM is mapped to three levels (namelylow medium and high) DIF is mapped to three levels(namely low medium and high) So the probability thatuserrsquos relationship is equal to a certain discrete value iscalculated with

119875 (UR = 119903 | UPDSMDIF) = 119875 (UR = 119903UPDSMDIF)119875 (UP) 119875 (DSM) 119875 (DIF)

=119875 (UR = 119903) 119875 (UP | UR = 119903) 119875 (DSM | UR = 119903) 119875 (DIF | UR = 119903)

119875 (UP) 119875 (DSM) 119875 (DIF)

(11)

where 119903 isin lowmedium high denotes a certain discretevalue of UR UP DSM and DIF denote their correspondingdiscrete values respectively 119875(UR = 119903UPDSMDIF)denotes the probability that UR = 119903 and UP DSM andDIF are equal to corresponding discrete values and 119875(sdot |UR = 119903) denotes the probability that a feature equals a certaindiscrete value when UR = 119903 Then UR can be classified intothree levels (namely low medium and high) according to itsmaximum a posteriori probability which is calculated with(11)

43 Naıve Bayes Model in Top Layer Finally since thecalculation of userrsquos retweeting sentiment tendency (denotedas ST) userrsquos profile (UP) relationship (UR) and emotion(UE) all conforms to Naıve Bayes model so we adopt amultilayer Naıve Bayes model to analyze userrsquos retweetingsentiment tendency In this paper determined by userrsquos rela-tionship and emotion userrsquos retweeting sentiment tendencycan be regarded as root node of Naıve Bayes model userrsquosrelationship and emotion can be regarded as leaf nodesof Naıve Bayes model Thus userrsquos retweeting sentimenttendency is calculated as follows

119875 (ST = st | UPURUE) = 119875 (ST = stUPURUE)119875 (UP) 119875 (UR) 119875 (UE)

=119875 (ST = st) 119875 (UP | ST = st) 119875 (UR | ST = st) 119875 (UE | ST = st)

119875 (UP) 119875 (UR) 119875 (UE)

(12)

where st isin positive negative neutral denotes a certain dis-crete value of STUPUR andUEdenote their correspondingdiscrete values respectively 119875(ST = stUPURUE) denotesthe probability that ST = st and UP UR and UE are equal tocorresponding discrete values and 119875(sdot | ST = st) denotes theprobability that a feature equals a certain discrete value whenST = st Thus ST could be classified into three particularemotion statuses (namely positive negative and neutral)according to its maximum a posteriori probability which iscalculated with (12)

5 Experimental Evaluation

In this section we conduct experiments to assess the effec-tiveness of the proposed frameworkMLNBRSTThrough theexperiments we aim to answer the following two questions

(i) How effective is the proposed frameworkMLNBRSTcompared with other methods of retweeting senti-ment tendency analyzing

(ii) What are the effects of different features and temporalinformation on the performance of retweeting senti-ment tendency analyzing

51 Dataset and Experimental Settings To study the problemof retweeting sentiment tendency analyzing we leverage aSina microblogging dataset [21] which contains time seriesof usersrsquo tweets retweets and followingsrsquo number fromSeptember 28 2012 to October 29 2012 to evaluate thevalidity of the proposedmethodMoreover since not all userspost opinions when retweeting we manually label a subsetof Sina microblogging which contains retweeting contentswith sentiment polarity Statistics of the dataset are shown inTable 1

The experimental settings of retweeting sentiment ten-dency analyzing are described as follows we randomly dividethe dataset into two parts 119860 and 119861 119860 possesses 90 ofretweets used for training The left 10 of retweets denotedas 119861 is designated for testing And we use 10-fold cross-validations to ensure that our results are reliable and reportthe mean performance via precision recall and 1198651-measure

52 Performance Comparisons with Different RetweetingSentiment Tendency Analyzing Methods

521 Experiments with Different Feature-Based Methods Toanswer the first question we first compare the proposedframework MLNBRST with four different feature-basedmethods

(i) pMLNBRST considering userrsquos profile-based featuresonly

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

6 Computational Intelligence and Neuroscience

42 Naıve Bayes Model in Middle Layer Similarly the calcu-lation of userrsquos relationship (denoted as UR) is in accordancewith Naıve Bayes model as well which is determined by userrsquosprofile (UP) dynamic Salton metrics (denoted as DSM)and dynamic interaction frequency (denoted as DIF) henceUR can be treated as root node of Naıve Bayes model UPDSM and DIF can be treated as leaf nodes of Naıve Bayesmodel

Since DSM and DIF are continuous attributes in orderto calculate the conditional probability of them we discretizethem by using discrete intervals to represent them beforemodeling UR DSM is mapped to three levels (namelylow medium and high) DIF is mapped to three levels(namely low medium and high) So the probability thatuserrsquos relationship is equal to a certain discrete value iscalculated with

119875 (UR = 119903 | UPDSMDIF) = 119875 (UR = 119903UPDSMDIF)119875 (UP) 119875 (DSM) 119875 (DIF)

=119875 (UR = 119903) 119875 (UP | UR = 119903) 119875 (DSM | UR = 119903) 119875 (DIF | UR = 119903)

119875 (UP) 119875 (DSM) 119875 (DIF)

(11)

where 119903 isin lowmedium high denotes a certain discretevalue of UR UP DSM and DIF denote their correspondingdiscrete values respectively 119875(UR = 119903UPDSMDIF)denotes the probability that UR = 119903 and UP DSM andDIF are equal to corresponding discrete values and 119875(sdot |UR = 119903) denotes the probability that a feature equals a certaindiscrete value when UR = 119903 Then UR can be classified intothree levels (namely low medium and high) according to itsmaximum a posteriori probability which is calculated with(11)

43 Naıve Bayes Model in Top Layer Finally since thecalculation of userrsquos retweeting sentiment tendency (denotedas ST) userrsquos profile (UP) relationship (UR) and emotion(UE) all conforms to Naıve Bayes model so we adopt amultilayer Naıve Bayes model to analyze userrsquos retweetingsentiment tendency In this paper determined by userrsquos rela-tionship and emotion userrsquos retweeting sentiment tendencycan be regarded as root node of Naıve Bayes model userrsquosrelationship and emotion can be regarded as leaf nodesof Naıve Bayes model Thus userrsquos retweeting sentimenttendency is calculated as follows

119875 (ST = st | UPURUE) = 119875 (ST = stUPURUE)119875 (UP) 119875 (UR) 119875 (UE)

=119875 (ST = st) 119875 (UP | ST = st) 119875 (UR | ST = st) 119875 (UE | ST = st)

119875 (UP) 119875 (UR) 119875 (UE)

(12)

where st isin positive negative neutral denotes a certain dis-crete value of STUPUR andUEdenote their correspondingdiscrete values respectively 119875(ST = stUPURUE) denotesthe probability that ST = st and UP UR and UE are equal tocorresponding discrete values and 119875(sdot | ST = st) denotes theprobability that a feature equals a certain discrete value whenST = st Thus ST could be classified into three particularemotion statuses (namely positive negative and neutral)according to its maximum a posteriori probability which iscalculated with (12)

5 Experimental Evaluation

In this section we conduct experiments to assess the effec-tiveness of the proposed frameworkMLNBRSTThrough theexperiments we aim to answer the following two questions

(i) How effective is the proposed frameworkMLNBRSTcompared with other methods of retweeting senti-ment tendency analyzing

(ii) What are the effects of different features and temporalinformation on the performance of retweeting senti-ment tendency analyzing

51 Dataset and Experimental Settings To study the problemof retweeting sentiment tendency analyzing we leverage aSina microblogging dataset [21] which contains time seriesof usersrsquo tweets retweets and followingsrsquo number fromSeptember 28 2012 to October 29 2012 to evaluate thevalidity of the proposedmethodMoreover since not all userspost opinions when retweeting we manually label a subsetof Sina microblogging which contains retweeting contentswith sentiment polarity Statistics of the dataset are shown inTable 1

The experimental settings of retweeting sentiment ten-dency analyzing are described as follows we randomly dividethe dataset into two parts 119860 and 119861 119860 possesses 90 ofretweets used for training The left 10 of retweets denotedas 119861 is designated for testing And we use 10-fold cross-validations to ensure that our results are reliable and reportthe mean performance via precision recall and 1198651-measure

52 Performance Comparisons with Different RetweetingSentiment Tendency Analyzing Methods

521 Experiments with Different Feature-Based Methods Toanswer the first question we first compare the proposedframework MLNBRST with four different feature-basedmethods

(i) pMLNBRST considering userrsquos profile-based featuresonly

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Computational Intelligence and Neuroscience 7

1

09

08

07

06

05

04

03

02

01

0336 0401 0366

0427 0414 042

0358 039 0373

0

PositiveNegativeNeutral

Precision Recall F1-measure

(a) pMLNBRST

0645 0643 0643

0664 0604 0632

0622 0642 0631

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(b) rMLNBRST

0734 0733 0733

0742 0754 0747

0676 0689 0682

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(c) eMLNBRST

0751 0763 0756

0787 0775 078

0712 0731 0721

1

09

08

07

06

05

04

03

02

01

0

PositiveNegativeNeutral

Precision Recall F1-measure

(d) aMLNBRST

Figure 2 The impact of different feature sets in the proposed framework MLNBRST

Table 1 Statistics of the dataset

Features Statisticsusers 296134followings 51415017tweets 176659retweets 264743

(ii) rMLNBRST considering userrsquos relationship-basedfeatures only

(iii) eMLNBRST considering userrsquos emotion-based fea-tures only

(iv) aMLNBRST considering all features

The precision recall and1198651-measure of different feature-based methods are shown in Figure 2

We draw the following observations removing eitheruserrsquos relationship-based or emotion-based features maylower the modelrsquos prediction abilities obviously Additionallyemotion-based features contribute more to analyzing userrsquosretweeting sentiment polarity than profile- and relationship-based features since a plenty of positive or negative emotionalwords in retweeting contents provide a good guaranteefor accuracy of emotion predicting Besides the systemimproves its performance by shifting the emphasis towardsmerging profile- relationship- and content-based featuresfrom which we found that it is difficult to predict userrsquosretweeting emotion tendency with only its specific types offeatures and it is important to fuse multidimensional featuresreasonably

522 Experiments with Other Sentiment Tendency AnalyzingMethods Since support vector machine (denoted as SVM)Naıve Bayes (denoted as NB) and maximum entropy models(denoted as ME) are most commonly used in sentimentclassification among various machine learning techniques[26] therefore on the basis of our proposed features wecompare the proposed framework MLNBRST with SVMNB and ME as well as NBSVM used in [16] which isan improved SVM classifier and Adaptive Recursive NeuralNetwork (denoted as AdaRNN) used in [20] to answer thefirst question Due to space restrictions the mean precisionsrecalls 1198651-measures of the aforementioned studies and ourbest one are depicted in Figure 3

From Figure 3 it can be found that the proposedapproach shows the best results from the point of viewof precision recall and 1198651-measure Maximum entropymethod achieves the worst results because it strongly relieson corpus Since support vector machine is only applicableto a small-size training dataset therefore Naıve Bayes ismore suitable for sentiment classification than support vectormachine in terms of microblogging which is in line with [25]Adaptive RecursiveNeuralNetworkmethod only can achievebetter precision via a complete dataset Additionally givencontext of retweeting content our proposedmethod stratifiesdifferent factors according to the correlations between themvia amultilayer Naıve Bayesmodel consequently it performsbetter than Naıve Bayes and NBSVM method Moreoverwe can obtain a significant improvement on performance(+107 in terms of precision 223 in terms of recalland +169 in terms of 1198651-measure) compared with [11]which leveraged a Naıve Bayes classifier to analyze userrsquos

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

8 Computational Intelligence and Neuroscience

SVM NB

ME

NBS

VM

AdaR

NN

MLN

BRST

05055

06065

07075

08085

09095

1

PrecisionRecallF1-measure

Figure 3 Precisions recalls and 1198651-measures of different methods

sentiment via 95 most frequently used emoticons in Chinesetweets Besides we achieve better results (+58 in terms ofprecision +121 in terms of recall and +99 in terms of1198651-measure) compared with [18] which leveraged SVR (supportvector regression) to classify emotions in Chinese tweets onthe basis of common social network characteristics and othercarefully generalized linguistic patterns

In summary the results in Sections 521 and 522 sug-gest that all improvement is significant With the help ofmultilayer Naıve Bayes model based on integration of mul-tidimensional features the proposed framework MLNBRSTgains significant improvement over representative differentfeature-basedmethods and baseline methods which answersthe first question

53 Analysis of Different Factorsrsquo Impacts in MLNBRST

531 Impacts of Different Retweeting Sentiment Features inMLNBRST In this context we first investigate the impactthat the proposed retweeting sentiment features have andaccordingly answer the second question Because of the spacecrunch Figures 4(a) 4(b) 4(c) and 4(d) merely illustrate theprobability distribution of values of dynamic Salton metricsdynamic interaction frequency recent mood statistics andemotion divergence under different retweeting sentimenttendencies (positive negative and neutral)

As depicted in Figure 4(a) if there is a bigger dynamicSalton metrics between two users it is more easier toretweet casually which may lead to either positive emotionor negative emotion being included in retweeting contentsIn Figure 4(b) users who interact frequently may be lesspossible to retweet in neutral emotion tendency with eachother instead there may be more likely to express supportor opposition to each other And in Figure 4(c) users willhave higher probability to retweet with positive emotiontendency if they are in high spirits latently and vice versaFigure 4(d) shows strong evidence for the impact that emo-tion divergence has on userrsquos retweeting emotion tendency Ifabsolute values of emotion divergence are small the majorityof users may retweet with neutral emotion And the ratio

Table 2 Information gains of features

Types of features Features Informationgains

Profile-based

bifollowers 0254followers 0231followees 0257posts 0201Province 0105City 0110

Gender 0148Created time of userrsquos

account 0092

Verified type of userrsquosaccount 0102

Relation-based Dynamic Salton metrics 0485Dynamic interaction

frequency 0503

Emotion-based

positive emotionalwords 0647

negative emotionalwords 0622

Recent mood statistics 0694Emotion divergence 0723

of retweeting with negative emotion is higher if there arenegative values of emotion divergence between emotion ofmicroblogging and emotion that is expressed in userrsquos recentstates If values of emotion divergence are positive users aremore likely to retweet with positive emotion which may bedetermined by usersrsquo recent mood statistics

Furthermore we employ information entropy theory tofurther explore the contributions of the proposed features touserrsquos retweeting sentiment tendency The information gainof the 119894th feature 119891

119894is calculated as

IG (119891119894) = minussum

119890isin119864

119901 (119890) log119901 (119890)

+ sum

Visin119881119891119894

119901 (V) sum119890isin119864

119901 (119890 | V) log119901 (119890 | V) (13)

where 119890 denotes a certain emotion state which belongs toemotion set 119864 = positive negative neutral V denotes avalue in discretized value set 119881

119891119894

of the 119894th feature 119891119894 119901(119890)

stands for the probability that emotion state 119890 appears indataset 119901(V) stands for the probability that discretized valueof 119891119894is equal to V in dataset and 119901(119890 | V) stands for the

probability that emotion state 119890 appears in dataset whendiscretized value of 119891

119894is equal to V Information gains of

features are shown in Table 2As described in Table 2 information gain of userrsquos

profile-based features is lower than other relationship- andemotion-based features Moreover emotion-based featureshave higher information gains than relationship-based fea-tures In addition being processed based on the count ofpositive and negative emotional words recentmood statistics

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Computational Intelligence and Neuroscience 9

0

005

01

015

02

025

0 2 4 6 8 10 12 14 16 18Dynamic Salton metrics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(a)

0 02 04 06 08 10

002004006008

01012014016018

02

Dynamic interaction frequency

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(b)

0 02 04 06 08 10

005

01

015

02

025

Recent mood statistics

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(c)

minus06 minus04 minus02 0 02 04 06 08 10

002004006008

01012014016018

Emotion divergence

Prob

abili

ty d

istrib

utio

n

PositiveNegativeNeutral

(d)

Figure 4 The probability distribution of different retweeting sentiment tendency features

and emotion divergence have larger information gains thanpositive or negative emotional words count

From the above it can be found that the proposedfactors can be used as a good indicator of userrsquos retweetingsentiment tendency However although social relationshipplayed an important role in individual emotion which isin keeping with Fowler and Christakisrsquos work [27] it canonly distinguish between neutral and other sentiment ten-dencies while emotion-based features are merely with gooddiscriminability on positive and negative sentiment polarityHence comprehensive considering on context information ofretweeting content is necessary

532 Impact of Temporal Information in MLNBRST Toanswer the second question we also investigate how temporalinformation affects the performance of our method in termsof 1198651-measure by changing the time slice weight factor 120572 Inthis paper 120572 is varied as 001 01 05 07 1 and we carryon 10-fold cross-validations with 50 60 80 and 100of 119860 for training so as to avoid bias brought by the sizes ofthe training data and the results are shown in Figure 5 whereldquo50rdquo ldquo60rdquo ldquo80rdquo and ldquo100rdquo denote that we leverage50 60 80 and 100 of 119860 for training

001 0105 07

1

506080

1000405060708

Training data ()

F1-

mea

sure

120572

Figure 5 The impact of temporal information in the proposedframework MLNBRST

It can be observed from Figure 5 when setting 120572 as 1namely without considering temporal information the 1198651-measure is much lower than the peak performance and the1198651-measure first increases greatly and then degrades rapidlyafter reaching a peak value with the increase 120572

The results in Sections 531 and 532 further demonstratethe importance of proposed features and temporal infor-mation in retweeting sentiment tendency analysis whichcorrespondingly answers the second question

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

10 Computational Intelligence and Neuroscience

6 Conclusion

In this paper we explored the problem of finding the possiblevariations and analyzing userrsquos retweeting sentiment ten-dency in dynamic social networks Firstly relationship-basedfeatures were inferred from usersrsquo dynamic Salton metricsand dynamic interaction frequency Secondly along with thenumber of positive and negative emotional words we builtrecent mood statistics and emotion divergence based on timeseries of usersrsquo posts And then on the basis of Naıve Bayestheory we represented models in lower layers from profile-relationship- and emotion-based dimension respectivelyfollowed by designing a multilayer Naıve Bayes model onconstructed models of different dimensions to analyze userrsquosretweeting sentiment tendency Finally we ran a set ofexperiments on a real-world dataset to investigate the per-formance of our model and reported system performancesin terms of precision recall and 1198651-measure In generalthe experimental results demonstrate the effectiveness of ourproposed framework

In future work we will employ crowd sourcing tech-nology to add more context information to our methodto ameliorate its performance as well as increase its onlineapplication scope Furthermore we will speculate on whatdirections can be undertaken to ameliorate its performancewith respect to time complexity

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China under Grant no 61300148 the Scientificand Technological Break-Through Program of Jilin ProvinceunderGrant no 20130206051GX the Science andTechnologyDevelopment Program of Jilin Province under Grant no20130522112JH the Science Foundation for China Postdoctorunder Grant no 2012M510879 the Basic Scientific ResearchFoundation for the Interdisciplinary Research and Innova-tion Project of Jilin University under Grant no 201103129

References

[1] D Y Zhang and G Guo ldquoA comparison of online social net-works and real-life social networks a study of Sina Microblog-gingrdquo Mathematical Problems in Engineering vol 2014 ArticleID 578713 6 pages 2014

[2] B Agarwal N Mittal P Bansal and S Garg ldquoSentimentanalysis using common-sense and context informationrdquo Com-putational Intelligence and Neuroscience vol 2015 Article ID715730 9 pages 2015

[3] J Bollen H N Mao and A Pepe ldquoModeling public mood andemotion Twitter sentiment and socio-economic phenomenardquoin Proceedings of the 5th International AAAI Conference onWeblogs and Social Media pp 450ndash453 2011

[4] J Bollen B Goncalves G C Ruan and H N Mao ldquoHappinessis assortative in online social networksrdquo Artificial Life vol 17no 3 pp 237ndash251 2011

[5] J BollenHMao andX Zeng ldquoTwittermoodpredicts the stockmarketrdquo Journal of Computational Science vol 2 no 1 pp 1ndash82011

[6] A Tumasjan T O Sprenger P G Sandner and I M WelpeldquoPredicting elections with Twitter What 140 characters revealabout political sentimentrdquo inProceedings of the 4th InternationalAAAIConference onWeblogs and SocialMedia (ICWSM rsquo10) pp178ndash185 May 2010

[7] D N Trung and J J Jung ldquoSentiment analysis based onfuzzy propagation in online social networks a case study onTweetScoperdquoComputer Science and Information Systems vol 11no 1 pp 215ndash228 2014

[8] S A Golder andMWMacy ldquoDiurnal and seasonal mood varywithwork sleep and daylength across diverse culturesrdquo Sciencevol 333 no 6051 pp 1878ndash1881 2011

[9] S Stieglitz and L Dang-Xuan ldquoEmotions and informationdiffusion in social mediamdashsentiment ofmicroblogs and sharingbehaviorrdquo Journal of Management Information Systems vol 29no 4 pp 217ndash247 2013

[10] D Davidov O Tsur and A Rappoport ldquoEnhanced sentimentlearning using twitter hash-tags and smileysrdquo in Proceedings ofthe 23rd International Conference on Computational Linguisticspp 241ndash249 August 2010

[11] J C Zhao L Dong J J Wu and K Xu ldquoMoodLens anemoticon-based sentiment analysis system for Chinese tweetsrdquoin Proceedings of the 18th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo12) pp1528ndash1531 August 2012

[12] D Ramage S Dumais and D Liebling ldquoCharacterizingmicroblogs with topicmodelsrdquo in Proceedings of the 4th Interna-tional AAAI Conference on Weblogs and Social Media (ICWSMrsquo10) pp 130ndash137 May 2010

[13] M Ghiassi J Skinner andD Zimbra ldquoTwitter brand sentimentanalysis a hybrid system using n-gram analysis and dynamicartificial neural networkrdquo Expert Systems with Applications vol40 no 16 pp 6266ndash6282 2013

[14] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoA knowledge-based approach forpolarity classification in Twitterrdquo Journal of the Association forInformation Science and Technology vol 65 no 2 pp 414ndash4252014

[15] A Montejo-Raez E Martınez-Camara M T Martın-Valdiviaand L A Urena-Lopez ldquoRanked word net graph for sentimentpolarity classification in TwitterrdquoComputer Speech amp Languagevol 28 no 1 pp 93ndash107 2014

[16] H Wang D Can A Kazemzadeh F Bar and S NarayananldquoA system for real-time Twitter sentiment analysis of 2012 USpresidential election cyclerdquo in Proceedings of the 50th AnnualMeeting of the Association for Computational Linguistics SystemDemonstrations pp 115ndash120 2012

[17] P Korenek and M Simko ldquoSentiment analysis on microblogutilizing appraisal theoryrdquo World Wide Web vol 17 no 4 pp847ndash867 2014

[18] W Li and H Xu ldquoText-based emotion classification usingemotion cause extractionrdquo Expert Systems with Applicationsvol 41 no 4 pp 1742ndash1749 2014

[19] X Xiong G Zhou Y Huang H Chen and K Xu ldquoDynamicevolution of collective emotions in social networks a case study

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Computational Intelligence and Neuroscience 11

of Sina weibordquo Science China Information Sciences vol 56 no7 pp 1ndash18 2013

[20] L Dong F R Wei C Q Tan D Y Tang M Zhou and KXu ldquoAdaptive recursive neural network for target-dependentTwitter sentiment classificationrdquo in Proceedings of the 52ndAnnual Meeting of the Association for Computational Linguistics(ACL rsquo14) pp 49ndash54 June 2014

[21] J Zhang B Liu J Tang T Chen and J Li ldquoSocial influencelocality formodeling retweeting behaviorsrdquo inProceedings of the23rd International Joint Conference on Artificial Intelligence pp2761ndash2767 August 2013

[22] C H Tan L Lee J Tang L Jiang M Zhou and P Li ldquoUser-level sentiment analysis incorporating social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1397ndash1405 August 2011

[23] M Newman ldquoClustering and preferential attachment in grow-ing networksrdquo Physical Review E vol 64 no 2 Article ID025102 4 pages 2001

[24] Z W Yu X S Zhou D Q Zhang G Schiele and C BeckerldquoUnderstanding social relationship evolution by using real-world sensing datardquoWorld Wide Web vol 16 no 5-6 pp 749ndash762 2013

[25] A Bermingham and A F Smeaton ldquoClassifying sentiment inmicroblogs is brevity an advantagerdquo in Proceedings of the 19thACM International Conference on Information and KnowledgeManagement pp 1833ndash1836 October 2010

[26] B Pang L Lee and S Vaithyanathan ldquoThumbs up Sentimentclassification using machine learning techniquesrdquo in Proceed-ings of the Conference on EmpiricalMethods inNatural LanguageProcessing pp 79ndash86 2002

[27] J H Fowler andNA Christakis ldquoDynamic spread of happinessin a large social network longitudinal analysis over 20 years inthe Framingham Heart Studyrdquo British Medical Journal vol 337Article ID a2338 9 pages 2008

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Research Article A Multilayer Na ve Bayes Model for ...downloads.hindawi.com/journals/cin/2015/510281.pdfmarket [], predicting results of presidential election [] , and modeling for

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014