extracting opinion targets and opinion words from online ... · extracting opinion targets and...

Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pages 314–324,Baltimore, Maryland, USA, June 23-25 2014. c©2014 Association for Computational Linguistics

Extracting Opinion Targets and Opinion Words from Online Reviewswith Graph Co-ranking

Kang Liu, Liheng Xu and Jun ZhaoNational Laboratory of Pattern Recognition

Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China{kliu, lhxu, jzhao}@nlpr.ia.ac.cn

Abstract

Extracting opinion targets and opinionwords from online reviews are two fun-damental tasks in opinion mining. Thispaper proposes a novel approach to col-lectively extract them with graph co-ranking. First, compared to previousmethods which solely employed opinionrelations among words, our method con-structs a heterogeneous graph to modeltwo types of relations, including seman-tic relations and opinion relations. Next,a co-ranking algorithm is proposed to es-timate the confidence of each candidate,and the candidates with higher confidencewill be extracted as opinion targets/words.In this way, different relations make coop-erative effects on candidates’ confidenceestimation. Moreover, word preferenceis captured and incorporated into our co-ranking algorithm. In this way, our co-ranking is personalized and each candi-date’s confidence is only determined by itspreferred collocations. It helps to improvethe extraction precision. The experimen-tal results on three data sets with differ-ent sizes and languages show that our ap-proach achieves better performance thanstate-of-the-art methods.

1 Introduction

In opinion mining, extracting opinion targets andopinion words are two fundamental subtasks.Opinion targets are objects about which users’opinions are expressed, and opinion words arewords which indicate opinions’ polarities. Ex-tracting them can provide essential informationfor obtaining fine-grained analysis on customers’opinions. Thus, it has attracted a lot of attentions(Hu and Liu, 2004b; Liu et al., 2012; Moghaddamand Ester, 2011; Mukherjee and Liu, 2012).

To this end, previous work usually employed acollective extraction strategy (Qiu et al., 2009; Huand Liu, 2004b; Liu et al., 2013b). Their intuitionis: opinion words usually co-occur with opiniontargets in sentences, and there are strong modifi-cation relationship between them (called opinionrelation in (Liu et al., 2012)). If a word is anopinion word, other words with which that wordhaving opinion relations will have highly proba-bility to be opinion targets, and vice versa. In thisway, extraction is alternatively performed and mu-tual reinforced between opinion targets and opin-ion words. Although this strategy has been widelyemployed by previous approaches, it still has sev-eral limitations.

1) Only considering opinion relations is in-sufficient. Previous methods mainly focused onemploying opinion relations among words foropinion target/word co-extraction. They have in-vestigated a series of techniques to enhance opin-ion relations identification performance, such asnearest neighbor rules (Liu et al., 2005), syntacticpatterns (Zhang et al., 2010; Popescu and Etzioni,2005), word alignment models (Liu et al., 2012;Liu et al., 2013b; Liu et al., 2013a), etc. How-ever, we are curious that whether merely employ-ing opinion relations among words is enough foropinion target/word extraction? We note that thereare additional types of relations among words. Forexample, “LCD” and “LED” both denote the sameaspect “screen” in TV set domain, and they aretopical related. We call such relations betweenhomogeneous words as semantic relations. If wehave known “LCD” to be an opinion target, “LED”is naturally to be an opinion target. Intuitively,besides opinion relations, semantic relations mayprovide additional rich clues for indicating opin-ion targets/words. Which kind of relations is moreeffective for opinion targets/words extraction? Is itbeneficial to consider these two types of relationstogether for the extraction? To our best knowl-

314

edge, these problems have seldom been studiedbefore (see Section 2).

2) Ignoring word preference. When employ-ing opinion relations to perform mutual reinforc-ing extraction between opinion targets and opin-ion words, previous methods depended on opin-ion associations among words, but seldom consid-ered word preference. Word preference denotesa word’s preferred collocations. Intuitively, theconfidence of a candidate being an opinion tar-get (opinion word) should mostly be determinedby its word preferences rather than all words hav-ing opinion relations with it. For example

“This camera’s price is expensive for me.”“It’s price is good.”“Canon 40D has a good price.”

In these three sentences, “price” is modified by“good” more times than “expensive”. In tradi-tional extraction strategy, opinion associations areusually computed based on the co-occurrence fre-quency. Thus, “good” has more strong opinionassociation with “price” than “expensive”, and itwould have more contributions on determining“price” to be an opinion target or not. It’s un-reasonable. “Expensive” actually has more re-latedness with “price” than “good”, and “expen-sive” is likely to be a word preference for “price”.The confidence of “price” being an opinion targetshould be influenced by “expensive” in greater ex-tent than “good”. In this way, we argue that theextraction will be more precise.

𝑂𝑂𝑂𝑂4 𝑂𝑂𝑂𝑂6 𝑂𝑂𝑂𝑂5 𝑂𝑂𝑂𝑂1 𝑂𝑂𝑂𝑂3 𝑂𝑂𝑂𝑂2

𝑇𝑇𝑂𝑂2 𝑇𝑇𝑂𝑂4 𝑇𝑇𝑂𝑂3 𝑇𝑇𝑂𝑂5 𝑇𝑇𝑂𝑂6 𝑇𝑇𝑂𝑂1

𝐺𝐺𝑡𝑡𝑡𝑡

𝐺𝐺𝑜𝑜𝑜𝑜

𝐺𝐺𝑡𝑡𝑜𝑜

Figure 1: Heterogeneous Graph: OC means opin-ion word candidates. TC means opinion targetcandidates. Solid curves and dotted lines respec-tively mean semantic relations and opinion rela-tions between two candidates.

Thus, to resolve these two problems, we presenta novel approach with graph co-ranking. The col-lective extraction of opinion targets/words is per-formed in a co-ranking process. First, we oper-ate over a heterogeneous graph to model seman-tic relations and opinion relations into a unifiedmodel. Specifically, our heterogeneous graph is

composed of three subgraphs which model differ-ent relation types and candidates, as shown in Fig-ure 1. The first subgraph Gtt represents semanticrelations among opinion target candidates, and thesecond subgraph Goo models semantic relationsamong opinion word candidates. The third partis a bipartite subgraph Gto, which models opinionrelations among different candidate types and con-nects the above two subgraphs together. Then weperform a random walk algorithm onGtt, Goo andGto separately, to estimate all candidates’ confi-dence, and the entries with higher confidence thana threshold are correspondingly extracted as opin-ion targets/words. The results could reflect whichtype of relation is more useful for the extraction.

Second, a co-ranking algorithm, which incor-porates three separate random walks on Gtt, Goo

and Gto into a unified process, is proposed toperform candidate confidence estimation. Differ-ent relations may cooperatively affect candidateconfidence estimation and generate more globalranking results. Moreover, we discover each can-didate’s preferences through topics. Such wordpreference will be different for different candi-dates. We add word preference information intoour algorithm and make our co-ranking algorithmbe personalized. A candidate’s confidence wouldmainly absorb the contributions from its wordpreferences rather than its all neighbors with opin-ion relations, which may be beneficial for improv-ing extraction precision.

We perform experiments on real-world datasetsfrom different languages and different domains.Results show that our approach effectively im-proves extraction performance compared to thestate-of-the-art approaches.

2 Related Work

There are many significant research efforts onopinion targets/words extraction (sentence leveland corpus level). In sentence level extraction,previous methods (Wu et al., 2009; Ma and Wan,2010; Li et al., 2010; Yang and Cardie, 2013)mainly aimed to identify all opinion target/wordmentions in sentences. They regarded it as a se-quence labeling task, where several classical mod-els were used, such as CRFs (Li et al., 2010) andSVM (Wu et al., 2009).

This paper belongs to corpus level extraction,and aims to generate a sentiment lexicon and atarget list rather than to identify mentions in sen-

315

tences. Most of previous corpus-level methodsadopted a co-extraction framework, where opin-ion targets and opinion words reinforce each otheraccording to their opinion relations. Thus, howto improve opinion relations identification perfor-mance was their main focus. (Hu and Liu, 2004a)exploited nearest neighbor rules to mine opinionrelations among words. (Popescu and Etzioni,2005) and (Qiu et al., 2011) designed syntacticpatterns to perform this task. (Zhang et al., 2010)promoted Qiu’s method. They adopted some spe-cial designed patterns to increase recall. (Liu etal., 2012; Liu et al., 2013a; Liu et al., 2013b) em-ployed word alignment model to capture opinionrelations rather than syntactic parsing. The exper-imental results showed that these alignment-basedmethods are more effective than syntax-based ap-proaches for online informal texts. However, allaforementioned methods only employed opinionrelations for the extraction, but ignore consider-ing semantic relations among homogeneous can-didates. Moreover, they all ignored word prefer-ence in the extraction process.

In terms of considering semantic relationsamong words, our method is related with sev-eral approaches based on topic model (Zhao etal., 2010; Moghaddam and Ester, 2011; Moghad-dam and Ester, 2012a; Moghaddam and Ester,2012b; Mukherjee and Liu, 2012). The maingoals of these methods weren’t to extract opin-ion targets/words, but to categorize all given as-pect terms and sentiment words. Although thesemodels could be used for our task according to theassociations between candidates and topics, solelyemploying semantic relations is still one-sided andinsufficient to obtain expected performance.

Furthermore, there is little work which consid-ered these two types of relations globally (Su etal., 2008; Hai et al., 2012; Bross and Ehrig, 2013).They usually captured different relations using co-occurrence information. That was too coarse toobtain expected results (Liu et al., 2012). In ad-dition, (Hai et al., 2012) extracted opinion tar-gets/words in a bootstrapping process, which hadan error propagation problem. In contrast, we per-form extraction with a global graph co-rankingprocess, where error propagation can be effec-tively alleviated. (Su et al., 2008) used heteroge-neous relations to find implicit sentiment associ-ations among words. Their aim was only to per-form aspect terms categorization but not to extract

opinion targets/words. They extracted opinion tar-gets/words in advanced through simple phrase de-tection. Thus, the extraction performance is farfrom expectation.

3 The Proposed Method

In this section, we propose our method in detail.We formulate opinion targets/words extraction asa co-ranking task. All nouns/noun phrases are re-garded as opinion target candidates, and all ad-jectives/verbs are regarded as opinion word candi-dates, which are widely adopted by pervious meth-ods (Hu and Liu, 2004a; Qiu et al., 2011; Wangand Wang, 2008; Liu et al., 2012). Then each can-didate will be assigned a confidence and ranked,and the candidates with higher confidence than athreshold will be extracted as the results.

Different from traditional methods, besidesopinion relations among words, we additionallycapture semantic relations among homogeneouscandidates. To this end, a heterogeneous undi-rected graph G = (V,E) is constructed. V =V t ∪ V o denotes the vertex set, which includesopinion target candidates vt ∈ V t and opinionword candidates vo ∈ V o. E denotes the edgeset, where eij ∈ E means that there is a relationbetween two vertices. Ett ⊂ E represents the se-mantic relations between two opinion target candi-dates. Eoo ⊂ E represents the semantic relationsbetween two opinion word candidates. Eto ⊂ Erepresents the opinion relations between opiniontarget candidates and opinion word candidates.Based on different relation types, we used threematrices Mtt ∈ R|V t|×|V t|, Moo ∈ R|V o|×|V o|

and Mto ∈ R|V t|×|V o| to record the associationweights between any two vertices, respectively.Section 3.4 will illustrate how to construct them.

3.1 Only Considering Opinion Relations

To estimate the confidence of each candidate, weuse a random walk algorithm on our graph to per-form co-ranking. Most previous methods (Hu andLiu, 2004a; Qiu et al., 2011; Wang and Wang,2008; Liu et al., 2012) only considered opinionrelations among words. Their basic assumption isas follows.

Assumption 1: If a word is likely tobe an opinion word, the words whichit has opinion relation with will havehigher confidence to be opinion targets,and vice versa.

316

In this way, candidates’ confidences (vt or vo) arecollectively determined by each other iteratively.It equals to making random walk on subgraphGto = (V,Eto) of G. Thus we have

Ct = (1− µ)×Mto × Co + µ× ItCo = (1− µ)×MT

to × Ct + µ× Io(1)

where Ct and Co respectively represent confi-dences of opinion targets and opinion words.mto

i,j ∈Mto means the association weight betweenthe ith opinion target and the jth opinion word ac-cording to their opinion relations.

It’s worthy noting that It and Io respectively de-note prior confidences of opinion target candidatesand opinion word candidates. We argue that opin-ion targets are usually domain-specific, and thereare remarkably distribution difference of them ondifferent domains (in-domain Din vs. out-domainDout). If a candidate is salient inDin but commonin Dout, it’s likely to be an opinion target in Din.Thus, we use a domain relevance measure (DR)(Hai et al., 2013) to compute It.

DR(t) =R(t,Din)R(t,Dout)

(2)

where R(t,D) = wtst× ∑N

j=1(wtj − 1Wj×∑Wj

k=1wkj) represents candidate relevance withdomain D. wtj = (1 + logTFtj) × log N

DFt

is a TF-IDF-like weight of candidate t in doc-ument j. TFtj is the frequency of the candi-date t in the jth document, and DFt is docu-ment frequency. N means the document num-ber in domain D. R(t,D) includes two mea-sures to reflect the salient of a candidate in D. 1)wtj − 1

Wj×∑Wj

k=1wkj reflects how frequently aterm is mentioned in a particular document. Wj

denotes the word number in document j. 2) wtst

quantifies how significantly a term is mentionedacross all documents in D. wt = 1

N ×∑N

k=1wtk

denotes average weight across all documents for

t. st =√

1N ×

∑Nj=1 (wtj − wj)2 denotes the

standard variance of term t. We use the givenreviews as in-domain collection Din and Googlen-gram corpus1 as out-domain collection Dout.Finally, each entry in It is a normalized DR(t)score. In contrast, opinion words are usuallydomain-independent. Users may use same wordsto express theirs opinions, like “good”, “bad”, etc.But there are still some domain-dependent opinion

1http://books.google.com/ngrams/datasets

words, like “delicious” in the restaurant domain,“powerful” in the car domain. It’s difficult to dis-criminate them from other words by using statisti-cal information. So we simply set all entries in Ioto be 1. µ ∈ [0, 1] in Eq.1 determines the impactof the prior confidence on results.

3.2 Only Considering Semantic RelationsTo estimate candidates’ confidences by only con-sidering semantic relations among words, wemake two separately random walks on the sub-graphs of G, Gtt = (V,Ett) and Goo = (V,Eoo).The basic assumption is as follows:

Assumption 2: If a word is likely tobe an opinion target (opinion word), thewords which it has strong semantic rela-tion with will have higher confidence tobe opinion targets (opinion words).

In this way, the confidence of the candidate isdetermined only by its homogeneous neighbours.There is no mutual reinforcement between opiniontargets and opinion words. Thus we have

Ct = (1− ν)×Mtt × Ct + ν × ItCo = (1− ν)×Moo × Co + ν × Io

(3)

where ν has the same role as µ in Eq.1.

3.3 Considering Semantic Relations andOpinion Relations Together

To jointly model semantic relations and opinionrelations for opinion targets/words extraction, wecouple two random walking algorithms mentionedabove together. Here, Assumption 1 and As-sumption 2 are both satisfied. Thus, an opiniontarget/word candidate’s confidence is collectivelydetermined by its neighbours according to differ-ent relation types. Meanwhile, each item maymake influence on it’s neighbours. It’s an iterativereinforcement process. Thus, we have

Ct = (1− λ− µ)×Mto × Co

+ λ×Mtt × Ct + µ× ItCo = (1− λ− µ)×MT

to × Ct

+ λ×Moo × Co + µ× Io

(4)

where λ ∈ [0, 1] determines which type of rela-tions dominates candidate confidence estimation.λ = 0 means that each candidate’s confidenceis estimated by only considering opinion relationsamong words, which equals to Eq.1. Otherwise,when λ = 1, candidate confidence estimation only

317

considers semantic relations among words, whichequals to Eq.3. µ, Io and It have the same meaningin Eq.1. Our algorithm will run iteratively until itconverges or in a fixed iteration number Iter. Inexperiments, we set Iter = 200.

Obtaining Word Preference. The co-rankingalgorithm in Eq.4 is based on a standard randomwalking algorithm, which randomly selects a linkaccording to the association matrix Mto, Mtt andMoo, or jumps to a random node with prior confi-dence value. However, it generates a global rank-ing over all candidates without taking the nodepreference (word preference) into account. Asmentioned in the first section, each opinion tar-get/word has its preferred collocations, it’s reason-able that the confidence of an opinion target (opin-ion word) candidate should be preferentially de-termined by its preferences, rather than all of itsneighbors with opinion relations.

To obtain the word preference, we resort to top-ics. We believe that if an opinion word vi

o istopical related with a target word vj

t, vio can be

regarded as a word preference for vjt, and vice

versa. For example, “price” and “expensive” aretopically related in phone’s domain, so they are aword preference for each other.

Specifically, we use a vector P Ti =[P Ti

1 , ..., P Tik , ..., P Ti

|V o|]1×|V o| to represent wordpreference of the ith opinion target candidate.P Ti

k means the preferred probability of the ithpotential opinion target for the kth potentialopinion words. To compute P Ti

k , we first useKullback-Leibler divergence to measure thesemantic distance between any two candidates onthe bridge of topics. Thus, we have

D(vi, vj) =12

Σz(KLz(vi||vj) +KLz(vj ||vi))

whereKLz(vi||vj) = p(z|vi)logp(z|vi)p(z|vj) means the

KL-divergence from candidate vi to vj based ontopic z. p(z|v) = p(v|z)p(z)

p(v) , where p(v|z) is theprobability of the candidate v to topic z (see Sec-tion 3.4). p(z) is the probability that topic z inreviews. p(v) is the probability that a candidateoccurs in reviews. Then, a logistic function is usedto map D(vi, vj) into [0, 1].

SA(vi, vj) =1

1 + eD(vi,vj)(5)

Then, we calculate P Tik by normalize SA(vi, vj)

score, i.e. P Tik = SA(vt

i ,vok)∑|V o|

p=1 SA(vti ,v

op)

. For demon-

stration, we give some examples in Table 1, whereeach entry denotes a SA(vi, vj) score between twocandidates. We can see that using topics can suc-cessfully capture the preference information foreach opinion target/word.

expensive good long colorfulprice 0.265 0.043 0.003 0.000LED 0.002 0.035 0.007 0.098battery 0.000 0.015 0.159 0.001

Table 1: Examples of Calculated Word Preference

And we use a vector POj =[POj

1 , ..., POjq , ..., P

Oj

|V t|]1×|V t| to representthe preference information of the jth opin-ion word candidate. Similarly, we have

POjq =

SA(vtq ,vo

j )∑|V t|k=1 SA(vt

k,voj )

.

Incorporating Word Preference into Co-ranking. To consider such word preference inour co-ranking algorithm, we incorporate it intothe random walking on Gto. Intuitively, prefer-ence vectors will be different for different can-didates. Thus, the co-ranking algorithm wouldbe personalized. It allows that the candidateconfidence propagates to other candidates onlyin its preference cluster. Specifically, we makemodification on original transition matrix Mto =(M to

1 ,Mto2 , ...,M

to|V t|) and add each candidate’s

preference in it. Let Mto = (M to1 , M

to2 , ..., M

to|V t|)

be the modified transition matrix, which recordsthe associations between opinion target candi-dates and opinion word candidates. Here M to

k ∈R1×|V o| and M to

k ∈ R1×|V o| denotes the kth col-umn vector in Mto and Mto, respectively. Andlet Diag(P Tk) denote a diagonal matrix whoseeigenvalue is vector P Tk , we have

M tok = M to

k Diag(P Tk)

Similarly, let U tok ∈ R1×|V t| and U to

k ∈ R1×|V t|

denotes the kth row vector in MTto and MT

to, re-spectively. Diag(POk) denote a diagonal matrixwhose eigenvalue is vector POk . Then we have

U tok = U to

k Diag(POk)

In this way, each candidate’s preference is in-corporated into original associations based onopinion relation Mto through Diag(POk) andDiag(P Tk). And candidates’ confidences willmainly come from the contributions of its prefer-ences. Thus, Ct and Co in Eq.4 become:

318

Ct = (1− λ− µ)× Mto × Co

+ λ×Mtt × Ct + µ× ItCo = (1− λ− µ)× MT

to × Ct

+ λ×Moo × Co + µ× Io

(6)

3.4 Capturing Semantic and OpinionRelations

In this section, we explain how to capture seman-tic relations and opinion relations for constructingtransition matrices Mtt, Moo and Mto.

Capturing Semantic Relations: For captur-ing semantic relations among homogenous candi-dates, we employ topics. We believe that if twocandidates share similar topics in the corpus, thereis a strong semantic relation between them. Thus,we employ a LDA variation (Mukherjee and Liu,2012), an extension of (Zhao et al., 2010), to dis-cover topic distribution on words, which sampledall words into two separated observations: opiniontargets and opinion words. It’s because that we areonly interested in topic distribution of opinion tar-gets/words, regardless of other useless words, in-cluding conjunctions, prepositions etc. This modelhas been proven to be better than the standardLDA model and other LDA variations for opinionmining (Mukherjee and Liu, 2012).

After topic modeling, we obtain the proba-bility of the candidates (vt and vo) to topic z,i.e. p(z|vt) and p(z|vo), and topic distributionp(z). Then, a symmetric Kullback-Leibler diver-gence as same as Eq.5 is used to calculate the se-mantical associations between any two homoge-nous candidates. Thus, we obtain SA(vt, vt) andSA(vo, vo), which correspond to the entries inMtt and Moo, respectively.

Capturing Opinion Relations: To captureopinion relations among words and construct thetransition matrix Mto, we used an alignment-based method proposed in (Liu et al., 2013b).This approach models capturing opinion relationsas a monolingual word alignment process. Eachopinion target can find its corresponding mod-ifiers in sentences through alignment, in whichmultiple factors are considered globally, such asco-occurrence information, word position in sen-tence, etc. Moreover, this model adopted a par-tially supervised framework to combine syntac-tic information with alignment results, which hasbeen proven to be more precise than the state-of-the-art approaches for opinion relations identifica-tion (Liu et al., 2013b).

After performing word alignment, we obtaina set of word pairs composed of a noun (nounphrase) and its corresponding modified word.Then, we simply employ Pointwise Mutual Infor-mation (PMI) to calculate the opinion associationsamong words as the entries in Mto. OA(vt, vo) =log p(vt,vo)

p(vt)p(vo) , where vt and vo denote an opiniontarget candidate and an opinion word candidate,respectively. p(vt, vo) is the co-occurrence prob-ability of vt and vo based on the opinion relationidentification results. p(vt) and p(vo) give the in-dependent occurrence probability of of vt and vo,respectively

4 Experiments

4.1 Datasets and Evaluation Metrics

Datasets: To evaluate the proposed method, weused three datasets. The first one is CustomerReview Datasets (CRD), used in (Hu and Liu,2004a), which contains reviews about five prod-ucts. The second one is COAE2008 dataset22,which contains Chinese reviews about four prod-ucts. The third one is Large, also used in (Wanget al., 2011; Liu et al., 2012; Liu et al., 2013a),where two domains are selected (Mp3 and Hotel).As mentioned in (Liu et al., 2012), Large con-tains 6,000 sentences for each domain. Opiniontargets/words are manually annotated, where threeannotators were involved. Two annotators wererequired to annotate out opinion words/targets inreviews. When conflicts occur, the third annota-tor make final judgement. In total, we respectivelyobtain 1,112, 1,241 opinion targets and 334, 407opinion words in Hotel, MP3.

Pre-processing: All sentences are tagged toobtain words’ part-of-speech tags using StanfordNLP tool3. And noun phrases are identified usingthe method in (Zhu et al., 2009) before extraction.

Evaluation Metrics: We select precision(P),recall(R) and f-measure(F) as metrics. And a sig-nificant test is performed, i.e., a t-test with a de-fault significant level of 0.05.

4.2 Our Method vs. The State-of-the-artMethods

To prove the effectiveness of the proposed method,we select some state-of-the-art methods for com-parison as follows:

2http://ir-china.org.cn/coae2008.html3http://nlp.stanford.edu/software/tagger.shtml

319

Methods D1 D2 D3 D4 D5 Avg.P R F P R F P R F P R F P R F F

Hu 0.75 0.82 0.78 0.71 0.79 0.75 0.72 0.76 0.74 0.69 0.82 0.75 0.74 0.80 0.77 0.758DP 0.87 0.81 0.84 0.90 0.81 0.85 0.90 0.86 0.88 0.81 0.84 0.82 0.92 0.86 0.89 0.856

Zhang 0.83 0.84 0.83 0.86 0.85 0.85 0.86 0.88 0.87 0.80 0.85 0.82 0.86 0.86 0.86 0.846SAS 0.80 0.79 0.79 0.82 0.76 0.79 0.79 0.74 0.76 0.77 0.78 0.77 0.80 0.76 0.78 0.778Liu 0.84 0.85 0.84 0.87 0.85 0.86 0.88 0.89 0.88 0.81 0.85 0.83 0.89 0.87 0.88 0.858Hai 0.77 0.87 0.83 0.79 0.86 0.82 0.79 0.89 0.84 0.72 0.88 0.79 0.74 0.88 0.81 0.818CR 0.84 0.86 0.85 0.87 0.85 0.86 0.87 0.90 0.88 0.81 0.87 0.83 0.89 0.88 0.89 0.862

CR WP 0.86 0.86 0.86 0.88 0.86 0.87 0.89 0.90 0.89 0.81 0.87 0.83 0.91 0.89 0.90 0.870

Table 2: Results of Opinion Targets Extraction on Customer Review Dataset

Methods Camera Car Laptop Phone Mp3 Hotel Avg.P R F P R F P R F P R F P R F P R F F

Hu 0.63 0.65 0.64 0.62 0.58 0.60 0.51 0.67 0.58 0.69 0.60 0.64 0.61 0.68 0.64 0.60 0.65 0.62 0.587DP 0.71 0.70 0.70 0.72 0.65 0.68 0.58 0.69 0.63 0.78 0.66 0.72 0.69 0.70 0.69 0.67 0.69 0.68 0.683

Zhang 0.71 0.78 0.74 0.69 0.68 0.68 0.57 0.80 0.67 0.80 0.71 0.75 0.67 0.77 0.72 0.67 0.76 0.71 0.712SAS 0.72 0.72 0.72 0.71 0.64 0.67 0.59 0.72 0.65 0.78 0.69 0.73 0.69 0.75 0.72 0.69 0.74 0.71 0.700Liu 0.75 0.81 0.78 0.71 0.71 0.71 0.61 0.85 0.71 0.83 0.74 0.78 0.70 0.82 0.76 0.71 0.80 0.75 0.749Hai 0.68 0.84 0.76 0.69 0.75 0.72 0.58 0.86 0.72 0.75 0.76 0.76 0.65 0.83 0.74 0.62 0.82 0.75 0.742CR 0.75 0.83 0.79 0.72 0.74 0.73 0.60 0.85 0.70 0.83 0.77 0.80 0.70 0.84 0.76 0.71 0.83 0.77 0.758

CR WP 0.78 0.84 0.81 0.74 0.75 0.74 0.64 0.85 0.73 0.84 0.76 0.80 0.74 0.84 0.79 0.74 0.82 0.78 0.773

Table 3: Results of Opinion Targets Extraction on COAE 2008 and Large

Hu extracted opinion targets/words using asso-ciation mining rules (Hu and Liu, 2004a).

DP used syntax-based patterns to capture opin-ion relations in sentences, and then used a boot-strapping process to extract opinion targets/words(Qiu et al., 2011),.

Zhang is proposed by (Zhang et al., 2010).They also used syntactic patterns to capture opin-ion relations between words. Then a HITS (Klein-berg, 1999) algorithm is employed to extract opin-ion targets.

Liu is proposed by (Liu et al., 2013a), an ex-tension of (Liu et al., 2012). They employed aword alignment model to capture opinion relationsamong words, and then used a random walking al-gorithm to extract opinion targets.

Hai is proposed by (Hai et al., 2012), which issimilar to our method. They employed both of se-mantic relations and opinion relations to extractopinion words/targets in a bootstrapping frame-work. But they captured relations only using co-occurrence statistics. Moreover, word preferencewas not considered.

SAS is proposed by (Mukherjee and Liu, 2012),an extended lda-based model of (Zhao et al.,2010). The top K items for each aspect are ex-tracted as opinion targets/words. It means thatonly semantic relations among words are consid-ered in SAS. And we set aspects number to be 9 assame as (Mukherjee and Liu, 2012).

CR: is the proposed method in this paper by us-ing co-ranking, referring to Eq.4. CR doesn’t con-sider word preference.

CR WP: is the full implementation of ourmethod, referring to Eq.6.

Hu, DP, Zhang and Liu are the methods whichonly consider opinion relations among words.SAS is the methods which only consider seman-tic relations among words. Hai, CR and CR WPconsider these two types of relations together. Theparameter settings of state-of-the-art methods aresame as their original paper. In CR and CR WP,we set λ = 0.4 and µ = 0.1. The experimentalresults are shown in Table 2, 3, 4 and 5, where thelast column presents the average F-measure scoresfor multiple domains. Since Liu and Zhang aren’tdesigned for opinion words extraction, we don’tpresent their results in Table 4 and 5. From exper-imental results, we can see.

1) Our methods (CR and CR WP) outperformother methods not only on opinion targets extrac-tion but on opinion words extraction in most do-mains. It proves the effectiveness of the proposedmethod.

2) CR and CR WP have much better perfor-mance than Liu and Zhang, especially on Recall.Liu and Zhang also use a ranking framework likeours, but they only employ opinion relations forextraction. In contrast, besides opinion relations,CR and CR WP further take semantic relationsinto account. Thus, more opinion targets/wordscan be extracted. Furthermore, we observe thatCR and CR WP outperform SAS. SAS only ex-ploits semantic relations, but ignores opinion re-lations among words. Its extraction is performedseparately and neglects the reinforcement betweenopinion targets and opinion words. Thus, SAS hasworse performance than our methods. It demon-strates the usefulness of considering multiple rela-tion types.

320

Methods D1 D2 D3 D4 D5 Avg.P R F P R F P R F P R F P R F F

Hu 0.57 0.75 0.65 0.51 0.76 0.61 0.57 0.73 0.64 0.54 0.62 0.58 0.62 0.67 0.64 0.624DP 0.64 0.73 0.68 0.57 0.79 0.66 0.65 0.70 0.67 0.61 0.65 0.63 0.70 0.68 0.69 0.666

SAS 0.64 0.68 0.66 0.55 0.70 0.62 0.62 0.65 0.63 0.60 0.61 0.60 0.68 0.63 0.65 0.632Hai 0.62 0.77 0.69 0.52 0.80 0.64 0.60 0.74 0.67 0.56 0.69 0.62 0.66 0.70 0.68 0.660CR 0.62 0.75 0.68 0.57 0.79 0.67 0.64 0.75 0.69 0.63 0.69 0.66 0.68 0.69 0.69 0.678

CR WP 0.65 0.75 0.70 0.59 0.80 0.68 0.65 0.74 0.70 0.66 0.68 0.67 0.71 0.70 0.70 0.690

Table 4: Results of Opinion Words Extraction on Customer Review Dataset

Methods Camera Car Laptop Phone Mp3 Hotel Avg.P R F P R F P R F P R F P R F P R F F

Hu 0.72 0.74 0.73 0.70 0.71 0.70 0.66 0.70 0.68 0.70 0.70 0.70 0.48 0.67 0.56 0.52 0.69 0.59 0.660DP 0.80 0.73 0.76 0.79 0.71 0.75 0.75 0.69 0.72 0.78 0.68 0.73 0.60 0.65 0.62 0.61 0.66 0.63 0.702

SAS 0.73 0.70 0.71 0.75 0.68 0.71 0.72 0.68 0.69 0.71 0.66 0.68 0.64 0.62 0.63 0.66 0.61 0.63 0.675Hai 0.76 0.74 0.75 0.72 0.74 0.73 0.69 0.72 0.70 0.72 0.70 0.71 0.61 0.69 0.64 0.59 0.68 0.64 0.690CR 0.80 0.75 0.77 0.77 0.74 0.75 0.73 0.71 0.72 0.75 0.71 0.73 0.63 0.69 0.64 0.63 0.68 0.66 0.710

CR WP 0.80 0.75 0.77 0.80 0.74 0.77 0.77 0.71 0.74 0.78 0.72 0.75 0.66 0.68 0.67 0.67 0.69 0.68 0.730

Table 5: Results of Opinion Words Extraction on COAE 2008 and Large

3) CR and CR WP both outperform Hai. Webelieve the reasons are as follows. First, CR andCR WP considers multiple relations in a unifiedprocess by using graph co-ranking. In contrast,Hai adopts a bootstrapping framework which per-forms extraction step by step and may have theproblem of error propagation. It demonstratesthat our graph co-ranking is more suitable for thistask than bootstrapping-based strategy. Second,our method captures semantic relations using topicmodeling and captures opinion relations throughword alignments, which are more precise than Haiwhich merely uses co-occurrence information toindicate such relations among words. In addition,word preference is not handled in Hai, but pro-cessed in CR WP. The results show the usefulnessof word preference for opinion targets/words ex-traction.

4) CR WP outperforms CR, especially on pre-cision. The only difference between them is thatCR WP considers word preference when perform-ing graph ranking for candidate confidence esti-mation, but CR does not. Each candidate confi-dence estimation in CR WP gives more weightsfor this candidate’s preferred words than CR.Thus, the precision can be improved.

4.3 Semantic Relation vs. Opinion Relation

In this section, we discuss which relation typeis more effective for this task. For comparison,we design two baselines, called OnlySA and On-lyOA. OnlyOA only employs opinion relationsamong words, which equals to Eq.1. OnlySA onlyemploys semantic relations among words, whichequals to Eq.3. Moreover, Combine is our methodwhich considers both of opinion relations and se-mantic relations together, referring to Eq.4 with

MP3 Hotel Laptop Phone

Recall

.65

.70

.75

.80

.85

.90

.95OnlySA

OnlyOA

Combine

(a) Opinion Target Extraction Results

MP3 Hotel Laptop PhoneR

ecall

.60

.65

.70

.75

.80OnlySA

OnlyOA

Combine

(b) Opinion Word Extraction Results

Figure 2: Semantic Relations vs. Opinion Rela-tions

λ = 0.5. Figure 2 presents experimental results.The left graph presents opinion targets extractionresults and the right graph presents opinion wordsextraction results. Because of space limitation, weonly shown the results of four domains (MP3, Ho-tel, Laptop and Phone).

From results, we observe that OnlyOA outper-forms OnlySA in all domains. It demonstratesthat employing opinion relations are more usefulthan semantic relations for co-extracting opiniontargets/words. And it is necessary to utilize themutual reinforcement relationship between opin-ion words and opinion targets. Moreover, Com-bine outperforms OnlySA and OnlyOA in all do-mains. It indicates that combining different rela-tions among words together is effective.

4.4 The Effectiveness of Considering WordPreference

In this section, we try to prove the necessity ofconsidering word preference in Eq.6. Besides thecomparison between CR and CR WP performed

321

in the main experiment in Section 4.2, we fur-ther incorporate word preference in aforemen-tioned OnlyOA, named as OnlyOA WP, whichonly employs opinion relations among words andequals to Eq.6 with λ = 0. Experimental resultsare shown in Figure 3. Because of space limita-tion, we only show the results of the same domainsin section 4.3,

Form results, we observe that CR WP out-performs CR, and OnlyOA WP outperforms On-lyOA in all domains, especially on precision.These observations demonstrate that consideringword preference is very important for opinion tar-gets/words extraction. We believe the reason isthat exploiting word preference can provide morefine information for opinion target/word candi-dates’ confidence estimation. Thus the perfor-mance can be improved.


Pre

cis

ion

.60

.65

.70

.75

.80

.85

.90OnlyOA

OnlyOA_WP

CR

CR_WP


Recall

.70

.75

.80

.85

.90

.95OnlyOA

OnlyOA_WP

CR

CR_WP

(a) Opinion Target Extraction Results


Pre

cis

ion

.60

.65

.70

.75

.80OnlyOA

OnlyOA_WP

CR

CR_WP


Recall

.60

.65

.70

.75

.80OnlyOA

OnlyOA_WP

CR

CR_WP

(b) Opinion Word Extraction Results

Figure 3: Experimental results when consideringword preference

4.5 Parameter Sensitivity

In this subsection, we discuss the variation of ex-traction performance when changing λ and µ inEq.6. Due to space limitation, we only show theF-measure of CR WP on four domains. Experi-mental results are shown in Figure 4 and Figure5. The left graphs in Figure 4 and 5 present theperformance variation of CR WP with varying λfrom 0 to 0.9 and fixing µ = 0.1. The right graphsin Figure 4 and 5 present the performance varia-tion of CR WP with varying µ from 0 to 0.6 andfixing λ = 0.4.

In the left graphs in Figure 4 and 5, we observethe best performance is obtained when λ = 0.4.It indicates that opinion relations and semantic re-lations are both useful for extracting opinion tar-gets/words. The extraction performance is benefi-

cial from their combination. In the right graphs inFigure 4 and 5, the best performance is obtainedwhen µ = 0.1. It indicates prior knowledge isuseful for extraction. When µ increases, perfor-mance, however, decreases. It demonstrates thatincorporating more prior knowledge into our al-gorithm would restrain other useful clues on esti-mating candidate confidence, and hurt the perfor-mance.

0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

F-M

ea

su

re

.60

.65

.70

.75

.80

.85

MP3

Hotel

Laptop

Phone

0.0 .1 .2 .3 .4 .5 .6

F-M

ea

su

re

.65

.70

.75

.80

.85

MP3

Hotel

Laptop

Phone

Figure 4: Opinion targets extraction results

0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

F-M

ea

su

re

.55

.60

.65

.70

.75

.80

MP3

Hotel

Laptop

Phone

0.0 .1 .2 .3 .4 .5 .6

F-M

ea

su

re

.50

.55

.60

.65

.70

.75

.80

MP3

Hotel

Laptop

Phone

Figure 5: Opinion words extraction results

5 Conclusions

This paper presents a novel method with graph co-ranking to co-extract opinion targets/words. Wemodel extracting opinion targets/words as a co-ranking process, where multiple heterogenous re-lations are modeled in a unified model to make co-operative effects on the extraction. In addition, weespecially consider word preference in co-rankingprocess to perform more precise extraction. Com-pared to the state-of-the-art methods, experimentalresults prove the effectiveness of our method.

Acknowledgement

This work was sponsored by the NationalBasic Research Program of China (No.2014CB340500), the National Natural Sci-ence Foundation of China (No. 61272332and No. 61202329), the National High Tech-nology Development 863 Program of China(No. 2012AA011102), and CCF-Tencent OpenResearch Fund.

ReferencesJuergen Bross and Heiko Ehrig. 2013. Automatic con-

struction of domain and aspect specific sentiment

322

lexicons for customer review mining. In Proceed-ings of the 22nd ACM international conference onConference on information & knowledge man-agement, CIKM ’13, pages 1077–1086, New York,NY, USA. ACM.

Zhen Hai, Kuiyu Chang, and Gao Cong. 2012. Oneseed to find them all: mining opinion features viaassociation. In CIKM, pages 255–264.

Zhen Hai, Kuiyu Chang, Jung-Jae Kim, and Christo-pher C. Yang. 2013. Identifying features in opinionmining via intrinsic and extrinsic domain relevance.IEEE Transactions on Knowledge and Data Engi-neering, 99(PrePrints):1.

Mingqin Hu and Bing Liu. 2004a. Mining opinion fea-tures in customer reviews. In Proceedings of Con-ference on Artificial Intelligence (AAAI).

Minqing Hu and Bing Liu. 2004b. Mining and summa-rizing customer reviews. In Proceedings of the tenthACM SIGKDD international conference on Knowl-edge discovery and data mining, KDD ’04, pages168–177, New York, NY, USA. ACM.

Jon M. Kleinberg. 1999. Authoritative sources in ahyperlinked environment. J. ACM, 46(5):604–632,September.

Fangtao Li, Chao Han, Minlie Huang, Xiaoyan Zhu,Yingju Xia, Shu Zhang, and Hao Yu. 2010.Structure-aware review mining and summarization.In Chu-Ren Huang and Dan Jurafsky, editors, COL-ING, pages 653–661. Tsinghua University Press.

Bing Liu, Minqing Hu, and Junsheng Cheng. 2005.Opinion observer: analyzing and comparing opin-ions on the web. In Allan Ellis and Tatsuya Hagino,editors, WWW, pages 342–351. ACM.

Kang Liu, Liheng Xu, and Jun Zhao. 2012. Opin-ion target extraction using word-based translationmodel. In Proceedings of the 2012 Joint Confer-ence on Empirical Methods in Natural LanguageProcessing and Computational Natural LanguageLearning, pages 1346–1356, Jeju Island, Korea,July. Association for Computational Linguistics.

Kang Liu, Liheng Xu, Yang Liu, and Jun Zhao. 2013a.Opinion target extraction using partially supervisedword alignment model.

Kang Liu, Liheng Xu, and Jun Zhao. 2013b. Syntacticpatterns versus word alignment: Extracting opiniontargets from online reviews.

Tengfei Ma and Xiaojun Wan. 2010. Opinion tar-get extraction in chinese news comments. In Chu-Ren Huang and Dan Jurafsky, editors, COLING(Posters), pages 782–790. Chinese Information Pro-cessing Society of China.

Samaneh Moghaddam and Martin Ester. 2011. Ilda:Interdependent lda model for learning latent aspectsand their ratings from online product reviews. In

Proceedings of the 34th International ACM SIGIRConference on Research and Development in Infor-mation Retrieval, SIGIR ’11, pages 665–674, NewYork, NY, USA. ACM.

Samaneh Moghaddam and Martin Ester. 2012a.Aspect-based opinion mining from product reviews.In Proceedings of the 35th International ACM SIGIRConference on Research and Development in In-formation Retrieval, SIGIR ’12, pages 1184–1184,New York, NY, USA. ACM.

Samaneh Moghaddam and Martin Ester. 2012b. Onthe design of lda models for aspect-based opinionmining. In CIKM, pages 803–812.

Arjun Mukherjee and Bing Liu. 2012. Aspect extrac-tion through semi-supervised modeling. In Proceed-ings of the 50th Annual Meeting of the Associationfor Computational Linguistics: Long Papers - Vol-ume 1, ACL ’12, pages 339–348, Stroudsburg, PA,USA. Association for Computational Linguistics.

Ana-Maria Popescu and Oren Etzioni. 2005. Ex-tracting product features and opinions from reviews.In Proceedings of the conference on Human Lan-guage Technology and Empirical Methods in Natu-ral Language Processing, HLT ’05, pages 339–346,Stroudsburg, PA, USA. Association for Computa-tional Linguistics.

Guang Qiu, Bing Liu, Jiajun Bu, and Chun Che. 2009.Expanding domain sentiment lexicon through dou-ble propagation.

Guang Qiu, Bing Liu 0001, Jiajun Bu, and Chun Chen.2011. Opinion word expansion and target extractionthrough double propagation. Computational Lin-guistics, 37(1):9–27.

Qi Su, Xinying Xu, Honglei Guo, Zhili Guo, XianWu, Xiaoxun Zhang, Bin Swen, and Zhong Su.2008. Hidden sentiment association in chinese webopinion mining. In Jinpeng Huai, Robin Chen,Hsiao-Wuen Hon, Yunhao Liu, Wei-Ying Ma, An-drew Tomkins, and Xiaodong Zhang 0001, editors,WWW, pages 959–968. ACM.

Bo Wang and Houfeng Wang. 2008. Bootstrappingboth product features and opinion words from chi-nese customer reviews with cross-inducing.

Hongning Wang, Yue Lu, and ChengXiang Zhai. 2011.Latent aspect rating analysis without aspect key-word supervision. In Chid Apt, Joydeep Ghosh,and Padhraic Smyth, editors, KDD, pages 618–626.ACM.

Yuanbin Wu, Qi Zhang, Xuanjing Huang, and Lide Wu.2009. Phrase dependency parsing for opinion min-ing. In EMNLP, pages 1533–1541. ACL.

Bishan Yang and Claire Cardie. 2013. Joint infer-ence for fine-grained opinion extraction. In Pro-ceedings of the 51st Annual Meeting of the Associa-tion for Computational Linguistics (Volume 1: Long

323

Papers), pages 1640–1649, Sofia, Bulgaria, August.Association for Computational Linguistics.

Lei Zhang, Bing Liu, Suk Hwan Lim, and EamonnO’Brien-Strain. 2010. Extracting and rankingproduct features in opinion documents. In Chu-Ren Huang and Dan Jurafsky, editors, COLING(Posters), pages 1462–1470. Chinese InformationProcessing Society of China.

Wayne Xin Zhao, Jing Jiang, Hongfei Yan, and Xiaom-ing Li. 2010. Jointly modeling aspects and opin-ions with a maxent-lda hybrid. In Proceedings ofthe 2010 Conference on Empirical Methods in Nat-ural Language Processing, EMNLP ’10, pages 56–65, Stroudsburg, PA, USA. Association for Compu-tational Linguistics.

Jingbo Zhu, Huizhen Wang, Benjamin K. Tsou, andMuhua Zhu. 2009. Multi-aspect opinion pollingfrom textual reviews. In David Wai-Lok Cheung,Il-Yeol Song, Wesley W. Chu, Xiaohua Hu, andJimmy J. Lin, editors, CIKM, pages 1799–1802.ACM.

324

extracting opinion targets and opinion words from online ... · extracting opinion targets and...

Documents