a community-based approach for link prediction in signed social

11
Research Article A Community-Based Approach for Link Prediction in Signed Social Networks Saeed Reza Shahriary, 1 Mohsen Shahriari, 2 and Rafidah MD Noor 1 1 Department of Computer System & Technology, Faculty of Computer Science & Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia 2 Advanced Community and Information System, RWTH Aachen University, Ahornstraße 55, 52056 Aachen, Germany Correspondence should be addressed to Saeed Reza Shahriary; [email protected], Mohsen Shahriari; [email protected] and Rafidah MD Noor; fi[email protected] Received 28 February 2014; Accepted 8 October 2014 Academic Editor: Przemyslaw Kazienko Copyright © 2015 Saeed Reza Shahriary et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In signed social networks, relationships among nodes are of the types positive (friendship) and negative (hostility). One absorbing issue in signed social networks is predicting sign of edges among people who are members of these networks. Other than edge sign prediction, one can define importance of people or nodes in networks via ranking algorithms. ere exist few ranking algorithms for signed graphs; also few studies have shown role of ranking in link prediction problem. Hence, we were motivated to investigate ranking algorithms availed for signed graphs and their effect on sign prediction problem. is paper makes the contribution of using community detection approach for ranking algorithms in signed graphs. erefore, community detection which is another active area of research in social networks is also investigated in this paper. Community detection algorithms try to find groups of nodes in which they share common properties like similarity. We were able to devise three community-based ranking algorithms which are suitable for signed graphs, and also we evaluated these ranking algorithms via sign prediction problem. ese ranking algorithms were tested on three large-scale datasets: Epinions, Slashdot, and Wikipedia. We indicated that, in some cases, these ranking algorithms outperform previous works because their prediction accuracies are better. 1. Introduction Recently, social network analysis has attracted great deal of attentions. In social networks, nodes and edges, respectively, indicate people and relationships among them [1]. Social networks are dynamic and evolve over time via registering new members, deleting profiles, and adding/removing some edges or connections among entities [2]. Hence, plenty of studies have investigated this field in order to model these structures. One of the most important problems in social networks is link prediction which can be stated as follows: with how much determinism one can predict forming (lack- ing) of edge between two people based on available structure of the graph? e importance of this subject is originated from the natural sparsity of social networks [2]. In other words, social networks encompass highly dynamic structure; therefore, available links are just a subset of possible relations among people and some new links will form in future. Link predication is also widely used in retrieving lost data and it probably helps to construct the graph [3]. One could model social systems by using signed rela- tionships. Inherently in signed graphs, most of relations are positive or negative such as likes and dislikes or trusts and distrusts [4]. Negative edges play an important role in signed networks and these negative links impress greatly on importance of nodes in the system. Studying negative rela- tionships in signed graphs can help in analyzing and better understanding social ecosystems. Link prediction in signed graph appears in the form of predicting sign of edge between two people. erefore, one important question which comes to mind in signed networks is that how accurately sign of an edge can be predicted according to local and global behavioral patterns in the network. Not only sign prediction enables us to have better understanding of social relations but also it can be utilized in several applications such as recommender systems and online social networks, in which Hindawi Publishing Corporation Scientific Programming Volume 2015, Article ID 602690, 10 pages http://dx.doi.org/10.1155/2015/602690

Upload: duongnhi

Post on 14-Jan-2017

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A Community-Based Approach for Link Prediction in Signed Social

Research ArticleA Community-Based Approach for Link Prediction inSigned Social Networks

Saeed Reza Shahriary1 Mohsen Shahriari2 and Rafidah MD Noor1

1Department of Computer System amp Technology Faculty of Computer Science amp Information Technology University of Malaya50603 Kuala Lumpur Malaysia2Advanced Community and Information System RWTH Aachen University Ahornstraszlige 55 52056 Aachen Germany

Correspondence should be addressed to Saeed Reza Shahriary shahriarysaeedrezagmailcomMohsen Shahriari shahriaridbisrwth-aachende and Rafidah MD Noor fidahsiswaumedumy

Received 28 February 2014 Accepted 8 October 2014

Academic Editor Przemyslaw Kazienko

Copyright copy 2015 Saeed Reza Shahriary et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

In signed social networks relationships among nodes are of the types positive (friendship) and negative (hostility) One absorbingissue in signed social networks is predicting sign of edges among people who are members of these networks Other than edge signprediction one can define importance of people or nodes in networks via ranking algorithms There exist few ranking algorithmsfor signed graphs also few studies have shown role of ranking in link prediction problem Hence we were motivated to investigateranking algorithms availed for signed graphs and their effect on sign prediction problem This paper makes the contribution ofusing community detection approach for ranking algorithms in signed graphs Therefore community detection which is anotheractive area of research in social networks is also investigated in this paper Community detection algorithms try to find groups ofnodes in which they share common properties like similarity We were able to devise three community-based ranking algorithmswhich are suitable for signed graphs and also we evaluated these ranking algorithms via sign prediction problem These rankingalgorithms were tested on three large-scale datasets Epinions Slashdot and Wikipedia We indicated that in some cases theseranking algorithms outperform previous works because their prediction accuracies are better

1 Introduction

Recently social network analysis has attracted great deal ofattentions In social networks nodes and edges respectivelyindicate people and relationships among them [1] Socialnetworks are dynamic and evolve over time via registeringnew members deleting profiles and addingremoving someedges or connections among entities [2] Hence plenty ofstudies have investigated this field in order to model thesestructures One of the most important problems in socialnetworks is link prediction which can be stated as followswith how much determinism one can predict forming (lack-ing) of edge between two people based on available structureof the graph The importance of this subject is originatedfrom the natural sparsity of social networks [2] In otherwords social networks encompass highly dynamic structuretherefore available links are just a subset of possible relationsamong people and some new links will form in future Link

predication is also widely used in retrieving lost data and itprobably helps to construct the graph [3]

One could model social systems by using signed rela-tionships Inherently in signed graphs most of relationsare positive or negative such as likes and dislikes or trustsand distrusts [4] Negative edges play an important role insigned networks and these negative links impress greatly onimportance of nodes in the system Studying negative rela-tionships in signed graphs can help in analyzing and betterunderstanding social ecosystems Link prediction in signedgraph appears in the form of predicting sign of edge betweentwo people Therefore one important question which comesto mind in signed networks is that how accurately signof an edge can be predicted according to local and globalbehavioral patterns in the network Not only sign predictionenables us to have better understanding of social relationsbut also it can be utilized in several applications such asrecommender systems and online social networks in which

Hindawi Publishing CorporationScientific ProgrammingVolume 2015 Article ID 602690 10 pageshttpdxdoiorg1011552015602690

2 Scientific Programming

they offer new friends for users In these networks users havecapability of expressing their views toward others via binaryminus1 and +1 values [5 6]

In this paper three community-based ranking algorithmsfor ranking of nodes have been proposed andwehave studiedtheir impacts on the edge sign prediction problem In order tostudy the impact of proposed ranking algorithms on signedprediction problem we extracted the features of the predictorbased on reputation and optimism introduced in [7] Rep-utation of a node shows how much reputable a node is inthe system and optimism denotes voting pattern of the nodetoward others We assess our method by utilizing logisticregression classifier and running algorithms on real socialnetwork datasets The structure of this paper is organized asfollows

In Section 2 related works are brought In Section 3we mainly introduce proposed algorithms moreover theproblem of sign prediction is defined and rank-based fea-tures are introduced as features for the prediction taskWe also separately go through community detection prob-lem community-based ranking algorithms and the logisticregression classifier In Section 4 datasets for experimentalpurposes are introduced and implementation results are alsodemonstrated In Section 5 the discussion ismade and finallyin Section 6 conclusion and future directions are mentioned

2 Related Works

There are two major categories of methods used in link pre-diction firstly those approaches that utilize local informationof the graph which focus on the local structure of nodesAmong local approaches [8] has the best performance in linkprediction between two specific nodes Common neighborindex is also known as friend of friend algorithm (FOAF)is used by many online social networks for recommendingfriends such as Facebook FOAF determines the similarityof two nodes that tend to communicate with each other onthe basis of counting number of joint neighbors [9] Othermetrics for computing similarity are based on preferentialattachments where these measures are calculated based onmultiplying or summing of nodes degree Second categoryconcentrates on global structure of the network and detectingoverall features in order to find how strongly two nodes aresimilar There are also diverse global approaches which usethe whole adjacency matrix in order to predict hidden linksfor instance shortest path algorithm PWR algorithm andSimiRank algorithm [1 10]

In sign prediction the most notable and remarkablemethods are divided into two categories Belief MatrixModel[5] and machine learning approaches [11] Belief MatrixModel was introduced by [5] andwas proposed for predictingtrust or distrust between two particular users in signednetworks It was the fundamental model in sign prediction ofedges Reference [11] employed the idea of signed triads andused logistic regressionmodel and some local feature in orderto predict sign of edges in social networks The features thatwere introduced by [11] are categorized in two classes firstone is on the base of the positivenegative ingoingoutgoingdegree of nodes which basically collect the local information

of nodes And the second group is based on the extractedprinciples from social psychology in which we are ableto determine the type of 119906 and V relation by utilizing theinformation of third party like 119908

Ranking of nodes has tight relationship with sign pre-diction problem so we also investigate ranking in signednetworks Ranking of nodes is the problemof computing howmuch important or trustable a node is in networks [12] Thecentrality measures like betweenness [13] closeness [14] andeigenvector centrality [15] were introduced to compute nodesrsquoimportance degree in the network Other algorithms likeHITS [16] and PageRank [17] were added in 1990 All of theseranking algorithms are designed for positive graphs and thereare merely several literatures for ranking of nodes in signednetworks The simplest ranking algorithm for signed graphsis prestige where number of positive and negative incominglinks determine ranking of each node [18] Another rankingalgorithm is PageTrust that was introduced by [19] Thismethod is extension of PageRank and the main differenceis that nodes with negative incoming links will be visitedless in random walk process Exponential ranking is anotherchiefmethodof ranking for signed graphs [12] In exponentialranking the value of ranking vector globally is obtainedfrom local trust values Another ranking algorithm for signednetworks that is greatly similar to HITS was proposed by[20] This method utilizes the concept of Bias and Deservewhich underestimates the vote of optimistic and pessimisticnodes Reference [3] also proposed new ranking algorithmsfor signed networks namely Modified HITS and ModifiedPageRank

Because we propose community-based ranking algo-rithms we should go through community detection prob-lem Community detection algorithms help to prepare moredominant recommendation systems and web page clusteringwhich have great effect on better searches [21] Commu-nity detection algorithms attempt to cluster edgesnodes inorder to have minimum number of edges between denselycommunities [22] One of the most widely used methodsfor community detection in unsigned graphs was proposedby [23] As for signed networks [24] proposed a two-stepspectral approach which was an extension to modularityThe main problem related to modularity is resolution limitin which very small communities might not be detected Inorder to address this problem [22] proposed newmethod fordetecting communities on signed graphs by extending pottsmodel Reference [25] also introduced useful approach thatworks on the base of blocking method

3 Method

In this paper authors intend to investigate the community-based problem of predicting sign of links in signed socialnetworks Hence in this section as well as proposed algo-rithms and methods the problems of sign prediction andcommunity detection will be discussed in detail

31 Edge Sign Prediction In order to define the problemformally it can be assumed that we have a signed directedgraph 119866(119881 119864) that 119881 represents set of vertices and 119864 shows

Scientific Programming 3

set of edges where customers and users can vote positivelyand negatively toward each other So the aforementionednotation 119881 represents users of site and 119864 indicates +1 andminus1 relations among them In all over the paper the personwho gives positive vote and receives it is named trustor andtrustee respectively [2 26] The sign prediction problem canbe defined as follows Suppose that signs of some links inthe network are hidden and the goal is to reliably predictvalues of these edges by current information in the graphThe sign prediction problem tries to find signs of hiddenedges with negligible error [1 27] In this work we proposestate-of-the-art community-based ranking algorithms andwe evaluate their effectiveness via sign prediction problem onthree datasets Epinoins Slashdot and Wikipedia

To this end [7] already introduced rank-based featuresnamed Optimism and Reputation to connect ranking prob-lem with sign prediction Rank-based reputation of node 119894indicates patterns of voting toward this node Meaningfullyrank-based reputation of node 119894 not only considers numberof positivenegative incoming links toward node 119894 but alsoit takes into account ranks of nodes who vote toward node119894 In other words when a person receives several positiveincoming links she might not be very reputable because oneshould consider rank of voters toward node 119894 If the userswho vote toward node 119894 are high rank then node 119894 can beconsidered reputable but if they are not high rank we cannotsay that node 119894 is reputable although the number of positiveincoming links toward node 119894 is relatively highThe followingequation can better describe rank-based reputation [7]

RBR119894=

1003816

1003816

1003816

1003816

1003816

119877

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119877

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119877

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119877

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

(1)

where RBR119894is the value of rank-based reputation of node 119894

119877

(+)

in (119894) indicates sum of rank values of nodes who positivelyvoted toward node 119894 and similarly 119877(minus)in (119894) is sum of rankvalues of nodes who negatively voted toward node 119894 In thesame vein one can define rank-based optimism of node 119894 asfollows

RBO119894=

1003816

1003816

1003816

1003816

1003816

119877

(+)

out (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119877

(minus)

out (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119877

(+)

out (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119877

(minus)

out (119894)1003816

1003816

1003816

1003816

1003816

(2)

where RBO119894is the value of rank-based optimism of node 119894

119877

(+)

out(119894) refers to sum of rank values of nodes whom node 119894positively voted toward them and similarly 119877(minus)out(119894) is sum ofrank values of nodes in which node 119894 negatively voted towardthem As formula (2) shows node 119894 which generates severalpositive outgoing links might not be optimistic because thisset might contain nodes in which they are low rank [3]In order to compute these features we need algorithms torank nodes As for ranking algorithms we propose threecommunity-based ranking algorithms in the next sections

32 Community-Based Ranking Algorithms to Compute RBRand RBO In this section we propose three ranking algo-rithms in which all of them work based on communitydetection problem in signed graphs In otherwords firstly we

a a

bb

c c

+ +

+ +

minus minus

(a) Stable configurations

a a

bb

cc

+ +

minus minus

minus minus

(b) Unstable configurations

Figure 1 All possible states of balance theory

run a community detection algorithm on signed networksThe results will be disjoint communities of nodes As allcommunity detection algorithms work based on a densitybased approach in which they try to maximize density ofintracluster edges and minimize between cluster edges sointracluster nodes are more dense and close From socialperspective intracluster nodes might know each other better(this is the notion behind our community-based rankingalgorithm) Meaningfully nodes in the same communityare much more familiar than nodes that are in differentcommunities Via using this philosophy about intraclusternodes we change previous ranking algorithm like PrestigeHITS and PageRank [3] to have influence of intra- andextracluster nodes with parameters 120572 and (1minus120572) respectivelyThen we can use ranking-based features of [7] for the caseof sign prediction Because first phase of the algorithms iscommunity detection sowe investigate community detectionproblem and a sample community detection algorithm insigned graphs in Section 321 In this paper a communitydetection algorithmbased on social balance theory is utilizedIn Section 33 ranking algorithms based on communitydetection phase are introduced

321 CommunityDetection Thealgorithmused in this paperis based on structural balance theory [28] In balance theorythere are four possible states when nodes are in signedrelations in social networks [29] One can differentiate thesestates by number of positive and negative edges in each triad[30] On the base of strong social balance theory when allof nodes have positive relation or two nodes share the sameenemy these states are called stable Similarly cases withall nodes have negative edges or with two positive edgesare unstable states [31] Regarding this definition a networkwith more than three nodes is structurally balanced if all thepossible triads are stable [32] (Figure 1)

The basic structural theorem states that these triples canbe partitioned into two distinct sets in which all the positiverelations are inside sets and negative ones are among them[33 34] In other words negative edges connect positive setsOn the basis of this definition a network is called 119896-balancedif all positive edges are located in 119896 number of differentcategories and these sets are joined with negative relations[35] In reality rarely there are structurally balanced net-worksThere are always some edges that destabilize the graphand transform it into unstable configuration Therefore thenumber of positive edges between clusters and the number ofnegative edges inside clusters should be minimized [24] Infact the problem is like finding the best sets with minimumnumber of positive relations between partitions and alsominimum number of negative edges inside sets [36]

4 Scientific Programming

Reference [25] introduced one criterion function whichmakes decision based on counting number of elementshaving conflict with 119896-balanced theory It can be defined as ifone considers119873 as number of negative edges inside clustersand 119875 as number of positive edges between clusters thennumber of inconsistencies which is denoted by 119868(119888) can bementioned as

119868 (119888) = 120573 times 119873 + (1 minus 120573) times 119875 (3)

where 120573 is the importance factor that is assigned to positiveand negative inconsistencies

If 120573 = 05 then positive and negative relations contributeequally on amount of inconsistency And for the case that05 lt 120573 le 1 the negative relations have more impact onresult and when 0 lt 120573 le 05 positive edges have higherinfluence The ideal condition is created when 119875 and119873 havethe smallest amount so better result is achieved Because 119868(119888)show error the algorithm tries to find minimum value for119873and 119875 via using a hill-climbing optimization technique [25]Other community detection approaches suitable for signedgraphs can also be used in this phase

33 Community-Based Ranking Methods In this paper wepropose three new ranking algorithms for signed complexnetworks that are dependent on community detection Weintroduce amethod that ties the concept of ranking algorithmand community detection Suppose that we intend to com-pute the rank of node 119894 in the network First of all we clusternodes in the network in such a way that each node belongsto one community in the graph (first phase algorithmreferenced in Section 321) so for calculating rank of node119894 by taking influence of other nodes in the network we givepriority and high importance to the nodes that belong to thesame community which node 119894 belongs to These algorithmsare described in detail in the following subsections (secondphase of ranking) We will verify rationality of these rankingalgorithms in Results section

331 PBCD (Prestige RankingAlgorithmBased onCommunityDetection) Prestige is the simplest algorithm in signed com-plex network [18] In this method the most important factorfor determining ranking of each node in the system is thenumber of positive and negative incoming nodes that eachnode receives from others In other words if a node hasmanypositive incoming links therefore its prestige is high in thenetwork And it is also true for negative links if the numberof positive incoming links is less than negative ones the nodehas low prestige in comparison with the other nodes in thesystem [3]The idea of community-based ranking inspires usto incorporate our method with some well-known rankingalgorithms like prestige The proposed prestige can be statedas follows

Pr (119894) = [120572 times1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

+ [(1 minus 120572) times

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

(4)

where 120572 is the impact factor to determine degree of impor-tance of nodes that are in the same community that node 119894belongs to and 119894119889+in(119894) and 119894119889

minus

in(119894) respectively indicate setof nodes who positively and negatively voted toward 119894 andthese nodes are members of the same community that node119894 belongs to Similarly 119900119889+in(119894) and 119900119889

minus

in(119894) show set of nodeswho positively and negatively voted toward node 119894 and thesenodes are members of other communities which are differentfrom the community that node 119894 belongs to Moreover | |represents magnitude of the set of nodes and the in subscriptindicates that we only consider incoming links from 119894119889(119894) and119900119889(119894) sets toward node 119894 Finally 119894 isin 119888 119888 isin 119862 119888 is thecommunity which node 119894 belongs to and 119862 is the disjointsubgraph of all clusters detected via the algorithm In thisnotation all members of 119894119889(119894) are in 119888 and none of 119900119889(119894)members are in 119888 As intraclusters nodes of communities aremore close to each other ranking algorithm can utilize theinfluence of intra clusters nodes closeness

332 HBCD (HITS Ranking Algorithm Based on CommunityDetection) HITS algorithm was introduced by [16] and itwas mainly proposed for exploiting helpful information inorder to analyze structure of links and has been appliedin various applications This algorithm works based on huband authority vectors These vectors are initialized withsome predefined (random) values and converge after somerecursive iterations [16]

Reference [3] introduced modified version of HITS inwhich the graph is divided into two positive and negativeparts and then run the algorithm on each graph separatelyIn HITS algorithm there are two vectors authority and hubwhich finally converge after enough iteration We propose anew version of HITS in which there is distinction betweenimportance of local neighbor nodes and those members ofdifferent communities that node 119894 belongs to The HITSalgorithm based on community detection can be stated asfollows

119886

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)out (119894)

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)out (119894)

(+)

119895(119905))

119886

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)out (119894)

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)out (119894)

(minus)

119895(119905))

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)in (119894)

119886

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)in (119894)

119886

(+)

119895(119905))

Scientific Programming 5

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)in (119894)

119886

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)in (119894)

119886

(minus)

119895(119905))

(5)

where 120572 indicates importance factor that is given to thelocal neighbors and ℎ(119905) and 119886(119905) show hub and authorityvectors at time 119905 119894119889(+)inout(119894) and 119894119889

(minus)

inout(119894) respectively rep-resent set of nodes who have relations (positivenegative orincomingoutgoing) with node 119894 and they are members of thesame community that node 119894 belongs to In a similar manner119900119889

(+)

inout(119894) and 119900119889(minus)

inout(119894) indicate set of nodes in which theyhave relations (positivenegative incomingoutgoing) withnode 119894 and they are members of different communities thatnode 119894 belongs to Finally in and out subscripts respectivelyshow that 119894119889(119894) or 119900119889(119894) represents set of nodes that votedtoward node 119894 (incoming links toward node 119894) or being votedby node 119894 (outgoing links from node 119894) Hub and authoritiesare initialized with some random values and they convergeafter enough iteration

333 RBCD (PageRank Algorithm Based on CommunityDetection) PageRank is one of the most widely used meth-ods for ranking of nodes [37] It was extracted from GoogleLarry page PageRank uses the concept of random walkthat leads to probability distribution which computes thepossibility of randomly going from one node to anotherone and finally gets to one specific node This algorithminitiallywas introduced for graphswith positive andunsignededges especially for web pages on the internet Reference[3] introduced modified version of PageRank in which thegraph is divided into two parts and the algorithm is runon each graph separately and finally for calculating rankingvector negative ranking vector is subtracted from positiveone Our proposed ranking algorithm states that each node inthe network belongs to specific community or cluster and thegeneral idea of specifying ranking is on the basis of utilizingother nodesrsquo information In other words to compute rankof node 119894 we give priority to the nodes that belong tothe same community that node 119894 belongs to Based on thisopinion PageRank based on community detection can bedefined as

PR(+)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

PR(minus)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

(6)

where in the above equations 120572 denotes importance factoris assigned to the local neighbor nodes and 119905 shows therank at the time of 119905 120573 indicates the forgetting factor and119873 represents number of nodes in the graph 119894119889in(119894) showsset of incoming links to node 119894 that belong to the samecommunity that node 119894 belongs to Similarly 119900119889in(119894) indicatesset of incoming links to node 119894 that belong to differentcommunities that node 119894 belongs to 119889(+)out(119895) and 119889(minus)out(119895)involves number of positive and negative outgoing links fromnode 119895 respectively

34 Classifier Classification is the process of assigning datato one of predetermined classes Here our classes are themapped values of positive and negative signs to +1 and minus1values This process is done via using training set whichcontains some features to constitute a classifier and thenevaluation via using a test set A brief explanation of theclassifier used in our work is brought here

Logistic regression logistic regression or sigmoid func-tion is a monotonic continuous function which lies betweenminus1 and +1 values and is a method of learning function of theform 119891 119883 rarr 119884 or 119875(119884 | 119883) They can be mathematicallydefined as follows

119875 (119884 = 1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

119875 (119884 = minus1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

(7)

119884 is the discrete values of classes which here are +1 andminus1 119883 is the input vector of discrete or continuous valuesand 119908

119894are some learning coefficient learnt by the model

Classifiers are evaluated based on accuracy Accuracy metricis calculated based on TP FP TN and FN as follows [38 39]

accuracy = TP + TNTP + TN + FP + FN

(8)

We used 10-fold validation in order to evaluate general-ization of the models Cross validation is a classical methodin which the dataset are partitioned into 119896 foldsThe first oneis used as test set and the remaining folds are considered astraining sets In the next phase the second fold is selected asthe test set and the remaining are chosen as training sets [40]We utilized WEKA software for computing accuracy that isavailable through httpwwwcswaikatoacnzmlweka

4 Experiments

In order to verify proposed algorithms introduced inSection 33 we executed them on three large datasets

6 Scientific Programming

In the following sections these datasets are introduced andachieved results are depicted

41 Datasets Datasets which are used for experimentalpurposes areWikipedia Epinions and SlashdotThese onlinesocial networks are available through httpsnapstanfordedudata In the following we explain briefly these datasets

Wikipedia Volunteers from all around the world collaborateto write this free cyclopedia A user can take the roleof administrator with additional access to some technicalfeatures by getting vote of other users Administrators areresponsible for maintenance purposes A user is nominatedas administrator and then Wikipedia members elect usersas administrators via public dialogues and talks Totally7000 users took part in elections and 100000 votes werereceived from 2800 delegated elections 1200 elections lead topromotion and 1500 elections were not successful and did notproduce a winner Half of voters are current administratorsand the remaining are ordinary users [6]

Epinions Epinions is a review online social network thatusers can express their views about diverse items like musicTV shows and hardware The site members are able to votepositively and negatively toward each other As a matter offact this online social network contains information aboutwho-trusts-whom The data set is made of 131828 nodes andcontains 841372 edges in which 85 of them are positive [41]

Slashdot Slashdot is a website related to technology andscience It is well known due to its specific users and hasintroduced itself as user-submitted and science evaluatednews In other words links of news and summary of differentissues are submitted by users and each story becomes thetopic of series of talks Selected readers that take the roleof moderators send ratings to Slashdot The responsibility ofthese readers is appointing tags for each comment Slashdotdoes not show scores to the users but lets them arrangecomments on the base of assigned points Slashdot alsohas a service that users have the ability to tag each otheras friend or opponent Therefore links between users arefriendopponent relationship The data set is made of 82144nodes and contains 549202 edges in which 77 of them arepositive [42]

42 Results Via using introduced community detection ofSection 321 the communities in Epinions Slashdot andWikipedia are detected We examine our proposed methodon different size of communities where it can be specifiedthrough input parameter of community detection algorithmReference [43] introduced a function for determining num-ber of clusters in Epinions Slashdot and Wikipedia and itcan be observed that this method has high accuracies whenthe number of clusters is between four and ten so we alsochecked our approach for seven cases starting from four toten For ranking nodes Prestige Based onCommunityDetec-tion (PBCD) PageRank Based on Community Detection(RBCD) andHITSBased onCommunityDetection (HBCD)are run on these datasets In order to differentiate betweennodes which belong to various communities we defined

05 055 06 065 07 075 08 085 09 095 1

959

9585

958

9575

957

9565

956

Com = 4

Com = 5

Com = 6

Com = 7

Com = 8

Com = 9

Com = 10

Figure 2 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

impact factor 120572 for local nodes and (1 minus 120572) for foreign oneswhere 120572 can contains values of (05 06 09 1) When05 lt 120572 le 1 local nodes have more privilege than theothers and 120572 = 05 shows normal ranking algorithms of[3] In other words in the case (120572 = 05) no communitystructure is considered In order to define accuracy of edgesign prediction rank based features are extracted and in orderto evaluate generalization of themodels we used 10-fold crossvalidation Accuracy of introduced methods is evaluatedwith various size of communities and different values of 120572The results are shown in forms of figures In the followingsignificant findings related to each dataset are stated

421 PBCD on Datasets In Epinions and Slashdot for allsize of communities 120572 = 05 contains the maximum valuesIn fact outputs of PBCD on Epinions and Slashdot datasetsindicate that using community-based ranking algorithmsmight slightly degrade prediction accuracy However theprediction accuracy is still high for different values of 120572greater than 05 and for all community numbers This loweraccuracy might be because of special property of dataset orcommunity detection algorithm Results for Epinions andSlashdot are illustrated in Figures 2 and 3 respectively Resultsrelated to Wikipedia are shown in Figure 4 In the caseCom (number of communities) equals 4 6 and 7 predictionaccuracy is higher than the case 120572 = 05 Generally highprecisions extract properties of dataset and offers bettermodel for prediction In Figure 4 where Com = 6 120572 = 06contains maximum value which is equal to 887467

422 RBCD on Datasets Analogously to PBCD we imple-mented RBCD ranking algorithm on datasets Outputs forEpinions are illustrated in Figure 5 In Epinions all thenumber of communities has acceptable accuracies Com = 10with 120572 = 1 contains the maximum value of 95515 among

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: A Community-Based Approach for Link Prediction in Signed Social

2 Scientific Programming

they offer new friends for users In these networks users havecapability of expressing their views toward others via binaryminus1 and +1 values [5 6]

In this paper three community-based ranking algorithmsfor ranking of nodes have been proposed andwehave studiedtheir impacts on the edge sign prediction problem In order tostudy the impact of proposed ranking algorithms on signedprediction problem we extracted the features of the predictorbased on reputation and optimism introduced in [7] Rep-utation of a node shows how much reputable a node is inthe system and optimism denotes voting pattern of the nodetoward others We assess our method by utilizing logisticregression classifier and running algorithms on real socialnetwork datasets The structure of this paper is organized asfollows

In Section 2 related works are brought In Section 3we mainly introduce proposed algorithms moreover theproblem of sign prediction is defined and rank-based fea-tures are introduced as features for the prediction taskWe also separately go through community detection prob-lem community-based ranking algorithms and the logisticregression classifier In Section 4 datasets for experimentalpurposes are introduced and implementation results are alsodemonstrated In Section 5 the discussion ismade and finallyin Section 6 conclusion and future directions are mentioned

2 Related Works

There are two major categories of methods used in link pre-diction firstly those approaches that utilize local informationof the graph which focus on the local structure of nodesAmong local approaches [8] has the best performance in linkprediction between two specific nodes Common neighborindex is also known as friend of friend algorithm (FOAF)is used by many online social networks for recommendingfriends such as Facebook FOAF determines the similarityof two nodes that tend to communicate with each other onthe basis of counting number of joint neighbors [9] Othermetrics for computing similarity are based on preferentialattachments where these measures are calculated based onmultiplying or summing of nodes degree Second categoryconcentrates on global structure of the network and detectingoverall features in order to find how strongly two nodes aresimilar There are also diverse global approaches which usethe whole adjacency matrix in order to predict hidden linksfor instance shortest path algorithm PWR algorithm andSimiRank algorithm [1 10]

In sign prediction the most notable and remarkablemethods are divided into two categories Belief MatrixModel[5] and machine learning approaches [11] Belief MatrixModel was introduced by [5] andwas proposed for predictingtrust or distrust between two particular users in signednetworks It was the fundamental model in sign prediction ofedges Reference [11] employed the idea of signed triads andused logistic regressionmodel and some local feature in orderto predict sign of edges in social networks The features thatwere introduced by [11] are categorized in two classes firstone is on the base of the positivenegative ingoingoutgoingdegree of nodes which basically collect the local information

of nodes And the second group is based on the extractedprinciples from social psychology in which we are ableto determine the type of 119906 and V relation by utilizing theinformation of third party like 119908

Ranking of nodes has tight relationship with sign pre-diction problem so we also investigate ranking in signednetworks Ranking of nodes is the problemof computing howmuch important or trustable a node is in networks [12] Thecentrality measures like betweenness [13] closeness [14] andeigenvector centrality [15] were introduced to compute nodesrsquoimportance degree in the network Other algorithms likeHITS [16] and PageRank [17] were added in 1990 All of theseranking algorithms are designed for positive graphs and thereare merely several literatures for ranking of nodes in signednetworks The simplest ranking algorithm for signed graphsis prestige where number of positive and negative incominglinks determine ranking of each node [18] Another rankingalgorithm is PageTrust that was introduced by [19] Thismethod is extension of PageRank and the main differenceis that nodes with negative incoming links will be visitedless in random walk process Exponential ranking is anotherchiefmethodof ranking for signed graphs [12] In exponentialranking the value of ranking vector globally is obtainedfrom local trust values Another ranking algorithm for signednetworks that is greatly similar to HITS was proposed by[20] This method utilizes the concept of Bias and Deservewhich underestimates the vote of optimistic and pessimisticnodes Reference [3] also proposed new ranking algorithmsfor signed networks namely Modified HITS and ModifiedPageRank

Because we propose community-based ranking algo-rithms we should go through community detection prob-lem Community detection algorithms help to prepare moredominant recommendation systems and web page clusteringwhich have great effect on better searches [21] Commu-nity detection algorithms attempt to cluster edgesnodes inorder to have minimum number of edges between denselycommunities [22] One of the most widely used methodsfor community detection in unsigned graphs was proposedby [23] As for signed networks [24] proposed a two-stepspectral approach which was an extension to modularityThe main problem related to modularity is resolution limitin which very small communities might not be detected Inorder to address this problem [22] proposed newmethod fordetecting communities on signed graphs by extending pottsmodel Reference [25] also introduced useful approach thatworks on the base of blocking method

3 Method

In this paper authors intend to investigate the community-based problem of predicting sign of links in signed socialnetworks Hence in this section as well as proposed algo-rithms and methods the problems of sign prediction andcommunity detection will be discussed in detail

31 Edge Sign Prediction In order to define the problemformally it can be assumed that we have a signed directedgraph 119866(119881 119864) that 119881 represents set of vertices and 119864 shows

Scientific Programming 3

set of edges where customers and users can vote positivelyand negatively toward each other So the aforementionednotation 119881 represents users of site and 119864 indicates +1 andminus1 relations among them In all over the paper the personwho gives positive vote and receives it is named trustor andtrustee respectively [2 26] The sign prediction problem canbe defined as follows Suppose that signs of some links inthe network are hidden and the goal is to reliably predictvalues of these edges by current information in the graphThe sign prediction problem tries to find signs of hiddenedges with negligible error [1 27] In this work we proposestate-of-the-art community-based ranking algorithms andwe evaluate their effectiveness via sign prediction problem onthree datasets Epinoins Slashdot and Wikipedia

To this end [7] already introduced rank-based featuresnamed Optimism and Reputation to connect ranking prob-lem with sign prediction Rank-based reputation of node 119894indicates patterns of voting toward this node Meaningfullyrank-based reputation of node 119894 not only considers numberof positivenegative incoming links toward node 119894 but alsoit takes into account ranks of nodes who vote toward node119894 In other words when a person receives several positiveincoming links she might not be very reputable because oneshould consider rank of voters toward node 119894 If the userswho vote toward node 119894 are high rank then node 119894 can beconsidered reputable but if they are not high rank we cannotsay that node 119894 is reputable although the number of positiveincoming links toward node 119894 is relatively highThe followingequation can better describe rank-based reputation [7]

RBR119894=

1003816

1003816

1003816

1003816

1003816

119877

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119877

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119877

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119877

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

(1)

where RBR119894is the value of rank-based reputation of node 119894

119877

(+)

in (119894) indicates sum of rank values of nodes who positivelyvoted toward node 119894 and similarly 119877(minus)in (119894) is sum of rankvalues of nodes who negatively voted toward node 119894 In thesame vein one can define rank-based optimism of node 119894 asfollows

RBO119894=

1003816

1003816

1003816

1003816

1003816

119877

(+)

out (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119877

(minus)

out (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119877

(+)

out (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119877

(minus)

out (119894)1003816

1003816

1003816

1003816

1003816

(2)

where RBO119894is the value of rank-based optimism of node 119894

119877

(+)

out(119894) refers to sum of rank values of nodes whom node 119894positively voted toward them and similarly 119877(minus)out(119894) is sum ofrank values of nodes in which node 119894 negatively voted towardthem As formula (2) shows node 119894 which generates severalpositive outgoing links might not be optimistic because thisset might contain nodes in which they are low rank [3]In order to compute these features we need algorithms torank nodes As for ranking algorithms we propose threecommunity-based ranking algorithms in the next sections

32 Community-Based Ranking Algorithms to Compute RBRand RBO In this section we propose three ranking algo-rithms in which all of them work based on communitydetection problem in signed graphs In otherwords firstly we

a a

bb

c c

+ +

+ +

minus minus

(a) Stable configurations

a a

bb

cc

+ +

minus minus

minus minus

(b) Unstable configurations

Figure 1 All possible states of balance theory

run a community detection algorithm on signed networksThe results will be disjoint communities of nodes As allcommunity detection algorithms work based on a densitybased approach in which they try to maximize density ofintracluster edges and minimize between cluster edges sointracluster nodes are more dense and close From socialperspective intracluster nodes might know each other better(this is the notion behind our community-based rankingalgorithm) Meaningfully nodes in the same communityare much more familiar than nodes that are in differentcommunities Via using this philosophy about intraclusternodes we change previous ranking algorithm like PrestigeHITS and PageRank [3] to have influence of intra- andextracluster nodes with parameters 120572 and (1minus120572) respectivelyThen we can use ranking-based features of [7] for the caseof sign prediction Because first phase of the algorithms iscommunity detection sowe investigate community detectionproblem and a sample community detection algorithm insigned graphs in Section 321 In this paper a communitydetection algorithmbased on social balance theory is utilizedIn Section 33 ranking algorithms based on communitydetection phase are introduced

321 CommunityDetection Thealgorithmused in this paperis based on structural balance theory [28] In balance theorythere are four possible states when nodes are in signedrelations in social networks [29] One can differentiate thesestates by number of positive and negative edges in each triad[30] On the base of strong social balance theory when allof nodes have positive relation or two nodes share the sameenemy these states are called stable Similarly cases withall nodes have negative edges or with two positive edgesare unstable states [31] Regarding this definition a networkwith more than three nodes is structurally balanced if all thepossible triads are stable [32] (Figure 1)

The basic structural theorem states that these triples canbe partitioned into two distinct sets in which all the positiverelations are inside sets and negative ones are among them[33 34] In other words negative edges connect positive setsOn the basis of this definition a network is called 119896-balancedif all positive edges are located in 119896 number of differentcategories and these sets are joined with negative relations[35] In reality rarely there are structurally balanced net-worksThere are always some edges that destabilize the graphand transform it into unstable configuration Therefore thenumber of positive edges between clusters and the number ofnegative edges inside clusters should be minimized [24] Infact the problem is like finding the best sets with minimumnumber of positive relations between partitions and alsominimum number of negative edges inside sets [36]

4 Scientific Programming

Reference [25] introduced one criterion function whichmakes decision based on counting number of elementshaving conflict with 119896-balanced theory It can be defined as ifone considers119873 as number of negative edges inside clustersand 119875 as number of positive edges between clusters thennumber of inconsistencies which is denoted by 119868(119888) can bementioned as

119868 (119888) = 120573 times 119873 + (1 minus 120573) times 119875 (3)

where 120573 is the importance factor that is assigned to positiveand negative inconsistencies

If 120573 = 05 then positive and negative relations contributeequally on amount of inconsistency And for the case that05 lt 120573 le 1 the negative relations have more impact onresult and when 0 lt 120573 le 05 positive edges have higherinfluence The ideal condition is created when 119875 and119873 havethe smallest amount so better result is achieved Because 119868(119888)show error the algorithm tries to find minimum value for119873and 119875 via using a hill-climbing optimization technique [25]Other community detection approaches suitable for signedgraphs can also be used in this phase

33 Community-Based Ranking Methods In this paper wepropose three new ranking algorithms for signed complexnetworks that are dependent on community detection Weintroduce amethod that ties the concept of ranking algorithmand community detection Suppose that we intend to com-pute the rank of node 119894 in the network First of all we clusternodes in the network in such a way that each node belongsto one community in the graph (first phase algorithmreferenced in Section 321) so for calculating rank of node119894 by taking influence of other nodes in the network we givepriority and high importance to the nodes that belong to thesame community which node 119894 belongs to These algorithmsare described in detail in the following subsections (secondphase of ranking) We will verify rationality of these rankingalgorithms in Results section

331 PBCD (Prestige RankingAlgorithmBased onCommunityDetection) Prestige is the simplest algorithm in signed com-plex network [18] In this method the most important factorfor determining ranking of each node in the system is thenumber of positive and negative incoming nodes that eachnode receives from others In other words if a node hasmanypositive incoming links therefore its prestige is high in thenetwork And it is also true for negative links if the numberof positive incoming links is less than negative ones the nodehas low prestige in comparison with the other nodes in thesystem [3]The idea of community-based ranking inspires usto incorporate our method with some well-known rankingalgorithms like prestige The proposed prestige can be statedas follows

Pr (119894) = [120572 times1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

+ [(1 minus 120572) times

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

(4)

where 120572 is the impact factor to determine degree of impor-tance of nodes that are in the same community that node 119894belongs to and 119894119889+in(119894) and 119894119889

minus

in(119894) respectively indicate setof nodes who positively and negatively voted toward 119894 andthese nodes are members of the same community that node119894 belongs to Similarly 119900119889+in(119894) and 119900119889

minus

in(119894) show set of nodeswho positively and negatively voted toward node 119894 and thesenodes are members of other communities which are differentfrom the community that node 119894 belongs to Moreover | |represents magnitude of the set of nodes and the in subscriptindicates that we only consider incoming links from 119894119889(119894) and119900119889(119894) sets toward node 119894 Finally 119894 isin 119888 119888 isin 119862 119888 is thecommunity which node 119894 belongs to and 119862 is the disjointsubgraph of all clusters detected via the algorithm In thisnotation all members of 119894119889(119894) are in 119888 and none of 119900119889(119894)members are in 119888 As intraclusters nodes of communities aremore close to each other ranking algorithm can utilize theinfluence of intra clusters nodes closeness

332 HBCD (HITS Ranking Algorithm Based on CommunityDetection) HITS algorithm was introduced by [16] and itwas mainly proposed for exploiting helpful information inorder to analyze structure of links and has been appliedin various applications This algorithm works based on huband authority vectors These vectors are initialized withsome predefined (random) values and converge after somerecursive iterations [16]

Reference [3] introduced modified version of HITS inwhich the graph is divided into two positive and negativeparts and then run the algorithm on each graph separatelyIn HITS algorithm there are two vectors authority and hubwhich finally converge after enough iteration We propose anew version of HITS in which there is distinction betweenimportance of local neighbor nodes and those members ofdifferent communities that node 119894 belongs to The HITSalgorithm based on community detection can be stated asfollows

119886

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)out (119894)

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)out (119894)

(+)

119895(119905))

119886

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)out (119894)

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)out (119894)

(minus)

119895(119905))

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)in (119894)

119886

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)in (119894)

119886

(+)

119895(119905))

Scientific Programming 5

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)in (119894)

119886

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)in (119894)

119886

(minus)

119895(119905))

(5)

where 120572 indicates importance factor that is given to thelocal neighbors and ℎ(119905) and 119886(119905) show hub and authorityvectors at time 119905 119894119889(+)inout(119894) and 119894119889

(minus)

inout(119894) respectively rep-resent set of nodes who have relations (positivenegative orincomingoutgoing) with node 119894 and they are members of thesame community that node 119894 belongs to In a similar manner119900119889

(+)

inout(119894) and 119900119889(minus)

inout(119894) indicate set of nodes in which theyhave relations (positivenegative incomingoutgoing) withnode 119894 and they are members of different communities thatnode 119894 belongs to Finally in and out subscripts respectivelyshow that 119894119889(119894) or 119900119889(119894) represents set of nodes that votedtoward node 119894 (incoming links toward node 119894) or being votedby node 119894 (outgoing links from node 119894) Hub and authoritiesare initialized with some random values and they convergeafter enough iteration

333 RBCD (PageRank Algorithm Based on CommunityDetection) PageRank is one of the most widely used meth-ods for ranking of nodes [37] It was extracted from GoogleLarry page PageRank uses the concept of random walkthat leads to probability distribution which computes thepossibility of randomly going from one node to anotherone and finally gets to one specific node This algorithminitiallywas introduced for graphswith positive andunsignededges especially for web pages on the internet Reference[3] introduced modified version of PageRank in which thegraph is divided into two parts and the algorithm is runon each graph separately and finally for calculating rankingvector negative ranking vector is subtracted from positiveone Our proposed ranking algorithm states that each node inthe network belongs to specific community or cluster and thegeneral idea of specifying ranking is on the basis of utilizingother nodesrsquo information In other words to compute rankof node 119894 we give priority to the nodes that belong tothe same community that node 119894 belongs to Based on thisopinion PageRank based on community detection can bedefined as

PR(+)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

PR(minus)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

(6)

where in the above equations 120572 denotes importance factoris assigned to the local neighbor nodes and 119905 shows therank at the time of 119905 120573 indicates the forgetting factor and119873 represents number of nodes in the graph 119894119889in(119894) showsset of incoming links to node 119894 that belong to the samecommunity that node 119894 belongs to Similarly 119900119889in(119894) indicatesset of incoming links to node 119894 that belong to differentcommunities that node 119894 belongs to 119889(+)out(119895) and 119889(minus)out(119895)involves number of positive and negative outgoing links fromnode 119895 respectively

34 Classifier Classification is the process of assigning datato one of predetermined classes Here our classes are themapped values of positive and negative signs to +1 and minus1values This process is done via using training set whichcontains some features to constitute a classifier and thenevaluation via using a test set A brief explanation of theclassifier used in our work is brought here

Logistic regression logistic regression or sigmoid func-tion is a monotonic continuous function which lies betweenminus1 and +1 values and is a method of learning function of theform 119891 119883 rarr 119884 or 119875(119884 | 119883) They can be mathematicallydefined as follows

119875 (119884 = 1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

119875 (119884 = minus1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

(7)

119884 is the discrete values of classes which here are +1 andminus1 119883 is the input vector of discrete or continuous valuesand 119908

119894are some learning coefficient learnt by the model

Classifiers are evaluated based on accuracy Accuracy metricis calculated based on TP FP TN and FN as follows [38 39]

accuracy = TP + TNTP + TN + FP + FN

(8)

We used 10-fold validation in order to evaluate general-ization of the models Cross validation is a classical methodin which the dataset are partitioned into 119896 foldsThe first oneis used as test set and the remaining folds are considered astraining sets In the next phase the second fold is selected asthe test set and the remaining are chosen as training sets [40]We utilized WEKA software for computing accuracy that isavailable through httpwwwcswaikatoacnzmlweka

4 Experiments

In order to verify proposed algorithms introduced inSection 33 we executed them on three large datasets

6 Scientific Programming

In the following sections these datasets are introduced andachieved results are depicted

41 Datasets Datasets which are used for experimentalpurposes areWikipedia Epinions and SlashdotThese onlinesocial networks are available through httpsnapstanfordedudata In the following we explain briefly these datasets

Wikipedia Volunteers from all around the world collaborateto write this free cyclopedia A user can take the roleof administrator with additional access to some technicalfeatures by getting vote of other users Administrators areresponsible for maintenance purposes A user is nominatedas administrator and then Wikipedia members elect usersas administrators via public dialogues and talks Totally7000 users took part in elections and 100000 votes werereceived from 2800 delegated elections 1200 elections lead topromotion and 1500 elections were not successful and did notproduce a winner Half of voters are current administratorsand the remaining are ordinary users [6]

Epinions Epinions is a review online social network thatusers can express their views about diverse items like musicTV shows and hardware The site members are able to votepositively and negatively toward each other As a matter offact this online social network contains information aboutwho-trusts-whom The data set is made of 131828 nodes andcontains 841372 edges in which 85 of them are positive [41]

Slashdot Slashdot is a website related to technology andscience It is well known due to its specific users and hasintroduced itself as user-submitted and science evaluatednews In other words links of news and summary of differentissues are submitted by users and each story becomes thetopic of series of talks Selected readers that take the roleof moderators send ratings to Slashdot The responsibility ofthese readers is appointing tags for each comment Slashdotdoes not show scores to the users but lets them arrangecomments on the base of assigned points Slashdot alsohas a service that users have the ability to tag each otheras friend or opponent Therefore links between users arefriendopponent relationship The data set is made of 82144nodes and contains 549202 edges in which 77 of them arepositive [42]

42 Results Via using introduced community detection ofSection 321 the communities in Epinions Slashdot andWikipedia are detected We examine our proposed methodon different size of communities where it can be specifiedthrough input parameter of community detection algorithmReference [43] introduced a function for determining num-ber of clusters in Epinions Slashdot and Wikipedia and itcan be observed that this method has high accuracies whenthe number of clusters is between four and ten so we alsochecked our approach for seven cases starting from four toten For ranking nodes Prestige Based onCommunityDetec-tion (PBCD) PageRank Based on Community Detection(RBCD) andHITSBased onCommunityDetection (HBCD)are run on these datasets In order to differentiate betweennodes which belong to various communities we defined

05 055 06 065 07 075 08 085 09 095 1

959

9585

958

9575

957

9565

956

Com = 4

Com = 5

Com = 6

Com = 7

Com = 8

Com = 9

Com = 10

Figure 2 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

impact factor 120572 for local nodes and (1 minus 120572) for foreign oneswhere 120572 can contains values of (05 06 09 1) When05 lt 120572 le 1 local nodes have more privilege than theothers and 120572 = 05 shows normal ranking algorithms of[3] In other words in the case (120572 = 05) no communitystructure is considered In order to define accuracy of edgesign prediction rank based features are extracted and in orderto evaluate generalization of themodels we used 10-fold crossvalidation Accuracy of introduced methods is evaluatedwith various size of communities and different values of 120572The results are shown in forms of figures In the followingsignificant findings related to each dataset are stated

421 PBCD on Datasets In Epinions and Slashdot for allsize of communities 120572 = 05 contains the maximum valuesIn fact outputs of PBCD on Epinions and Slashdot datasetsindicate that using community-based ranking algorithmsmight slightly degrade prediction accuracy However theprediction accuracy is still high for different values of 120572greater than 05 and for all community numbers This loweraccuracy might be because of special property of dataset orcommunity detection algorithm Results for Epinions andSlashdot are illustrated in Figures 2 and 3 respectively Resultsrelated to Wikipedia are shown in Figure 4 In the caseCom (number of communities) equals 4 6 and 7 predictionaccuracy is higher than the case 120572 = 05 Generally highprecisions extract properties of dataset and offers bettermodel for prediction In Figure 4 where Com = 6 120572 = 06contains maximum value which is equal to 887467

422 RBCD on Datasets Analogously to PBCD we imple-mented RBCD ranking algorithm on datasets Outputs forEpinions are illustrated in Figure 5 In Epinions all thenumber of communities has acceptable accuracies Com = 10with 120572 = 1 contains the maximum value of 95515 among

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: A Community-Based Approach for Link Prediction in Signed Social

Scientific Programming 3

set of edges where customers and users can vote positivelyand negatively toward each other So the aforementionednotation 119881 represents users of site and 119864 indicates +1 andminus1 relations among them In all over the paper the personwho gives positive vote and receives it is named trustor andtrustee respectively [2 26] The sign prediction problem canbe defined as follows Suppose that signs of some links inthe network are hidden and the goal is to reliably predictvalues of these edges by current information in the graphThe sign prediction problem tries to find signs of hiddenedges with negligible error [1 27] In this work we proposestate-of-the-art community-based ranking algorithms andwe evaluate their effectiveness via sign prediction problem onthree datasets Epinoins Slashdot and Wikipedia

To this end [7] already introduced rank-based featuresnamed Optimism and Reputation to connect ranking prob-lem with sign prediction Rank-based reputation of node 119894indicates patterns of voting toward this node Meaningfullyrank-based reputation of node 119894 not only considers numberof positivenegative incoming links toward node 119894 but alsoit takes into account ranks of nodes who vote toward node119894 In other words when a person receives several positiveincoming links she might not be very reputable because oneshould consider rank of voters toward node 119894 If the userswho vote toward node 119894 are high rank then node 119894 can beconsidered reputable but if they are not high rank we cannotsay that node 119894 is reputable although the number of positiveincoming links toward node 119894 is relatively highThe followingequation can better describe rank-based reputation [7]

RBR119894=

1003816

1003816

1003816

1003816

1003816

119877

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119877

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119877

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119877

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

(1)

where RBR119894is the value of rank-based reputation of node 119894

119877

(+)

in (119894) indicates sum of rank values of nodes who positivelyvoted toward node 119894 and similarly 119877(minus)in (119894) is sum of rankvalues of nodes who negatively voted toward node 119894 In thesame vein one can define rank-based optimism of node 119894 asfollows

RBO119894=

1003816

1003816

1003816

1003816

1003816

119877

(+)

out (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119877

(minus)

out (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119877

(+)

out (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119877

(minus)

out (119894)1003816

1003816

1003816

1003816

1003816

(2)

where RBO119894is the value of rank-based optimism of node 119894

119877

(+)

out(119894) refers to sum of rank values of nodes whom node 119894positively voted toward them and similarly 119877(minus)out(119894) is sum ofrank values of nodes in which node 119894 negatively voted towardthem As formula (2) shows node 119894 which generates severalpositive outgoing links might not be optimistic because thisset might contain nodes in which they are low rank [3]In order to compute these features we need algorithms torank nodes As for ranking algorithms we propose threecommunity-based ranking algorithms in the next sections

32 Community-Based Ranking Algorithms to Compute RBRand RBO In this section we propose three ranking algo-rithms in which all of them work based on communitydetection problem in signed graphs In otherwords firstly we

a a

bb

c c

+ +

+ +

minus minus

(a) Stable configurations

a a

bb

cc

+ +

minus minus

minus minus

(b) Unstable configurations

Figure 1 All possible states of balance theory

run a community detection algorithm on signed networksThe results will be disjoint communities of nodes As allcommunity detection algorithms work based on a densitybased approach in which they try to maximize density ofintracluster edges and minimize between cluster edges sointracluster nodes are more dense and close From socialperspective intracluster nodes might know each other better(this is the notion behind our community-based rankingalgorithm) Meaningfully nodes in the same communityare much more familiar than nodes that are in differentcommunities Via using this philosophy about intraclusternodes we change previous ranking algorithm like PrestigeHITS and PageRank [3] to have influence of intra- andextracluster nodes with parameters 120572 and (1minus120572) respectivelyThen we can use ranking-based features of [7] for the caseof sign prediction Because first phase of the algorithms iscommunity detection sowe investigate community detectionproblem and a sample community detection algorithm insigned graphs in Section 321 In this paper a communitydetection algorithmbased on social balance theory is utilizedIn Section 33 ranking algorithms based on communitydetection phase are introduced

321 CommunityDetection Thealgorithmused in this paperis based on structural balance theory [28] In balance theorythere are four possible states when nodes are in signedrelations in social networks [29] One can differentiate thesestates by number of positive and negative edges in each triad[30] On the base of strong social balance theory when allof nodes have positive relation or two nodes share the sameenemy these states are called stable Similarly cases withall nodes have negative edges or with two positive edgesare unstable states [31] Regarding this definition a networkwith more than three nodes is structurally balanced if all thepossible triads are stable [32] (Figure 1)

The basic structural theorem states that these triples canbe partitioned into two distinct sets in which all the positiverelations are inside sets and negative ones are among them[33 34] In other words negative edges connect positive setsOn the basis of this definition a network is called 119896-balancedif all positive edges are located in 119896 number of differentcategories and these sets are joined with negative relations[35] In reality rarely there are structurally balanced net-worksThere are always some edges that destabilize the graphand transform it into unstable configuration Therefore thenumber of positive edges between clusters and the number ofnegative edges inside clusters should be minimized [24] Infact the problem is like finding the best sets with minimumnumber of positive relations between partitions and alsominimum number of negative edges inside sets [36]

4 Scientific Programming

Reference [25] introduced one criterion function whichmakes decision based on counting number of elementshaving conflict with 119896-balanced theory It can be defined as ifone considers119873 as number of negative edges inside clustersand 119875 as number of positive edges between clusters thennumber of inconsistencies which is denoted by 119868(119888) can bementioned as

119868 (119888) = 120573 times 119873 + (1 minus 120573) times 119875 (3)

where 120573 is the importance factor that is assigned to positiveand negative inconsistencies

If 120573 = 05 then positive and negative relations contributeequally on amount of inconsistency And for the case that05 lt 120573 le 1 the negative relations have more impact onresult and when 0 lt 120573 le 05 positive edges have higherinfluence The ideal condition is created when 119875 and119873 havethe smallest amount so better result is achieved Because 119868(119888)show error the algorithm tries to find minimum value for119873and 119875 via using a hill-climbing optimization technique [25]Other community detection approaches suitable for signedgraphs can also be used in this phase

33 Community-Based Ranking Methods In this paper wepropose three new ranking algorithms for signed complexnetworks that are dependent on community detection Weintroduce amethod that ties the concept of ranking algorithmand community detection Suppose that we intend to com-pute the rank of node 119894 in the network First of all we clusternodes in the network in such a way that each node belongsto one community in the graph (first phase algorithmreferenced in Section 321) so for calculating rank of node119894 by taking influence of other nodes in the network we givepriority and high importance to the nodes that belong to thesame community which node 119894 belongs to These algorithmsare described in detail in the following subsections (secondphase of ranking) We will verify rationality of these rankingalgorithms in Results section

331 PBCD (Prestige RankingAlgorithmBased onCommunityDetection) Prestige is the simplest algorithm in signed com-plex network [18] In this method the most important factorfor determining ranking of each node in the system is thenumber of positive and negative incoming nodes that eachnode receives from others In other words if a node hasmanypositive incoming links therefore its prestige is high in thenetwork And it is also true for negative links if the numberof positive incoming links is less than negative ones the nodehas low prestige in comparison with the other nodes in thesystem [3]The idea of community-based ranking inspires usto incorporate our method with some well-known rankingalgorithms like prestige The proposed prestige can be statedas follows

Pr (119894) = [120572 times1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

+ [(1 minus 120572) times

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

(4)

where 120572 is the impact factor to determine degree of impor-tance of nodes that are in the same community that node 119894belongs to and 119894119889+in(119894) and 119894119889

minus

in(119894) respectively indicate setof nodes who positively and negatively voted toward 119894 andthese nodes are members of the same community that node119894 belongs to Similarly 119900119889+in(119894) and 119900119889

minus

in(119894) show set of nodeswho positively and negatively voted toward node 119894 and thesenodes are members of other communities which are differentfrom the community that node 119894 belongs to Moreover | |represents magnitude of the set of nodes and the in subscriptindicates that we only consider incoming links from 119894119889(119894) and119900119889(119894) sets toward node 119894 Finally 119894 isin 119888 119888 isin 119862 119888 is thecommunity which node 119894 belongs to and 119862 is the disjointsubgraph of all clusters detected via the algorithm In thisnotation all members of 119894119889(119894) are in 119888 and none of 119900119889(119894)members are in 119888 As intraclusters nodes of communities aremore close to each other ranking algorithm can utilize theinfluence of intra clusters nodes closeness

332 HBCD (HITS Ranking Algorithm Based on CommunityDetection) HITS algorithm was introduced by [16] and itwas mainly proposed for exploiting helpful information inorder to analyze structure of links and has been appliedin various applications This algorithm works based on huband authority vectors These vectors are initialized withsome predefined (random) values and converge after somerecursive iterations [16]

Reference [3] introduced modified version of HITS inwhich the graph is divided into two positive and negativeparts and then run the algorithm on each graph separatelyIn HITS algorithm there are two vectors authority and hubwhich finally converge after enough iteration We propose anew version of HITS in which there is distinction betweenimportance of local neighbor nodes and those members ofdifferent communities that node 119894 belongs to The HITSalgorithm based on community detection can be stated asfollows

119886

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)out (119894)

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)out (119894)

(+)

119895(119905))

119886

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)out (119894)

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)out (119894)

(minus)

119895(119905))

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)in (119894)

119886

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)in (119894)

119886

(+)

119895(119905))

Scientific Programming 5

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)in (119894)

119886

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)in (119894)

119886

(minus)

119895(119905))

(5)

where 120572 indicates importance factor that is given to thelocal neighbors and ℎ(119905) and 119886(119905) show hub and authorityvectors at time 119905 119894119889(+)inout(119894) and 119894119889

(minus)

inout(119894) respectively rep-resent set of nodes who have relations (positivenegative orincomingoutgoing) with node 119894 and they are members of thesame community that node 119894 belongs to In a similar manner119900119889

(+)

inout(119894) and 119900119889(minus)

inout(119894) indicate set of nodes in which theyhave relations (positivenegative incomingoutgoing) withnode 119894 and they are members of different communities thatnode 119894 belongs to Finally in and out subscripts respectivelyshow that 119894119889(119894) or 119900119889(119894) represents set of nodes that votedtoward node 119894 (incoming links toward node 119894) or being votedby node 119894 (outgoing links from node 119894) Hub and authoritiesare initialized with some random values and they convergeafter enough iteration

333 RBCD (PageRank Algorithm Based on CommunityDetection) PageRank is one of the most widely used meth-ods for ranking of nodes [37] It was extracted from GoogleLarry page PageRank uses the concept of random walkthat leads to probability distribution which computes thepossibility of randomly going from one node to anotherone and finally gets to one specific node This algorithminitiallywas introduced for graphswith positive andunsignededges especially for web pages on the internet Reference[3] introduced modified version of PageRank in which thegraph is divided into two parts and the algorithm is runon each graph separately and finally for calculating rankingvector negative ranking vector is subtracted from positiveone Our proposed ranking algorithm states that each node inthe network belongs to specific community or cluster and thegeneral idea of specifying ranking is on the basis of utilizingother nodesrsquo information In other words to compute rankof node 119894 we give priority to the nodes that belong tothe same community that node 119894 belongs to Based on thisopinion PageRank based on community detection can bedefined as

PR(+)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

PR(minus)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

(6)

where in the above equations 120572 denotes importance factoris assigned to the local neighbor nodes and 119905 shows therank at the time of 119905 120573 indicates the forgetting factor and119873 represents number of nodes in the graph 119894119889in(119894) showsset of incoming links to node 119894 that belong to the samecommunity that node 119894 belongs to Similarly 119900119889in(119894) indicatesset of incoming links to node 119894 that belong to differentcommunities that node 119894 belongs to 119889(+)out(119895) and 119889(minus)out(119895)involves number of positive and negative outgoing links fromnode 119895 respectively

34 Classifier Classification is the process of assigning datato one of predetermined classes Here our classes are themapped values of positive and negative signs to +1 and minus1values This process is done via using training set whichcontains some features to constitute a classifier and thenevaluation via using a test set A brief explanation of theclassifier used in our work is brought here

Logistic regression logistic regression or sigmoid func-tion is a monotonic continuous function which lies betweenminus1 and +1 values and is a method of learning function of theform 119891 119883 rarr 119884 or 119875(119884 | 119883) They can be mathematicallydefined as follows

119875 (119884 = 1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

119875 (119884 = minus1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

(7)

119884 is the discrete values of classes which here are +1 andminus1 119883 is the input vector of discrete or continuous valuesand 119908

119894are some learning coefficient learnt by the model

Classifiers are evaluated based on accuracy Accuracy metricis calculated based on TP FP TN and FN as follows [38 39]

accuracy = TP + TNTP + TN + FP + FN

(8)

We used 10-fold validation in order to evaluate general-ization of the models Cross validation is a classical methodin which the dataset are partitioned into 119896 foldsThe first oneis used as test set and the remaining folds are considered astraining sets In the next phase the second fold is selected asthe test set and the remaining are chosen as training sets [40]We utilized WEKA software for computing accuracy that isavailable through httpwwwcswaikatoacnzmlweka

4 Experiments

In order to verify proposed algorithms introduced inSection 33 we executed them on three large datasets

6 Scientific Programming

In the following sections these datasets are introduced andachieved results are depicted

41 Datasets Datasets which are used for experimentalpurposes areWikipedia Epinions and SlashdotThese onlinesocial networks are available through httpsnapstanfordedudata In the following we explain briefly these datasets

Wikipedia Volunteers from all around the world collaborateto write this free cyclopedia A user can take the roleof administrator with additional access to some technicalfeatures by getting vote of other users Administrators areresponsible for maintenance purposes A user is nominatedas administrator and then Wikipedia members elect usersas administrators via public dialogues and talks Totally7000 users took part in elections and 100000 votes werereceived from 2800 delegated elections 1200 elections lead topromotion and 1500 elections were not successful and did notproduce a winner Half of voters are current administratorsand the remaining are ordinary users [6]

Epinions Epinions is a review online social network thatusers can express their views about diverse items like musicTV shows and hardware The site members are able to votepositively and negatively toward each other As a matter offact this online social network contains information aboutwho-trusts-whom The data set is made of 131828 nodes andcontains 841372 edges in which 85 of them are positive [41]

Slashdot Slashdot is a website related to technology andscience It is well known due to its specific users and hasintroduced itself as user-submitted and science evaluatednews In other words links of news and summary of differentissues are submitted by users and each story becomes thetopic of series of talks Selected readers that take the roleof moderators send ratings to Slashdot The responsibility ofthese readers is appointing tags for each comment Slashdotdoes not show scores to the users but lets them arrangecomments on the base of assigned points Slashdot alsohas a service that users have the ability to tag each otheras friend or opponent Therefore links between users arefriendopponent relationship The data set is made of 82144nodes and contains 549202 edges in which 77 of them arepositive [42]

42 Results Via using introduced community detection ofSection 321 the communities in Epinions Slashdot andWikipedia are detected We examine our proposed methodon different size of communities where it can be specifiedthrough input parameter of community detection algorithmReference [43] introduced a function for determining num-ber of clusters in Epinions Slashdot and Wikipedia and itcan be observed that this method has high accuracies whenthe number of clusters is between four and ten so we alsochecked our approach for seven cases starting from four toten For ranking nodes Prestige Based onCommunityDetec-tion (PBCD) PageRank Based on Community Detection(RBCD) andHITSBased onCommunityDetection (HBCD)are run on these datasets In order to differentiate betweennodes which belong to various communities we defined

05 055 06 065 07 075 08 085 09 095 1

959

9585

958

9575

957

9565

956

Com = 4

Com = 5

Com = 6

Com = 7

Com = 8

Com = 9

Com = 10

Figure 2 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

impact factor 120572 for local nodes and (1 minus 120572) for foreign oneswhere 120572 can contains values of (05 06 09 1) When05 lt 120572 le 1 local nodes have more privilege than theothers and 120572 = 05 shows normal ranking algorithms of[3] In other words in the case (120572 = 05) no communitystructure is considered In order to define accuracy of edgesign prediction rank based features are extracted and in orderto evaluate generalization of themodels we used 10-fold crossvalidation Accuracy of introduced methods is evaluatedwith various size of communities and different values of 120572The results are shown in forms of figures In the followingsignificant findings related to each dataset are stated

421 PBCD on Datasets In Epinions and Slashdot for allsize of communities 120572 = 05 contains the maximum valuesIn fact outputs of PBCD on Epinions and Slashdot datasetsindicate that using community-based ranking algorithmsmight slightly degrade prediction accuracy However theprediction accuracy is still high for different values of 120572greater than 05 and for all community numbers This loweraccuracy might be because of special property of dataset orcommunity detection algorithm Results for Epinions andSlashdot are illustrated in Figures 2 and 3 respectively Resultsrelated to Wikipedia are shown in Figure 4 In the caseCom (number of communities) equals 4 6 and 7 predictionaccuracy is higher than the case 120572 = 05 Generally highprecisions extract properties of dataset and offers bettermodel for prediction In Figure 4 where Com = 6 120572 = 06contains maximum value which is equal to 887467

422 RBCD on Datasets Analogously to PBCD we imple-mented RBCD ranking algorithm on datasets Outputs forEpinions are illustrated in Figure 5 In Epinions all thenumber of communities has acceptable accuracies Com = 10with 120572 = 1 contains the maximum value of 95515 among

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: A Community-Based Approach for Link Prediction in Signed Social

4 Scientific Programming

Reference [25] introduced one criterion function whichmakes decision based on counting number of elementshaving conflict with 119896-balanced theory It can be defined as ifone considers119873 as number of negative edges inside clustersand 119875 as number of positive edges between clusters thennumber of inconsistencies which is denoted by 119868(119888) can bementioned as

119868 (119888) = 120573 times 119873 + (1 minus 120573) times 119875 (3)

where 120573 is the importance factor that is assigned to positiveand negative inconsistencies

If 120573 = 05 then positive and negative relations contributeequally on amount of inconsistency And for the case that05 lt 120573 le 1 the negative relations have more impact onresult and when 0 lt 120573 le 05 positive edges have higherinfluence The ideal condition is created when 119875 and119873 havethe smallest amount so better result is achieved Because 119868(119888)show error the algorithm tries to find minimum value for119873and 119875 via using a hill-climbing optimization technique [25]Other community detection approaches suitable for signedgraphs can also be used in this phase

33 Community-Based Ranking Methods In this paper wepropose three new ranking algorithms for signed complexnetworks that are dependent on community detection Weintroduce amethod that ties the concept of ranking algorithmand community detection Suppose that we intend to com-pute the rank of node 119894 in the network First of all we clusternodes in the network in such a way that each node belongsto one community in the graph (first phase algorithmreferenced in Section 321) so for calculating rank of node119894 by taking influence of other nodes in the network we givepriority and high importance to the nodes that belong to thesame community which node 119894 belongs to These algorithmsare described in detail in the following subsections (secondphase of ranking) We will verify rationality of these rankingalgorithms in Results section

331 PBCD (Prestige RankingAlgorithmBased onCommunityDetection) Prestige is the simplest algorithm in signed com-plex network [18] In this method the most important factorfor determining ranking of each node in the system is thenumber of positive and negative incoming nodes that eachnode receives from others In other words if a node hasmanypositive incoming links therefore its prestige is high in thenetwork And it is also true for negative links if the numberof positive incoming links is less than negative ones the nodehas low prestige in comparison with the other nodes in thesystem [3]The idea of community-based ranking inspires usto incorporate our method with some well-known rankingalgorithms like prestige The proposed prestige can be statedas follows

Pr (119894) = [120572 times1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119894119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119894119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

+ [(1 minus 120572) times

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

minus

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

1003816

119900119889

(+)

in (119894)1003816

1003816

1003816

1003816

1003816

+

1003816

1003816

1003816

1003816

1003816

119900119889

(minus)

in (119894)1003816

1003816

1003816

1003816

1003816

]

(4)

where 120572 is the impact factor to determine degree of impor-tance of nodes that are in the same community that node 119894belongs to and 119894119889+in(119894) and 119894119889

minus

in(119894) respectively indicate setof nodes who positively and negatively voted toward 119894 andthese nodes are members of the same community that node119894 belongs to Similarly 119900119889+in(119894) and 119900119889

minus

in(119894) show set of nodeswho positively and negatively voted toward node 119894 and thesenodes are members of other communities which are differentfrom the community that node 119894 belongs to Moreover | |represents magnitude of the set of nodes and the in subscriptindicates that we only consider incoming links from 119894119889(119894) and119900119889(119894) sets toward node 119894 Finally 119894 isin 119888 119888 isin 119862 119888 is thecommunity which node 119894 belongs to and 119862 is the disjointsubgraph of all clusters detected via the algorithm In thisnotation all members of 119894119889(119894) are in 119888 and none of 119900119889(119894)members are in 119888 As intraclusters nodes of communities aremore close to each other ranking algorithm can utilize theinfluence of intra clusters nodes closeness

332 HBCD (HITS Ranking Algorithm Based on CommunityDetection) HITS algorithm was introduced by [16] and itwas mainly proposed for exploiting helpful information inorder to analyze structure of links and has been appliedin various applications This algorithm works based on huband authority vectors These vectors are initialized withsome predefined (random) values and converge after somerecursive iterations [16]

Reference [3] introduced modified version of HITS inwhich the graph is divided into two positive and negativeparts and then run the algorithm on each graph separatelyIn HITS algorithm there are two vectors authority and hubwhich finally converge after enough iteration We propose anew version of HITS in which there is distinction betweenimportance of local neighbor nodes and those members ofdifferent communities that node 119894 belongs to The HITSalgorithm based on community detection can be stated asfollows

119886

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)out (119894)

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)out (119894)

(+)

119895(119905))

119886

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)out (119894)

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)out (119894)

(minus)

119895(119905))

(+)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(+)in (119894)

119886

(+)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(+)in (119894)

119886

(+)

119895(119905))

Scientific Programming 5

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)in (119894)

119886

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)in (119894)

119886

(minus)

119895(119905))

(5)

where 120572 indicates importance factor that is given to thelocal neighbors and ℎ(119905) and 119886(119905) show hub and authorityvectors at time 119905 119894119889(+)inout(119894) and 119894119889

(minus)

inout(119894) respectively rep-resent set of nodes who have relations (positivenegative orincomingoutgoing) with node 119894 and they are members of thesame community that node 119894 belongs to In a similar manner119900119889

(+)

inout(119894) and 119900119889(minus)

inout(119894) indicate set of nodes in which theyhave relations (positivenegative incomingoutgoing) withnode 119894 and they are members of different communities thatnode 119894 belongs to Finally in and out subscripts respectivelyshow that 119894119889(119894) or 119900119889(119894) represents set of nodes that votedtoward node 119894 (incoming links toward node 119894) or being votedby node 119894 (outgoing links from node 119894) Hub and authoritiesare initialized with some random values and they convergeafter enough iteration

333 RBCD (PageRank Algorithm Based on CommunityDetection) PageRank is one of the most widely used meth-ods for ranking of nodes [37] It was extracted from GoogleLarry page PageRank uses the concept of random walkthat leads to probability distribution which computes thepossibility of randomly going from one node to anotherone and finally gets to one specific node This algorithminitiallywas introduced for graphswith positive andunsignededges especially for web pages on the internet Reference[3] introduced modified version of PageRank in which thegraph is divided into two parts and the algorithm is runon each graph separately and finally for calculating rankingvector negative ranking vector is subtracted from positiveone Our proposed ranking algorithm states that each node inthe network belongs to specific community or cluster and thegeneral idea of specifying ranking is on the basis of utilizingother nodesrsquo information In other words to compute rankof node 119894 we give priority to the nodes that belong tothe same community that node 119894 belongs to Based on thisopinion PageRank based on community detection can bedefined as

PR(+)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

PR(minus)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

(6)

where in the above equations 120572 denotes importance factoris assigned to the local neighbor nodes and 119905 shows therank at the time of 119905 120573 indicates the forgetting factor and119873 represents number of nodes in the graph 119894119889in(119894) showsset of incoming links to node 119894 that belong to the samecommunity that node 119894 belongs to Similarly 119900119889in(119894) indicatesset of incoming links to node 119894 that belong to differentcommunities that node 119894 belongs to 119889(+)out(119895) and 119889(minus)out(119895)involves number of positive and negative outgoing links fromnode 119895 respectively

34 Classifier Classification is the process of assigning datato one of predetermined classes Here our classes are themapped values of positive and negative signs to +1 and minus1values This process is done via using training set whichcontains some features to constitute a classifier and thenevaluation via using a test set A brief explanation of theclassifier used in our work is brought here

Logistic regression logistic regression or sigmoid func-tion is a monotonic continuous function which lies betweenminus1 and +1 values and is a method of learning function of theform 119891 119883 rarr 119884 or 119875(119884 | 119883) They can be mathematicallydefined as follows

119875 (119884 = 1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

119875 (119884 = minus1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

(7)

119884 is the discrete values of classes which here are +1 andminus1 119883 is the input vector of discrete or continuous valuesand 119908

119894are some learning coefficient learnt by the model

Classifiers are evaluated based on accuracy Accuracy metricis calculated based on TP FP TN and FN as follows [38 39]

accuracy = TP + TNTP + TN + FP + FN

(8)

We used 10-fold validation in order to evaluate general-ization of the models Cross validation is a classical methodin which the dataset are partitioned into 119896 foldsThe first oneis used as test set and the remaining folds are considered astraining sets In the next phase the second fold is selected asthe test set and the remaining are chosen as training sets [40]We utilized WEKA software for computing accuracy that isavailable through httpwwwcswaikatoacnzmlweka

4 Experiments

In order to verify proposed algorithms introduced inSection 33 we executed them on three large datasets

6 Scientific Programming

In the following sections these datasets are introduced andachieved results are depicted

41 Datasets Datasets which are used for experimentalpurposes areWikipedia Epinions and SlashdotThese onlinesocial networks are available through httpsnapstanfordedudata In the following we explain briefly these datasets

Wikipedia Volunteers from all around the world collaborateto write this free cyclopedia A user can take the roleof administrator with additional access to some technicalfeatures by getting vote of other users Administrators areresponsible for maintenance purposes A user is nominatedas administrator and then Wikipedia members elect usersas administrators via public dialogues and talks Totally7000 users took part in elections and 100000 votes werereceived from 2800 delegated elections 1200 elections lead topromotion and 1500 elections were not successful and did notproduce a winner Half of voters are current administratorsand the remaining are ordinary users [6]

Epinions Epinions is a review online social network thatusers can express their views about diverse items like musicTV shows and hardware The site members are able to votepositively and negatively toward each other As a matter offact this online social network contains information aboutwho-trusts-whom The data set is made of 131828 nodes andcontains 841372 edges in which 85 of them are positive [41]

Slashdot Slashdot is a website related to technology andscience It is well known due to its specific users and hasintroduced itself as user-submitted and science evaluatednews In other words links of news and summary of differentissues are submitted by users and each story becomes thetopic of series of talks Selected readers that take the roleof moderators send ratings to Slashdot The responsibility ofthese readers is appointing tags for each comment Slashdotdoes not show scores to the users but lets them arrangecomments on the base of assigned points Slashdot alsohas a service that users have the ability to tag each otheras friend or opponent Therefore links between users arefriendopponent relationship The data set is made of 82144nodes and contains 549202 edges in which 77 of them arepositive [42]

42 Results Via using introduced community detection ofSection 321 the communities in Epinions Slashdot andWikipedia are detected We examine our proposed methodon different size of communities where it can be specifiedthrough input parameter of community detection algorithmReference [43] introduced a function for determining num-ber of clusters in Epinions Slashdot and Wikipedia and itcan be observed that this method has high accuracies whenthe number of clusters is between four and ten so we alsochecked our approach for seven cases starting from four toten For ranking nodes Prestige Based onCommunityDetec-tion (PBCD) PageRank Based on Community Detection(RBCD) andHITSBased onCommunityDetection (HBCD)are run on these datasets In order to differentiate betweennodes which belong to various communities we defined

05 055 06 065 07 075 08 085 09 095 1

959

9585

958

9575

957

9565

956

Com = 4

Com = 5

Com = 6

Com = 7

Com = 8

Com = 9

Com = 10

Figure 2 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

impact factor 120572 for local nodes and (1 minus 120572) for foreign oneswhere 120572 can contains values of (05 06 09 1) When05 lt 120572 le 1 local nodes have more privilege than theothers and 120572 = 05 shows normal ranking algorithms of[3] In other words in the case (120572 = 05) no communitystructure is considered In order to define accuracy of edgesign prediction rank based features are extracted and in orderto evaluate generalization of themodels we used 10-fold crossvalidation Accuracy of introduced methods is evaluatedwith various size of communities and different values of 120572The results are shown in forms of figures In the followingsignificant findings related to each dataset are stated

421 PBCD on Datasets In Epinions and Slashdot for allsize of communities 120572 = 05 contains the maximum valuesIn fact outputs of PBCD on Epinions and Slashdot datasetsindicate that using community-based ranking algorithmsmight slightly degrade prediction accuracy However theprediction accuracy is still high for different values of 120572greater than 05 and for all community numbers This loweraccuracy might be because of special property of dataset orcommunity detection algorithm Results for Epinions andSlashdot are illustrated in Figures 2 and 3 respectively Resultsrelated to Wikipedia are shown in Figure 4 In the caseCom (number of communities) equals 4 6 and 7 predictionaccuracy is higher than the case 120572 = 05 Generally highprecisions extract properties of dataset and offers bettermodel for prediction In Figure 4 where Com = 6 120572 = 06contains maximum value which is equal to 887467

422 RBCD on Datasets Analogously to PBCD we imple-mented RBCD ranking algorithm on datasets Outputs forEpinions are illustrated in Figure 5 In Epinions all thenumber of communities has acceptable accuracies Com = 10with 120572 = 1 contains the maximum value of 95515 among

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: A Community-Based Approach for Link Prediction in Signed Social

Scientific Programming 5

(minus)

119894(119905 + 1) = 120572 times ( sum

119895isin119894119889(minus)in (119894)

119886

(minus)

119895(119905))

+ (1 minus 120572) times ( sum

119895isin119900119889(minus)in (119894)

119886

(minus)

119895(119905))

(5)

where 120572 indicates importance factor that is given to thelocal neighbors and ℎ(119905) and 119886(119905) show hub and authorityvectors at time 119905 119894119889(+)inout(119894) and 119894119889

(minus)

inout(119894) respectively rep-resent set of nodes who have relations (positivenegative orincomingoutgoing) with node 119894 and they are members of thesame community that node 119894 belongs to In a similar manner119900119889

(+)

inout(119894) and 119900119889(minus)

inout(119894) indicate set of nodes in which theyhave relations (positivenegative incomingoutgoing) withnode 119894 and they are members of different communities thatnode 119894 belongs to Finally in and out subscripts respectivelyshow that 119894119889(119894) or 119900119889(119894) represents set of nodes that votedtoward node 119894 (incoming links toward node 119894) or being votedby node 119894 (outgoing links from node 119894) Hub and authoritiesare initialized with some random values and they convergeafter enough iteration

333 RBCD (PageRank Algorithm Based on CommunityDetection) PageRank is one of the most widely used meth-ods for ranking of nodes [37] It was extracted from GoogleLarry page PageRank uses the concept of random walkthat leads to probability distribution which computes thepossibility of randomly going from one node to anotherone and finally gets to one specific node This algorithminitiallywas introduced for graphswith positive andunsignededges especially for web pages on the internet Reference[3] introduced modified version of PageRank in which thegraph is divided into two parts and the algorithm is runon each graph separately and finally for calculating rankingvector negative ranking vector is subtracted from positiveone Our proposed ranking algorithm states that each node inthe network belongs to specific community or cluster and thegeneral idea of specifying ranking is on the basis of utilizingother nodesrsquo information In other words to compute rankof node 119894 we give priority to the nodes that belong tothe same community that node 119894 belongs to Based on thisopinion PageRank based on community detection can bedefined as

PR(+)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(+)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(+)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

PR(minus)119894(119905 + 1) = 120572 times (120573 times sum

119895isin119894119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

+ (1 minus 120572)

times (120573 times sum

119895isin119900119889in(119894)

PR(minus)119895(119905)

1003816

1003816

1003816

1003816

1003816

119889

(minus)

out (119895)1003816

1003816

1003816

1003816

1003816

+ (1 minus 120573) times

1

119873

)

(6)

where in the above equations 120572 denotes importance factoris assigned to the local neighbor nodes and 119905 shows therank at the time of 119905 120573 indicates the forgetting factor and119873 represents number of nodes in the graph 119894119889in(119894) showsset of incoming links to node 119894 that belong to the samecommunity that node 119894 belongs to Similarly 119900119889in(119894) indicatesset of incoming links to node 119894 that belong to differentcommunities that node 119894 belongs to 119889(+)out(119895) and 119889(minus)out(119895)involves number of positive and negative outgoing links fromnode 119895 respectively

34 Classifier Classification is the process of assigning datato one of predetermined classes Here our classes are themapped values of positive and negative signs to +1 and minus1values This process is done via using training set whichcontains some features to constitute a classifier and thenevaluation via using a test set A brief explanation of theclassifier used in our work is brought here

Logistic regression logistic regression or sigmoid func-tion is a monotonic continuous function which lies betweenminus1 and +1 values and is a method of learning function of theform 119891 119883 rarr 119884 or 119875(119884 | 119883) They can be mathematicallydefined as follows

119875 (119884 = 1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

119875 (119884 = minus1 | 119909) =

1

1 + exp (1199080+ sum

119899

119894=1119908

119894119909

119894)

(7)

119884 is the discrete values of classes which here are +1 andminus1 119883 is the input vector of discrete or continuous valuesand 119908

119894are some learning coefficient learnt by the model

Classifiers are evaluated based on accuracy Accuracy metricis calculated based on TP FP TN and FN as follows [38 39]

accuracy = TP + TNTP + TN + FP + FN

(8)

We used 10-fold validation in order to evaluate general-ization of the models Cross validation is a classical methodin which the dataset are partitioned into 119896 foldsThe first oneis used as test set and the remaining folds are considered astraining sets In the next phase the second fold is selected asthe test set and the remaining are chosen as training sets [40]We utilized WEKA software for computing accuracy that isavailable through httpwwwcswaikatoacnzmlweka

4 Experiments

In order to verify proposed algorithms introduced inSection 33 we executed them on three large datasets

6 Scientific Programming

In the following sections these datasets are introduced andachieved results are depicted

41 Datasets Datasets which are used for experimentalpurposes areWikipedia Epinions and SlashdotThese onlinesocial networks are available through httpsnapstanfordedudata In the following we explain briefly these datasets

Wikipedia Volunteers from all around the world collaborateto write this free cyclopedia A user can take the roleof administrator with additional access to some technicalfeatures by getting vote of other users Administrators areresponsible for maintenance purposes A user is nominatedas administrator and then Wikipedia members elect usersas administrators via public dialogues and talks Totally7000 users took part in elections and 100000 votes werereceived from 2800 delegated elections 1200 elections lead topromotion and 1500 elections were not successful and did notproduce a winner Half of voters are current administratorsand the remaining are ordinary users [6]

Epinions Epinions is a review online social network thatusers can express their views about diverse items like musicTV shows and hardware The site members are able to votepositively and negatively toward each other As a matter offact this online social network contains information aboutwho-trusts-whom The data set is made of 131828 nodes andcontains 841372 edges in which 85 of them are positive [41]

Slashdot Slashdot is a website related to technology andscience It is well known due to its specific users and hasintroduced itself as user-submitted and science evaluatednews In other words links of news and summary of differentissues are submitted by users and each story becomes thetopic of series of talks Selected readers that take the roleof moderators send ratings to Slashdot The responsibility ofthese readers is appointing tags for each comment Slashdotdoes not show scores to the users but lets them arrangecomments on the base of assigned points Slashdot alsohas a service that users have the ability to tag each otheras friend or opponent Therefore links between users arefriendopponent relationship The data set is made of 82144nodes and contains 549202 edges in which 77 of them arepositive [42]

42 Results Via using introduced community detection ofSection 321 the communities in Epinions Slashdot andWikipedia are detected We examine our proposed methodon different size of communities where it can be specifiedthrough input parameter of community detection algorithmReference [43] introduced a function for determining num-ber of clusters in Epinions Slashdot and Wikipedia and itcan be observed that this method has high accuracies whenthe number of clusters is between four and ten so we alsochecked our approach for seven cases starting from four toten For ranking nodes Prestige Based onCommunityDetec-tion (PBCD) PageRank Based on Community Detection(RBCD) andHITSBased onCommunityDetection (HBCD)are run on these datasets In order to differentiate betweennodes which belong to various communities we defined

05 055 06 065 07 075 08 085 09 095 1

959

9585

958

9575

957

9565

956

Com = 4

Com = 5

Com = 6

Com = 7

Com = 8

Com = 9

Com = 10

Figure 2 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

impact factor 120572 for local nodes and (1 minus 120572) for foreign oneswhere 120572 can contains values of (05 06 09 1) When05 lt 120572 le 1 local nodes have more privilege than theothers and 120572 = 05 shows normal ranking algorithms of[3] In other words in the case (120572 = 05) no communitystructure is considered In order to define accuracy of edgesign prediction rank based features are extracted and in orderto evaluate generalization of themodels we used 10-fold crossvalidation Accuracy of introduced methods is evaluatedwith various size of communities and different values of 120572The results are shown in forms of figures In the followingsignificant findings related to each dataset are stated

421 PBCD on Datasets In Epinions and Slashdot for allsize of communities 120572 = 05 contains the maximum valuesIn fact outputs of PBCD on Epinions and Slashdot datasetsindicate that using community-based ranking algorithmsmight slightly degrade prediction accuracy However theprediction accuracy is still high for different values of 120572greater than 05 and for all community numbers This loweraccuracy might be because of special property of dataset orcommunity detection algorithm Results for Epinions andSlashdot are illustrated in Figures 2 and 3 respectively Resultsrelated to Wikipedia are shown in Figure 4 In the caseCom (number of communities) equals 4 6 and 7 predictionaccuracy is higher than the case 120572 = 05 Generally highprecisions extract properties of dataset and offers bettermodel for prediction In Figure 4 where Com = 6 120572 = 06contains maximum value which is equal to 887467

422 RBCD on Datasets Analogously to PBCD we imple-mented RBCD ranking algorithm on datasets Outputs forEpinions are illustrated in Figure 5 In Epinions all thenumber of communities has acceptable accuracies Com = 10with 120572 = 1 contains the maximum value of 95515 among

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: A Community-Based Approach for Link Prediction in Signed Social

6 Scientific Programming

In the following sections these datasets are introduced andachieved results are depicted

41 Datasets Datasets which are used for experimentalpurposes areWikipedia Epinions and SlashdotThese onlinesocial networks are available through httpsnapstanfordedudata In the following we explain briefly these datasets

Wikipedia Volunteers from all around the world collaborateto write this free cyclopedia A user can take the roleof administrator with additional access to some technicalfeatures by getting vote of other users Administrators areresponsible for maintenance purposes A user is nominatedas administrator and then Wikipedia members elect usersas administrators via public dialogues and talks Totally7000 users took part in elections and 100000 votes werereceived from 2800 delegated elections 1200 elections lead topromotion and 1500 elections were not successful and did notproduce a winner Half of voters are current administratorsand the remaining are ordinary users [6]

Epinions Epinions is a review online social network thatusers can express their views about diverse items like musicTV shows and hardware The site members are able to votepositively and negatively toward each other As a matter offact this online social network contains information aboutwho-trusts-whom The data set is made of 131828 nodes andcontains 841372 edges in which 85 of them are positive [41]

Slashdot Slashdot is a website related to technology andscience It is well known due to its specific users and hasintroduced itself as user-submitted and science evaluatednews In other words links of news and summary of differentissues are submitted by users and each story becomes thetopic of series of talks Selected readers that take the roleof moderators send ratings to Slashdot The responsibility ofthese readers is appointing tags for each comment Slashdotdoes not show scores to the users but lets them arrangecomments on the base of assigned points Slashdot alsohas a service that users have the ability to tag each otheras friend or opponent Therefore links between users arefriendopponent relationship The data set is made of 82144nodes and contains 549202 edges in which 77 of them arepositive [42]

42 Results Via using introduced community detection ofSection 321 the communities in Epinions Slashdot andWikipedia are detected We examine our proposed methodon different size of communities where it can be specifiedthrough input parameter of community detection algorithmReference [43] introduced a function for determining num-ber of clusters in Epinions Slashdot and Wikipedia and itcan be observed that this method has high accuracies whenthe number of clusters is between four and ten so we alsochecked our approach for seven cases starting from four toten For ranking nodes Prestige Based onCommunityDetec-tion (PBCD) PageRank Based on Community Detection(RBCD) andHITSBased onCommunityDetection (HBCD)are run on these datasets In order to differentiate betweennodes which belong to various communities we defined

05 055 06 065 07 075 08 085 09 095 1

959

9585

958

9575

957

9565

956

Com = 4

Com = 5

Com = 6

Com = 7

Com = 8

Com = 9

Com = 10

Figure 2 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

impact factor 120572 for local nodes and (1 minus 120572) for foreign oneswhere 120572 can contains values of (05 06 09 1) When05 lt 120572 le 1 local nodes have more privilege than theothers and 120572 = 05 shows normal ranking algorithms of[3] In other words in the case (120572 = 05) no communitystructure is considered In order to define accuracy of edgesign prediction rank based features are extracted and in orderto evaluate generalization of themodels we used 10-fold crossvalidation Accuracy of introduced methods is evaluatedwith various size of communities and different values of 120572The results are shown in forms of figures In the followingsignificant findings related to each dataset are stated

421 PBCD on Datasets In Epinions and Slashdot for allsize of communities 120572 = 05 contains the maximum valuesIn fact outputs of PBCD on Epinions and Slashdot datasetsindicate that using community-based ranking algorithmsmight slightly degrade prediction accuracy However theprediction accuracy is still high for different values of 120572greater than 05 and for all community numbers This loweraccuracy might be because of special property of dataset orcommunity detection algorithm Results for Epinions andSlashdot are illustrated in Figures 2 and 3 respectively Resultsrelated to Wikipedia are shown in Figure 4 In the caseCom (number of communities) equals 4 6 and 7 predictionaccuracy is higher than the case 120572 = 05 Generally highprecisions extract properties of dataset and offers bettermodel for prediction In Figure 4 where Com = 6 120572 = 06contains maximum value which is equal to 887467

422 RBCD on Datasets Analogously to PBCD we imple-mented RBCD ranking algorithm on datasets Outputs forEpinions are illustrated in Figure 5 In Epinions all thenumber of communities has acceptable accuracies Com = 10with 120572 = 1 contains the maximum value of 95515 among

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: A Community-Based Approach for Link Prediction in Signed Social

Scientific Programming 7

05 055 06 065 07 075 08 085 09 095 1

898

897

896

895

894

893

892

891

89

Figure 3 119910 axis shows prediction accuracy of PBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8875

887

8865

886

8855

885

8845

Figure 4 119910 axis shows prediction accuracy of PBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

958

956

954

952

95

948

946

944

Figure 5 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

all communities RBCD have similar outputs in Slashdotwhen 120572 = 1 it has highest precision with maximum valueof 89335 for 10 communities Results related to Slashdot arerepresented in Figure 6 RBCD also produced satisfactoryresults in Wikipedia When 120572 = 05 it contains minimumvalue of 88350 among all communities In case of Com = 8

05 055 06 065 07 075 08 085 09 095 1

8935

893

8925

892

8915

891

8905

89

8895

889

8885

Figure 6 119910 axis shows prediction accuracy of RBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

8844

8842

884

8838

8836

8834

8832

883

8828

Figure 7 119910 axis shows prediction accuracy of RBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

maximum accuracy of 88405 is for 120572 = 09 When Com = 56 7 RBCD reach maximum accuracies at 120572 = 08 in whichtheir values are 88386 88372 and 88374 respectively

Similarly for communities nine and tenmaximum valuesare 884021 and 88402 respectively with120572 = 09 Outputs forWikipedia are shown in Figure 7

423 HBCD on Datasets We also implemented HBCD ondatasets Achieved results for Epinions are shown in Figure 8In Epinions 120572 = 1 has best accuracy for all communitiesMoreover community number four with value of 95620contains maximum among them As for Slashdot 120572 = 1 hasthe best accuracy and Com = 9 generates accuracy of 8937which has the highest value among all communities Outputsfor Slashdot are illustrated in Figure 9There is the same storyin Wikipedia 120572 = 1 has the best accuracy and proves ournew method Among all of them community number sevenhas the best accuracy with value of 88410 Related results forWikipedia are presented in Figure 10

5 Discussion

Random guessing of edge sign prediction on original datasetsresults in accuracy prediction of approximately 80 percent

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: A Community-Based Approach for Link Prediction in Signed Social

8 Scientific Programming

05 055 06 065 07 075 08 085 09 095 1

96

958

956

954

952

95

948

946

944

Figure 8 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Epinions dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

05 055 06 065 07 075 08 085 09 095 1

895

89

885

88

875

87

865

Figure 9 119910 axis shows prediction accuracy of HBCD rankingalgorithm on Slashdot dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

As Table 1 indicates accuracy of our work improves the rateof prediction about 10 to 15 percent

Reference [5] applied degree-based features on EpinionsSlashdot andWikipedia and performed sign prediction withprecisions of 90751 87117 and 83835 respectively It can beperceived in Table 1 that the prediction rate of our work hassignificant improvement in comparison to [5]

Reference [3] also introduced new ranking algorithmsnamely MPR MHITS and improved precision achievedby [5] In order to compare result of this paper with [3] wecan consider 120572 = 05 as normal Prestige MPR and MHITSIn other words in our proposed formulas 120572 = 05 indicatesthat nodes inside and outside community have the sameprivilege All in all except precision of PBCDonEpinions andSlashdot in other cases our method produces better resultsin comparison with [3] To put in a nutshell we find out thatour proposed approach outperforms previous works relatedto sign prediction problem and it is indicated that localnodes in communities have higher impact on reputation andimportance of other nodes in the networkMoreover number

05 055 06 065 07 075 08 085 09 095 1

89

88

87

86

85

84

83

82

Figure 10 119910 axis shows prediction accuracy of HBCD rankingalgorithm onWikipedia dataset and 119909 axis indicates different valuesof 120572 Different colors show number of communities detected by thecommunity detection algorithm

Table 1 Prediction accuracy using various methods on originaldatasets

Epinion SlashDot WikipediaPrediction accuracy of Leskovec [5]

Leskovec 90751 87117 83835Prediction accuracy of Shahriari [3]

Normal prestige 95872 89604 88747MPR 94550 88859 8834MHITS 94770 88185 88359

Prediction accuracy achieved from our proposed algorithmsPBCD 95853 89536 88747RBCD 95515 89355 88405HBCD 95629 89380 88405

of communities can be estimated in these real-world datasetsby analyzing output presented in the previous section Forexample inWikipedia PBCD algorithm produces best accu-racies for community number six and community numbereight yields the best result for RBCD Similarly HBCD detectseven communities inWikipedia It is very obvious that resultof each algorithm is different with another one but it can beeasily perceived that all of these numbers are close to eachother Therefore we can deduce that these community-basedranking algorithms are able to approximate the number ofcommunities in these datasets

6 Conclusion and Future Works

Complex networks have multidisciplinary roles in sciencecomprising artificial intelligence economics and chemistryin which their usages have been increasing Social networksas one branch of complex networks have got a lot of attentionrecently One principal topic in social networks is to inves-tigate evolution of graphs thus researchers are trying to takeprediction algorithms in order to find hidden relationships inthese networks A significant factor in predicting relations isranking of people in societies The aforementioned concepts

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: A Community-Based Approach for Link Prediction in Signed Social

Scientific Programming 9

are expressed in social networks as sign prediction andranking of nodes respectively

Nodes ranking algorithms which intend to determinehow much a node is reputable in a network are studied inour document Three community-based ranking algorithmsare proposed in this paper These ranking algorithms havetwo phases In the first phase nodes are assigned to differentcommunities by applying community detection algorithm(in this phase different community detection algorithmscan be applied) In the second phase rank of each nodeis computed based on its membership to its neighborsrsquocommunities and its incomingoutgoing positivenegativelinks So we investigated the effect of community detectionon the accuracy of sign prediction problem and comparedour work with [3 5] Eventually we deduced that our rank-ing algorithms outperform both methods and community-based ranking algorithms produce better accuracies in somecases

Our experiment was performed to check which com-munity number has the best accuracy In this case resultsmay be affected by properties of this community detectionalgorithm In future research we intend to compare impactof different community detection methods on accuracy ofedge sign prediction problem The problem of overlappingcommunity detection especially has gained much attentionHence this problem and its effect on signed predictioncan be investigated We are also interested in working onparameter-free community detection methods suitable forsigned graphs Finally to check the reliability of the methodit is good to test these approaches in person to personrecommenders

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 12th InternationalConference on Information and Knowledge Management (CIKMrsquo03) pp 556ndash559 November 2003

[2] Y Dong J Tang S Wu et al ldquoLink prediction and recommen-dation across heterogeneous social networksrdquo in Proceedings ofthe 12th IEEE International Conference on Data Mining (ICDMrsquo12) pp 181ndash190 December 2012

[3] M Shahriari and M Jalili ldquoRanking nodes in signed socialnetworksrdquo Social Network Analysis and Mining vol 4 article172 2014

[4] J Kunegis A Lommatzsch and C Bauckhage ldquoThe SlashdotZoo mining a social network with negative edgesrdquo in Pro-ceedings of the 18th International World Wide Web Conference(WWW rsquo09) pp 741ndash750 April 2009

[5] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 New York NY USA April 2010

[6] K-Y Chiang N Natarajan A Tewari and I S DhillonldquoExploiting longer cycles for link prediction in signed net-worksrdquo in Proceedings of the 20th ACM Conference on Infor-mation and Knowledge Management (CIKM rsquo11) pp 1157ndash1162October 2011

[7] M Shahriari O A Sichani J Gharibshah and M JalilildquoPredicting sign of edges in social networks based on usersreputation and optimismrdquoTransactions onKnowledgeDiscoveryfrom Data In press

[8] L Adamic and E Adar ldquoHow to search a social networkrdquo SocialNetworks vol 27 no 3 pp 187ndash203 2005

[9] J Chen W Geyer C Dugan M Muller I Guy and MCarmel ldquoMake new friends but keep the old recommendingpeople on social networking sitesrdquo in Proceedings of the SIGCHIConference on Human Factors in Computing Systems (CHI rsquo09)pp 201ndash210 2009

[10] P Symeonidis E Tiakas and Y Manolopoulos ldquoTransitivenode similarity for link prediction in social networks withpositive and negative linksrdquo in Proceedings of the 4th ACMRecommender Systems Conference (RecSys rsquo10) pp 183ndash190September 2010

[11] J Leskovec D Huttenlocher and J Kleinberg ldquoPredictingpositive and negative links in online social networksrdquo inProceedings of the 19th International Conference on World WideWeb pp 641ndash650 2010

[12] V A Traag Y Nesterov and P VanDooren ldquoExponential rank-ing taking into account negative linksrdquo in Social Informaticsvol 6430 of Lecture Notes in Computer Science pp 192ndash202Springer Berlin Germany 2010

[13] L C Freeman ldquoA set of measures of centrality based onbetweennessrdquo Sociometry vol 40 no 1 pp 35ndash41 1977

[14] L C Freeman ldquoCentrality in social networks conceptual clari-ficationrdquo Social Networks vol 1 no 3 pp 215ndash239 1978

[15] P Bonacich ldquoFactoring and weighting approaches to statusscores and clique identificationrdquo The Journal of MathematicalSociology vol 2 no 1 pp 113ndash120 1972

[16] J M Kleinberg ldquoAuthoritative sources in a hyperlinked envi-ronmentrdquo Journal of the ACM vol 46 no 5 pp 604ndash632 1999

[17] S Brin and L Page ldquoThe anatomy of a large-scale hypertextualweb search enginerdquo Computer Networks and ISDN Systems vol56 no 18 pp 3825ndash3833 2012

[18] K Zolfaghar and A Aghaie ldquoMining trust and distrust rela-tionships in social Web applicationsrdquo in Proceedings of the6th IEEE International Conference on Intelligent ComputerCommunication and Processing pp 73ndash80 2010

[19] C de Kerchove and P van Dooren ldquoThe PageTrust algorithmhow to rank web pages when negative links are allowedrdquo inProceedings of the 8th SIAM International Conference on DataMining pp 346ndash352 April 2008

[20] A Mishra and A Bhattacharya ldquoFinding the bias and prestigeof nodes in networks based on trust scoresrdquo in Proceedings ofthe 20th International Conference on World Wide Web (WWWrsquo11) pp 567ndash576 April 2011

[21] P Anchuri and M M Ismail ldquoCommunityDetection in SignedNetworksrdquo

[22] V A Traag and J Bruggeman ldquoCommunity detection innetworks with positive and negative linksrdquo Physical Review EStatistical Nonlinear and Soft Matter Physics vol 80 no 1Article ID 036115 7 pages 2009

[23] M E J Newman ldquoModularity and community structure innetworksrdquoProceedings of theNational Academy of Sciences of theUnited States of America vol 103 no 23 pp 8577ndash8582 2006

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: A Community-Based Approach for Link Prediction in Signed Social

10 Scientific Programming

[24] P Anchuri and M Magdon-Ismail ldquoCommunities and balancein signed networks a spectral approachrdquo in Proceedings ofthe IEEEACM International Conference on Advances in SocialNetworks Analysis and Mining pp 235ndash242 August 2012

[25] P Doreian and A Mrvar ldquoPartitioning signed social networksrdquoSocial Networks vol 31 no 1 pp 1ndash11 2009

[26] T Zhang H Jiang Z Bao and Y Zhang ldquoCharacterization andedge sign prediction in signed networksrdquo Journal of Industrialand Intelligent Information vol 1 no 1 pp 19ndash24 2013

[27] S H Yang A J Smola B Long H Zha and Y ChangldquoFriend or frenemy Predicting signed ties in social networksrdquoin Proceedings of the 35th Annual ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR rsquo12)pp 555ndash564 August 2012

[28] D Cartwright and F Harary ldquoStructural balance a generaliza-tion of Heiderrsquos theoryrdquo Psychological Review vol 63 no 5 pp277ndash293 1956

[29] P Doreian ldquoEvolution of human signed networksrdquometodoloskiZvezki vol 1 no 2 pp 277ndash293 2004

[30] M Szell R Lambiotte and S Thurner ldquoMultirelational orga-nization of large-scale social networks in an online worldrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 107 no 31 pp 13636ndash13641 2010

[31] S Maniu B Cautis and T Abdessalem ldquoBuilding a signednetwork from interactions in Wikipediardquo Proceedings of the 1stACM SIGMOD Workshop on Databases and Social Networks(DBSocial rsquo11) pp 19ndash24 2011

[32] S R T Antal P L Krapivsky and S Redner ldquoSocial balance onnetworks the dynamics of friendship and enmityrdquo Physica DNonlinear Phenomena vol 224 no 1-2 pp 130ndash136 2006

[33] P Doreian ldquoA multiple indicator approach to blockmodelingsigned networksrdquo Social Networks vol 30 no 3 pp 247ndash2582008

[34] S A Marvel J Kleinberg R D Kleinberg and S H StrogatzldquoContinuous-time model of structural balancerdquo Proceedings ofthe National Academy of Sciences of the United States of Americavol 108 no 5 pp 1771ndash1776 2011

[35] N P Hummon and P Doreian ldquoSome dynamics of socialbalance processes bringing Heider back into balance theoryrdquoSocial Networks vol 25 no 1 pp 17ndash49 2003

[36] G Adejumo P R Duimering and Z Zhong ldquoA balance theoryapproach to group problem solvingrdquo Social Networks vol 30no 1 pp 83ndash99 2008

[37] L Page S Brin R Motawani and TWinograd ldquoThe page rankcitation rankingbringing order to the webrdquo 1999

[38] C M Bishop Pattern Recognition and Machine LearningSpringer New York NY USA 2006

[39] T M Mitchell Machine Learning McGraw Hill 1st edition1997

[40] J Ye H Cheng Z Zhu and M Chen ldquoPredicting positive andnegative links in signed social networks by transfer learningrdquo inProceedings of the 22nd International Conference onWorldWideWeb (WWW rsquo13) pp 1477ndash1488 2013

[41] J Kunegis ldquoWhat is the added value of negative links inonline social networksrdquo inProceedings of the 22nd InternationalConference on World Wide Web pp 727ndash736 2013

[42] K-Y Chiang C-J Hsieh N Natarajan I S Dhillon and ATewari ldquoPrediction and clustering in signed networks a local toglobal perspectiverdquo Journal of Machine Learning Research vol15 no 1 pp 1177ndash1213 2014

[43] A Javari and M Jalili ldquoCluster-based collaborative filtering forsign prediction in social networks with positive and negativelinksrdquo ACM Transactions on Intelligent Systems and Technologyvol 5 no 2 article 24 2014

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: A Community-Based Approach for Link Prediction in Signed Social

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014