social recommendation algorithm fusing user interest social network

8
July 2014, 21(Suppl. 1): 26–33 www.sciencedirect.com/science/journal/10058885 http://jcupt.xsw.bupt.cn The Journal of China Universities of Posts and Telecommunications Social recommendation algorithm fusing user interest social network LI Yu-sheng ( ), SONG Mei-na, E Hai-hong, SONG Jun-de School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China Abstract Using the social information among users in recommender system can partly solve the data sparsely problems and significantly improve the performance of the recommendation system. However, the recommendation systems which using the users’ social information have two main problems: the explicit user social connection information is not always available in real-world recommender systems, and the user social connection information is directly used in recommender systems when the user explicit social information is available. But as we know that the user social information is not all based on user interest, so this can introduce noise to the recommender systems. This paper proposes a social recommender system model called interest social recommendation (ISoRec). Based on probability matrix factorization (PMF), the model addresses the problems mentioned above by combining user-item rating matrix, explicit user social connection information and implicit user interest social connection information to make more accurately recommendation. In addition, the computational complexity of our algorithm is linear with respect to the number of observed data sets used in this algorithm, and can scalable to very large datasets. Keywords social recommendation, PMF, implicit user interest social connection network, explicit user social connection network 1 Introduction Today, with the rapid development of the Internet, the amount of information generated by the network grows exponentially. The recommendation system is becoming increasingly important as an information tool for filtering and information recommendation. Traditional recommendation system has been studied by many researchers [1–6], and some of the systems have been well applied in industrial. Those systems are based on the assumption that users are independent and identically distributed, ignoring the social networking relationship among users. However, such assumption does not hold in reality, because people often turn to friends for some suggestions about movies, books or other things, those opinions of their friends have a great influence on the final decision made by the people. Therefore, using social networking information among users to improve Received date: 27-06-2014 Corresponding author: LI Yu-sheng, E-mail: [email protected] DOI: 10.1016/S1005-8885(14)60516-1 recommendation system performance recently drowns a lot of attention and proposed some social recommendation algorithms [7–11]. The experiments of the social recommendation indicate that the social relationships among users can significantly improve the recommendation accuracy of the traditional recommendation systems. Although the social recommendation algorithm has been shown great advantage compare with traditional recom- mendation, there are still some problems and challenges: 1) Most of the exiting recommendation systems do not provide mechanisms for social relationships between users, the explicit user social relationships information is not available, and so the social recommendation cannot apply in this situation. This greatly limits the use of social recommendation. 2) When the explicit user social connection information is available, the connection between users is formed not necessarily based on the similarity interests or hobbies between users, so use this social information to recommender system may introduce noise to the system. Because only the user social information based on same or

Upload: jun-de

Post on 01-Feb-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social recommendation algorithm fusing user interest social network

July 2014, 21(Suppl. 1): 26–33 www.sciencedirect.com/science/journal/10058885 http://jcupt.xsw.bupt.cn

The Journal of China Universities of Posts and Telecommunications

Social recommendation algorithm fusing user interest social network

LI Yu-sheng ( ), SONG Mei-na, E Hai-hong, SONG Jun-de

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract

Using the social information among users in recommender system can partly solve the data sparsely problems and significantly improve the performance of the recommendation system. However, the recommendation systems which using the users’ social information have two main problems: the explicit user social connection information is not always available in real-world recommender systems, and the user social connection information is directly used in recommender systems when the user explicit social information is available. But as we know that the user social information is not all based on user interest, so this can introduce noise to the recommender systems. This paper proposes a social recommender system model called interest social recommendation (ISoRec). Based on probability matrix factorization (PMF), the model addresses the problems mentioned above by combining user-item rating matrix, explicit user social connection information and implicit user interest social connection information to make more accurately recommendation. In addition, the computational complexity of our algorithm is linear with respect to the number of observed data sets used in this algorithm, and can scalable to very large datasets.

Keywords social recommendation, PMF, implicit user interest social connection network, explicit user social connection network

1 Introduction

Today, with the rapid development of the Internet, the amount of information generated by the network grows exponentially. The recommendation system is becoming increasingly important as an information tool for filtering and information recommendation. Traditional recommendation system has been studied by many researchers [1–6], and some of the systems have been well applied in industrial. Those systems are based on the assumption that users are independent and identically distributed, ignoring the social networking relationship among users. However, such assumption does not hold in reality, because people often turn to friends for some suggestions about movies, books or other things, those opinions of their friends have a great influence on the final decision made by the people. Therefore, using social networking information among users to improve

Received date: 27-06-2014 Corresponding author: LI Yu-sheng, E-mail: [email protected] DOI: 10.1016/S1005-8885(14)60516-1

recommendation system performance recently drowns a lot of attention and proposed some social recommendation algorithms [7–11]. The experiments of the social recommendation indicate that the social relationships among users can significantly improve the recommendation accuracy of the traditional recommendation systems.

Although the social recommendation algorithm has been shown great advantage compare with traditional recom- mendation, there are still some problems and challenges:

1) Most of the exiting recommendation systems do not provide mechanisms for social relationships between users, the explicit user social relationships information is not available, and so the social recommendation cannot apply in this situation. This greatly limits the use of social recommendation.

2) When the explicit user social connection information is available, the connection between users is formed not necessarily based on the similarity interests or hobbies between users, so use this social information to recommender system may introduce noise to the system. Because only the user social information based on same or

Page 2: Social recommendation algorithm fusing user interest social network

Supplement 1 LI Yu-sheng, et al. / Social recommendation algorithm fusing user interest social network 27

similar interests is useful for making recommendation. To solve these limitations, we propose a method based

on probabilistic factor analysis which integrates implicit user interest social information, explicit user social information and user-item rating matrix. In our method, when the explicit user social information is unavailable, we use the implicit user interest social information for implicit interest social recommendation. When the explicit user social information is available, the algorithm using the user’s explicit social information and the user’s implicit interest information simultaneously to generate recommendation results. In this way, the algorithm not only reflects the impact of the users’ explicit social relationships but also takes into account the users’ implicit interest information, so that the users with same or similar interests and have explicit social information give more influence weight on the recommendation results. To achieve this goal, the algorithm proposed in this paper integrating explicit user social information matrix, implicit user interest relationships information matrix and the user-item rating matrix, based on probabilistic factor analysis. We connect those three different data resources through the shared user latent feature space, that is, the user latent feature space in the user’s explicit social relationships matrix and the user’s implicit interest relationships matrix is the same in the user-item rating matrix. By performing factor analysis based on probabilistic matrix factorization, the low-rank user latent feature space and item latent feature space are learned in order to make social recommendations. The experimental results on MoveLens and Epinions show that our method outperforms the state-of-the-art collaborative filtering algorithms and social recommendation algorithms.

The remainder of this paper is organized as follows. In Sect. 2, we provide an overview of several major approaches for recommender systems and some related work. Sect. 3 we present our work on social recommendation. The results of an empirical analysis are presented in Sect. 4, followed by the conclusions and future work in Sect. 5.

2 Related work

2.1 Preliminaries

To facilitate the description of the problem, Table 1 defines basic terms and notations used throughout this paper.

Table 1 Basic notations throughout this paper Notation Description

1 2{ , ,..., }mu u u=US US is the set of users, iu

is the ith user, m

is the total number of users

1 2{ , ,..., }ni i i=IS IS is the set of items, ji is the jth item, n

is the total number of items

l ∈ l is the number of dimensions of latent feature space

l m×∈U U is the user latent feature matrix l n×∈V V is the item latent feature matrix

{r }, m nij

×= ∈R R R is the user-item rating matrix, rij is rating that user ui gave to item ij

{ }, m miks ×= ∈S S

S is the user’s explicit social relationships matrix, sij is the relationship between user ui and uj

{ }, m mitc ×= ∈C C

C is the user’s implicit interest relationships matrix, itc

is the relationship between user

iu and tu

As show in Table 1, we have m users and n items in a recommendation system. The user-item rating matrix is denoted as R, and the element ijr in R means the rating to

ji given by user iu , where values of ijr are within the

range [0, 1]. In recommender systems, ratings reflect users’ judgments about the items, and most recommender systems use discrete integer rating values from 1 to maxR to represent the users’ judgments on items. In this case, we use the function max( ) ( 1) / ( 1)f x x R= − − as the mapping function to map the original rating values to values in the interval [0, 1].

In social recommendation system, there is an explicit user social relationships matrix besides user-item rating matrix. The explicit user social relationships matrix is denoted as S, where the weight of element ijs in S represents how much a user iu trusts or knows user ju

in the social network. In this paper, we construct an implicit interest relationships matrix C, the element ijc of C denote the interests similarities between user iu and user ju , the bigger of the value in matrix C means more

similar between the users.

2.2 Related work

Traditional recommendation algorithms [12–14] are based on the assumption that users are independent and identically distributed, and ignore the social trust relationships between users, which are not consistent with the reality that we normally ask our friends for recommendations. Therefore, to improve the

Page 3: Social recommendation algorithm fusing user interest social network

28 The Journal of China Universities of Posts and Telecommunications 2014

recommendation accuracy, in modern recommender systems, both social network and user-item rating matrix should be taken into consideration and those recommendation systems called social recommendation systems [9–11,15–17].

In general, most of the social recommendation methods are based on the matrix factorization framework [9–10], which is both effective and efficient in generating recommendations. Typically, social information is utilized to shape the user and item latent space. Different intuitions on interpreting social information will result in different objective functions or learning models. Ma et al. in Ref. [10] proposed a social recommender system named social recommendation (SoRec), by conducting latent factor analysis using probabilistic matrix factorization, the user latent feature space and item latent feature space are learned by employing a user social network and a user-item matrix simultaneously and seamlessly. The graphical model for this social recommendation algorithm is described in Fig. 1. The same author in Ref. [11] proposed another social recommender system called social trust ensemble (STE), this social recommendation system is the base matrix factorization method fusing the user social relational so that the model not only reflected the interest of the user’s but also reflected the interest of his/her friends’ interest. The graphical model of this algorithm is shown in Fig. 2, the predict rating of the user

iu given to item ji is computed as follows:

( (1 ) )i

ij jk N

R g α α∈

= + − ∑T Ti j ij kU V T U V

(1)

The users’ favors and the trusted friends’ favors are smoothed by the parameter α , which naturally fuses appropriate amount of real world recommendation processes into the recommender systems.

Fig. 1 Graphical model for SoRec

Fig. 2 Graphical model for STE

The above social recommendation systems all use the users’ explicit social connections information to improve recommendation results. However, explicit social connection information is not always available in real-world recommender systems. In Ref. [15], the author proposed a social recommendation algorithm which uses users’ and items’ similarity as social connection information for social recommendation when the users’ explicit social connection information unavailable. There are some other methods to get the users’ implicit connection information when explicit social connection information is not available. In Ref. [16], the author used the tagging information to compute the users’ and items’ similarity, and use this similarity as users’ social connection information to improve the recommendation results. The graphical model is shown in Fig. 3.

Fig. 3 Graphical model in social recommendation in Ref. [11]

Our model differs from above work because we fusion the users’ explicit social information and the users’ implicit social information in one model, making the algorithm has better adaptability and scalability. Furthermore, we use a novel matrix decomposition model. This model is easy to

Page 4: Social recommendation algorithm fusing user interest social network

Supplement 1 LI Yu-sheng, et al. / Social recommendation algorithm fusing user interest social network 29

fuse the users’ explicit social information and users’ implicit interest information to make more accurately recommendation.

3 Social recommendation fusing the users’ implicit interest information

3.1 Users’ implicit interest information matrix

In real-world recommender systems, only few Web sites have implemented the social or trust mechanisms, like Epinions (http://www.epinions.com, a general consumer review site that was established in 1999, where users can also add other users into their trust list) and Douban (http://www.douban.com, the largest Chinese Web 2.0 site devoted for movies, books, and music reviews that was launched in 2005). But the user-item rating matrix can be found in almost every recommender system, so in this paper we use this user-item rating matrix to construct a users’ implicit interest social connection matrix.

Each element ijr in user-item rating matrix R reflects the degree of user iu preferences for the item ji . As we

all know that, people who have same interest or hobbies exhibited similar preferences on same items. So in the case of missing explicit social information, we can always compute another user interest social connection: we compute the similarity of users using the user-item rating matrix. In order to reduce noises when computing the similarities, we require that user iu and user ju should

at least co-rated 10 items, otherwise, we will ignore user ju when computing user iu ’s similar neighbors.

There are several methods we can borrow in the literature to compare the similarity between two users. In our paper, we adopt the most popular approach Pearson correlation coefficient (PCC) [14], which is defined as:

2 2

( ) ( )

( ) ( )

ik i jk jk

ij

ik i jk jk k

r r r rs

r r r r∈

∈ ∈

− ⋅ −=

− ⋅ −

∑ ∑( ) ( )

( ) ( ) ( ) ( )

∩ ∩

I i I j

I i I j I i I j

(2)

where I(i) is a set of items that rated by user iu , and ir represents the average rate of user iu . From the definition, user similarity s is ranging from [ − 1, 1], and a larger value means users iu and ju are more similar. We employ a mapping function ( ) ( 1) / 2f x x= + to bound the range of PCC similarities into [0, 1]. When all of the similarities are calculated, we use these similarities to construct an

implicit user interest matrix C, the elements ijc in it are set to ij ijc s= , because the more similar between users,

the more common interests and hobbies they shared. When the users’ explicit social connection information

is available, traditional social recommendation usually uses this social information directly. While the social connection information is formed by many reasons, such as: schoolmate, colleagues, and consanguinity. Those relationships are not necessary based on same interest and similarity hobbies, but we all know that the social recommendation system are based on the intuition the users’ s social relationships means they have same or similarity interest and hobbies. So the explicit users’ connection information in traditional social may introduces noise and affects the recommendation accuracy.

In this paper, we use the users’ implicit interest connection information to enhance the social information which they have same or similar interests. The model ISoRec we proposed can be used in both situations which the users’ social connection information is available or unavailable. When the users’ social connection information is unavailable, we only use the users’ implicit interest connection for social recommendation. While the users’ social connection information is available, we use the explicit and implicit connection information simultaneously.

3.2 ISoRec recommendation model

In order to learn the characteristics of the users, we employ matrix factorization to factorize the user-item matrix, the users’ explicit social connection matrix and the users’ implicit interest matrix. The idea of matrix factorization is to derive a high-quality l-dimensional ( )l m feature representation iU of users and jV of

items based on analyzing the matrix R, S and C, then use the user latent feature matrix U and item latent feature matrix V to generate the recommendation results.

The conditional distribution over the observed ratings of user-item rating matrix R is defined as:

2 2

1 1

( | , , ) [ ( | g( ), )]m n

iji j

p N rσ σ= =

= ∏∏RijIT

R i j RR U V U V

(3)

where RijI is an indicator variable with the value of 1 if

row i column j existing element in the matrix and equals to 0 otherwise. The 2( | , )N x μ σ is the probability density function of the Gaussian distribution with mean μ and

Page 5: Social recommendation algorithm fusing user interest social network

30 The Journal of China Universities of Posts and Telecommunications 2014

variance 2σ , and the function g(x) is the logistic function ( ) 1/ (1 exp( ))g x x= + − , which makes it possible to bound

the range of Ti jU V with the range [0,1].

We also place zero-mean spherical Gaussian priors on user, item feature vectors:

2 2

1

2 2

1

( | ) ( | 0, )

( | ) ( | 0, )

m

i

n

j

p N

p N

σ σ

σ σ

=

=

⎫= ⎪

⎪⎬⎪=⎪⎭

U i U

V j V

U U I

V V I

(4)

Through a Bayesian inference, the posterior distributions of U and V based only on the observed ratings are derived in Eq. (5):

2 2 2 2

2 2 2

1 1

( , | , , , ) ( | , , )

( | ) ( | )= [N( | g( ), )]m n

iji j

p p

p p r

σ σ σ σ

σ σ σ= =

∝ ⋅

×∏∏Rij

U V R R

ITU V i j R

U V R R U V

U V U V

2 2

1 1

( | 0, ) ( | 0, )m n

i j

N Nσ σ= =

×∏ ∏i U j VU I V I (5)

Similarly, the conditional distribution over the observed elements of explicit user social connection matrix S and implicit user interest social connection matrix C is defined as:

2 T 2

1 1

2 T 2

1 1

( | , , ) [ ( | g( ), )]

( | , , ) [ ( | g( ), )]

m m

iki km m

iti t

p N s

p N c

σ σ

σ σ

= =

= =

⎫= ⎪⎪

⎬⎪= ⎪⎭

∏∏

∏∏

Sik

Cit

IS i k S

IC i t C

S U W U W

C U Z U Z

(6)

We also place the zero-mean spherical Gaussian priors, and through a Bayesian inference, we can derive the posterior distributions of U, W in Eq. (7) and U, Z in Eq. (8):

2 2 2 2

2 2 T 2

1 1

( , | , , , ) ( | , , )

( | ) ( | )= [ ( | g( ), )]m m

iki k

p p

p p N s

σ σ σ σ

σ σ σ= =

∝ ⋅

×∏∏Sik

U W S S

IU W i k S

U W R S U W

U W U W

2 2

1 1

( | 0, ) ( | 0, )m m

i k

N Nσ σ= =

×∏ ∏i U k WU I W I (7)

2 2 2 2 2

2 T 2

1 1

( , | , , , ) ( | , , ) ( | )

( | )= [ ( | g( ), )]m m

iti t

p p p

p N c

σ σ σ σ σ

σ σ= =

∝ ⋅

×∏∏C

Z C

Cit

U Z C U

Ii t

U Z C R U Z U

Z U Z

2 2

1 1

( | 0, ) ( | 0, )m m

i t

N Nσ σ= =

×∏ ∏i U t ZU I Z I (8 )

In order to reflect the phenomenon that a user’s explicit social connections and implicit interest connections will affect this user’s judgment of items, we model the problem of social recommendation using the graphical model described in Fig. 4, which fuses users’ explicit social

connection matrix, users’ implicit interest connection matrix and the user-item rating matrix into a consistent and compact feature representation.

Fig. 4 Graphical model for ISoRec

Based on Fig. 4, we derive the log of the posterior distribution for the above equations are given by:

2 2 2 2 2 2 2

T 22 2

1 1 1 1

T 2 T 22

1 1

T T2 2

1 1

lnp( , , , | , , ; , , , , , , )1 1 ( g( )) (

2 21 g( )) ( g( ))

21 1

2 2

m n m m

ij iki j i k

m m

iti t

m n

i j

r s

c

σ σ σ σ σ σ σ

σ σ

σ

σ σ

= = = =

= =

= =

=

− − − −

− − −

− −

∑∑ ∑∑

∑∑

∑ ∑

R S C U V W Z

R Sij i j ik

R S

Ci k it i t

C

i i j jU V

U V W Z R S C

I U V I

U W I U Z

U U V V T2

T 2 22

1 1 1 1

2 2 2 2

1 1

12

1 1 1 ln ln2 22

1 1 ln ( ln ln ln2 2

m n m m

i j i k

m m

i t

ml nl ml

σ

σ σσ

σ σ σ σ

= = = =

= =

⎛ ⎞ ⎛ ⎞− − −⎜ ⎟ ⎜ ⎟⎝ ⎠⎝ ⎠

⎛ ⎞− + + +⎜ ⎟

⎝ ⎠

∑∑ ∑∑

∑∑

S

C

k kW

R St t ij R ik

Z

Cit U V W

W W

Z Z I I

I

2 ln ) Cml σ +Z ( 9 ) where C is a constant that does not depend on the parameters. Maximizing the log-posterior over three latent features with hyperparameters kept fixed is equivalent to minimizing the following sum-of-squared-errors objective functions with quadratic regularization terms:

T 2

1 1

T 2

1 1 1 1

2 2 2 2T 2

1( , , , , , , ) [ g( )] +2

[ g( )] [2 2

g( )]2 2 2 2

m n

iji j

m m m m

ik iti k i t

F F F F

E r

s cλ λ

λ λ λ λ

= =

= = = =

= −

− + −

+ + + +

∑∑

∑∑ ∑∑S C

Rij i j

S Cik i k it

U V W Zi t

R S C U V W Z I U V

I U W I

U Z U V W Z

(10) where 2 2 2 2 2 2 2, , ,λ σ σ λ σ σ λ σ σ λ σ= = = =S R S C R C U R U V R

2 2 2 2 2, ,σ λ σ σ λ σ σ= =V W R W Z R Z and 2

Fi denotes the

Page 6: Social recommendation algorithm fusing user interest social network

Supplement 1 LI Yu-sheng, et al. / Social recommendation algorithm fusing user interest social network 31

Frobenius norm. A local minimum of the objective function given by Eq. (10) can be found by performing gradient descent in U, V and Z.

T T T

1

T T

1

T T

1

T T

1

1

( )(g( ) )

( )g(( ) )

g ( )(g( ) )

( )(g( ) )

m

iji

m

ikkm

itt

m

iji

n

t

E g r

g s

c

E g r

λ

λ λ

λ

=

=

=

=

=

∂ ′= − +∂

′ − +

′ − +

∂ ′= − +∂

ij i j i j ji

SS ik i k i k k

CC it i t i t t U i

Rij i j i j i

j

CC tj

I U V U V VU

I U W U W W

I U Z U Z Z U

I U V U V UV

I T T

T T

1

T T

1

( )(g( ) )

( )(g( ) )

( )(g( ) )

tj

m

ikikm

iti

g c

E g s

E g c

λ

λ λ

λ λ

=

=

⎫⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎬⎪⎪⎪′ − +⎪⎪∂ ⎪′= − +⎪∂⎪

∂ ⎪′= − +⎪∂ ⎭

tt j t j V j

SS ik i k i k i W k

CC it i t i t i Z t

t

Z V Z V Z V

I U W U W U WW

I U Z U Z U ZZ

(11)

where ( )g x′ is the derivative of logistic function 2( ) exp( ) / (1 exp( ))g x x x′ = + .

3.3 Complexity analysis

The main computation of gradient methods is evaluating the object function E and its gradients against variables. Because of the sparsity of matrix R, S and C, the computational complexity of evaluating the object function E is ( )O l l lρ ρ ρ+ +R S C , where , ,ρ ρ ρR S C are the numbers of nonzero entries in matrices R, S and C. The computational complexities for gradients in Eq. (11) are

( ), ( ), ( ), ( )O l l O l l O l O lρ ρ ρ ρ ρ ρ+ +R S R C C S . Therefore, the total computational complexity in one iteration is (O lρ +R

)l lρ ρ+S C , which indicates that the computational time of our method is linear with respect to the number of observations in the three sparse matrices. This complexity analysis shows that our proposed approach is very efficient and can scale up with respect to very large datasets.

4 Performance analysis

In this section, we conduct several experiments to compare the recommendation quality of our social recommendation approach with other state-of-the-art collaborative filtering methods. Our experiments are intended to address as the following question:

1) How does our approach compare with the published

state-of-the-art collaborative filtering algorithms? 2) How does the users’ implicit interest connections

information impact the recommendation results? 3) How does the parameter affect the accuracy of

prediction?

4.1 Description of the datasets and metrics

We use the MovieLens 10M/100K dataset in our experiment to evaluate the algorithms when users’ explicit social connections information is not available, and use the Epinions dataset to evaluate the performance of the algorithms when it is available, because this latter dataset has a trust relation between users, in addition to the user-item rating matrix.

The MovieLens 10M/100K is a relatively small dataset contains 1 000 000 user-item ratings (scaled from 1 to 5), rated by 943 users on 1 642 items. The Epinions dataset contains 49 290 users, who rated a total of 139 738 different items at least once, and it also has 487 181 issued trust statements which we can use as explicit user social connection information.

The main purpose of recommender systems is to predict users’ likes and interests. Multiple metrics exist to measure various aspects of recommendation performance. Two notable metrics, MAE and RMSE, are used to measure the closeness of predicted ratings to the true ratings. MAE is defined as:

MAE

ˆ| |ij ijij

r rP

N

−=

where ijr denotes the rating user iu gives to item ji ,

ijr denotes the related predicted rating, and N denotes the

number of tested ratings. RMSE is defined as: 2

RMSE

ˆ( )ij ijij

r rP

N

−=

Lower MAE and RMSE results correspond to higher prediction accuracy.

4.2 Performance analysis

When the explicit social information is not available, we evaluate all the algorithms on MovieLens dataset. The explicit social connection information is unavailable means that in our model the parameters 0λ λ= =S W , at this time we just use the users’ implicit interest connection

Page 7: Social recommendation algorithm fusing user interest social network

32 The Journal of China Universities of Posts and Telecommunications 2014

information for recommendation. We ultimately compare our approach with

top-performing recommendation algorithms, including PMF [1] and singular value decomposition (SVD) [2]. In the comparison, we employ different amount of training data, including 99%, 80%, 50%, 20%, 10%. 99% training data means we randomly select 99% of ratings form the MovieLens 10M/100K data set as training data, and leave the remaining 1% as prediction performance testing. In the comparison, we set 0.01λ λ λ λ= = = =U V W Z and the dimensions number of latent feature space is 10l = . The MAE and RMSE results are reports in Table 1. From the results, we can see that our approach consistently outperforms PMF and SVD algorithms, which only utilize the user-item rating matrix. This observation coincides with our intuition that, at the absence of the explicit user social connection information, employing implicit users’ interest connection information can help increase the recommendation quality.

Table 1 RMSE and MAE comparison with other approaches on MovieLens

Training date Metrics SVD PMF ISoRec

99% RMSEP 0.909 7 0.680 7 0.610 4

MAEP 0.721 2 0.620 6 0.552 1

80% RMSEP 0.941 8 0.687 2 0.621 4

MAEP 0.744 8 0.627 4 0.563 8

50% RMSEP 0.953 2 0.689 2 0.637 8

MAEP 0.753 9 0.629 8 0.580 6

20% RMSEP 0.980 1 0.690 2 0.656 5

MAEP 0.778 6 0.630 7 0.598 3

10% RMSEP 1.011 7 0.699 6 0.661 7

MAEP 0.812 6 0.640 0 0.602 9

When the users’ explicit social connections information is available, we evaluate all algorithms on Epinions dataset since in addition to the user-item rating matrix, it also contains a social connection network between users. We use the following methods to compare our approach: SoRec algorithm [9] and PMF algorithm. The SoRec algorithm is a social recommendation algorithm which uses the explicit social connection information and user-item rating matrix information to generate recommendation results. The PMF only uses user-item rating matrix for recommendation. We also use different amounts of training data (80%, 50%, 30%, 20%, 10%) to test all the algorithms. The experimental results are shown

in Table 2. We can observe that our approach outperforms the PMF algorithm which only use user-item rating matrix for recommendation, in addition, our approach has significantly improve the accuracy of social recommendation, this confirm that the implicit users’ interest connection information has a significant impact on recommendation results.

Table 2 RMSE and MAE comparison with other approaches on Epinions

Training date Metrics PMF SoRec ISoRec

80% RMSEP 0.803 2 0.799 4 0.784 1

MAEP 0.744 6 0.741 0 0.725 2

50% RMSEP 0.803 8 0.800 1 0.787 3

MAEP 0.745 4 0.744 2 0.729 3

30% RMSEP 0.803 9 0.800 2 0.789 0

MAEP 0.745 6 0.742 3 0.730 6

20% RMSEP 0.803 9 0.800 4 0.790 1

MAEP 0.745 9 0.742 5 0.731 8

10% RMSEP 0.804 0 0.800 6 0.791 4

MAEP 0.746 1 0.742 6 0.733 1

The main advantage of our approach is that it incorporates the users’ explicit social connection information and users’ implicit interest connection information, which helps predict users’ preferences. In our model, the parameter , S Cλ λ balances the information from the user-item rating matrix and the users’ social information, and we set S Cλ λ λ= = . If =0λ , we only use the user-item matrix for matrix factorization, and if

= infλ , we only extract information from the social network to predict users’ preferences. In other cases, we fuse information from the user-item rating matrix and the user social information for probabilistic matrix factorization and to predict ratings for active users.

Fig. 5 and Fig. 6 show the impacts of λ on MAE and RMSE. We observe that the value of λ impacts the recommendation results signifying matrix with the user social information greatly improves the recommendation accuracy. As λ increases, the prediction accuracy also increases at first, but when λ surpasses a certain threshold, the values of RMSE increase as the value of λ increases. The results meet the intuition that the purely using the users’ social information or user-item rating information for predictions cannot get better results than fusing this information together.

Page 8: Social recommendation algorithm fusing user interest social network

Supplement 1 LI Yu-sheng, et al. / Social recommendation algorithm fusing user interest social network 33

Fig. 5 Impact of parameter λ on RMSE

Fig. 6 Impact of parameter λ on MAE

5 Conclusions

Based on the intuition that the users’ implicit interest connection information can help recommendation system to judgment the users’ interest and hobbies, we propose the ISoRec framework, which employs users’ explicit social connection information, users’ implicit interest connection information and user-item rating matrix with a unified probabilistic matrix factorization. The experimental results show that the performance of our approach can significantly improve prediction accuracies. In addition the ISoRec algorithm can be used in most real-word recommendation systems, especially when the user’s explicit social connection information is unavailable.

Acknowledgements

This work was supported by the National Key project of Scientific and Technical Supporting Programs of China (2013BAH10F01, 2013BAH07F02, 2014BAH26F02), the Research Fund for the Doctoral Program of Higher Education (20110005120007), Beijing Higher Education Young Elite Teacher Project (YETP0445), the Co-construction Program with Beijing Municipal Commission of Education, Engineering Research Center of Information Networks,

Ministry of Education.

References

1. Mnih A, Salakhutdinov R. Probabilistic matrix factorization. Advances in neural information processing systems. 2007: 1257−1264

2. Wang J, De Vries A P, Reinders M J T. Unifying user-based and item-based collaborative filtering approaches by similarity fusion. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 501−508

3. Xue G R, Lin C, Yang Q, et al. Scalable collaborative filtering using cluster-based smoothing. Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2005: 114−121

4. Agarwal D, Chen B C. fLDA: matrix factorization through latent dirichlet allocation. Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. ACM, 2010: 91−100

5. Bell R, Koren Y, Volinsky C. Modeling relationships at multiple scales to improve accuracy of large recommender systems. Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2007: 95−104

6. Koren Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model. Proceedings of the 14th ACM SIGKDD International Conference on knowledge Discovery and Data Mining. ACM, 2008: 426−434

7. Liu F, Lee H J. Use of social network information to enhance collaborative filtering performance [J]. Expert Systems with Applications, 2010, 37(7): 4772−4778

8. Yang S H, Long B, Smola A, et al. Like like alike: joint friendship and interest propagation in social networks. Proceedings of the 20th International Conference on World Wide Web. ACM, 2011: 537−546

9. Ma H, Zhou D, Liu C, et al. Recommender systems with social regularization. Proceedings of the 4h ACM International Conference on Web Search and Data Mining. ACM, 2011: 287-296.

10. Ma H, Yang H, Lyu M R, et al. Sorec: social recommendation using probabilistic matrix factorization.Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, 2008: 931−940

11. Ma H, King I, Lyu M R. Learning to recommend with social trust ensemble. Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2009: 203−210

12. Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems [J]. Computer, 2009, 42(8): 30−37

13. Hofmann T. Latent semantic models for collaborative filtering [J]. ACM Transactions on Information Systems (TOIS), 2004, 22(1): 89−115

14. Hofmann T. Collaborative filtering via gaussian probabilistic latent semantic analysis. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. ACM, 2003: 259−266

15. Ma H. An experimental study on implicit social recommendation. Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2013: 73−82

16. Wu L, Chen E, Liu Q, et al. Leveraging tagging for neighborhood-aware probabilistic matrix factorization. Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 2012: 1854−1858

17. Shardanand U, Maes P. Social information filtering: algorithms for automating word of mouth. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM Press/Addison-Wesley Publishing Co., 1995: 210−217