collaborative filtering fei wang adapted from...

40
Collaborative Filtering Fei Wang Adapted from http://www.cs.uic.edu/~liub/teach/cs583-spring-11/CS583- recommender-systems.ppt

Upload: willa-butler

Post on 04-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Collaborative Filtering

Fei Wang

Adapted from http://www.cs.uic.edu/~liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Page 2: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Road Map

• Introduction• Content-based recommendation• Collaborative filtering based

recommendation– K-nearest neighbor– Association rules– Matrix factorization

2

Page 3: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Introduction

Recommender systems are widely used on the Web for recommending products and services to users.

Most e-commerce sites have such systems. These systems serve two important functions.

They help users deal with the information overload by giving them recommendations of products, etc.

They help businesses make more profits, i.e., selling more products.

3

Page 4: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

E.g., movie recommendation

The most common scenario is the following:A set of users has initially rated some subset of

movies (e.g., on the scale of 1 to 5) that they have already seen.

These ratings serve as the input. The recommendation system uses these known ratings to predict the ratings that each user would give to those not rated movies by him/her.

Recommendations of movies are then made to each user based on the predicted ratings.

4

Page 5: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Different variations In some applications, there is no rating

information while in some others there are also additional attributes about each user (e.g., age, gender, income, marital

status, etc), and/or about each movie (e.g., title, genre, director, leading

actors or actresses, etc). When no rating information, the system will not

predict ratings but predict the likelihood that a user will enjoy watching a movie.

CS583, Bing Liu, UIC 5

Page 6: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

The Recommendation Problem

• We have a set of users U and a set of items S to be recommended to the users.

• Let p be an utility function that measures the usefulness of item s ( S) to user u ( U), i.e., – p:U×S R, where R is a totally ordered set (e.g.,

non-negative integers or real numbers in a range)

• Objective– Learn p based on the past data– Use p to predict the utility value of each item s (

S) to each user u ( U)

6

Page 7: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

As Prediction

Rating prediction, i.e., predict the rating score that a user is likely to give to an item that s/he has not seen or used before. E.g., rating on an unseen movie. In this case, the utility

of item s to user u is the rating given to s by u. Item prediction, i.e., predict a ranked list of

items that a user is likely to buy or use.

7

Page 8: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Two basic approaches

Content-based recommendations: The user will be recommended items similar to the

ones the user preferred in the past;Collaborative filtering (or collaborative

recommendations): The user will be recommended items that people

with similar tastes and preferences liked in the past.Hybrids: Combine collaborative and content-

based methods. 8

Page 9: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Road Map

• Introduction• Content-based recommendation• Collaborative filtering based

recommendation– K-nearest neighbor– Association rules– Matrix factorization

9

Page 10: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Content-Based Recommendation

• Perform item recommendations by predicting the utility of items for a particular user based on how “similar” the items are to those that he/she liked in the past. E.g., – In a movie recommendation application, a movie

may be represented by such features as specific actors, director, genre, subject matter, etc.

– The user’s interest or preference is also represented by the same set of features, called the user profile.

10

Page 11: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Content-based recommendation (contd)

• Recommendations are made by comparing the user profile with candidate items expressed in the same set of features.

• The top-k best matched or most similar items are recommended to the user.

• The simplest approach to content-based recommendation is to compute the similarity of the user profile with each item.

11

Page 12: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Road Map

• Introduction• Content-based recommendation• Collaborative filtering based

recommendations– K-nearest neighbor– Association rules– Matrix factorization

12

Page 13: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Collaborative filtering

Collaborative filtering (CF) is perhaps the most studied and also the most widely-used recommendation approach in practice. k-nearest neighbor,association rules based prediction, and matrix factorization

Key characteristic of CF: it predicts the utility of items for a user based on the items previously rated by other like-minded users.

13

Page 14: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

k-nearest neighbor

kNN (which is also called the memory-based approach) utilizes the entire user-item database to generate predictions directly, i.e., there is no model building.

This approach includes both User-based methodsItem-based methods

14

Page 15: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

User-based kNN CF

A user-based kNN collaborative filtering method consists of two primary phases: the neighborhood formation phase and the recommendation phase.

There are many specific methods for both. Here we only introduce one for each phase.

15

Page 16: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Neighborhood formation phase

• Let the record (or profile) of the target user be u (represented as a vector), and the record of another user be v (v T).

• The similarity between the target user, u, and a neighbor, v, can be calculated using the Pearson’s correlation coefficient:

16

,)()(

))((),(

2,

2,

,,

Ci iCi i

Ci ii

rrrr

rrrrsim

vvuu

vvuuvu

Page 17: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Recommendation Phase

• Use the following formula to compute the rating prediction of item i for target user u

where V is the set of k similar users, rv,i is the rating of user v given to item i,

17

V

V i

sim

rrsimrip

v

v vvu vu

vuu

),(

)(),(),( ,

Page 18: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Issue with the user-based kNN CF

The problem with the user-based formulation of collaborative filtering is the lack of scalability: it requires the real-time comparison of the target

user to all user records in order to generate predictions.

A variation of this approach that remedies this problem is called item-based CF.

18

Page 19: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Item-based CF

• The item-based approach works by comparing items based on their pattern of ratings across users. The similarity of items i and j is computed as follows:

19

U jU i

U ji

rrrr

rrrrjisim

u uuu uu

u uuuu

2,

2,

,,

)()(

))((),(

Page 20: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Recommendation phase

• After computing the similarity between items we select a set of k most similar items to the target item and generate a predicted value of user u’s rating

where J is the set of k similar items

20

Jj

Jj j

jisim

jisimrip

),(

),()(

,u u,

Page 21: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Road Map

• Introduction• Content-based recommendation• Collaborative filtering based

recommendation– K-nearest neighbor– Association rules– Matrix factorization

21

Page 22: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Association rule-based CF

• Association rules obviously can be used for recommendation.

• Each transaction for association rule mining is the set of items bought by a particular user.

• We can find item association rules, e.g.,

buy_X, buy_Y -> buy_Z• Rank items based on measures such as

confidence, etc. – See Chapter 3 for details

22

Page 23: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Road Map

• Introduction• Content-based recommendation• Collaborative filtering based

recommendation– K-nearest neighbor– Association rules– Matrix factorization

23

Page 24: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Matrix factorization

• The idea of matrix factorization is to decompose a matrix M into the product of several factor matrices, i.e.,

where n can be any number, but it is usually 2 or 3.

24

nFFFM ... 21

Page 25: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

CF using matrix factorization

Matrix factorization has gained popularity for CF in recent years due to its superior performance both in terms of recommendation quality and scalability.

Part of its success is due to the Netflix Prize contest for movie recommendation, which popularized a Singular Value Decomposition (SVD) based matrix factorization algorithm.The prize winning method of the Netflix Prize Contest

employed an adapted version of SVD25

Page 26: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

The abstract ideaMatrix factorization a latent factor model.

Latent variables (also called features, aspects, or factors) are introduced to account for the underlying reasons of a user purchasing or using a product.When the connections between the latent variables

and observed variables (user, product, rating, etc.) are estimated during the training

recommendations can be made to users by computing their possible interactions with each product through the latent variables.

26

Page 27: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Netflix Prize Contest

27

Page 28: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Netflix Prize TaskTraining data: Quadruples of the form

(user, movie, rating, time)For our purpose here, we only use triplets, i.e.,

(user, movie, rating) For example, (132456, 13546, 4) means that the

user with ID 132456 gave the movie with ID 13546 a rating of 4 (out of 5).

Testing: predict the rating of each triplet: (user, movie, ?)

28

Page 29: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD factorization

• The technique discussed here is based on the SVD method given by – Simon Funk at his blog site, – the derivation of Funk’s method described by

Wagman in the Netflix forums.– the paper by Takacs et al.

• The method was later improved by Koren et al., Paterek and several other researchers.

29

Page 30: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Intuitive Idea

30

Page 31: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Simon Funk’s SVD method

31

where U = [u1, u2, …, uI] and M = [m1, m2, …, mJ]

Page 32: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

• Let us use K = 90 latent aspects (K needs to be set experimentally).

• Then, each movie will be described by only ninety aspect values indicating how much that movie exemplifies each aspect.

• Correspondingly, each user is also described by ninety aspect values indicating how much he/she prefers each aspect.

32

Page 33: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

• To combine these together into a rating, we multiply each user preference by the corresponding movie aspect, and then sum them up to give a rating to indicate how much that user likes that movie:– U = [u1, u2, …, uI] and M = [m1, m2, …, mJ]

• Using SVD, we can perform the task

33

K

kkjkij

Tiij mur

1

mu

Page 34: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

• SVD is a mathematical way to find these two smaller matrices which minimizes the resulting approximation error, the mean square error (MSE).

• We can use the resulting matrices U and M to predict the ratings in the test set.

34

Page 35: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

35

Page 36: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

• To minimize the error, the gradient descent approach is used.

• For gradient descent, we take the partial derivative of the square error with respect to each parameter, i.e. with respect to each uki and mkj.

36

ki

ijij

ki

ij

u

ee

u

e

2

)( 2

Page 37: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

37

Page 38: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

SVD method (contd)

38

Page 39: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

The final update rules

• By the same reasoning, we can also compute the update rule for mkj.

• Finally, we have both rules

• The final prediction uses Eq. (11)

39

Page 40: Collaborative Filtering Fei Wang Adapted from liub/teach/cs583-spring-11/CS583-recommender-systems.ppt

Further improvements

• The two basic rules need some improvements to make them work well.

• There are also some pre-processing.• Time was also added later. • Etc• Note:

– Funk used stochastic gradient descent– Not the batch (global) gradient descent.

40