modeling social data, lecture 8: recommendation systems

36
Recommendation Systems APAM E4990 Modeling Social Data Jake Hofman Columbia University March 13, 2015 Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 1 / 29

Upload: jakehofman

Post on 15-Jul-2015

343 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Modeling Social Data, Lecture 8: Recommendation Systems

Recommendation SystemsAPAM E4990

Modeling Social Data

Jake Hofman

Columbia University

March 13, 2015

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 1 / 29

Page 2: Modeling Social Data, Lecture 8: Recommendation Systems

Personalized recommendations

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 2 / 29

Page 3: Modeling Social Data, Lecture 8: Recommendation Systems

Personalized recommendations

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 3 / 29

Page 4: Modeling Social Data, Lecture 8: Recommendation Systems

http://netflixprize.com

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 4 / 29

Page 5: Modeling Social Data, Lecture 8: Recommendation Systems

http://netflixprize.com/rules

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 5 / 29

Page 6: Modeling Social Data, Lecture 8: Recommendation Systems

http://netflixprize.com/faq

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 6 / 29

Page 7: Modeling Social Data, Lecture 8: Recommendation Systems

Netflix prize: results

http://en.wikipedia.org/wiki/Netflix_Prize

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 7 / 29

Page 8: Modeling Social Data, Lecture 8: Recommendation Systems

Netflix prize: results

See [TJB09] and [Kor09] for more gory details.

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 8 / 29

Page 9: Modeling Social Data, Lecture 8: Recommendation Systems

http://bit.ly/beyond5stars

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 9 / 29

Page 10: Modeling Social Data, Lecture 8: Recommendation Systems

Recommendation systems

High-level approaches:

• Content-based methods(e.g., wgenre: thrillers = +2.3, wdirector: coen brothers = +1.7)

• Collaborative methods(e.g., “Users who liked this also liked”)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 10 / 29

Page 11: Modeling Social Data, Lecture 8: Recommendation Systems

Netflix prize: data

(userid, movieid, rating, date)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 11 / 29

Page 12: Modeling Social Data, Lecture 8: Recommendation Systems

Netflix prize: data

(movieid, year, title)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 11 / 29

Page 13: Modeling Social Data, Lecture 8: Recommendation Systems

Recommendation systems

High-level approaches:

• Content-based methods(e.g., wgenre: thrillers = +2.3, wdirector: coen brothers = +1.7)

• Collaborative methods(e.g., “Users who liked this also liked”)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 12 / 29

Page 14: Modeling Social Data, Lecture 8: Recommendation Systems

Collaborative filtering

Memory-based(e.g., k-nearest neighbors)

Model-based(e.g., matrix factorization)

http://research.yahoo.com/pub/2859

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 13 / 29

Page 15: Modeling Social Data, Lecture 8: Recommendation Systems

Problem statement

• Given a set of past ratings Rui that user u gave item i• Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number

of stars for movie rating• Or we may infer implicit ratings from user actions, e.g.

Rui = 1 if u purchased i ; otherwise Rui = ?

• Make recommendations of several forms• Predict unseen item ratings for a particular user• Suggest items for a particular user• Suggest items similar to a particular item• . . .

• Compare to natural baselines• Guess global average for item ratings• Suggest globally popular items

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29

Page 16: Modeling Social Data, Lecture 8: Recommendation Systems

Problem statement

• Given a set of past ratings Rui that user u gave item i• Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number

of stars for movie rating• Or we may infer implicit ratings from user actions, e.g.

Rui = 1 if u purchased i ; otherwise Rui = ?

• Make recommendations of several forms• Predict unseen item ratings for a particular user• Suggest items for a particular user• Suggest items similar to a particular item• . . .

• Compare to natural baselines• Guess global average for item ratings• Suggest globally popular items

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29

Page 17: Modeling Social Data, Lecture 8: Recommendation Systems

Problem statement

• Given a set of past ratings Rui that user u gave item i• Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number

of stars for movie rating• Or we may infer implicit ratings from user actions, e.g.

Rui = 1 if u purchased i ; otherwise Rui = ?

• Make recommendations of several forms• Predict unseen item ratings for a particular user• Suggest items for a particular user• Suggest items similar to a particular item• . . .

• Compare to natural baselines• Guess global average for item ratings• Suggest globally popular items

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29

Page 18: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighbors

Key intuition:Take a local popularity vote amongst “similar” users

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 15 / 29

Page 19: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighborsUser similarity

Quantify similarity as a function of users’ past ratings, e.g.

• Fraction of items u and v have in common

Suv =|ru ∩ rv ||ru ∪ rv |

=

∑i RuiRvi∑

i (Rui + Rvi − RuiRvi )(1)

Retain top-k most similar neighbors v for each user u

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 16 / 29

Page 20: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighborsUser similarity

Quantify similarity as a function of users’ past ratings, e.g.

• Angle between rating vectors

Suv =ru · rv|ru| |rv |

=

∑i RuiRvi√∑i R

2ui

∑j R

2vj

(1)

Retain top-k most similar neighbors v for each user u

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 16 / 29

Page 21: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighborsPredicted ratings

Predict unseen ratings R̂ui as a weighted vote over u’s neighbors’ratings for item i

R̂ui =

∑v RviSuv∑v Suv

(2)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 17 / 29

Page 22: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighborsPractical notes

We expect most users have nothing in common, so calculatesimilarities as:

for each item i :for all pairs of users u, v that have rated i :

calculate Suv (if not already calculated)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29

Page 23: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighborsPractical notes

Alternatively, we can make recommendations using an item-basedapproach [LSY03]:

• Compute similarities Sij between all pairs of items

• Predict ratings with a weighted vote R̂ui =∑

j RujSij/∑

j Sij

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29

Page 24: Modeling Social Data, Lecture 8: Recommendation Systems

k-nearest neighborsPractical notes

Several (relatively) simple ways to scale:

• Sample a subset of ratings for each user (by, e.g., recency)

• Use MinHash to cluster users [DDGR07]

• Distribute calculations with MapReduce

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29

Page 25: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorization

Key intuition:Model item attributes as belonging to a set of unobserved “topics

and user preferences across these “topics”

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 19 / 29

Page 26: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationLinear model

Start with a simple linear model:

R̂ui = b0︸︷︷︸global average

+ bu︸︷︷︸user bias

+ bi︸︷︷︸item bias

(3)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 20 / 29

Page 27: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationLinear model

For example, we might predict that a harsh critic would score apopular movie as

R̂ui = 3.6︸︷︷︸global average

+ −0.5︸︷︷︸user bias

+ 0.8︸︷︷︸item bias

(3)

= 3.9 (4)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 20 / 29

Page 28: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationLow-rank approximation

Add an interaction term:

R̂ui = b0︸︷︷︸global average

+ bu︸︷︷︸user bias

+ bi︸︷︷︸item bias

+ Wui︸︷︷︸user-item interaction

(5)

where Wui = pu · qi =∑

k PukQik

• Puk is user u’s preference for topic k

• Qik is item i ’s association with topic k

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 21 / 29

Page 29: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationLoss function

Measure quality of model fit with squared-loss:

L =∑(u,i)

(R̂ui − Rui

)2(6)

=∑(u,i)

([PQT

]ui− Rui

)2(7)

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 22 / 29

Page 30: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationOptimization

The loss is non-convex in (P,Q), so no global minimum exists

Instead we can optimize L iteratively, e.g.:

• Alternating least squares: update each row of P, holding Qfixed, and vice-versa

• Stochastic gradient descent: update individual rows pu and qifor each observed Rui

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 23 / 29

Page 31: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationAlternating least squares

L is convex in rows of P with Q fixed, and Q with P fixed, soalternate solutions to the normal equations:

pu =[Q(u)TQ(u)

]−1Q(u)T r(u) (8)

qi =[P(i)TP(i)

]−1P(i)T r(i) (9)

where:

• Q(u) is the item association matrix restricted to items ratedby user u

• P(i) is the user preference matrix restricted to users that haverated item i

• r(u) are ratings by user u and r(i) are ratings on item i

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 24 / 29

Page 32: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationStochastic gradient descent

Alternatively, we can avoid inverting matrices by taking steps inthe direction of the negative gradient for each observed rating:

pu ← pu − η∂L∂pu

= pu +(Rui − R̂ui

)qi (10)

qi ← qi − η∂L∂qi

= qi +(Rui − R̂ui

)pu (11)

for some step-size η

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 25 / 29

Page 33: Modeling Social Data, Lecture 8: Recommendation Systems

Matrix factorizationPractical notes

Several ways to scale:

• Distribute matrix operations with MapReduce [GHNS11]

• Parallelize stochastic gradient descent [ZWSL10]

• Expectation-maximization for pLSI with MapReduce[DDGR07]

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 26 / 29

Page 34: Modeling Social Data, Lecture 8: Recommendation Systems

Datasets

• Movielenshttp://www.grouplens.org/node/12

• Reddithttp://bit.ly/redditdata

• CU “million songs”http://labrosa.ee.columbia.edu/millionsong/

• Yahoo Music KDDcuphttp://kddcup.yahoo.com/

• AudioScrobblerhttp://bit.ly/audioscrobblerdata

• Delicioushttp://bit.ly/deliciousdata

• . . .

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 27 / 29

Page 35: Modeling Social Data, Lecture 8: Recommendation Systems

References I

AS Das, M Datar, A Garg, and S Rajaram.Google news personalization: scalable online collaborativefiltering.page 280, 2007.

R Gemulla, PJ Haas, E Nijkamp, and Y Sismanis.Large-scale matrix factorization with distributed stochasticgradient descent.2011.

Yehuda Koren.The bellkor solution to the netflix grand prize.pages 1–10, Aug 2009.

G Linden, B Smith, and J York.Amazon. com recommendations: Item-to-item collaborativefiltering.IEEE Internet computing, 7(1):76–80, 2003.

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 28 / 29

Page 36: Modeling Social Data, Lecture 8: Recommendation Systems

References II

A Toscher, M Jahrer, and RM Bell.The bigchaos solution to the netflix grand prize.2009.

M. Zinkevich, M. Weimer, A. Smola, and L. Li.Parallelized stochastic gradient descent.In Neural Information Processing Systems (NIPS), 2010.

Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 29 / 29