hyper: a flexible and extensible probabilistic framework for hybrid recommender systems pigi kouki,...

HyPER: A Flexible and Extensible Probabilistic Framework for

Hybrid Recommender Systems

Pigi Kouki, Shobeir Fakhraei, James Foulds, Magdalini Eirinaki, Lise Getoor

University of California, Santa Cruz University of Maryland, College Park

San Jose State University

Motivation• Increasing amount of data useful for recommendations

content

social demographic

ratings

Multiple Data Sources

• Content – [Gunawardana and Meek, RecSys 2009]– [Forbes and Zhu, RecSys 2011]– [de Campos et al., IJAR 51(7) 2010]

• Social relationships– [Ma et al., WSDM 2011]– [Liu et al., DSS 55(3) 2013]

Combining ratings with otherdata sources improves performance

• Review text – [McAuley & Leskovec, RecSys 2013]– [Ling et al., RecSys, 2014]

• Tags and labels– [Guy et al., SIGIR 2010]

• Feedback– [Sedhain et al., RecSys, 2014]

#cool #neat #ok #sucks

Multiple Recommenders

• [Jahrer et al., KDD 2010]• [Burke, In The Adaptive Web, 2007]

Combining predictions of multiple recommenders also improves performance

“Predictive accuracy is substantially improved when blending multiple predictors”-[Bell et al., The BellKor Solution to the Netflix Prize, 2007]

See also:

Desiderata for Hybrid Systems

• To get the best performance, we should make use of all available data sources and algorithms

• We need a framework that is:– General

• Combines arbitrary data modalities• Combines multiple recommenders• problem and data-agnostic

– Extensible to new information sources/recommenders– Scalable to large data sets

General Hybrid Recommendersin the Literature

• Existing hybrid systems, though powerful, typically fall short on either generality, extensibility, or scalability– Often combine collaborative and/or content-based methods with

each other or just one other data modality (cf. previous slides)

– Some systems can leverage heterogeneous data• [Gemmell et al. 2012, Burke et al. 2014, Yu et al. 2014]

• Probabilistic graphical modeling approaches are typically more general, less scalable– Bayesian networks [de Campos et al., IJAR 51(7) 2010]

– Markov logic networks [Hoxha & Rettinger, ICMLA 2013]

General Hybrid Recommendersin the Literature

• Existing hybrid systems, though powerful, typically fall short on either generality, extensibility, or scalability– Often combine collaborative and/or content-based methods with

each other or just one other data modality (cf. previous slides)

– Some systems can leverage heterogeneous data• [Gemmell et al. 2012, Burke et al. 2014, Yu et al. 2014]

• Probabilistic graphical modeling approaches are typically more general, less scalable– Bayesian networks [de Campos et al., IJAR 51(7) 2010]

– Markov logic networks [Hoxha & Rettinger, ICMLA 2013]

Our Approach

• A general, extensible, scalable recommender framework

• Leverages advances in statistical relational learning– Probabilistic soft logic [Bach et al., UAI 2013, ArXiv 2015]

• Inspired by recent work in drug-target interaction prediction [Fakhraei et al., Transactions on Computational Biology and Bioinformatics 11(5) 2014]

We propose HyPER: Hybrid Probabilistic Extensible Recommender

Hybrid Modeling with HyPER

Data Source

Recommender

Predicted Ratings

Data Source 1

Recommender

Predicted RatingsData Source 2

Data Source N

Data Source 1

Recommender 1

Predicted RatingsData Source 2

Data Source N

Recommender 2

Recommender M

HyPER: High-Level Approach

• User-item ratings viewed as a weighted bipartite graph

• Build hybrid model by adding links to encode additional information– multiple user and item similarities, social

information,…

• Predict ratings by reasoning over the graph, via a graphical model

information,…

Extended Recommendation Graph

Modeling and Reasoning over the Graph

• Hinge-loss Markov random fields (HL-MRFs) [Bach et al., UAI 2013]

– Exact, efficient, and scalable inference– Continuous random variables– Models defined by PSL programs

• Probabilistic Soft Logic (PSL) [Bach et al., ArXiv 2015]

– Statistical relational learning system– Logical probabilistic programming interface – Templating language for HL-MRFs

Modeling and Reasoning over the Graph

• Hinge-loss Markov random fields (HL-MRFs) [Bach et al., UAI 2013]

– Exact, efficient, and scalable inference– Continuous random variables– Models defined by PSL programs

• Probabilistic Soft Logic (PSL) [Bach et al., ArXiv 2015]

– Statistical relational learning system– Logical probabilistic programming interface – Templating language for HL-MRFs

Hinge-loss Markov Random Fields

Conditional random field over continuous random variablesbetween 0 and 1

Feature functions are hinge loss functions

Linear function

Hinge losses encode the distance to satisfactionfor each instantiated rule

Linear function

Efficient Inference in HL-MRFs

• Energy function is convex, can find a global MAP state

• The alternating direction method of multipliers (ADMM) is used for efficient and scalable inference

Probabilistic Soft Logic

• Statistical relational learning language• Uses first-order logical rules • Τemplates HL-MRFs

logical operators

predicatesweight

w : LikesGenre(U, G) && IsGenre(M, G) Rating(U, M)

• Statistical relational learning language• Uses first-order logical rules• Τemplates HL-MRFs

predicatesweight

• Statistical relational learning language• Uses first-order logical rules• Τemplates HL-MRFs

logical operators

predicatesweight

• Statistical relational learning language• Uses first-order logical rules • Τemplates HL-MRFs

predicatesweight

logical operators

• Converts rules to hinge-loss potentials

• PSL program = rules + data• Open source: http://psl.umiacs.umd.edu

hinge-loss

LikesGenre(U, G) && IsGenre(M, G) Rating(U, M)

hinge-loss

max{LikesGenre(U, G) + IsGenre(M, G) - Rating(U, M) -1, 0}

hinge-loss

max{LikesGenre(U, G) + IsGenre(M, G) - Rating(U, M) -1, 0}

Recommendations with HyPER

• Similar items get similar ratings from a user– e.g. cosine, adjusted cosine, Pearson, content

SimilarItems(i1,i2)

Rating(u,i1) = 5

Rating(u,i1) = ?

SimilarItemssim(i1, i2) && Rating(u, i1) Rating(u, i2)

Recommendations with HyPER• Similar users give similar ratings to an item– e.g. cosine, Pearson

SimilarUsers(u1,u2)

Rating(u1,i) = 4

Rating(u2,i) = ?

SimilarUserssim(u1, u2) && Rating(u1, i) Rating(u2, i)

• Mean-centering priors

• Additional data sources

• Leveraging existing recommenders• e.g. matrix factorization, item-based

AverageUserRating(u) Rating(u, i)AverageItemRating(i) Rating(u, i)

• Social network links

Friends(u1, u2) && Rating (u1, i) Rating(u2, i)

RatingRecommender(u, i) Rating(u, i)

Extensible to new data/algorithms – just add rules!

RatingRecommender(u, i) Rating(u, i)

Balancing the Rules

• Balancing done through weights wj

• Higher wj indicates a more important rule

• Weight learning by approximating a gradient step in the conditional log-likelihood:

Experimental Validation

• Yelp academic dataset– ~34k users, ~3.6k items, ~99k ratings – ~81k friendships– 514 business categories

• Last.fm– ~1.8k users, ~17k items, ~92k ratings– ~12k friendships– ~9.7k artist tags

• Evaluation metrics: RMSE, MAEhttps://www.yelp.com/academic_datasethttp://grouplens.org/datasets/hetrec-2011/

Baselines

• Collaborative filtering systems– Item-based cf. [Ning et al., In Recommender Systems Handbook, 2015]

– Matrix factorization (MF) cf. [Koren et al., IEEE Computer 42(8) 2009]

– Bayesian probabilistic matrix factorization (BPMF) [Salakhutdinov & Mnih., ICML 2008]

• Hybrid Systems– Naïve hybrid (averaged predictions)– BPMF with social relations and content (BPMF-SRIC)

[Liu et al., DSS 55(3) 2013]

HyPER vs Baselines

• HyPER outperforms all other models in both datasets• Results statistically significant

HyPER Submodels: Mean-centering

• HyPER combined model beats individual rules

HyPER Submodels: User-based

• HyPER combined model beats/matches best individual rules• Similar story for item-based, content & social

• HyPER can combine different recommenders effectively• Results statistically significant better

Combining the Baselines

HyPER (All Rules)

• Combining all rules achieves the best performance in both datasets

Scaling to Large Datasets

• Parallel implementation for inference and learning based on ADMM [Bach et al, UAI 2013]

• Scaling to big-data applications:– perform inference in parallel on densely

connected subgraphs of the original graph– fully distributed implementation of ADMM

Conclusions

• HyPER is a general-purpose, extensible framework for hybrid recommender systems

• With HyPER, practitioners can define custom hybrid models for using all available data/algorithms, via logical rules in PSL

• HyPER outperforms existing techniques on two popular datasets

Conclusions

• HyPER is a general-purpose, extensible framework for hybrid recommender systems

• With HyPER, practitioners can define custom hybrid models for using all available data/algorithms, via logical rules in PSL

• HyPER outperforms existing techniques on two popular datasets

Thank you for your attention!

HyPER Submodels – Item-based, Content & Social

ReferencesX. Ning, C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation

methods. In Recommender Systems Handbook. 2nd edition, Springer, 2015S. Fakhraei, B. Huang, L. Raschid, and L. Getoor. Network-based drug-target interaction prediction with

probabilistic soft logic. Transactions on Computational Biology and Bioinformatics, 11(5), 2014.J. Liu, C. Wu, and W. Liu. Bayesian probabilistic matrix factorization with social relations and item contents for

recommendation. Decision Support Systems, 55(3), 2013.R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In

ICML, 2008.Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer,

42(8), 2009.A. Gunawardana and C. Meek. A unified approach to building hybrid recommender systems. In RecSys, 2009.R. Burke. Hybrid web recommender systems. In The Adaptive Web. Springer, 2007.L. de Campos, J. Fernandez-Luna, J. Huete, and M. Rueda-Morales. Combining content-based and collaborative

recommendations: A hybrid approach based on Bayesian networks. International Journal of Approximate Reasoning, 51(7), 2010.

M. Jahrer, A. Toscher, and R. Legenstein. Combining predictions for accurate recommender systems. In KDD, �2010.

ReferencesJ. Hoxha and A. Rettinger. First-order probabilistic model for hybrid recommendations. In ICMLA, 2013.S. H. Bach, B. Huang, B. London, and L. Getoor. Hinge-loss Markov random fields: Convex inference for structured

prediction. In UAI, 2013.S.H. Bach, M. Broecheler, B. Huang, and L. Getoor. Hinge-loss Markov random fields and probabilistic soft logic.

ArXiv:1505.04406 [cs.LG], 2015.A. P. Forbes and M. Zhu. Content-boosted matrix factorization for recommender systems: Experiments with recipe

recommendation. In RecSys, 2011.J. Chen, G. Chen, H. Zhang, J. Huang, and G. Zhao. Social recommendation based on multi-relational analysis. In WI-

IAT, 2012.R. Burke, F. Vahedian, and B. Mobasher. Hybrid recommendation in heterogeneous networks. In User Modeling,

Adaptation, and Personalization. Springer, 2014.J. Gemmell, T. S., B. Mobasher, and R. Burke. Resource recommendation in social annotation systems: A linear-

weighted hybrid approach. Journal of Computer and System Sciences, 78(4), 2012.X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and J. Han. Personalized entity recommendation: A

heterogeneous information network approach. In WSDM, 2014.H. Ma, D. Zhou, C. Liu, M. R. Lyu, and I. King. Recommender systems with social regularization. In WSDM, 2011.J. McAuley and J. Leskovec. Hidden factors and hidden topics: Understanding rating dimensions with review text. In

RecSys, 2013.G. Ling, M. R. Lyu, and I. King. Ratings meet reviews, a combined approach to recommend. In RecSys, 2014.I. Guy, N. Zwerdling, I. Ronen, D. Carmel, and E. Uziel. Social media recommendation based on people and tags. In

SIGIR, 2010.S. Sedhain, S. Sanner, D. Braziunas, L. Xie, and J. Christensen. Social collaborative ltering for cold-start

recommendations. In RecSys, 2014.

hyper: a flexible and extensible probabilistic framework for hybrid recommender systems pigi kouki,...

Documents

hardware implementation of transform & quantization blocks...

+ collective spammer detection in evolving multi-relational...

catalytic hydrogen evolution by fe(ii) carbonyls … · ! 1...

web mining: a roadmap - university of albertagolmoham/sw/web...

web mining for web...

1 cmos image sensor seied manoochehr hoseini university of...

a class presentation for vlsi course by “ anahita...

rigorous success awesome hands-on bliss incredible · bliss...

ieee transactions on systems, man, and cybernetics...

shobeir fakhraei, eberechukwu onukwugha, lise getoor...

the hilltop health care reform simulation model hamid...

predictable dual-view hashing - shobeir fakhraei · search...

using web mining to extract knowledge and behavior...

bias and stability of single variable classiﬁers for...

c.k.k.yang and e.h.chen presented by : pedram payandehnia...

ieee/acm transactions on computational biology...

shobeir - pebble