recsys 2016 - accuracy and diversity in cross-domain recommendations for cold-start userswith...

Accuracy and Diversity in Cross-domainRecommendations for Cold-Start Users

with Positive-only FeedbackIgnacio Fernández-Tobías1, Paolo Tomeo2,

Iván Cantador1, Tommaso Di Noia2, Eugenio Di Sciascio2

1 Autonomous University of Madrid, Spain{ignacio.fernandezt, ivan.cantador}@uam.es

2 Polytechnic University of Bari, Italy{paolo.tomeo, tommaso.dinoia, eugenio.disciascio}@poliba.it

2

User Cold-Start Problem

Cold-Start

Extreme Cold-Start

Items

Use

rs

Little or no information about some users(usually new users)

Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback

3

Cross-domain recommendation

A simple way to combine different domains is to horizontally concatenate the user-item matrices

Movies

Use

rs

Music


4

Research Questions

1. Introduction1.1. Motivation

RQ1 - How beneficial in terms of accuracy is to exploit cross-domain information for cold-start users?

RQ2 - Is cross-domain information really useful to improve the recommendation diversity?

RQ3 - What is the impact of the size and diversity of source user profile on the target recommendation accuracy?


5

Positive-only Dataset

1 - Facebook likes extracted by using Graph API

2 - Items mapped to DBpedia entities by using SPARQL


6

Dataset Statistics


Metrics

Users Items (Facebook pages)

Likes

Music 50K 5K 5MMovies 27K 4K 800K

Accuracy MRRIndividual Diversity ILD@10, BinomDiv@10

Profile Diversity ILD

7

Evaluation Setting

5-fold cross validation

training → 10 likesSplitting validation → 5 likes

test → the remaining likes, at least 1

Simulation of different user profile sizes (from 0 to 10 likes)evaluated with the same test set [Kluver and Konstan, RecSys ‘14]


8

Recommendation algorithms

3. Recommendation models3.3. Baseline models

• Popularity-based (POP)• User-based Nearest Neighbors (UNN)• Item-based Nearest Neighbors (INN)• Implicit Matrix Factorization (IMF) [Hu et al., 2008]• HeteRec [Yu et al., 2014]• PathRank [Lee et al., 2012]

Prefix “CD-” indicates cross-domain version (e.g. CD-UNN)


9

Single-domain vs Cross-domain


10

Which algorithm is more accurate?

…and which one provides more diversity?


Impact of source profile size

Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 11

Impact of source profile diversity

Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 12

13

Conclusions

5. Conclusions and future work

Cross-domain recommendation may improve accuracy (RQ1), but not always providing diversity (RQ2)

The choice of the recommendation algorithm depends on the domain and the amount of user information available

Recommendation accuracy increases with size of source profile, but may deteriorate with diversity (RQ3)

Investigating which characteristics of the datasets could explain the differences in the obtained results

Extending the analysis to more domains and sophisticated methods


Future work

recsys 2016 - accuracy and diversity in cross-domain recommendations for cold-start userswith...

Data & Analytics