modeling difficulty in recommender systems
DESCRIPTION
Presentation given at the Workshop on Recommendation Utility Evaluation: Beyond RMSE in conjunction with the conference on recommender systems (ACM) on September 9, 2012TRANSCRIPT
![Page 1: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/1.jpg)
Competence Center Information Retrieval & Machine Learning
Modeling Difficulty in Recommender Systems
Benjamin Kille (@bennykille)
September 9, 2012
Recommendation Utility Evaluation: Beyond RMSE (2012)
![Page 2: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/2.jpg)
2
Outline
► Recommender System Evaluation
► Problem definition
► Difficulty in Recommender Systems
► Future work
► Conclusions
September 9, 2012
Recommendation Utility Evaluation: Beyond RMSE (2012)
![Page 3: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/3.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)3
Recommender Systems Evaluation
► Definition of Evaluation measure:
RMSE (rating prediction scenario)
nDCG (ranking scenario)
Precision@N (top-N scenario)
► Splitting data into training and test partition
► Reporting results as average over the full set of users
► Is recommending to all users equally difficult?
September 9, 2012
![Page 4: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/4.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)4
Observed Differences
► Users differ with respect to Demographics (e.g., age, gender and location) Taste Needs Expectations Consumption patterns …
► Recommendation algorithms do not perform equally for each single userusers should not be evaluated all in the same way!
September 9, 2012
![Page 5: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/5.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)5
Risks of disregarding users‘ differences
► A subset of users receives worse recommendations than possible
► recommendation algorithm optimization targets all users equally:
„easy“ users costs could be saved „difficult“ users insufficient optimization
Control optimization towards those users who really require it!
How to determine difficulty?
September 9, 2012
![Page 6: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/6.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)6
Problem Formulation
► Measuring how difficult it will be to recommend items to a user
► Ideally: deriving difficulty directly from user attributes► Problem: unkown correlation between (combinations of)
attributes and difficulty
► We need a method to calculate the correlation of user attributes and the recommendation difficulty
September 9, 2012
![Page 7: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/7.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)7
Difficulty in Information Retrieval
► Target object: query► Method:
September 9, 2012
Query
IR-System IR-System IR-System IR-System IR-System
Doc 1 Doc 1
Doc 1
Doc 1Doc 1
Doc 2 Doc 2Doc 3Doc 2
Doc 2
Doc 4Doc 3 Doc 4Doc 2Doc 3
… … … … …
Difficulty = Diversity of returned list of documents
![Page 8: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/8.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)8
Difficulty in Recommender Systems
► Selecting several recommendation methods (state-of-the-art)► Measure the diversity of their output for a specific user► Based on the methods‘ agreement with respect to predicted
rating / ranking / top-N items, we conclude: high agreement low difficulty low agreement high difficulty
► Target correlation (user attributes ~ difficulty) can be estimated using the observed difficulties for a sufficiently large set of users
September 9, 2012
![Page 9: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/9.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)9
Future Work
► Experimentally verify feasability of difficulty estimation
► Evaluate observed correlation (user attributes ~ difficulty) on
data sets
► Investigate business rationale (reduced costs through
controlled optimization efforts)
► How to deal with sparsity / cold-start issues
September 9, 2012
![Page 10: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/10.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)10
Conclusions
► Users should not be treated equally when evaluating
recommender systems
► Difficulty of recommendation tasks varies between users
► Difficulty will allow to control optimization towards those users
who require it
► Diversity metrics could be used to estimate difficulty scores
(analogously to information retrieval)
► Proposed method needs to be evaluated
September 9, 2012
![Page 11: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/11.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)11
Thank you for your attention!
Questions
September 9, 2012
![Page 12: Modeling Difficulty in Recommender Systems](https://reader033.vdocuments.us/reader033/viewer/2022061222/54be4f944a795955748b4599/html5/thumbnails/12.jpg)
Recommendation Utility Evaluation: Beyond RMSE (2012)12
References
[He2008] J. He, M. Larson, and M. De Rijke. Using coherence-based measures to predict query difficulty. ECIR 2008
[Herlocker2004] J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. ACM TOIS 22(1)
2004[Kuncheva2003] L. Kuncheva and C. Whitaker. Measures of
diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51 2003
[Vargas2011] S. Vargas and P. Castells. Rank and relevance in novelty and diversity metrics for recommender systems. RecSys 2011
September 9, 2012