recommending from experience · insight centre for data analytics 30 oct 2015 slide 19 • yelp...
TRANSCRIPT
![Page 2: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/2.jpg)
Recommender Systems
30 Oct 2015 Insight Centre for Data Analytics Slide 2
![Page 3: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/3.jpg)
Recommender Systems
Collaborative Filtering User reviews
Sentiment Analysis
Feature Extraction
Context Extraction
30 Oct 2015 Insight Centre for Data Analytics Slide 3
![Page 4: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/4.jpg)
Context
30 Oct 2015 Insight Centre for Data Analytics Slide 4
![Page 5: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/5.jpg)
Context
30 Oct 2015 Insight Centre for Data Analytics Slide 5
![Page 6: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/6.jpg)
Context
30 Oct 2015 Insight Centre for Data Analytics Slide 6
![Page 7: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/7.jpg)
Context in Recommender Systems
• Most of them predefine context • Small number of features • Small number of values
Context Aware Recommender Systems
• Context is richer, is open-ended • Birthday, anniversary, parking, accessibility, eat-in vs. take
away, pet friendly, …
Open-ended context is very wide
30 Oct 2015 Insight Centre for Data Analytics Slide 7
I’m travelling for: Work Leisure Companion: Solo Couple Family
![Page 8: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/8.jpg)
Motivation
• Discover open-ended contextual information from user reviews. Context doesn’t have to be predefined.
• Incorporate the open-ended contextual information together with ratings into a context sensitive recommender system.
• Make recommendations that better satisfy user goals.
30 Oct 2015 Insight Centre for Data Analytics Slide 8
![Page 9: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/9.jpg)
Assumptions
• “During the summer, we like to take a mini staycation. This year it was extra special as we also got engaged. Our stay at the Biltmore was just fantastic. The service exceptional, the food amazing.
Specific Review
• “Nice hotel, all the amenities you need, great complex of pools.
Generic Review
30 Oct 2015 Insight Centre for Data Analytics Slide 9
![Page 10: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/10.jpg)
Assumptions
• “During the summer, we like to take a mini staycation. This year it was extra special as we also got engaged. Our stay at the Biltmore was just fantastic. The service exceptional, the food amazing.
Specific Review
• “Nice hotel, all the amenities you need, great complex of pools.
Generic Review
30 Oct 2015 Insight Centre for Data Analytics Slide 10
Specific reviews contain more contextual information than generic ones.
![Page 11: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/11.jpg)
Rich Context (RC)
RC
RCMiner
RCRecommender
30 Oct 2015 Insight Centre for Data Analytics Slide 11
![Page 12: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/12.jpg)
RCMiner
30 Oct 2015 Insight Centre for Data Analytics Slide 12
Reviews Classifier
SpecificReviews
GenericReviews
LDA TopicModel Filter Contextual
Topics
Inspired by [Bauman&Tuzhilin2014]
![Page 13: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/13.jpg)
RCMiner
30 Oct 2015 Insight Centre for Data Analytics Slide 13
Reviews Classifier
SpecificReviews
GenericReviews
LDA TopicModel Filter Contextual
Topics
Inspired by [Bauman&Tuzhilin2014]
![Page 14: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/14.jpg)
Classification
30 Oct 2015 Insight Centre for Data Analytics Slide 14
• 300 tagged reviews • Logistic Regression • Feature Subset Selection
• Number of words • Number of verbs in past
tense
![Page 15: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/15.jpg)
LDA (Latent Dirichlet Allocation)
30 Oct 2015 Insight Centre for Data Analytics Slide 15
“During the summer, we like to take a mini staycation. This year it was extra special as we also got engaged. Our stay at the Biltmore was just fantastic. The service exceptional, the food amazing- it was great at the pool, Wrights and also at Frank and Alberts. The only reason I am not giving it a full 5 stars is the 'upgraded' room was just a nice basic room. Though it was certainly nice, it wasnt what I expected for being the Biltmore. However, everything else certainly lived up to that expectation”.
“During the summer, we like to take a mini staycation. This year it was extra special as we also got engaged. Our stay at the Biltmore was just fantastic. The service exceptional, the food amazing- it was great at the pool, Wrights and also at Frank and Alberts. The only reason I am not giving it a full 5 stars is the 'upgraded' room was just a nice basic room. Though it was certainly nice, it wasnt what I expected for being the Biltmore. However, everything else certainly lived up to that expectation”.
Summer (0.04) Weekend (0.02) Monday (0.01) …
Holiday (0.05) Romantic (0.03) Staycation (0.01) …
Room (0.05) Pool (0.04) Sauna (0.01) …
Free (0.03) Cheap (0.02) Expensive (0.01) …
Topic Distribution of the Review
0
0.25
0.5
0.75
1.0
• Each document is a mixture of topics
• Each topic is composed of words that co-occur along documents
![Page 16: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/16.jpg)
Filtering
• Apply the topic model to both specific and generic reviews.
• Count the number of times topics appear in specific and generic reviews.
• The ones that appear more frequently in specific reviews are labeled as contextual topics.
30 Oct 2015 Insight Centre for Data Analytics Slide 16 Specific
Generic
Topic Distribution Among Reviews
0
0.25
0.5
0.75
1.0
Contextual topics
![Page 17: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/17.jpg)
RCRecommender
30 Oct 2015 Insight Centre for Data Analytics Slide 17
Context k-NN
Reviews&
Ratings
RecommendationUser Text
Context&
Ratings
![Page 18: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/18.jpg)
Rating Prediction
30 Oct 2015 Insight Centre for Data Analytics Slide 18
Inspired by [Cheng,Burke&Mobasher2013]
Traditional k-NN RC Neighbourhood Most similar users Users who have rated the item
in a similar context. User Similarity Based on rating similarity Based on rating similarity,
weighting each rating by its context similarity.
Aggregated rating
Average weighted by user similarity
Average weighted by user similarity which includes context similarity.
![Page 19: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/19.jpg)
Evaluation
30 Oct 2015 Insight Centre for Data Analytics Slide 19
• Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews.
• 5-fold cross validation. • Compared against a k-NN user-based baseline
recommender.
Change Over User Baseline
Dataset RMSE Coverage RMSE Coverage
Hotels 1.30 8% +13.7% +74.6%
Spas 1.32 3% +18.2% +34.8%
Top-N Recall
Change Over User Baseline
0.56 +49.5%
0.51 +59.6%
![Page 20: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/20.jpg)
Final Thoughts
• RC is capable of extracting open-ended context from user reviews.
• It integrates open-ended contextual information into a k-NN based model.
Conclusions
• Improve RCMiner by designing new features and using different classifiers.
• Improve the coverage of RCRecommender on sparse datasets.
Future Work
30 Oct 2015 Insight Centre for Data Analytics Slide 20
![Page 22: Recommending From Experience · Insight Centre for Data Analytics 30 Oct 2015 Slide 19 • Yelp Data Challenge dataset. • 5034 hotel reviews. • 5579 spa reviews. • 5-fold cross](https://reader033.vdocuments.us/reader033/viewer/2022042408/5f23232a413b9d07934208e4/html5/thumbnails/22.jpg)
References
[1] K. Bauman and A. Tuzhilin. Discovering contextual information from user reviews for recommendation purposes. In 1st Workshop on New Trends in Content-based Recommender Systems, pages 2–9, 2014.
[2] L. Chen, G. Chen and F. Wang. Recommender systems based on user reviews: the state of the art. User Modeling and User- Adapted Interaction, 25(2):99–154, 2015.
[3] Y. Zheng, R. Burke, and B. Mobasher. Recommendation with differential context weighting. In User Modeling, Adaptation, and Personalization, LNCS 7899, pages 152–164, 2013.
30 Oct 2015 Insight Centre for Data Analytics Slide 22