![Page 1: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/1.jpg)
ECIR 2016, PADUA, ITALYEFFICIENT PSEUDO-RELEVANCE FEEDBACKMETHODS FOR COLLABORATIVE FILTERINGRECOMMENDATION
Daniel Valcarce, Javier Parapar, Álvaro Barreiro@dvalcarce @jparapar @AlvaroBarreiroG
Information Retrieval Lab@IRLab_UDC
University of A CoruñaSpain
![Page 2: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/2.jpg)
Outline
1. Pseudo-Relevance Feedback (PRF)
2. Collaborative Filtering (CF)
3. PRF Methods for CF
4. Experiments
5. Conclusions and Future Work
1/28
![Page 3: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/3.jpg)
PSEUDO-RELEVANCE FEEDBACK (PRF)
![Page 4: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/4.jpg)
Pseudo-Relevance Feedback (I)
Pseudo-Relevance Feedback provides an automatic method forquery expansion:
# Assumes that the top retrieved documents with theoriginal query are relevant (pseudo-relevant set).
# The query is expanded with the most representative termsfrom this set.
# The expanded query is expected to yield better results thanthe original one.
3/28
![Page 5: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/5.jpg)
Pseudo-Relevance Feedback (II)
Information need
4/28
![Page 6: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/6.jpg)
Pseudo-Relevance Feedback (II)
Information need
query
4/28
![Page 7: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/7.jpg)
Pseudo-Relevance Feedback (II)
Information need
query RetrievalSystem
4/28
![Page 8: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/8.jpg)
Pseudo-Relevance Feedback (II)
Information need
query RetrievalSystem
4/28
![Page 9: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/9.jpg)
Pseudo-Relevance Feedback (II)
Information need
query RetrievalSystem
4/28
![Page 10: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/10.jpg)
Pseudo-Relevance Feedback (II)
Information need
query RetrievalSystem
4/28
![Page 11: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/11.jpg)
Pseudo-Relevance Feedback (II)
Information need
query RetrievalSystem
QueryExpansion
expandedquery
4/28
![Page 12: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/12.jpg)
Pseudo-Relevance Feedback (II)
Information need
query RetrievalSystem
QueryExpansion
expandedquery
4/28
![Page 13: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/13.jpg)
Pseudo-Relevance Feedback (III)
Some popular PRF approaches:
# Based on Rocchio’s model(Rocchio, 1971 & Carpineto et al., ACM TOIS 2001)
# Relevance-Based Language Models(Lavrenko & Croft, SIGIR 2001)
# Divergence Minimization Model(Zhai & Lafferty, SIGIR 2006)
# Mixture Models(Tao & Zhai, SIGIR 2006)
5/28
![Page 14: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/14.jpg)
COLLABORATIVE FILTERING (CF)
![Page 15: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/15.jpg)
Recommender Systems
Notation:
# The set of users U
# The set of items I
# The rating that the user u gave to the item i is ru ,i
# The set of items rated by user u is denoted by Iu
# The set of users that rated item i is denoted by Ui
# The neighbourhood of user u is denoted by Vu
Top-N recommendation: create a ranked list containingrelevant and unknown items for each user u ∈ U.
7/28
![Page 16: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/16.jpg)
Collaborative Filtering (I)
Collaborative Filtering (CF) employs the past interactionbetween users and items to generate recommendations.
Idea: If this user who is similar to you likes this item, maybe you willalso like it.
Different input data:
# Explicit feedback: ratings, reviews...
# Implicit feedback: clicks, purchases...
Perhaps the most popular approach to recommendation giventhe increasing amount of information about users.
8/28
![Page 17: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/17.jpg)
Collaborative Filtering (II)
Collaborative Filtering (CF) techniques can be classified in:
# Model-based methods: learn a predictive model from theuser-item ratings.◦ Matrix factorisation (e.g., SVD)
# Neighbourhood-based (or memory-based) methods:compute recommendations using directly part of theratings.◦ k-NN approaches
9/28
![Page 18: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/18.jpg)
PRF METHODS FOR CF
![Page 19: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/19.jpg)
PRF for CF
PRF CFUser’s query User’s profile
mostˆ1,populatedˆ2,stateˆ2 Titanicˆ2,Avatarˆ3,Matrixˆ5
Docum
ents
Neigh
bours
Term
s
Items
11/28
![Page 20: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/20.jpg)
Previous Work on Adapting PRF Methods to CF
Relevance-Based Language Models
# Originally devised for PRF (Lavrenko & Croft, SIGIR 2001).# Adapted to CF (Parapar et al., Inf. Process. Manage. 2013).# Two models: RM1 and RM2.# High precision figures in recommendation.
# ... but high computational cost!
RM1 : p(i |Ru) ∝∑v∈Vu
p(v) p(i |v)∏j∈Iu
p( j |v)
RM2 : p(i |Ru) ∝ p(i)∏j∈Iu
∑v∈Vu
p(i |v) p(v)p(i) p( j |v)
12/28
![Page 21: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/21.jpg)
Previous Work on Adapting PRF Methods to CF
Relevance-Based Language Models
# Originally devised for PRF (Lavrenko & Croft, SIGIR 2001).# Adapted to CF (Parapar et al., Inf. Process. Manage. 2013).# Two models: RM1 and RM2.# High precision figures in recommendation.# ... but high computational cost!
RM1 : p(i |Ru) ∝∑v∈Vu
p(v) p(i |v)∏j∈Iu
p( j |v)
RM2 : p(i |Ru) ∝ p(i)∏j∈Iu
∑v∈Vu
p(i |v) p(v)p(i) p( j |v)
12/28
![Page 22: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/22.jpg)
Our Proposals based on Rocchio’s Framework
Rocchio’s Weights
pRocchio(i |u) �∑v∈Vu
rv ,i
|Vu |
Robertson Selection Value g
pRSV (i |u) �∑v∈Vu
rv ,i
|Vu | p(i |Vu)
CHI-2 g
pCHI−2(i |u) ��p(i |Vu) − p(i |C)�2
p(i |C)
Kullback–Leibler Divergence
pKLD(i |u) � p(i |Vu) logp(i |Vu)p(i |C)
13/28
![Page 23: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/23.jpg)
Our Proposals based on Rocchio’s Framework
Rocchio’s Weights
pRocchio(i |u) �∑v∈Vu
rv ,i
|Vu |
Robertson Selection Value g
pRSV (i |u) �∑v∈Vu
rv ,i
|Vu | p(i |Vu)
CHI-2 g
pCHI−2(i |u) ��p(i |Vu) − p(i |C)�2
p(i |C)
Kullback–Leibler Divergence
pKLD(i |u) � p(i |Vu) logp(i |Vu)p(i |C)
13/28
![Page 24: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/24.jpg)
Our Proposals based on Rocchio’s Framework
Rocchio’s Weights
pRocchio(i |u) �∑v∈Vu
rv ,i
|Vu |
Robertson Selection Value g
pRSV (i |u) �∑v∈Vu
rv ,i
|Vu | p(i |Vu)
CHI-2 g
pCHI−2(i |u) ��p(i |Vu) − p(i |C)�2
p(i |C)
Kullback–Leibler Divergence
pKLD(i |u) � p(i |Vu) logp(i |Vu)p(i |C)
13/28
![Page 25: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/25.jpg)
Probability Estimation
Maximum Likelihood Estimate under a MultinomialDistribution over the ratings:
pmle(i |Vu) �∑
v∈Vu rv ,i∑v∈Vu , j∈I rv , j
pmle(i |C) �∑
u∈U ru ,i∑u∈U, j∈I ru , j
14/28
![Page 26: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/26.jpg)
Neighbourhood Length Normalisation (I)
Neighbourhoods are computed using clustering algorithms:
# Hard clustering: every user is in only one cluster. Clustersmay have different sizes. Example: k-means.
# Soft clustering: each user has its own neighbours. Whenwe set k to a high value, we may find different amounts ofneighbours. Example: k-NN.
Idea: consider the variability of the neighbourhood lengths:
# Big neighbourhoods is equivalent to a query with a lot ofresults: the collection model is closed to the target user.
# Small neighbourhoods implies that neighbours are highlyspecific: the collection is very different from the target user.
15/28
![Page 27: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/27.jpg)
Neighbourhood Length Normalisation (I)
Neighbourhoods are computed using clustering algorithms:
# Hard clustering: every user is in only one cluster. Clustersmay have different sizes. Example: k-means.
# Soft clustering: each user has its own neighbours. Whenwe set k to a high value, we may find different amounts ofneighbours. Example: k-NN.
Idea: consider the variability of the neighbourhood lengths:
# Big neighbourhoods is equivalent to a query with a lot ofresults: the collection model is closed to the target user.
# Small neighbourhoods implies that neighbours are highlyspecific: the collection is very different from the target user.
15/28
![Page 28: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/28.jpg)
Neighbourhood Length Normalisation (II)
We bias the MLE to perform neighbourhood lengthnormalisation:
pnmle(i |Vu) rank�
1|Vu |
∑v∈Vu rv ,i∑
v∈Vu , j∈I rv , j
pnmle(i |C) rank�
1|U |
∑u∈U ru ,i∑
u∈U, j∈I ru , j
16/28
![Page 29: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/29.jpg)
EXPERIMENTS
![Page 30: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/30.jpg)
Experimental settings
Baselines:
# UB: traditional user-based neighbourhood approach.# SVD: matrix factorisation.# UIR-Item: probabilistic approach.# RM1 and RM2: Relevance-Based Language Models.
Our algorithms:
# Rocchio’s Weights (RW)# Robertson Selection Value (RSV)# CHI-2# Kullback-Leibler Divergence (KLD)
18/28
![Page 31: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/31.jpg)
Efficiency
0.01
0.1
1
10
ML 100k ML 1M ML 10Mreco
mm
enda
tion
tim
epe
rus
er(s
)
dataset
UIRRM1RM2
SVD++RSVUBRW
CHI-2KLD
19/28
![Page 32: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/32.jpg)
Accuracy (nDCG@10)
Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing
UB 0.0468 0.0313 0.0108 0.0055b
SVD 0.0936a 0.0608a 0.0101 0.0015UIR-Item 0.2188ab 0.1795abd 0.0174abd 0.0673abd
RM1 0.2473abc 0.1402ab 0.0146ab 0.0444ab
RM2 0.3323abcd 0.1992abd 0.0207abcd 0.0957abcd
Rocchio’s Weights 0.2604abcd 0.1557abd 0.0194abcd 0.0892abcd
RSV 0.2604abcd 0.1557abd 0.0194abcd 0.0892abcd
KLDMLE 0.2693abcd 0.1264ab 0.0197abcd 0.1576abcde
NMLE 0.3120abcd 0.1546ab 0.0201abcd 0.1101abcde
CHI-2MLE 0.0777a 0.0709ab 0.0149ab 0.0939abcd
NMLE 0.3220abcd 0.1419ab 0.0204abcd 0.1459abcde
Table: Values of nDCG@10. Pink = best algorithm. Blue = notsignificantly different to the best (Wilcoxon two-sided p < 0.01). 20/28
![Page 33: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/33.jpg)
Diversity (Gini@10)
Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing
UIR-Item 0.0124 0.0050 0.0137 0.0005RM2 0.0256 0.0069 0.0207 0.0019CHI-2 NMLE 0.0450 0.0106 0.0506 0.0539
Table: Values of the complement of Gini index at 10. Pink = bestalgorithm.
21/28
![Page 34: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/34.jpg)
Novelty (MSI@10)
Algorithm ML 100k ML 1M R3-Yahoo! LibraryThing
UIR-Item 5.2337e 8.3713e 3.7186e 17.1229eRM2 6.8273c 8.9481c 4.9618c 19.27343c
CHI-2 NMLE 8.1711ec 10.0043ec 7.5555ec 8.8563
Table: Values of Mean Self-Information at 10. Pink = best algorithm.
22/28
![Page 35: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/35.jpg)
Trade-off Accuracy-Diversity
0.06
0.07
0.08
0.09
0.10
0.11
0.12
0.13
200 300 400 500 600 700 800 900
G–(Gini,n
DCG)
k
RM2CHI-2 NMLE
Figure: G-measure of nDCG@10 and Gini@10 on MovieLens 100kvarying the number of neighbours k using Pearson’s correlationsimilarity.
23/28
![Page 36: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/36.jpg)
Trade-off Accuracy-Novelty
0.91.01.11.21.31.41.51.61.71.81.92.0
200 300 400 500 600 700 800 900
G–(MSI,nDCG)
k
RM2CHI-2 NMLE
Figure: G-measure of nDCG@10 and MSI@10 on MovieLens 100kvarying the number of neighbours k using Pearson’s correlationsimilarity.
24/28
![Page 37: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/37.jpg)
CONCLUSIONS AND FUTURE WORK
![Page 38: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/38.jpg)
Conclusions
We proposed to use fast PRF methods (Rocchio’s Weigths, RSV,KLD and CHI-2):
# They are orders of magnitude faster than the RelevanceModels (up to 200x).
# They generate quite accurate recommendations.
# Good novelty and diversity figures with a better trade-offthan RM2.
# They lack of parameters (only clustering parameters).
26/28
![Page 39: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/39.jpg)
Future Work
Other approaches for computing neighbourhoods:
# Posterior Probability Clustering (a non-negative matrixfactorisation).
# Normalised Cut (spectral clustering).
Explore other PRF methods:
# Divergence Minimization Models.
# Mixture Models.
27/28
![Page 40: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/40.jpg)
Future Work
Other approaches for computing neighbourhoods:
# Posterior Probability Clustering (a non-negative matrixfactorisation).
# Normalised Cut (spectral clustering).
Explore other PRF methods:
# Divergence Minimization Models.
# Mixture Models.
27/28
![Page 41: Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation [ECIR '16 Slides]](https://reader030.vdocuments.us/reader030/viewer/2022020213/58a8398c1a28ab30658b4d11/html5/thumbnails/41.jpg)
THANK YOU!
@DVALCARCEhttp://www.dc.fi.udc.es/~dvalcarce