tanwistha saha, huzefa rangwala, carlotta...
Post on 21-May-2018
214 Views
Preview:
TRANSCRIPT
Predicting Preference Tags to Improve Item Recommendation Tanwistha Saha, Huzefa Rangwala, Carlotta Domeniconi
Department of Computer Science
George Mason University, VA USA
• User preferences/item descriptors are represented by their choice of tags (short text snippets) while rating any item
• Tags can be seen as “multi-labels” that are descriptive of both the user and the item
• Collaborative Filtering (CF) based Recommender Systems [1] improve their performance by integrating these tags as user preferences and item descriptors in the system
• Collective Classification [2] on the implicit user-user network and item-item network can be used to predict tags for users/items that don’t have any
• Two users tagging/rating at least one common item are connected in the user-user network
• Two items rated by at least one user are connected in the item-item network
• Active Learning [3] methods on relational networks are used to improve collective classifier’s performance on tag prediction in cases when enough users/items didn’t have tag information available
BACKGROUND
OBJECTIVE AND SETUP
Latent Factor Models with User and Item Tags Factorization Machines (FM) [4] with User and Item Tags (User-Item-Tag-FM): User-Tag-FM (UT-FM): Uses predicted tags from collective classification for users only Item-Tag-FM (IT-FM): Uses predicted tags from collective classification for items only
Neighborhood-based model with User and Item Tags Rating for an item by an user is computed by tag based similarity between the user and the Item (User-Item-Tag-KNN or UIT-KNN). User-Tag-KNN (UT-KNN): Predicted rating of an item is an weighted average of the ratings given by the “nearest neighbors” of the user, computed using tag-based similarity Item-Tag-KNN (IT-KNN): Predicted rating of an item is an weighted average of the ratings obtained by the “nearest neighbors” of the items, computed using tag-based similarity
Active Learning for Tag Prediction using Collective Classification Active learning is used to selectively query informative users/items on their preference tags. We use FLIP algorithm [5] to choose which users / items to select for querying the tags. The users/items with high FLIP scores S[z] are chosen (z is a node in the implicit user-user or item-item network) for which we *do not* have the tag information.
METHODOLOGY
1. X. Su & T. M. Khoshgoftaar. “A survey of collaborative filtering techniques”, Advances in Artificial Intelligence, 2009.
2. S. A. Macskassy & F. Provost. “Classification in networked data: A toolkit and a univariate case study”, Journal of Machine Learning Research, 2007.
3. Burr Settles. “Active Learning Literature Survey”, Technical Report, University of Wisconsin-Madison, 2010.
4. S. Rendle. “Factorization Machines with LIBFM”, ACM Transactions on Intelligent Systems and Technology, TIST 2012.
5. T. Saha, H. Rangwala & C. Domeniconi. “FLIP: Active Learning for Relational Network Classification”, European Conference on Machine Learning, ECML 2014.
We propose : • Using collective classification for predicting user preference tags and
item descriptive tags as side information in state-of-the-art collaborative filtering recommender systems
• Using active learning to incrementally add informative samples to the training set for training the collective classifier
Our results on several real world relational network datasets show that using collective classification to predict preference tags in tag-based recommender systems is more effective for datasets with high rating density.
The experiments were run on Argo Research Cluster http://orc.gmu.edu/
Title Here Insert your text here. You can change the font size to fit your text. You can also make this box shrink or grow with the amount of text. Simply double click this text box, go to the “Text Box” tab, and check the option “Resize AutoShape to fit text”.
• The background of this template may appear blue on your screen, but it does print lavender. Insert your text here.
• You can change the font size to fit your text. • You can also make this box shrink or grow with the amount of text.
Simply double click this text box, go to the “Text Box” tab, and check the option “Resize AutoShapesize AutoShape to fit text”.
CONCLUSIONS
REFERENCES
f(x) = w0 + wu + wi +
|P |X
s=1
t̂usws +
|P |X
s=1
t̂ushvi,vsi+|P |X
s=1
|P |X
s0>s
t̂ust̂us0hvs,vs0i+|P |X
q=1
|P |X
q0>q
t̂iq t̂iq0hvq,vq0i+
|P |X
q=1
t̂iqwq +
|P |X
q=1
t̂iqhvu,vqi+FX
f=1
vufviff(x) -> predicted rating
-> predicted tag for user/item t̂
R̂u,i =
Pj2Ni
Si,jRu,jPj2Ni
Si,j
RESULTS WITH ACTIVE LEARNING
• Hamming Loss for Item tag prediction with active learning (FLIP) decreases with increasing training data in comparison to Random
• LibraryThing dataset with active learning (FLIP) can achieve the same RMSE with only 65% of labeled data, that is obtained when 80% labeled data is used in traditional settings
S[z] =maxiterX
j=1
|P |X
l=1
|t̂jzp
� t̂j�1zp
|
Contact (any query/comments): tsaha@masonlive.gmu.edu rangwala@cs.gmu.edu carlotta@cs.gmu.edu
RESULTS WITH COLLABORATIVE FILTERING
Datasets for the experiments are publicly available at: http://www.cs.gmu.edu/~mlbio/TagRecSys/ Mean Absolute Error (MAE) for Tag-based Latent Factor Models are less compared to the baseline model (FM) which do not use tags. All the scores are average of 5 independent runs. The statistical significance is proved by paired t-tests with p-value <0.1 (significance level of 0.10).
Datasets FM IT-‐FM IT-‐FM-‐G UIT-‐FM UIT-‐FM-‐G
TripAdvisor 0.8704 ± 0.0077 0.8480 ± 0.0043 0.8310 ± 0.0047 0.8429 ± 0.0044 0.8312 ± 0.0063
MovieLens 0.5741 ± 0.0013 0.5706± 0.0016 0.5674 ± 0.0015 0.5751 ± 0.0019 0.5724 ± 0.0011
LibraryThing 0.6128 ± 0.0014 0.6045 ± 0.0015 0.5959 ± 0.0012 0.6085 ± 0.0019 0.5985 ± 0.0012
Dataset UIT-‐FM vs FM UIT-‐FM-‐G vs FM IT-‐FM vs FM IT-‐FM-‐G vs FM
TripAdvisor 0 0 0 0
MovieLens 0.7645 0.4202 0.1283 0
LibraryThing 0.1059 0 0 0
MAE for Tag-based methods
p-value for MAE of Tag-based Methods w.r.t. Baseline Methods
Inspiring, feel-good
Feel-good, drama
Crime drama, 90s
Prison movie
User 1
User 2
User 3
User 4
Users Tags Item
User
e.g., Fiction, Comedy, Horror,
Thriller
Items
Tags
uses
All users inthe system
user-item ratingmatrix
user-user network
item-item network
Tag predictions for users/items
(collective classification)
User-user / item-item CFOR
Factorization Machines
Top-N predicted items Predicted rating of an item
User
GI
GUrates
Rm1,Rm2, · · · ,Rmn
R11,R12, · · · ,R1n
top related