mymedialite
DESCRIPTION
Talk at the FOSDEM 2011 Data Analytics Devroom about MyMediaLite. http://fosdem.org/2011/schedule/event/mymedialite MyMediaLite is a lightweight, multi-purpose library of recommender system algorithms written in C#. The presentation gives a short overview of the library, how to use its features from the command line and from C#, Python, and Ruby programs, as well as how to extend the library with new recommender system algorithms.TRANSCRIPT
MyMediaLitea lightweight, multi-purpose library of recommender system algorithms
Zeno Gantner
University of Hildesheim
February 5, 2011
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 1 / 16
Introduction
What are Recommender Systems?
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 2 / 16
Introduction
MyMediaLite: Recommender System Algorithm Libraryfunctionality
I rating prediction
I item recommendation from implicit feedback
I algorithm testbed
target groups
I recommender system researchers
I educators and students
I application developers
misc info
I written in C#, runs on Mono
I GNU General Public License (GPL)
I regular releases (1 or 2 per month)
why use it?
I simple
I free
I scalable
I well-documented
I choice
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 3 / 16
Using MyMediaLite
Data Flow
Recommender
Model
interactiondata
user/itemattributes
disk
predictions
hyperparameters
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 4 / 16
Using MyMediaLite
Methods Implemented in MyMediaLiterating prediction
I averages: global, user, item
I linear baseline method by Koren and Bell
I frequency-weighted Slope One
I k-nearest neighbor (kNN):
I user or item similarities, diff. similarity measuresI collaborative or attribute-/content-based
I (biased) matrix factorization
item prediction from implicit feedback
I random
I most popular item
I linear content-based model optimized for BPR (BPR-Linear)
I support-vector machine using item attributes
I k-nearest neighbor (kNN)
I weighted regularized matrix factorization (WR-MF)
I matrix factorization optimized for BPR (BPR-MF)Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 5 / 16
Using MyMediaLite
Command-Line Tools
I one for each task: rating prediction, item recommendation
I simple text format: CSV
I pick method and parameters using command-line arguments
I evaluate, store/load models
http://ismll.de/mymedialite/documentation/command_line.html
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 6 / 16
Using MyMediaLite
Embedding MyMediaLite: C#us ing System ;us ing MyMediaLite . Data ;us ing MyMediaLite . Eva l ;us ing MyMediaLite . IO ;us ing MyMediaLite . ItemRecommendation ;
pub l i c c l a s s Example{
pub l i c s t a t i c vo id Main ( s t r i n g [ ] a r g s ){
// l oad the datavar use r mapp ing = new Ent i tyMapp ing ( ) ;var i tem mapping = new Ent i tyMapp ing ( ) ;var t r a i n i n g d a t a = ItemRecommenderData . Read ( a r g s [ 0 ] , user mapping , i tem mapping ) ;var r e l e v a n t i t em s = item mapping . I n t e r n a l I D s ;var t e s t d a t a = ItemRecommenderData . Read ( a r g s [ 1 ] , user mapping , i tem mapping ) ;
// s e t up the recommendervar recommender = new MostPopular ( ) ;recommender . S e tCo l l a b o r a t i v eDa t a ( t r a i n i n g d a t a ) ;recommender . Tra in ( ) ;
// measure the accu racy on the t e s t data s e tvar r e s u l t s = I t emP r e d i c t i o nE v a l . Eva l ua t e ( recommender , t e s t d a t a , t r a i n i n g d a t a ,
r e l e v a n t i t em s ) ;Conso l e . Wr i t eL i n e ( "prec@5 ={0}" , r e s u l t s [ "prec5" ] ) ;
// make a p r e d i c t i o n f o r a c e r t a i n u s e r and i temConso l e . Wr i t eL i n e ( recommender . P r e d i c t ( use r mapp ing . To I n t e r n a l ID (1 ) ,
i tem mapping . To I n t e r n a l ID ( 1 ) ) ) ;}
}
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 7 / 16
Using MyMediaLite
Embedding MyMediaLite: Python
#!/ u s r / b i n / env i p y
import clr
clr . AddReference ( "MyMediaLite.dll" )from MyMediaLite import ∗
# load the datauser_mapping = Data . EntityMapping ( )item_mapping = Data . EntityMapping ( )train_data = IO . ItemRecommenderData . Read ( "u1.base" , user_mapping , item_mapping )relevant_items = item_mapping . InternalIDstest_data = IO . ItemRecommenderData . Read ( "u1.test" , user_mapping , item_mapping )
# s e t up the recommenderrecommender = ItemRecommendation . MostPopular ( )recommender . SetCollaborativeData ( train_data ) ;recommender . Train ( )
# measure the accu racy on the t e s t data s e tprint Eval . ItemPredictionEval . Evaluate ( recommender , test_data , train_data , relevant_items )
# make a p r e d i c t i o n f o r a c e r t a i n u s e r and i temprint recommender . Predict ( user_mapping . ToInternalID ( 1 ) , item_mapping . ToInternalID ( 1 ) )
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 8 / 16
Using MyMediaLite
Embedding MyMediaLite: Ruby#!/ u s r / b i n / env i r
require ’MyMediaLite ’
min_rating = 1max_rating = 5
# load the datauser_mapping = MyMediaLite : : Data : : EntityMapping . new ( )item_mapping = MyMediaLite : : Data : : EntityMapping . new ( )
train_data = MyMediaLite : : IO : : RatingPredictionData . Read ( "u1.base" , min_rating , max_rating ,user_mapping , item_mapping )
test_data = MyMediaLite : : IO : : RatingPredictionData . Read ( "u1.test" , min_rating , max_rating ,user_mapping , item_mapping )
# s e t up the recommenderrecommender = MyMediaLite : : RatingPrediction : : UserItemBaseline . new ( )recommender . MinRating = min_rating
recommender . MaxRating = max_rating
recommender . Ratings = train_data
recommender . Train ( )
# measure the accu racy on the t e s t data s e teval_results = MyMediaLite : : Eval : : RatingEval : : Evaluate ( recommender , test_data )eval_results . each do | entry |
puts "#{ entry}"
end
# make a p r e d i c t i o n f o r a c e r t a i n u s e r and i temputs recommender . Predict ( user_mapping . ToInternalID ( 1 ) , item_mapping . ToInternalID ( 1 ) )
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 9 / 16
Extending MyMediaLite
Roll Your Own Recommendation Method
It’s easy.
for basic functionality
I define model data structures
I write Train() method
I write Predict() method
That’s all!
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 10 / 16
Extending MyMediaLite
Roll Your Own: Define Model Data Structures
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 11 / 16
Extending MyMediaLite
Roll Your Own: Write Train() Method
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 12 / 16
Extending MyMediaLite
Roll Your Own: Write Predict() Method
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 13 / 16
Extending MyMediaLite
Roll Your Own Recommendation Method
It’s easy.
You do not need to worry about including the new method to thecommand-line tools, reflection takes care of that.
advanced functionality
I CanPredict() method
I load/store models
I on-line updates
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 14 / 16
Conclusion
MyMediaLitefuture work
I more methods (contributions welcome . . . )
I additional scenarios: context-aware recommendation, tags, . . .
I distributed/parallel computing
Methods now shipped with MyMediaLite wereused in the MyMedia field trials (>50,000 users).
acknowledgements
I authors: Zeno Gantner, Steffen Rendle, Christoph Freudenthaler
I funding by EC FP7 project “Dynamic Personalization of Multimedia”(MyMedia) under grant agreement no. 215006.
I feedback, patches, suggestions: Thorsten Angermann, Fu Changhong,Andreas Hoffmann, Artus Krohn-Grimberghe, Christina Lichtenthaler,Damir Logar, Thai-Nghe Nguyen
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 15 / 16
Conclusion
MyMediaLite
homepage: http://ismll.de/mymedialite
fork it: http://gitorious.org/mymedialite
follow us: http://twitter.com/mymedialite
send feedback/patches: [email protected]
MyMediaLite: simple — free — scalable — well-documented
Zeno Gantner, University of Hildesheim: MyMediaLite Recommender System Library — http://ismll.de/mymedialite 16 / 16