comparing topic models for a movie recommendation system webist2014
DESCRIPTION
presentation at WEBIST Conference 2014, Barcellona, SpainTRANSCRIPT
Sonia Bergamaschi, Laura Po and Serena Sorrentino
Department of Engineering “Enzo Ferrari”, University of
Modena and Reggio Emilia, Italy
Comparing Topic Models for a Movie Recommendation
System
Recommendation systems
their performance greatly
suffers when little information about the
users preferences are given
movie plots
without knowing any user
preferences
Topic Models
Local database
movie selected
by the user
NO personal
information
NO user
preferences
Internet Movie Database Open Movie Database
Cast&Crew
Movie Person
IMDB Movie
Collection 1,861,736
IMDB Personality
Collection 3,165,235
TMDB Film
Collection 20,861
IMDB Cast
Collection 24,662,392
TMDB Person
Collection 234,986
TMDB Production
Collection 225,494
English Dbpedia
Movie Collection 164,508
EnglishDbpedia
Crew Collection 6,102
German Dbpedia
Movie Collection 164,508
German Dbpedia
Crew Collection 866
English Dbpedia
Actor Collection 6,151
German Dbpedia
Actor Collection 1,039
1. Plot Vectorization -
2. Weights Computation-
3. Matrix Reduction by using Topic Models
4. Movie Similarity Computation-
keyword1 keyword2 …
plot a
plot b wb,2
plot c
The weight of keyword 2
according to plot b
lower
add movies
without re-computing
find similar movies
T Document by
Keyword Matrix (d x k)
K Topic by Keyword
Matrix (z x k)
= x S Topic by
Topic Matrix (z x z)
DT Document by Topic Matrix
(d x z)
x
P(k|d) Document
distribution over Keywords
(d x k)
P(k|z) Topic
distrib. over
Keywords (z x k)
= x
LSA
LDA
P(z|d) Document distrib.
over Topics (d x z)
204,000 plots x
220,000 keywords 204,000 plots x
500 topics
204,000 plots x
50 topics 204,000 plots x
220,000 keywords
A test on the IMDb database, about 1,8 million of
multimedia only 204,000 has a plot available.
LSA allows to select plots that are
better related to the target’s plot themes
Off-line tests
• 20 users
• 18 movies
• the top 6
recommendations
from both LSA and
LDA
• 594 evaluations
collected
LDA does not have good
performance on movie
recommendations: it is not able to
suggest movies of the same saga
and it suggests erroneous entries for
movies that have short plot
LSA achieves good performance
on movie recommendations:
it is able to suggest movies of the
same saga and also unknown
movies related to the target one
• 30 users
• 18 movies
• the top 6
recommendations
from both LDA and
IMDb
• 146 evaluations
collected