movie2books by sumin tang

13
Recommending Books from your favorite Movie Sumin Tang Movie2Books.com

Upload: tangsm

Post on 05-Aug-2015

190 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Recommending Books from your favorite Movie

Sumin Tang

Movie2Books.com

Data Sources and Processing Flow

20M Reviews Genres

2800 Movies & 1000 Bookswith 20+ reviews & no missing attribute

Similarity Scores

Book Recommendations for Each Movie

Book cover images

Collaborative filtering using user rating scores?Unfortunately the data is too sparse...

Poor performance even after SVD

80% movie-book pairs have 0 common user

Similarity Metrics for Movie-Book Pairs

Review TextCosine Similarity (C)

GenresJaccard Similarity (J)

Final Similarity Score

Validation

Users liked movie A

Users liked book B

Users liked both

• Based on rating scores from users who rated both movies and books

• For each movie, calculate Jaccard index between the movie and: – Jrec: recommended books

– Jbase: all the books

• Median(Jrec/Jbase)=26:people are 26x more likely to like Movie2Books recommendation than the random baseline

Sumin Tanghttps://www.linkedin.com/in/sumintang

Out of 20 million reviews from 3.7 million users, about half of the reviews were provided by 10% of the users.

Books

MoviesTop 10% users

Top 10% users

Some fun stuff…

Is this a highly rated movie at Amazon?

Don’t like it Really like it

Is this a highly rated movie at Amazon?

=

Ratings of the Movie Ratings of All Movies Re-scaled scores=

=

Most vs Least Reviewed Items

• Both have very skewed distribution in ratings, with mode being 5

• The most reviewed items have higher fraction of 5s: popular products are indeed more liked by people.

Books

Movies

Most vs Least Active Users

The least active users give more bad ratings (score=1): they are more likely to write a review if they really don’t like the product?

Books

Movies