movie recommendation system jon c. hammer 04/30/2015

14
Movie Recommendation System Jon C. Hammer 04/30/2015

Upload: hope-clarke

Post on 18-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Movie Recommendation

SystemJon C. Hammer

04/30/2015

Outline• Introduction

• Background

• Architecture

• Implementation

• Lessons Learned

• Conclusion

Introduction• Goal:

• Build a complete movie recommendation system.

• Requirements:

• User can view personalized recommendations

• User can input new ratings

• Interactive

ALS• Collaborative Filtering technique

• Alternative to user/item based recommendation

• Uses Matrix Factorization

• Split ratings matrix (u x m) into U (u x d) and M (d x m)

• u = Number of users

• m = Number of movies

• d = Number of intrinsic dimensions (our choice)

• Alternate between optimizing U and M

• Gradient Descent

• Least Squares method

Architecture• Key components:

• Recommender

• Database

• Web server

• Client application

Client Application

Web Server

Recommender Database

Implementation• Platforms

• Servers hosted on AWS

• T2.micro and M3.xlarge EC2 instances

• Ubuntu 14.04 LTS

• Additional Software:

• Hadoop

• Mahout

• MySQL

• Android application

• Languages

• Python

• Bash

• Java

Recommender• Dataset

• MovieLens

• 20 million ratings

• 27,000 movies

• 138,000 users

• Mahout

• Given ratings matrix, produces N recommendations per user

• Uses ALS algorithm

• Recommendations are used by web server

• Recomputed when new ratings are provided by user

Database• User Table

• Login & customer information

• Movie Table

• Movie name, year, IMDB link, and poster

• Most information already provided in the dataset

• Posters scraped from IMDB a priori for client applications

• MySQL implementation

Web Server• Interface between clients and

recommendation engine / database

• Written in Python

• Twisted, Klein, MySQLdb modules

• Communication via HTTP Get, HTTP Post, JSON

• Returns most recent recommendations

• Interactive database queries

Client Application• Features

• Login system

• Ability to create new accounts on the fly

• View personalized recommendations

• Search database for movies

• Enter new ratings

• Written in Java for Android

Client Application

Lessons Learned• Operate with AWS

• Configuring & launching instances

• Creating images

• VPC & Security

• Hadoop / Mahout

• Installation & configuration

• HDFS

• General

• Making / responding to web requests in both Java & Python

• Website scraping

References• Zhou, Yunhong, et al. "Large-scale parallel

collaborative filtering for the netflix prize." Algorithmic Aspects in Information and Management. Springer Berlin Heidelberg, 2008. 337-348.

• Mahout. https://mahout.apache.org/

• Hadoop. https://hadoop.apache.org/

• MovieLens dataset. http://grouplens.org/datasets/movielens/

Questions?