brett boge cs 765 university of nevada, reno

26
A Hybrid Recommender System Using Link Analysis and Genetic Tuning in the Bipartite Network of BoardGameGeek.com Brett Boge CS 765 University of Nevada, Reno

Upload: jontae

Post on 24-Feb-2016

28 views

Category:

Documents


0 download

DESCRIPTION

A Hybrid Recommender System Using Link Analysis and Genetic Tuning in the Bipartite Network of BoardGameGeek.com. Brett Boge CS 765 University of Nevada, Reno. Data (Overview). Data (Scope). Starting with the top 5,000 games - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Brett  Boge CS  765 University of Nevada, Reno

A Hybrid Recommender System Using Link Analysis and Genetic Tuning in the Bipartite Network of

BoardGameGeek.com

Brett BogeCS 765University of Nevada, Reno

Page 2: Brett  Boge CS  765 University of Nevada, Reno

Recap

Data

General Approach

Step 1: Link-analysis

Step 2: Content-based Cascade

Step 2: Genetic tuning

Page 3: Brett  Boge CS  765 University of Nevada, Reno

Recap

Data

General Approach

Step 1: Link-analysis

Step 2: Content-based Cascade

Step 2: Genetic tuning

Page 4: Brett  Boge CS  765 University of Nevada, Reno

Data (Overview)

Users

400,000 +

Games

55,000 +

Ratings

0–3000/ea

Page 5: Brett  Boge CS  765 University of Nevada, Reno

Data (Scope)

• Starting with the top 5,000 games

• List of users == those which have rated at least one of the top 5,000 games

• Users with no ratings cannot be connected to anycomponent of the graph, and can only be evaluatedin the most general sense

Page 6: Brett  Boge CS  765 University of Nevada, Reno

Data (Retrieval)

• Data will be obtained through the BGG XML API2

• Game|Small World, id 40692http://boardgamegeek.com/xmlapi2/

thing?id=40692&ratingcomments=1

• User|Licinianhttp://boardgamegeek.com/xmlapi2/

user?name=Licinian

http://boardgamegeek.com/xmlapi2/collection?name=Licinian&own/played/trade/want/wishlist/etc

Page 7: Brett  Boge CS  765 University of Nevada, Reno
Page 8: Brett  Boge CS  765 University of Nevada, Reno

Data (Sets)

Ratings/Ownership Data

TeachingSet70%

TestingSet30%

(hopefully most recent)

Page 9: Brett  Boge CS  765 University of Nevada, Reno

Recap

Data

General Approach

Step 1: Link-analysis

Step 2: Content-based Cascade

Step 2: Genetic tuning

Page 10: Brett  Boge CS  765 University of Nevada, Reno

• User & Item profiles• Based on content specific to that

object (properties)

ContentBased

• Users & Items similar to those liked/owned in the past

• More abstract, only links matter

CollaborativeBased

General Approach

Page 11: Brett  Boge CS  765 University of Nevada, Reno

• Weighted• Switched• Mixed• Feature combination• Cascade

Methods of Hybrid Filtering

R. Burke, "Hybrid recommender systems: Survey and experiments,"

ApproachesGeneral Approach

Page 12: Brett  Boge CS  765 University of Nevada, Reno

Our Method

ApproachesGeneral Approach

Link-analysis

•As described by Huang et al. in A Link analysis approach to recommendation under sparse data

•A PageRank style analysis of hubs and authorities

Content-based

•Refines the previous results•Uses information about the items themselves to

adjust ranking•Will need tuning

Page 13: Brett  Boge CS  765 University of Nevada, Reno

Recap

Data

General Approach

Step 1: Link-analysis

Step 2: Content-based Cascade

Step 2: Genetic tuning

Page 14: Brett  Boge CS  765 University of Nevada, Reno

Overview

From Z. Huang, et al., "A Link analysis approach torecommendation under sparse data," 2004.

ApproachesLink Analysis Step

LinkAnalysis

Consumer - ProductMatrix

ConsumerRepresentativeness

Matrix

ProductRepresentativeness

Matrix

Page 15: Brett  Boge CS  765 University of Nevada, Reno

Matrix Definitions

From Z. Huang, et al., "A Link analysis approach torecommendation under sparse data," 2004.

ApproachesLink Analysis Step

ProductRepresentativeness

Matrix

ConsumerRepresentativeness

Matrix

Page 16: Brett  Boge CS  765 University of Nevada, Reno

Initialization

From Z. Huang, et al., "A Link analysis approach torecommendation under sparse data," 2004.

ApproachesLink Analysis Step

ConsumerRepresentativeness

Matrix

ProductRepresentativeness

Matrix

Page 17: Brett  Boge CS  765 University of Nevada, Reno

Update Phase

From Z. Huang, et al., "A Link analysis approach torecommendation under sparse data," 2004.

ApproachesLink Analysis Step

Update Phase

ConsumerRepresentativeness

Matrix

ProductRepresentativeness

Matrix

Page 18: Brett  Boge CS  765 University of Nevada, Reno

Recap

Data

General Approach

Step 1: Link-analysis

Step 2: Content-based Cascade

Step 2: Genetic tuning

Page 19: Brett  Boge CS  765 University of Nevada, Reno

Product Representativeness Result

ApproachesContent-based Cascade

ProductRepresentativeness

Matrix

Game1

Game2

Game3

UserA

x x x

UserB

PR21 PR22 PR23

UserC

x x x

PRi

Page 20: Brett  Boge CS  765 University of Nevada, Reno

Additional Data

ApproachesContent-based Cascade

Property Description

Subdomain (S) General type of game (Strategy,Family, Party)

Category (C) Genre/specific type of game(Civilization, Territory Building)

Playing Time (P) Publisher provided, in minutes

Mechanic (M) Game mechanics used (Dice Rolling,Variable Powers)

Suggested best Number of players (N)

User voted best number of players toplay the game

Page 21: Brett  Boge CS  765 University of Nevada, Reno

Similarity Measures

ApproachesContent-based Cascade

Property Similarity

Subdomain (S) Cosine

Category (C) Cosine

Playing Time (P) Error

Mechanic (M) Cosine

Suggested best Number of players (N)

Error

These will need to be normalized on the same scale (0.00 - 1.00)

Page 22: Brett  Boge CS  765 University of Nevada, Reno

Product Similarity Matrix

ApproachesContent-based Cascade

S C P M NGame 1 .12 .2 .6 .1 .5

Page 23: Brett  Boge CS  765 University of Nevada, Reno

Refining the Product Ranking

ApproachesContent-based Cascade

• Create PRfinal by refining PR:

• W is a vector of weights which determine how much a givenproperty should effect the original score

Page 24: Brett  Boge CS  765 University of Nevada, Reno

Recap

Data

General Approach

Step 1: Link-analysis

Step 2: Content-based Cascade

Step 2: Genetic tuning

Page 25: Brett  Boge CS  765 University of Nevada, Reno

Determining an Optimal W

ApproachesGenetic Tuning

• W needs to be defined optimally for this given domain

• A genetic algorithm will be used to tune W

• Chromosome = sequential binary representation of W

• Fitness based on Rank Score (from Huang et al.)

• 8 bits per weight, ranging from 0 - .25 to start

• Rates of crossover/mutation TBD

Page 26: Brett  Boge CS  765 University of Nevada, Reno

Conclusion / Questions