neo4j - graph database for recommendations
DESCRIPTION
The trend nowadays is to represent the relationships between entities in a graph structure. Neo4j is a NOSQL graph database, which allows for fast and effective queries on connected data. Implementation of own algorithms is possible, which can improve the functionality of built in API. We make use of the graph database to model and recommend movies and other media content.TRANSCRIPT
Neo4j - Graph database for recommendations
Jakub Kříž, Ondrej Proksa30.5.2013
Summary
Graph databases Working with Neo4j and Ruby (On Rails) Plugins and algorithms – live demos
Document similarity Movie recommendation Recommendation from subgraph
TeleVido.tv
Why Graphs?
Graphs are everywhere! Natural way to model almost everything “Whiteboard friendly” Even the internet is a graph
Why Graph Databases?
Relational databases are not so great for storing graph structures Unnatural m:n relations Expensive joins Expensive look ups during graph traversals
Graph databases fix this Efficient storage Direct pointers = no joins
Neo4j
The World's Leading Graph Database www.neo4j.org
NOSQL database Open source - github.com/neo4j ACID Brief history
Official v1.0 – 2010 Current version 1.9 2.0 coming soon
Querying Neo4j
Querying languages Structurally similar to SQL Based on graph traversal
Most often used Gremlin – generic graph querying language Cypher – graph querying language for
Neo4j SPARQL – generic querying language for
data in RDF format
Cypher Example
CREATE (n {name: {value}})CREATE (n)-[r:KNOWS]->(m)
START[MATCH][WHERE]RETURN [ORDER BY] [SKIP] [LIMIT]
Cypher Example (2)
Friend of a friend
START n=node(0)MATCH (n)--()--(f)RETURN f
Working with Neo4j
REST API => wrappers Neography for Ruby py2neo for Python … Your own wrapper
Java API Direct access in JVM based applications neo4j.rb
Neography – API wrapper example
# create nodes and properties
n1 = Neography::Node.create("age" => 31, "name" => "Max")
n2 = Neography::Node.create("age" => 33, "name" => "Roel")
n1.weight = 190
# create relationships
new_rel = Neography::Relationship.create(:coding_buddies, n1, n2)
n1.outgoing(:coding_buddies) << n2
# get nodes related by outgoing friends relationship
n1.outgoing(:friends)
# get n1 and nodes related by friends and friends of friends
n1.outgoing(:friends).depth(2).include_start_node
Neo4j.rb – JRuby gem example
class Person < Neo4j::Rails::Model
property :name
property :age, :index => :exact # :fulltext
has_n(:friends).to(Person).relationship(Friend)
end
class Friend < Neo4j::Rails::Relationship
property :as
end
mike = Person.new(:name => ‘Mike’, :age => 24)
john = Person.new(:name => ‘John’, :age => 27)
mike.friends << john
mike.save
Our Approach
Relational databases are not so bad Good for basic data storage Widely used for web applications Well supported in Rails via ActiveRecord Performance issues with Neo4j
However, we need a graph database We model the domain as a graph Our recommendation is based on graph
traversal
Our Approach (2)
Hybrid model using both MySQL and Neo4j
MySQL contains basic information about entities
Neo4j contains only relationships Paired via identifiers (neo4j_id)
Our Approach (3)
Recommendation algorithms Made as plugins to Neo4j Written in Java Embedded into Neo4j API
Rails application uses custom made wrapper Creates and modifies nodes and
relationships via API calls Handles recommendation requests
Graph Algorithms
Built-in algorithms Shortest path All shortest paths Dijkstra’s algorithm
Custom algorithms Depth first search Breadth first search Spreading activation Flows, pairing, etc.
Document Similarity
Task: find similarities between documents
Documents data model: Each document is made of sentences Each sentence can be divided into n-grams N-grams are connected with relationships
Neo4J is graph database in Java (Neo4j, graph) – (graph, database) – (database,
Java)
Document Similarity (2)
Detecting similar documents in our graph model Shortest path between documents Number of paths shorter than some
distance Weighing relationships
How about a custom plugin? Spreading activation
Document Similarity (3)
Live Demo…
Document Similarity (4)
Task: recommend movies based on what we like
We like some entities, let’s call them initial Movies People (actors, directors etc.) Genres
We want recommended nodes from input Find nodes which are
The closest to initial nodes The most relevant to initial nodes
Movie Recommendation
165k nodes Movies People Genre
870k relationships Movies – People Movies – Genres
Easy to add more entities Tags, mood, period, etc.
Will it be fast? We need 1-2 seconds
Movie Recommendation (2)
Movie Recommendation (3)
Breadth first search Union Colors Mixing Colors
Modified Dijkstra Weighted relationships between entities
Spreading activation (energy) Each initial node gets same starting energy
Recommendation Algorithms
Union Colors
Mixing Colors
Spreading Activation (Energy)
100.0
100.0
100.0
100.0
Spreading Activation (Energy)
100.0
100.0
100.0
100.0
12.012.0
12.0
Spreading Activation (Energy)
0.0
100.0
100.0
100.0
12.0
10.0
10.0
Spreading Activation (Energy)
0.0
0.0
100.0
100.0
22.0
10.0
8.0
8.0 8.08.0
Spreading Activation (Energy)
0.0
0.0
0.0
100.0
22.0
18.0
Experimental evaluation Which algorithm is the best (rating on scale
1-5) 30 users / 168 scenarios
Recommendation - Evaluation
Spájanie farieb Miešanie farieb Šírenie energie Dijkstra0
0.5
1
1.5
2
2.5
3
3.5
Live Demo…
Movie Recommendation (4)
Movie Recommendation – User Model
Spreading energy Each initial node gets different starting
energy Based on user’s interests and feedback
Improves the recommendation!
Recommendation from subgraph
Recommend movies which are currently in cinemas
Recommend movies which are currently on TV
How? Algorithm will traverse normally Creates a subgraph from which it returns
nodes
Live Demo…
Recommendation from subgraph (2)
TeleVido.tv
Media content recommendation using Neo4j Movie recommendation Recommendation of movies in cinemas Recommendation of TV programs and
schedules
Summary
Graph databases Working with Neo4j and Ruby (On Rails) Plugins and algorithms
Document similarity Movie recommendation Recommendation from subgraph
TeleVido.tv