graphconnect 2014 sf: neo4j at scale using enterprise integration patterns

Brad Nussbaum

CTO | [email protected] | @bradnussbaum

www.mediahound.com

Graph at Scale10/22/14

MediaHound started with a search…

But Entertainment is more than just Film & TV

Music Movies Television Books

Entertainment Preferences Are Scattered

All Content

one platform that connects:

All Sources All Devices

All Brands All Artists All Users

The Entertainment Graph

The Entertainment Graph powers meaningful recommendations, exciting data insights and comprehensive

social discovery.

A comprehensive database that brings together movies, books, games, music, and TV, including the cast & crew, sources, reviews, categories,

genres, lists and more!

Use Cases:

Academy Award Winners On Netflix

MATCH (c:Collection)-[:CONTAINS]-(m:Movie) WHERE (m)-[:MATCHED_SOURCE]-(n:NETFLIX) AND c.name=“Academy Award Winners” RETURN m;

Movies and Shows Based On Zombie Books

MATCH (m:Media)-[:BASED_ON]-(b:Book)-[:HAS_TRAIT]-(z:Trait) WHERE (m.type=“Movie” OR m.type=“Show”) AND z.name=“Zombies” RETURN m;

To access ‘The Movie Graph’ mini app (which uses a different model than above), from your browser, run:play movie graph

Entertainment Recommendations

Collaborative Filtering:

If Joe, Amy and Steve like Gladiator, AND Joe and Amy like Toy Story, THEN MediaHound recommends Toy Story to Steve.

Graph Influencers

Joe is an early adopter.Joe, Amy and Steve like several things in common.MediaHound recommends Amy and Steve follow Joe and AmyJoe discovers the next big hit and shares it on his feedAmy and Steve see Steve’s post and give it a listen.

We needed our graph database to perform under sustained user write load

AND during heavy batch update operations.

We needed to recommend media content in real-time which required many concurrent pattern matching

operations on the graph.

We realized through trial and error that using the Transactional Cypher HTTP Endpoint was the BEST solution to

control batch writes.POST http://localhost:7474/db/data/transaction/commitAccept: application/json; charset=UTF-8Content-Type: application/json{ "statements": [ { "statement": "MERGE(n:User{username:\"bradnussbaum\"})-[:FOLLOWS]->(m:User{username:\"bennussbaum\"});" }, { "statement": "MERGE(n:User{username:\"bradnussbaum\"})<-[:FOLLOWS]-(m:User{username:\"bennussbaum\"});" } ]}

We ran tests using the low-level kernel and found that sustained transaction writes performed optimally between

400-2,000 nodes and relationships per transaction.

http://neo4j.com/docs/stable/linux-performance-guide.htmlgit clone [email protected]:neo4j-contrib/tooling.git

Compare the difference between…• Writing a single relationship (33 bytes) per trans-

action for 10k iterations compared to• Writing 1k relationships per transaction for 10 it-

erations.**As of 2.2, Neo4j will batch writes on the server

http://neo4j.com/docs/stable/linux-performance-guide.html

http://neo4j.com/docs/stable/linux-performance-guide.html

We used Enterprise Integration Patterns (EIP) to create optimal batch sizes for

each transaction.

• Splitters to break down larger messages• Aggregators to combine single CQL statements

together into a single batch transaction• Throttling to control concurrent requests and re-

quests per second

We frequently run 40+ concurrent write transactions to a 3 instance cluster for

hours at time.

Deadlocks can occur often with many concurrent write operations.• Retry Transient Errors after a small period of

time.• Use the Error Index to split failed TX statements.

Read here to learn all the error status codes, seriously.http://neo4j.com/docs/stable/status-codes.html

http://neo4j.com/docs/stable/status-codes.html





Test write throughput on your cluster with the push factor you plan to use in production and intentionally kill your

master under load.You need to have two load balancers:• One for all the instances you want performing reads• One for your master ONLY (send writes here)

Master Check: /db/manage/server/ha/master- Returns true|falseSlave Check: /db/manage/server/ha/slave- Returns true|falseAvailable Check: /db/manage/server/ha/available- Returns master|slave

Check your driver for transaction support. Embedded mode has full transaction

support but most remote drivers do not at this time.

This will be changing in the near future…depending on which driver you use.

**Spring Data Neo4j is actively being developed to included these features as part of 2.2.

Ben Nussbaum

Director of Engineering | [email protected] | @bennussbaum

www.mediahound.com


We built custom algorithms that needed run-time decision making as Neo4j Extensions with Spring Data Neo4j.

https://github.com/AtomRain/neo4j-extensions

• Cache abstraction with Google’s Guava to build large in-memory indexes of nodes and relationships.

• Integration for jobs instructions and results to and from the broker.

• Async for batch job processing.

We took advantage of spot processing from AWS to run our custom extension

algoritms.

• On-demand graph processing with as many instances at a time as needed (we have used up to 9).

• Concurrent job operations per spot.• Cache optimizations based on Labels and

context of the jobs.

We built a flexible job controller that enables concurrent job processing on

spot instances• Large jobs are broken into smaller jobs that can

be processed by a single spot instance.• Spots process unit jobs and return results. If a

spot dies, the job stays in the queue and another spot picks it up.

• Memory and CPU constraints on an instance make this a necessity, especially when processing 30M+ songs.

Spot instances run Neo4j in SINGLE mode and stay up to date using a Topic.

1. ESB sends Jobs to MQ2. Spots consume job instruc-

tions, process and send re-sults back to MQ

3. ESB posts jobs results to HA4. On successful post, send

updates to Topic5. Spots consume from Topic

to stay u to date

Batch jobs return thousands of CQL statements which must not be

dependent on any statements before or after.

• Compound statements to create nodes and relationships for specific sub-graphs to avoid the need for layering wherever possible. If not…

• Run jobs in a linear phases (layering) to create nodes first then connect relationships


Q/A

graphconnect 2014 sf: neo4j at scale using enterprise integration patterns

Technology

graph database

graph influencersjoe

movie graph mini app

entertainment preferences

toy story

control batch

netflixmatch c

benn mergen