neo4j and python: playing with graph data - pycon india 2014 talk
DESCRIPTION
Neo4J and Python: Playing with graph data. This talk introduces the world of graphs, their utility and the efficient use of the Neo4J graph database for some super cool day to day applications with the help of py2neo. Follow the development of the talk at http://www.sonalraj.com/pycon14.html Check out the details at the PyCon India 2014 website: http://in.pycon.org/funnel/2014/252-neo4j-and-python-playing-with-graph-dataTRANSCRIPT
PyCon India 2014• • created by Sonal Raj •
Neo4j and Python Playing with graph data
Graph Everything
Sonal Raj
PyCon India 2014• • created by Sonal Raj •
The Plan for today
Graphs and NOSQL
Step One
Neo4j and Cypher
Step Two
4Step Two
Use CasesPy2neo and
REST
Step Two
PyCon India 2014• • created by Sonal Raj •
Once upon a time..
PyCon India 2014• • created by Sonal Raj •
Once upon a time..• Relational databases ruled the earth . .
• Data was stored in Tables, Rows and Columns
• Connections using Primary keys, Foreign keys . .
• That’s all that is relational about then
• No on-the-fly structural (schema) changes
• Horrible for Interconnected data ( joins, really? )
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
RDBMS
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
End of the DBA rule
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
End of the DBA rule
Flexible Data models
Graph Trivia
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Some Graphswe overlook . .
PyCon India 2014• • created by Sonal Raj •
Some Graphswe overlook . .
PyCon India 2014• • created by Sonal Raj •
Some Graphswe overlook . .
PyCon India 2014• • created by Sonal Raj •
Apart from that
Fraud AnalysesInvestment securities &debt analysis
RecommendationEngines
Impact Analysis innetworks
PyCon India 2014• • created by Sonal Raj •
So, Why Graphs ?
• Increasing Connectivity of Data
• Increasing Semi-Structredness
• Rising Complexity
PyCon India 2014• • created by Sonal Raj •
So, Why Graphs ?
• Increasing Connectivity of Data
• Increasing Semi-Structredness
• Rising Complexity
Seven Bridges of Königsberg
Leonhard Euler in 1735
PyCon India 2014• • created by Sonal Raj •
Property Graphs
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes- Has properties for
each node
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes- Has properties for
each node- Has Relationships
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes- Has properties for
each node- Has Relationships- Has properties for
each relationship
PyCon India 2014• • created by Sonal Raj •
Building Blocks
Nodes
Relationships
Labels
Graph Database
Properties
PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data as nodes and relationships.
PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data as nodes and relationships.
The Other ones . . .
Data stored in tables, joins and aggregates to simulate a graph
PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data as nodes and relationships.
The Other ones . . .
Data stored in tables, joins and aggregates to simulate a graph
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
REST API for servers. Can be embedded to applications on JVM.
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
REST API for servers. Can be embedded to applications on JVM.
Cypher – a declarative querying solution
Graph DB with good native python bindings . .
PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
1 2
(1) – [ :label ] - (2)
label
PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
1 2
START n=(1), m=(2)
MATCH n – [r:label] – m
RETURN r
label
PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
• To make life easy for some, it is inspired by SQL.
1 2
START n=(1), m=(2)
MATCH n – [r:label] – m
RETURN r
label
PyCon India 2014• • created by Sonal Raj •
Cypher in actionCreate
Read
CREATE (n:Person { name : ‘Chuck Norris', title : ‘Analyst' })
RETURN n
MATCH (a:Person),(b:Person)
WHERE a.name = ‘Chuck' AND b.name = ‘Rajani'
CREATE (a)-[r:RELTYPE { name : ‘cannot_find’ }]->(b)
RETURN r
MATCH (n) RETURN n #everything is returned
MATCH (n:Label) RETURN n #all with specific label
MATCH (Titanic { title:‘Titanic' })<-[:ACTED_IN|:DIRECTED]-(person)
RETURN person
PyCon India 2014• • created by Sonal Raj •
Cypher in actionUpdate
Delete
MATCH (n { name: 'Andres' })
SET n.surname = 'Taylor'
RETURN n
MATCH (peter { name: 'Peter' })
SET peter += { hungry: TRUE , position: 'Entrepreneur' }
MATCH (n { name: 'Peter' })
REMOVE n.title
REMOVE n:German
RETURN n
SET n.name = NULL
PyCon India 2014• • created by Sonal Raj •
REST in peace !!Create
POST http://localhost:7474/db/data/node
{
"foo" : "bar"
}
POST http://localhost:7474/db/data/node/1/relationships
{
"to" : "http://localhost:7474/db/data/node/10",
"type" : "LOVES",
"data" : {
"foo" : "bar"
}
}
POST http://localhost:7474/db/data/schema/index/person
{
"property_keys" : [ "name" ]
}
PyCon India 2014• • created by Sonal Raj •
REST in peace !!Read
Update
Delete
GET http://localhost:7474/db/data/node/144
GET http://localhost:7474/db/data/relationship/65
GET http://localhost:7474/db/data/relationship/61/properties
GET http://localhost:7474/db/data/schema/index/user
PUT http://localhost:7474/db/data/relationship/66/properties
{
"happy" : false
}
PUT http://localhost:7474/db/data/relationship/60/properties/cost
"deadly"
DELETE http://localhost:7474/db/data/node/308
DELETE http://localhost:7474/db/data/relationship/58
DELETE http://localhost:7474/db/data/schema/index/SomeLabel/name
Beauty of py2neo
PyCon India 2014• • created by Sonal Raj •
For the pythonistasAs simple as that!
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseService("http://localhost:7474/db/data/")
from py2neo import node, rel die_hard = graph_db.create(
node(name="Bruce Willis"),
node(name="John McClane"),
node(name="Alan Rickman"),
node(name="Hans Gruber"),
node(name="Nakatomi Plaza"),
rel(0, "PLAYS", 1),
rel(2, "PLAYS", 3),
rel(1, "VISITS", 4),
rel(3, "STEALS_FROM", 4),
rel(1, "KILLS", 3),
)
PyCon India 2014• • created by Sonal Raj •
For the pythonistas
graphdb • clear()
• create(*abstracts)
• delete(*entities)
• delete_index(content_type, index_name)
• find(label, property_key=None, property_value=None)
• get_index(content_type, index_name)
• get_indexed_node(index_name, key, value)
• ...
PyCon India 2014• • created by Sonal Raj •
For the pythonistas
• get_indexed_relationship(index_name, key, value)
• get_properties(*entities)
• match(start_node=None, rel_type=None, end_node=None,
bidirectional=False, limit=None)
• match_one(start_node=None, rel_type=None, end_node=None,
bidirectional=False)
• node(id_)
• get_or_create_index(content_type, index_name, config=None)
• get_or_create_indexed_node(index_name, key, value,
properties=None)
graphdb
PyCon India 2014• • created by Sonal Raj •
Complexity Handling
“ A graph database without traversals is just a persistent graph ”
PyCon India 2014• • created by Sonal Raj •
Paths with py2neo#Create Paths
from py2neo import neo4j, node
a, b, c = node(name="Alice"), node(name="Bob"), node(name="Carol")
abc = neo4j.Path(a, ’KNOWS’, b, ’KNOWS’, c)
d, e = node(name=“Doctor”), node(name=“Easter”)
de = neo4j.Path(d, ‘KNOWS’, e)
#Join paths
abcde = neo4j.Path.join(abc, ‘KNOWS’, de)
#commit to the db
abcde.get_or_create(graph_db)
PyCon India 2014• • created by Sonal Raj •
Schema, Indices with py2neo#The class
py2neo.neo4j.Schema
py2neo.neo4j.Index
#Join paths
create_index(label, property_key)
drop_index(label, property_key)
get_indexed_property_keys(label)
add_if_none(key, value, entity)
#Apache Lucene Query
people = graph_db.get_or_create_index(neo4j.Node, "People")
s_people = people.query("family_name:S*")
neo4j.Node
neo4j.Relationship
PyCon India 2014• • created by Sonal Raj •
Cypher with py2neo#Create transaction object
from py2neo import cypher
Session = cypher.Session(“http://localhost:7474/”)
tx = session.create_transaction()
#Add transactions, execute or commit
tx.append(“some cypher query”)
tx.append(“some cypher query”)
tx.execute()
tx.append(“some cypher query”)
tx.commit()
#The classical way
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseSercice()
query = neo4j.CypherQuery(graph_db, ‘your cypher query’)
query.execute()
#query.stream()
PyCon India 2014• • created by Sonal Raj •
Command Line neotool#Syntax of operation
neotool [<option>] <command> <args>
Or python –m py2neo.tool ..
#Some serious examples
neotool clear
neotool cypher "start n=node(1) return n, n.name?“
neotool cypher-csv "start n=node(1) return n.name, n.age?"
neotool cypher-tsv "start n=node(1) return n.name, n.age?"
#Guess what, you can also access the shell
neotool shell
PyCon India 2014• • created by Sonal Raj •
Neo4j level 2
• Batch Inserter
• High Availability
• Built-in online backup tools
• HTTPS support
PyCon India 2014• • created by Sonal Raj •
Neo4j level 2
• Batch Inserter
• High Availability
• Built-in online backup tools
• HTTPS support
Neo4J Framework.
• GraphUnit, for unit testing neo4j• Libraries for performance and API testing• Batch Transaction tools• Transaction Event tools• Some other utilities . .
Use Cases
PyCon India 2014• • created by Sonal Raj •
Recommendation Engines Complex pattern matching
PyCon India 2014• • created by Sonal Raj •
Social Network Data Many entities, highly interconnected
PyCon India 2014• • created by Sonal Raj •
Map DataTraversals and routing
PyCon India 2014• • created by Sonal Raj •
Python FamilyRelatives for Neo4j
Neo4
• •PyCon India 2014
Thank YouNeo appreciates your patience.
• Sonal Raj
• http://www.sonalraj.com/
• http://github.com/sonal-raj/