neo4j and python: playing with graph data - pycon india 2014 talk

69
PyCon India 2014 created by Sonal Raj Neo4j and Python Playing with graph data Graph Everything Sonal Raj

Upload: sonal-raj

Post on 05-Dec-2014

569 views

Category:

Technology


0 download

DESCRIPTION

Neo4J and Python: Playing with graph data. This talk introduces the world of graphs, their utility and the efficient use of the Neo4J graph database for some super cool day to day applications with the help of py2neo. Follow the development of the talk at http://www.sonalraj.com/pycon14.html Check out the details at the PyCon India 2014 website: http://in.pycon.org/funnel/2014/252-neo4j-and-python-playing-with-graph-data

TRANSCRIPT

Page 1: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Neo4j and Python Playing with graph data

Graph Everything

Sonal Raj

Page 2: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

The Plan for today

Graphs and NOSQL

Step One

Neo4j and Cypher

Step Two

4Step Two

Use CasesPy2neo and

REST

Step Two

Page 3: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Once upon a time..

Page 4: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Once upon a time..• Relational databases ruled the earth . .

• Data was stored in Tables, Rows and Columns

• Connections using Primary keys, Foreign keys . .

• That’s all that is relational about then

• No on-the-fly structural (schema) changes

• Horrible for Interconnected data ( joins, really? )

Page 5: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Page 6: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Page 7: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Page 8: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Page 9: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

RDBMS

Page 10: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Elastic scaling – Scale out, not up

Page 11: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Elastic scaling – Scale out, not up

Big data, Transaction Friendly

Page 12: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Elastic scaling – Scale out, not up

Big data, Transaction Friendly

Economical, can run on commodity hardware

Page 13: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Elastic scaling – Scale out, not up

Big data, Transaction Friendly

Economical, can run on commodity hardware

End of the DBA rule

Page 14: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

In the NOSQL Space

Elastic scaling – Scale out, not up

Big data, Transaction Friendly

Economical, can run on commodity hardware

End of the DBA rule

Flexible Data models

Page 15: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

Graph Trivia

Page 16: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Where are Graphs . . .

Page 17: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Where are Graphs . . .

Page 18: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Where are Graphs . . .

Page 19: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Where are Graphs . . .

Page 20: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Some Graphswe overlook . .

Page 21: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Some Graphswe overlook . .

Page 22: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Some Graphswe overlook . .

Page 23: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Apart from that

Fraud AnalysesInvestment securities &debt analysis

RecommendationEngines

Impact Analysis innetworks

Page 24: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

So, Why Graphs ?

• Increasing Connectivity of Data

• Increasing Semi-Structredness

• Rising Complexity

Page 25: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

So, Why Graphs ?

• Increasing Connectivity of Data

• Increasing Semi-Structredness

• Rising Complexity

Seven Bridges of Königsberg

Leonhard Euler in 1735

Page 26: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Property Graphs

Page 27: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Property Graphs

- Has nodes

Page 28: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Property Graphs

- Has nodes- Has properties for

each node

Page 29: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Property Graphs

- Has nodes- Has properties for

each node- Has Relationships

Page 30: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Property Graphs

- Has nodes- Has properties for

each node- Has Relationships- Has properties for

each relationship

Page 31: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Building Blocks

Nodes

Relationships

Labels

Graph Database

Properties

Page 32: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Data Models

Native Graphs

Inherently store data as nodes and relationships.

Page 33: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Data Models

Native Graphs

Inherently store data as nodes and relationships.

The Other ones . . .

Data stored in tables, joins and aggregates to simulate a graph

Page 34: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Data Models

Native Graphs

Inherently store data as nodes and relationships.

The Other ones . . .

Data stored in tables, joins and aggregates to simulate a graph

Page 35: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Why Neo4j ?

Schema-less property graph

Page 36: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Why Neo4j ?

Schema-less property graph

Handles complex connected data efficiently

Page 37: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Why Neo4j ?

Schema-less property graph

Handles complex connected data efficiently

Fully ACID Transactions

Page 38: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Why Neo4j ?

Schema-less property graph

Handles complex connected data efficiently

Fully ACID Transactions

Highly Scalable, High Availability Clusters

Page 39: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Why Neo4j ?

Schema-less property graph

Handles complex connected data efficiently

Fully ACID Transactions

Highly Scalable, High Availability Clusters

REST API for servers. Can be embedded to applications on JVM.

Page 40: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Why Neo4j ?

Schema-less property graph

Handles complex connected data efficiently

Fully ACID Transactions

Highly Scalable, High Availability Clusters

REST API for servers. Can be embedded to applications on JVM.

Cypher – a declarative querying solution

Graph DB with good native python bindings . .

Page 41: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Cypher in action

• Highly expressive query language

• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.

• Uses pattern matching expressions.

1 2

(1) – [ :label ] - (2)

label

Page 42: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Cypher in action

• Highly expressive query language

• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.

• Uses pattern matching expressions.

1 2

START n=(1), m=(2)

MATCH n – [r:label] – m

RETURN r

label

Page 43: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Cypher in action

• Highly expressive query language

• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.

• Uses pattern matching expressions.

• To make life easy for some, it is inspired by SQL.

1 2

START n=(1), m=(2)

MATCH n – [r:label] – m

RETURN r

label

Page 44: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Cypher in actionCreate

Read

CREATE (n:Person { name : ‘Chuck Norris', title : ‘Analyst' })

RETURN n

MATCH (a:Person),(b:Person)

WHERE a.name = ‘Chuck' AND b.name = ‘Rajani'

CREATE (a)-[r:RELTYPE { name : ‘cannot_find’ }]->(b)

RETURN r

MATCH (n) RETURN n #everything is returned

MATCH (n:Label) RETURN n #all with specific label

MATCH (Titanic { title:‘Titanic' })<-[:ACTED_IN|:DIRECTED]-(person)

RETURN person

Page 45: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Cypher in actionUpdate

Delete

MATCH (n { name: 'Andres' })

SET n.surname = 'Taylor'

RETURN n

MATCH (peter { name: 'Peter' })

SET peter += { hungry: TRUE , position: 'Entrepreneur' }

MATCH (n { name: 'Peter' })

REMOVE n.title

REMOVE n:German

RETURN n

SET n.name = NULL

Page 46: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

REST in peace !!Create

POST http://localhost:7474/db/data/node

{

"foo" : "bar"

}

POST http://localhost:7474/db/data/node/1/relationships

{

"to" : "http://localhost:7474/db/data/node/10",

"type" : "LOVES",

"data" : {

"foo" : "bar"

}

}

POST http://localhost:7474/db/data/schema/index/person

{

"property_keys" : [ "name" ]

}

Page 47: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

REST in peace !!Read

Update

Delete

GET http://localhost:7474/db/data/node/144

GET http://localhost:7474/db/data/relationship/65

GET http://localhost:7474/db/data/relationship/61/properties

GET http://localhost:7474/db/data/schema/index/user

PUT http://localhost:7474/db/data/relationship/66/properties

{

"happy" : false

}

PUT http://localhost:7474/db/data/relationship/60/properties/cost

"deadly"

DELETE http://localhost:7474/db/data/node/308

DELETE http://localhost:7474/db/data/relationship/58

DELETE http://localhost:7474/db/data/schema/index/SomeLabel/name

Page 48: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

Beauty of py2neo

Page 49: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk
Page 50: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

For the pythonistasAs simple as that!

from py2neo import neo4j

graph_db = neo4j.GraphDatabaseService("http://localhost:7474/db/data/")

from py2neo import node, rel die_hard = graph_db.create(

node(name="Bruce Willis"),

node(name="John McClane"),

node(name="Alan Rickman"),

node(name="Hans Gruber"),

node(name="Nakatomi Plaza"),

rel(0, "PLAYS", 1),

rel(2, "PLAYS", 3),

rel(1, "VISITS", 4),

rel(3, "STEALS_FROM", 4),

rel(1, "KILLS", 3),

)

Page 51: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

For the pythonistas

graphdb • clear()

• create(*abstracts)

• delete(*entities)

• delete_index(content_type, index_name)

• find(label, property_key=None, property_value=None)

• get_index(content_type, index_name)

• get_indexed_node(index_name, key, value)

• ...

Page 52: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

For the pythonistas

• get_indexed_relationship(index_name, key, value)

• get_properties(*entities)

• match(start_node=None, rel_type=None, end_node=None,

bidirectional=False, limit=None)

• match_one(start_node=None, rel_type=None, end_node=None,

bidirectional=False)

• node(id_)

• get_or_create_index(content_type, index_name, config=None)

• get_or_create_indexed_node(index_name, key, value,

properties=None)

graphdb

Page 53: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Complexity Handling

“ A graph database without traversals is just a persistent graph ”

Page 54: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Paths with py2neo#Create Paths

from py2neo import neo4j, node

a, b, c = node(name="Alice"), node(name="Bob"), node(name="Carol")

abc = neo4j.Path(a, ’KNOWS’, b, ’KNOWS’, c)

d, e = node(name=“Doctor”), node(name=“Easter”)

de = neo4j.Path(d, ‘KNOWS’, e)

#Join paths

abcde = neo4j.Path.join(abc, ‘KNOWS’, de)

#commit to the db

abcde.get_or_create(graph_db)

Page 55: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Schema, Indices with py2neo#The class

py2neo.neo4j.Schema

py2neo.neo4j.Index

#Join paths

create_index(label, property_key)

drop_index(label, property_key)

get_indexed_property_keys(label)

add_if_none(key, value, entity)

#Apache Lucene Query

people = graph_db.get_or_create_index(neo4j.Node, "People")

s_people = people.query("family_name:S*")

Page 56: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk
Page 57: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

neo4j.Node

Page 58: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

neo4j.Relationship

Page 59: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Cypher with py2neo#Create transaction object

from py2neo import cypher

Session = cypher.Session(“http://localhost:7474/”)

tx = session.create_transaction()

#Add transactions, execute or commit

tx.append(“some cypher query”)

tx.append(“some cypher query”)

tx.execute()

tx.append(“some cypher query”)

tx.commit()

#The classical way

from py2neo import neo4j

graph_db = neo4j.GraphDatabaseSercice()

query = neo4j.CypherQuery(graph_db, ‘your cypher query’)

query.execute()

#query.stream()

Page 60: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Command Line neotool#Syntax of operation

neotool [<option>] <command> <args>

Or python –m py2neo.tool ..

#Some serious examples

neotool clear

neotool cypher "start n=node(1) return n, n.name?“

neotool cypher-csv "start n=node(1) return n.name, n.age?"

neotool cypher-tsv "start n=node(1) return n.name, n.age?"

#Guess what, you can also access the shell

neotool shell

Page 61: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Neo4j level 2

• Batch Inserter

• High Availability

• Built-in online backup tools

• HTTPS support

Page 62: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Neo4j level 2

• Batch Inserter

• High Availability

• Built-in online backup tools

• HTTPS support

Neo4J Framework.

• GraphUnit, for unit testing neo4j• Libraries for performance and API testing• Batch Transaction tools• Transaction Event tools• Some other utilities . .

Page 63: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

Use Cases

Page 64: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Recommendation Engines Complex pattern matching

Page 65: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Social Network Data Many entities, highly interconnected

Page 66: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Map DataTraversals and routing

Page 67: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk
Page 68: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

PyCon India 2014• • created by Sonal Raj •

Python FamilyRelatives for Neo4j

Neo4

Page 69: Neo4j and Python: Playing with graph data - PyCon India 2014 Talk

• •PyCon India 2014

Thank YouNeo appreciates your patience.

• Sonal Raj

• http://www.sonalraj.com/

• http://github.com/sonal-raj/

[email protected]