designing cypher - goto conference · 2016. 10. 15. · designing cypher (a graph query language)...

Post on 29-Aug-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Designing Cypher (a graph query language)

Narrated by Tobias Lindaaker, Developer at Neo Technology

tobias@neotechnology.com #neo4j,#cypher @thobe

Once upon a time…

~2001

in the kingdom of Sweden

~2001

there was a DBMS that had some interesting things going for it…

~2001

a few years later…This presentation has been given at StrangeLoop,

the video is online: youtu.be/l-n8yj6_RgU

This is a Stand-Alone Sequel

~2016

The Precursors to Cypher

Embedded Java API HTTP API

server model

2001 2006 2010

Custom code deployment

2011 (July)

First release of Cypher

The Origin of Cypher

(query)--[MODELED_AS]--->(drawing) ^ | | | [IMPLEMENTS] [TRANSLATED_TO] | | | v (code)<-[IN_COMMENT_OF]-(ascii art)

The Origin of Cypher

(query)--[MODELED_AS]--->(drawing) ^ | | | [IMPLEMENTS] [TRANSLATED_TO] | | | v (code)<-[IN_COMMENT_OF]-(ascii art)

MATCH (query)-[:MODELED_AS]->(drawing), (code)-[:IMPLEMENTS]->(query), (drawing)-[:TRANSLATED_TO]->(ascii_art) (ascii_art)-[:IN_COMMENT_OF]->(code) WHERE query.id = query_id RETURN code.source

The Origin of Cypher

v1: Read Only

START john=node:Person(name="John") MATCH (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) WHERE NOT (john)-[:KNOWS]-(foaf) RETURN foaf

July 2011 (neo4j 1.4)

v2: Graph write no update of search structures

START john=node:Person(name="John") MATCH (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) WHERE NOT (john)-[:KNOWS]-(foaf) AND NOT (john)-[:RECOMMENDATION]->(foaf)

CREATE (john)-[:RECOMMENDATION]->(foaf)

RETURN foaf

Oct 2012 (neo4j 1.8)

Neo4j 2.0: labels and proper indexes Cypher “feature complete”

MATCH (john:Personname:"John"), (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) WHERE NOT (john)-[:KNOWS]-(foaf) MERGE (john)-[:RECOMMENDATION]->(foaf)

(neo4j 2.0) Dec 2013

A brief Cypher overviewPattern matching:MATCH (n), (a)-[:REL]->(b) + filtering: WHERE n=b AND a.val < n.val Returning results RETURN x.name AS name ORDER BY x.score LIMIT 10 Creating data: CREATE (p:Personname:”Tobias”) Updating data: SET n.name = ”John” Deleting data:REMOVE p.age DELETE n

Carrying results from a query into a new MATCH (a)-[:KNOWS]->(b) WITH a, avg(b.age) AS frAge WHERE frAge > 15 MATCH (a)<-[:FOLLOWS]-(c) RETURN c.name

OPTIONAL MATCH Match-or-null

MERGE Match-or-createwith ON MATCH and ON CREATE to perform updates

Multiple versions

Put the version in the query, and support multiple versions

of Cypher in the Neo4j Server

CYPHER 2.0 MATCH (n) RETURN n

Improve based on user feedback… good, but not as good as we hoped…

photo credits: Columbia Journalism Review

Query caching - amortise cost of optimisation

Whole program analysis

MATCH (john:Personname:"John"), (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) RETURN friend.name

Whole program analysis

MATCH (john:Personname:"John"), (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) RETURN friend.name -[:KNOWS]-(foaf)

Recent additions: Procedures

CALL dbms.listProcedures() YIELD name, signature, description

MATCH (me:Personname:myName), (me)-[:KNOWS]-()-[:KNOWS]-(foaf) WHERE NOT (me)-[:KNOWS]-(foaf) CALL apoc.load.jdbc('mysql:…', "SELECT * FROM people WHERE id = " + foaf.id) YIELD row RETURN foaf.name, row.address

it isn’t all roses

Semantically different things look similar

MATCH (n:Measurement) RETURN abs(n.value)

MATCH (n:Measurement) RETURN avg(n.value)

MATCH (a:Fookey:a), (b:Fookey:b), p=(a)-[:BAR*1..]-(b) WHERE all(n in nodes(p) WHERE n.value > m) RETURN length(p)

Constructs with intricate semantics

Deprecated:MATCH (a:Fookey:a), (b:Fookey:b) CREATE UNIQUE (a)-[:KNOWS]-(u)-[:KNOWS]-(b)

Better:MATCH (a:Fookey:a), (b:Fookey:b) MERGE (a)-[:KNOWS]-(u)-[:KNOWS]-(b)

Constructs with intricate semantics

MATCH (a)-->(b)<--(c) RETURN a.key,b.key,c.key

vs

MATCH (a)-->(b) MATCH (b)<--(c)

RETURN a.key,b.key,c.key

key:x key:y

Constructs with intricate semantics

MATCH (a)-->(b)<--(c) RETURN a.key,b.key,c.key

vs

MATCH (a)-->(b) MATCH (b)<--(c)

RETURN a.key,b.key,c.key

key:x key:y

<no rows>

a.key:’x’, b.key:’y’, c.key:’x’

Constructs with intricate semantics

MATCH (a)-->(b)<--(c) RETURN a.key,b.key,c.key

vs

MATCH (a)-->(b) MATCH (b)<--(c)

RETURN a.key,b.key,c.key

key:x key:y

<no rows>

a.key:’x’, b.key:’y’, c.key:’x’

MATCH (a)-[x]->(b) MATCH (b)<-[y]-(c)

WHERE x <> y RETURN a.key,b.key,c.key

“Syntactic sugar” vs single canonical syntax

MATCH (n) WHERE n.foo = "bar" vs

MATCH (nfoo: "bar")

MATCH (n WHERE foo < 10) vs

MATCH (n) WHERE n.foo < 10

Predicates on variable length paths

MATCH (a)-[r*]->(b) WHERE all(x IN r WHERE x.weight > 0)

would be simpler if it could be written as:

MATCH (a)-[r* WHERE weight > 0]->(b)

although there are other problems that would come from that…

LOAD CSV will be replaced

LOAD CSV WITH HEADERS FROM "some.csv" AS line

CALL apoc.load.csv("some.csv") YIELD map AS line

Parameters avoid SQL injection

Labels and Relationship Types cannot be passed as parameters

MATCH (n:label) SET n:label

6/20/2016 https://s3.amazonaws.com/artifacts.opencypher.org/railroad/Cypher.svg

https://s3.amazonaws.com/artifacts.opencypher.org/railroad/Cypher.svg 1/1

QueryOptions CREATE Index

DROP Index

CREATE UniqueConstraint

DROP UniqueConstraint

CREATE NodePropertyExistenceConstraint

DROP NodePropertyExistenceConstraint

CREATE RelationshipPropertyExistenceConstraint

DROP RelationshipPropertyExistenceConstraint

LoadCSV

Start

Match

Unwind

Merge

Create

Set

Delete

Remove

Foreach

With

Return

ALL UNION

BulkImportQuery

;

openCypher

Opening the language design process

Implementations for other platforms

Compatibility test suite

Grammar specification

Reference implementation

Defining the next version of Cypher

https://github.com/openCypher/openCypher

Future features

Cypher keeps evolving

You can get involved through openCypher

Sub queries are powerful

MATCH (me:Personname:myName), (me)-[:FRIEND]-(friend) WITH me, collect(friend) AS friends MATCH (me)-[:ENEMY]-(enemy) RETURN me, friends, collect(enemy) AS enemies

Sub-queries for side effectsMATCH (me:Personname:myName), (me)-[:FRIEND]-(friend) WITH me, collect(friend) AS friends MATCH (me)-[:ENEMY]-(enemy) DO UNWIND friends AS friend MERGE (friend)-[:ENEMY]-(enemy)

DO will replace FOREACH

Existential sub-queries

MATCH (actor:Actor) WHERE EXISTS (actor)-[:ACTED_IN]->(movie), (other:Actor)-[:ACTED_IN]->(movie) WHERE other.name = actor.name AND actor <> other RETURN actor

More Sub-queriesMATCH (me:Username:username)-[:FOLLOWS]->(user)WHERE user.country = country MATCH // authored tweets MATCH (user)<-[:AUTHORED]-(tweet:Tweet) RETURN tweet, tweet.time AS time UNION // favorited tweets MATCH (user)<-[:HAS_FAVOURITE]-(favorite)-[:TARGETS]->(tweet:Tweet) RETURN tweet, favourite.time AS time RETURN DISTINCT tweet ORDER BY time DESC LIMIT 100

Projections and comprehensionMATCH (person:Personssn:mySSN), (person)-[emp:EMPLOYED_BY]->(employer)WHERE NOT exists(emp.endDate) WITH person, employer.name AS employer ORDER BY emp.startDate DESC LIMIT 1 RETURN person .ssn, .firstName, .lastName, employer, friends: [ MATCH (person)-[:FRIEND]-(friend) WHERE friend.age > 12 RETURN friend .ssn, .firstName, .lastName ]

Inspired by Facebook’s GraphQLReturns a single column ‘person’ containing: ssn: “192168-0001”, firstName: “John”, lastName: “Smith”, employer: “The Company, Inc.”, friends: [ ssn: “009933-1126”, firstName: “Marty”, lastName: “McFly”, ssn: “123987-4506”, firstName: “Emmet”, lastName: “Brown”, ]

The End

top related