a walk in graph databases v1.0
TRANSCRIPT
Pierre De Wilde4 May 2012Global Brain InstituteVUB - ECCO Group
A Walk in Graph Databases
The Law of the Hammer
If the only tool you have is a hammer, everything looks like a nail.
Abraham Maslow - The Psychology of Science - 1966
The Law of the Relational Database
If the only tool you have is a relational database, everything looks like a table.
A Walk in Graph Databases - 2012
One size fits all
Scalability issueScale upScale out
Index-intensive issue
Find dataJoin data
doesn't
NoSQL ?! No SQL ? Not only SQL !
Scalability solutionsKey-value storesColumn databasesDocument databases
Index-intensive solution
Graph databases
Query language for relational databases
SQL
ISUD or CRUD
Query language for relational databases
Gremlin is a graph traversal language
Traversal graph
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Graph
G = (V, E)
--. .-. .- .--. ....
One graph doesn't fit all
Marko A. Rodriguez and Peter Neubauer - Constructions from Dots and Lines - 2010
Property graph
A property graph is a directed, labeled, attributed, multi graph.
Anatomy of a vertex
A vertex is composed of - an unique identifier (id)- a collection of properties- a set of incoming edges (inE)- a set of outgoing edges (outE)
Anatomy of an edge
An edge is composed of - an unique identifier (id)- an outgoing vertex (outV)- a label- an incoming vertex (inV)- a collection of properties
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Graph database
Key feature of a graph database
Index-free adjacency
Some graph database vendors
Neo4j from Neo Technologyhttp://neo4j.org/ OrientDB from Orient Technologieshttp://www.orientdb.org/ Dex from Sparsity-Technologieshttp://www.sparsity-technologies.com/dex InfiniteGraph from Objectivity, Inc.http://www.infinitegraph.com/
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
TinkerPop
Open source project in the graph space
TinkerPop family
https://github.com/tinkerpop
Gremlin
Gremlin is a graph traversal language
$ gremlin.sh \,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin>
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Connect to a graph database
gremlin> g = new TinkerGraph(name) gremlin> g = new Neo4jGraph(name) gremlin> g = new OrientGraph(name) gremlin> g = new DexGraph(name) gremlin> g = new IGGraph(name)
Add a vertex / an edge
gremlin> v1 = g.addVertex()gremlin> v2 = g.addVertex()... gremlin> g.addEdge(v1, 'knows', v2)... gremlin> g.loadGraphML(url)
Update a vertex
gremlin> v = g.getVertex(1)==>v[1] gremlin> v.getPropertyKeys()==>age==>name gremlin> v.getProperty('name')==>markogremlin> v.getProperty('age')==>29gremlin> v.setProperty('age',32)==>32 gremlin> v.age==>32gremlin> v.name==>marko
Update an edge
gremlin> e = g.getEdge(8)==>e[8][1-knows->4] gremlin> e.getPropertyKeys()==>weight gremlin> e.getProperty('weight')==>1.0gremlin> e.setProperty('weigth',0.9)==>0.9 gremlin> e.map() ==>weigth=0.9==>weight=1.0 gremlin> e.removeProperty('weigth')==>0.9
Remove a vertex
gremlin> v = g.getVertex(3)==>v[3]gremlin> g.removeVertex(v)==>null
Remove an edge
gremlin> e = g.getEdge(10)==>e[10][4-created->5]gremlin> g.removeEdge(e)==>null
Disconnect from the graph database
gremlin> g.shutdown()
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Graph traversal
Jump - from vertex to edge- from edge to vertex- from vertex to vertex
Graph traversal: starting the traversal
gremlin> g.v(1)==>v[1]
Graph traversal: outgoing edges
gremlin> g.v(1).outE==>e[7][1-knows->2]==>e[9][1-created->3]==>e[8][1-knows->4]
Graph traversal: incoming vertices
gremlin> g.v(1).outE.inV==>v[2]==>v[4]==>v[3]
Graph traversal: outgoing edges (cont.)
gremlin> g.v(1).outE.inV.outE==>e[10][4-created->5]==>e[11][4-created->3]
Graph traversal: incoming vertices (cont.)
gremlin> g.v(1).outE.inV.outE.inV==>v[5]==>v[3]
Graph traversal: ending the traversal
gremlin> g.v(1).outE.inV.outE.inV.outE
Graph traversal: starting vertex
gremlin> g.v(1)==>v[1]
Graph traversal: adjacent vertices
gremlin> g.v(1).out==>v[2]==>v[4]==>v[3]
Graph traversal: adjacent vertices (cont.)
gremlin> g.v(1).out.out==>v[5]==>v[3]
Graph traversal: starting vertex
gremlin> g.v(1)==>v[1]
Graph traversal: labeled outgoing edges
gremlin> g.v(1).outE('created')==>e[9][1-created->3]
Graph traversal: labeled adjacent vertices
gremlin> g.v(1).outE('created').inV==>v[3] gremlin> g.v(1).out('created')==>v[3]
Graph traversal: labeled adjacent (cont.)
gremlin> g.v(1).out('created').in('created')==>v[1]==>v[4]==>v[6]
Graph traversal and ...
indextransformfiltercomputemanipulatelooppath
Graph traversal and index
gremlin> g.idx('vertices')[[name:'marko']]==>v[1]
Graph traversal and transform
gremlin> g.v(1).outE.label.dedup==>knows==>created gremlin> g.v(1).out('knows').name==>vadas==>josh
Graph traversal and filter
gremlin> g.v(1).out('knows').age==>27==>32 gremlin> g.v(1).out('knows').filter{it .age>30}.age==>32
Graph traversal and compute
gremlin> g.v(1).outE.weight==>0.5==>1.0==>0.4 gremlin> g.v(1).outE.weight.mean()==>0.6333333353201548
Graph traversal and manipulate
gremlin> g.v(1).outE.sideEffect{it. weight+=0.1}.weight==>0.6==>1.1==>0.5
Graph traversal and loop
gremlin> g.v(1).out.loop(1){it.loops<3}==>v[5]==>v[3]
Graph traversal and path
gremlin> g.v(1).outE.inV.path ==>[v[1], e[7][1-knows->2], v[2]]==>[v[1], e[8][1-knows->4], v[4]]==>[v[1], e[9][1-created->3], v[3]] gremlin> g.v(1).out.path==>[v[1], v[2]]==>[v[1], v[4]]==>[v[1], v[3]]
Global traversal: in-degree distribution
gremlin> m=[:].withDefault{0}; g.V.sideEffect{m[it.in.count()]+=1}.iterate(); m.sort()==>0=2==>1=3==>3=1
Walk is ending
Gremlin is a graph traversal languageflexible
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Linked Data
http://www.w3.org/DesignIssues/LinkedData.html
Linked Data cloud
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Linked Data initiative
http://freeyourmetadata.org/
Linked Data and Gremlin
gremlin> g = new SparqlRepositorySailGraph("http://dbpedia.org/sparql") gremlin> v = g.v(' http://dbpedia.org/resource/Global_brain')==>v[http://dbpedia.org/resource/Global_brain] gremlin> v.out('http://www.w3.org/2000/01/rdf-schema#comment').has('lang','en').value==>The Global Brain is a metaphor for the worldwide intelligent network... gremlin> v.inE('http://dbpedia.org/ontology/knownFor').outV==>v[http://dbpedia.org/resource/Francis_Heylighen] gremlin> v.inE('http://dbpedia.org/ontology/knownFor').outV.outE('http://dbpedia.org/ontology/knownFor').inV==>v[http://dbpedia.org/resource/Self-organization]==>v[http://dbpedia.org/resource/Memetics]==>v[http://dbpedia.org/resource/Global_brain]
+ +
Map of walk
Warm upProperty GraphGraph Database
Walk with Gremlin
Graph ManipulationGraph Traversal
Stretching
Linked DataGlobal Graph
Graph and Brain
Global Graph
=>
=>
=>
I called this graph the Semantic Web, but maybe it should have been Giant Global Graph.
Tim Berners-Lee - timbl's blog - 2007
Internet
Word Wide Web
Giant Global Graph
net of computers web of documents graph of metadata
Thank you
Logos created by Ketrina Yim for TinkerPop geeksImages created by Flickr Creative Commons ArtistsGraphs created by Memotive Concept Mapping tool
http://tinkerpop.com