a walk in graph databases v1.0

Post on 08-May-2015

8.343 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Pierre De Wilde4 May 2012Global Brain InstituteVUB - ECCO Group

A Walk in Graph Databases

The Law of the Hammer

If the only tool you have is a hammer, everything looks like a nail.

Abraham Maslow - The Psychology of Science - 1966

The Law of the Relational Database

If the only tool you have is a relational database, everything looks like a table.

A Walk in Graph Databases - 2012

One size fits all

Scalability issueScale upScale out

Index-intensive issue

Find dataJoin data

doesn't

NoSQL ?! No SQL ? Not only SQL !

Scalability solutionsKey-value storesColumn databasesDocument databases

Index-intensive solution

Graph databases

Query language for relational databases

SQL

ISUD or CRUD

Query language for relational databases

Gremlin is a graph traversal language

Traversal graph

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Graph

G = (V, E)

--. .-. .- .--. ....

One graph doesn't fit all

Marko A. Rodriguez and Peter Neubauer - Constructions from Dots and Lines - 2010

Property graph

A property graph is a directed, labeled, attributed, multi graph.

Anatomy of a vertex

A vertex is composed of - an unique identifier (id)- a collection of properties- a set of incoming edges (inE)- a set of outgoing edges (outE)

Anatomy of an edge

An edge is composed of - an unique identifier (id)- an outgoing vertex (outV)- a label- an incoming vertex (inV)- a collection of properties

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Graph database

Key feature of a graph database

Index-free adjacency

Some graph database vendors

Neo4j from Neo Technologyhttp://neo4j.org/ OrientDB from Orient Technologieshttp://www.orientdb.org/ Dex from Sparsity-Technologieshttp://www.sparsity-technologies.com/dex InfiniteGraph from Objectivity, Inc.http://www.infinitegraph.com/

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

TinkerPop

Open source project in the graph space

TinkerPop family

https://github.com/tinkerpop

Gremlin

Gremlin is a graph traversal language

$ gremlin.sh \,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin>

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Connect to a graph database

gremlin> g = new TinkerGraph(name) gremlin> g = new Neo4jGraph(name) gremlin> g = new OrientGraph(name) gremlin> g = new DexGraph(name) gremlin> g = new IGGraph(name)

Add a vertex / an edge

gremlin> v1 = g.addVertex()gremlin> v2 = g.addVertex()... gremlin> g.addEdge(v1, 'knows', v2)... gremlin> g.loadGraphML(url)

Update a vertex

gremlin> v = g.getVertex(1)==>v[1] gremlin> v.getPropertyKeys()==>age==>name gremlin> v.getProperty('name')==>markogremlin> v.getProperty('age')==>29gremlin> v.setProperty('age',32)==>32 gremlin> v.age==>32gremlin> v.name==>marko

Update an edge

gremlin> e = g.getEdge(8)==>e[8][1-knows->4] gremlin> e.getPropertyKeys()==>weight gremlin> e.getProperty('weight')==>1.0gremlin> e.setProperty('weigth',0.9)==>0.9 gremlin> e.map() ==>weigth=0.9==>weight=1.0 gremlin> e.removeProperty('weigth')==>0.9

Remove a vertex

gremlin> v = g.getVertex(3)==>v[3]gremlin> g.removeVertex(v)==>null

Remove an edge

gremlin> e = g.getEdge(10)==>e[10][4-created->5]gremlin> g.removeEdge(e)==>null

Disconnect from the graph database

gremlin> g.shutdown()

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Graph traversal

Jump - from vertex to edge- from edge to vertex- from vertex to vertex

Graph traversal: starting the traversal

gremlin> g.v(1)==>v[1]

Graph traversal: outgoing edges

gremlin> g.v(1).outE==>e[7][1-knows->2]==>e[9][1-created->3]==>e[8][1-knows->4]

Graph traversal: incoming vertices

gremlin> g.v(1).outE.inV==>v[2]==>v[4]==>v[3]

Graph traversal: outgoing edges (cont.)

gremlin> g.v(1).outE.inV.outE==>e[10][4-created->5]==>e[11][4-created->3]

Graph traversal: incoming vertices (cont.)

gremlin> g.v(1).outE.inV.outE.inV==>v[5]==>v[3]

Graph traversal: ending the traversal

gremlin> g.v(1).outE.inV.outE.inV.outE

Graph traversal: starting vertex

gremlin> g.v(1)==>v[1]

Graph traversal: adjacent vertices

gremlin> g.v(1).out==>v[2]==>v[4]==>v[3]

Graph traversal: adjacent vertices (cont.)

gremlin> g.v(1).out.out==>v[5]==>v[3]

Graph traversal: starting vertex

gremlin> g.v(1)==>v[1]

Graph traversal: labeled outgoing edges

gremlin> g.v(1).outE('created')==>e[9][1-created->3]

Graph traversal: labeled adjacent vertices

gremlin> g.v(1).outE('created').inV==>v[3] gremlin> g.v(1).out('created')==>v[3]

Graph traversal: labeled adjacent (cont.)

gremlin> g.v(1).out('created').in('created')==>v[1]==>v[4]==>v[6]

Graph traversal and ...

indextransformfiltercomputemanipulatelooppath

Graph traversal and index

gremlin> g.idx('vertices')[[name:'marko']]==>v[1]

Graph traversal and transform

gremlin> g.v(1).outE.label.dedup==>knows==>created gremlin> g.v(1).out('knows').name==>vadas==>josh

Graph traversal and filter

gremlin> g.v(1).out('knows').age==>27==>32 gremlin> g.v(1).out('knows').filter{it .age>30}.age==>32

Graph traversal and compute

gremlin> g.v(1).outE.weight==>0.5==>1.0==>0.4 gremlin> g.v(1).outE.weight.mean()==>0.6333333353201548

Graph traversal and manipulate

gremlin> g.v(1).outE.sideEffect{it. weight+=0.1}.weight==>0.6==>1.1==>0.5

Graph traversal and loop

gremlin> g.v(1).out.loop(1){it.loops<3}==>v[5]==>v[3]

Graph traversal and path

gremlin> g.v(1).outE.inV.path ==>[v[1], e[7][1-knows->2], v[2]]==>[v[1], e[8][1-knows->4], v[4]]==>[v[1], e[9][1-created->3], v[3]] gremlin> g.v(1).out.path==>[v[1], v[2]]==>[v[1], v[4]]==>[v[1], v[3]]

Global traversal: in-degree distribution

gremlin> m=[:].withDefault{0}; g.V.sideEffect{m[it.in.count()]+=1}.iterate(); m.sort()==>0=2==>1=3==>3=1

Walk is ending

Gremlin is a graph traversal languageflexible

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Linked Data

http://www.w3.org/DesignIssues/LinkedData.html

Linked Data cloud

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Linked Data initiative

http://freeyourmetadata.org/

Linked Data and Gremlin

gremlin> g = new SparqlRepositorySailGraph("http://dbpedia.org/sparql") gremlin> v = g.v(' http://dbpedia.org/resource/Global_brain')==>v[http://dbpedia.org/resource/Global_brain] gremlin> v.out('http://www.w3.org/2000/01/rdf-schema#comment').has('lang','en').value==>The Global Brain is a metaphor for the worldwide intelligent network... gremlin> v.inE('http://dbpedia.org/ontology/knownFor').outV==>v[http://dbpedia.org/resource/Francis_Heylighen] gremlin> v.inE('http://dbpedia.org/ontology/knownFor').outV.outE('http://dbpedia.org/ontology/knownFor').inV==>v[http://dbpedia.org/resource/Self-organization]==>v[http://dbpedia.org/resource/Memetics]==>v[http://dbpedia.org/resource/Global_brain]

+ +

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Graph and Brain

Global Graph

=>

=>

=>

I called this graph the Semantic Web, but maybe it should have been Giant Global Graph.

Tim Berners-Lee - timbl's blog - 2007

Internet

Word Wide Web

Giant Global Graph

net of computers web of documents graph of metadata

Thank you

Logos created by Ketrina Yim for TinkerPop geeksImages created by Flickr Creative Commons ArtistsGraphs created by Memotive Concept Mapping tool

http://tinkerpop.com

top related