![Page 1: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/1.jpg)
Copyright © ArangoDB GmbH, 2016
Handling Billions Of Edges in a Graph Database
Michael Hackstein @mchacki
![Page 2: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/2.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Schema-free Objects (Vertices)
‣ Relations between them (Edges)
‣ Edges have a direction
{ name: "alice", age: 32 }
{ name: "dancing" }
{ name: "bob", age: 35, size: 1,73m }
{ name: "reading" }
{ name: "fishing" }
hobby
hobby
hobby
hobb
y
![Page 3: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/3.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Schema-free Objects (Vertices)
‣ Relations between them (Edges)
‣ Edges have a direction
‣ Edges can be queried in both directions
‣ Easily query a range of edges (2 to 5)
‣ Undefined number of edges (1 to *)
‣ Shortest Path between two vertices
{ name: "alice", age: 32 }
{ name: "dancing" }
{ name: "bob", age: 35, size: 1,73m }
{ name: "reading" }
{ name: "fishing" }
hobby
hobby
hobby
hobb
y
![Page 4: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/4.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends of Alice
Bob
Charly DaveAlice
Eve Frank
![Page 5: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/5.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends of Alice
Bob
Charly DaveCharly
Bob
DaveAlice
Eve Frank
![Page 6: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/6.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends-of-friends of Alice
Alice
Bob
Charly
Eve
Dave
Frank
![Page 7: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/7.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all friends-of-friends of Alice
Alice
Bob
Charly
Eve
Dave
FrankEve Frank
![Page 8: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/8.jpg)
Copyright © ArangoDB GmbH, 2016
‣ What is the linking path between Alice and Eve
Alice
Bob
Charly
Eve
Dave
Frank
![Page 9: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/9.jpg)
Copyright © ArangoDB GmbH, 2016
‣ What is the linking path between Alice and Eve
Alice
Bob
Charly
Eve
Dave
FrankBobEve
Alice
![Page 10: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/10.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Which Trainstations can I reach if I am allowed to drive a distance of at most 6 stations on my ticket
You are here
![Page 11: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/11.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Which Trainstations can I reach if I am allowed to drive a distance of at most 6 stations on my ticket
You are here
![Page 12: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/12.jpg)
Copyright © ArangoDB GmbH, 2016
Friend
‣ Give me all users that share two hobbies with Alice
Alice
Hobby1 Hobby2
![Page 13: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/13.jpg)
Copyright © ArangoDB GmbH, 2016
FriendFriend
‣ Give me all users that share two hobbies with Alice
Alice
Hobby1 Hobby2
![Page 14: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/14.jpg)
Copyright © ArangoDB GmbH, 2016
Alice
Product
ProductFriend
‣ Give me all products that at least one of my friends has bought together with the products I already own, ordered by how many friends have bought it and the products rating, but only 20 of them.
has_bought has_b
ought
has_boughtis_friend
![Page 15: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/15.jpg)
Copyright © ArangoDB GmbH, 2016
Alice
Product
ProductFriend Product
‣ Give me all products that at least one of my friends has bought together with the products I already own, ordered by how many friends have bought it and the products rating, but only 20 of them.
has_bought has_b
ought
has_boughtis_friend
![Page 16: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/16.jpg)
Copyright © ArangoDB GmbH, 2016
![Page 17: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/17.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all users which have an age attribute between 21 and 35.
![Page 18: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/18.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all users which have an age attribute between 21 and 35.‣ Give me the age distribution of all users
![Page 19: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/19.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Give me all users which have an age attribute between 21 and 35.‣ Give me the age distribution of all users‣ Group all users by their name
![Page 20: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/20.jpg)
Copyright © ArangoDB GmbH, 2016
![Page 21: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/21.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)
S
![Page 22: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/22.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S
BCA
S
![Page 23: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/23.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges
BCA
SS
A B
![Page 24: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/24.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)
DE
BCA
SS
A B
![Page 25: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/25.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)‣ We apply filters on edges
DE
BCA
SS
E
A B
![Page 26: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/26.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)‣ We apply filters on edges‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E
DE
BCA
SS
E
A B
![Page 27: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/27.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)‣ We apply filters on edges‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E‣ Go back to the next unfinished vertex (B)
DE
BCA
SS
E
A B
![Page 28: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/28.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)‣ We apply filters on edges‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E‣ Go back to the next unfinished vertex (B)‣ We iterate down on (B)
DE F
BCA
SS
E
A B
![Page 29: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/29.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)‣ We apply filters on edges‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E‣ Go back to the next unfinished vertex (B)‣ We iterate down on (B)‣ We apply filters on edges
DE F
BCA
SS
E
A
F
B
![Page 30: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/30.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We first pick a start vertex (S)‣ We collect all edges on S‣ We apply filters on edges‣ We iterate down one of the new vertices (A)‣ We apply filters on edges‣ The next vertex (E) is in desired depth. Return
the path S -> A -> E‣ Go back to the next unfinished vertex (B)‣ We iterate down on (B)‣ We apply filters on edges‣ The next vertex (F) is in desired depth. Return
the path S -> B -> F
DE F
BCA
SS
E
A
F
B
![Page 31: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/31.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Once: ‣ Find the start vertex
‣ For every depth: ‣ Find all connected edges ‣ Filter non-matching edges ‣ Find connected vertices ‣ Filter non-matching vertices
Depends on indexes: Hash:
Edge-Index or Index-Free: Linear in edges: Depends on indexes: Hash: Linear in vertices: Only one pass:
O 1
1 n
n * 1 n
3n
![Page 32: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/32.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Linear sounds evil? ‣ NOT linear in All Edges O(E) ‣ Only Linear in relevant Edges n < E
‣ Traversals solely scale with their result size ‣ They are not effected at all by total amount of data ‣ BUT: Every depth increases the exponent: O((3n)d) ‣ "7 degrees of separation": (3n)6 < E < (3n)7
![Page 33: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/33.jpg)
Copyright © ArangoDB GmbH, 2016
‣ MULTI-MODEL database ‣ Stores Key Value, Documents, and Graphs ‣ All in one core
‣ Query language AQL ‣ Document Queries
‣ Graph Queries
‣ Joins
‣ All can be combined in the same statement
‣ ACID support including Multi Collection Transactions
+ +
![Page 34: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/34.jpg)
Copyright © ArangoDB GmbH, 2016
FOR user IN users RETURN user
![Page 35: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/35.jpg)
Copyright © ArangoDB GmbH, 2016
FOR user IN users FILTER user.name == "alice" RETURN user
Alice
![Page 36: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/36.jpg)
Copyright © ArangoDB GmbH, 2016
FOR user IN users FILTER user.name == "alice" FOR product IN OUTBOUND user has_bought RETURN product
Alice
![Page 37: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/37.jpg)
Copyright © ArangoDB GmbH, 2016
FOR user IN users FILTER user.name == "alice" FOR product IN OUTBOUND user has_bought RETURN product
Alicehas_bought
TV
![Page 38: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/38.jpg)
Copyright © ArangoDB GmbH, 2016
FOR user IN users FILTER user.name == "alice" FOR recommendation, action, path IN 3 ANY user has_bought FILTER path.vertices[2].age <= user.age + 5 AND path.vertices[2].age >= user.age - 5 FILTER recommendation.price < 25 LIMIT 10 RETURN recommendation
Alicehas_bought
TV
![Page 39: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/39.jpg)
Copyright © ArangoDB GmbH, 2016
FOR user IN users FILTER user.name == "alice" FOR recommendation, action, path IN 3 ANY user has_bought FILTER path.vertices[2].age <= user.age + 5 AND path.vertices[2].age >= user.age - 5 FILTER recommendation.price < 25 LIMIT 10 RETURN recommendation
Alicehas_bought
TVhas_bought
playstation.price < 25
PlaystationBob
alice.age - 5 <= bob.age && bob.age <= alice.age + 5
has_bought
![Page 40: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/40.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Many graphs have "Supernodes" ‣ Vertices with many inbound and/or outbound edges
‣ Traversing over them is expensive ‣ Often you only need a subset of edges
Bob Alice
![Page 41: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/41.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Remember Complexity? O((3n)d) ‣ Filtering of non-matching edges is linear for every depth
‣ Index all edges based on their vertices and arbitrary other attributes ‣ Find initial set of edges in identical time ‣ Less / No post-filtering required ‣ This decreases the n significantly
Alice
![Page 42: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/42.jpg)
Copyright © ArangoDB GmbH, 2016
‣ We have the rise of big data ‣ Store everything you can
‣ Dataset easily grows beyond one machine ‣ This includes graph data!
![Page 43: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/43.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Distribute graph on several machines (sharding)
‣ How to query it now? ‣ No global view of the graph possible any more ‣ What about edges between servers?
‣ In a shared environment network most of the time is the bottleneck ‣ Reduce network hops
‣ Vertex-Centric Indexes again help with super-nodes ‣ But: Only on a local machine
![Page 44: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/44.jpg)
First let's do the cluster thingy
![Page 45: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/45.jpg)
Copyright © ArangoDB GmbH, 2016
‣ ArangoDB is the first fully certified database including the persistence primitives for DC/OS
‣ ArangoDB's cluster resource management is based on Apache Mesos
‣ Later this year we will also support Kubernetes
![Page 46: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/46.jpg)
Copyright © ArangoDB GmbH, 2016
‣ ArangoDB can run clusters without it ‣ Setup Requires manual effort (can be scripted)
‣ This works: ‣ Automatic Failover (Follower takes over if leader dies) ‣ Rebalancing of shards ‣ Everything inside of ArangoDB
‣ This is based on Mesos: ‣ Complete self healing ‣ Automatic restart of ArangoDBs (on new machines)
➡ We suggest you have someone on call
![Page 47: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/47.jpg)
Demo Time DC/OS
![Page 48: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/48.jpg)
Now distribute the graph
![Page 49: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/49.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Only parts of the graph on every machine ‣ Neighboring vertices may be on different machines ‣ Even edges could be on other machines than their vertices
‣ Queries need to be executed in a distributed way ‣ Result needs to be merged locally
![Page 50: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/50.jpg)
Copyright © ArangoDB GmbH, 2016
Random Distribution
‣ Advantages: ‣ every server takes an equal portion of
data ‣ easy to realize ‣ no knowledge about data required ‣ always works
‣ Disadvantages: ‣ Neighbors on different machines ‣ Probably edges on other machines than
their vertices ‣ A lot of network overhead is required for
querying
![Page 51: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/51.jpg)
Copyright © ArangoDB GmbH, 2016
Random Distribution
‣ Advantages: ‣ every server takes an equal portion of
data ‣ easy to realize ‣ no knowledge about data required ‣ always works
‣ Disadvantages: ‣ Neighbors on different machines ‣ Probably edges on other machines than
their vertices ‣ A lot of network overhead is required for
querying
![Page 52: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/52.jpg)
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
![Page 53: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/53.jpg)
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
![Page 54: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/54.jpg)
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
![Page 55: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/55.jpg)
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
????
![Page 56: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/56.jpg)
Copyright © ArangoDB GmbH, 2016
Index-Free Adjacency
‣ ArangoDB uses an hash-based EdgeIndex (O(1) - lookup) ‣ The vertex is independent of it's edges ‣ It can be stored on a different machine
‣ Used by most other graph databases ‣ Every vertex maintains two lists of it's edges (IN and OUT) ‣ Do not use an index to find edges ‣ How to shard this?
????
![Page 57: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/57.jpg)
Copyright © ArangoDB GmbH, 2016
Domain Based Distribution
‣ Many Graphs have a natural distribution ‣ By country/region for People ‣ By tags for Blogs ‣ By category for Products
‣ Most edges in same group ‣ Rare edges between groups
![Page 58: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/58.jpg)
Copyright © ArangoDB GmbH, 2016
Domain Based Distribution
‣ Many Graphs have a natural distribution ‣ By country/region for People ‣ By tags for Blogs ‣ By category for Products
‣ Most edges in same group ‣ Rare edges between groups
![Page 59: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/59.jpg)
Copyright © ArangoDB GmbH, 2016
Domain Based Distribution
‣ Many Graphs have a natural distribution ‣ By country/region for People ‣ By tags for Blogs ‣ By category for Products
‣ Most edges in same group ‣ Rare edges between groups
ArangoDB Enterprise Edition uses Domain Knowledge
for short-cuts
![Page 60: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/60.jpg)
Copyright © ArangoDB GmbH, 2016
SmartGraphs - How it works
DB Server 1 DB Server 2 DB Server n
Coordinator
Foxx
Coordinator
Foxx
![Page 61: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/61.jpg)
Copyright © ArangoDB GmbH, 2016
SmartGraphs - How it works
DB Server 1 DB Server 2 DB Server n
Coordinator
Foxx
Coordinator
Foxx
![Page 62: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/62.jpg)
Copyright © ArangoDB GmbH, 2016
SmartGraphs - How it works
DB Server 1 DB Server 2 DB Server n
Coordinator
Foxx
Coordinator
Foxx
![Page 63: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/63.jpg)
Sneak Preview SmartGraphs
![Page 64: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/64.jpg)
Copyright © ArangoDB GmbH, 2016
![Page 65: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/65.jpg)
Copyright © ArangoDB GmbH, 2016
Database document write transactions per virtual CPU/second
ArangoDB 1,730
Aerospike 2,500
Cassandra 965
Couchbase 1,375
FoundationDB 750
*) Source: https://mesosphere.com/blog/2015/11/30/arangodb-benchmark-dcos/
![Page 66: Handling Billions Of Edges in a Graph Database - GotoLdn · Title: Handling Billions Of Edges in a Graph Database - GotoLdn.key Created Date: 10/14/2016 1:41:02 PM](https://reader035.vdocuments.us/reader035/viewer/2022062920/5f029cda7e708231d4052256/html5/thumbnails/66.jpg)
Copyright © ArangoDB GmbH, 2016
‣ Further questions? ‣ Follow us on twitter: @arangodb ‣ Join our slack: slack.arangodb.com
‣ Follow me on twitter/github: @mchacki ‣ Write me a mail: [email protected]
Thank You