scale out your graph across servers and clouds with orientdb

Post on 21-Jan-2018

168 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Scale Out Your Graph Across Servers and Clouds

with OrientDB

#gdsf17

Luca Garulli, Founder and CEO @lgarulli GraphDay - San Francisco - June 17, 2017

Copyright (c) - OrientDB LTD 2

We all want the same thing: an open source GraphDB that is

Fast, Flexible, Scalable and … Unbreakable!

Copyright (c) - OrientDB LTD 3

Complexity

Scal

ability

Single Thread

Multi Thread

Distributed

Systems/Complexity

Copyright (c) - OrientDB LTD 4

Master Node

Auto-Discovery

I’m the only one!

C

Copyright (c) - OrientDB LTD 5

Auto-Discovery

Master Node

Master Node

Connected!

C

Copyright (c) - OrientDB LTD 6

Master Node

Master Node

C

Updated distributed configuration is broadcasted to

all the connected clients

Clients See the Distributed Configuration

Copyright (c) - OrientDB LTD 77

Master Node

Master Node

CC

Master Node

Auto-reconnect in Case of Failure

In case of failure, the clients auto-reconnect to

the available nodes

Copyright (c) - OrientDB LTD 88

Master Node

Master Node

C

DBs are automatically deployed

to the newly joined nodes

DB DB

Database Auto-deploy

CC

C

Copyright (c) - OrientDB LTD 9

Replication: Under the Hood Client commits a transaction

99999

Master Node

Master Node

Master Node

HA Queue

HA Queue

HA Queue

C

Transaction

Requests

Copyright (c) - OrientDB LTD 10101010101010

Master Node

Master Node

Master Node

HA Queue

HA Queue

HA Queue

C

HA Queue

HA Queue

HA Queue

Replication: Under the Hood Response Handling

WriteQuorum = 2

Sends OK

OK

Requests

Responses

Copyright (c) - OrientDB LTD 11

Replication: Under the Hood Fix the unaligned node

11111111111111

Master Node

Master Node

Master Node

HA Queue

HA Queue

HA Queue

HA Queue

HA Queue

HA Queue

Requests

Responses

Fix

Commit

Copyright (c) - OrientDB LTD 12

Replication: Under the Hood 2pc messages

12121212121212

Fix (response != quorum)

Commit (response == quorum)

Rollback (quorum not reached)

Copyright (c) - OrientDB LTD 13

Master Node A

Optimistic MVCC Distributed Transaction

(2) tx task

(4) return result

(7) asynch* commit, rollback or fix

(1) Lock all records in order

+ Execute TX locally

Master Node B

(3) Lock all records in order

+ Execute TX locally

(5) Check results

+ unlock all records

(8) Execute the message and Unlock all records

C Node A becomes the transaction coordinator

(6) send response back to the client

Copyright (c) - OrientDB LTD 14

Consistency- Distributed Locks don’t block reads - During the transaction the old version of the

record is retrieved - 2pc message is asynchronous, so a record could

not be updated yet on server X - If you need higher consistency, don’t use load

balancer policy ROUND_ROBIN_REQUEST

Copyright (c) - OrientDB LTD 15

{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }

Default Configuration (json)

deploys the database automatically on new

nodes

Copyright (c) - OrientDB LTD 16

Default Configuration (json)

writeQuorum=2 means at least 2

nodes must agree on writes. Set it to the majority to ensure

consistency

{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }

Copyright (c) - OrientDB LTD 17

Default Configuration (json)

newNodeStrategy: static (default) or

dynamic. Static = once nodes join, they are

part of configuration

{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }

Copyright (c) - OrientDB LTD 18

Default Configuration (json)

clusters contain the distributed

configuration per cluster

{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }

Copyright (c) - OrientDB LTD 19

Default Configuration (json)

cluster “*” represents the default cluster

configuration. Any new node will join,

containing all the clusters

{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] } } }

Copyright (c) - OrientDB LTD 20

Custom Setting per Cluster

cluster “customer” extends the default configuration with a custom writeQuorum

{ "autoDeploy": true, "writeQuorum": “majority”, "newNodeStrategy": “static”, "clusters": { "*": { "servers": ["<NEW_NODE>"] }, "customer": { "writeQuorum": 3, "servers": ["<NEW_NODE>"] } } }

Copyright (c) - OrientDB LTD

What about scalability?

21

Copyright (c) - OrientDB LTD 22

Performance with Reads

2222222222

Master Node

Master Node

Master Node

C

10,000 req/sec

C C

10,000 req/sec

10,000 req/sec

Copyright (c) - OrientDB LTD

Full replication provides linear scalability on reads,

but what about scalability on writes?

23

Copyright (c) - OrientDB LTD 24

Performance with Writes

2424242424

Master Node

C

12,000 req/sec

C C

Copyright (c) - OrientDB LTD 25

Performance with Writes

2525252525

Master Node

Master Node

C

7000 req/sec

C C

7000 req/sec

Copyright (c) - OrientDB LTD 26

Performance with Writes

2626262626

Master Node

Master Node

Master Node

C

5000 req/sec

C C

5000 req/sec

5000 req/sec

The more replicas you

configure, the more the propagation cost

will impact performance, based

on writeQuorum

Copyright (c) - OrientDB LTD

In order to scale up writes, you need

sharing + replication

We used a solution similar to RAID for Hard Drives

27

Copyright (c) - OrientDB LTD

Sharding in OrientDB v2.2

28

Copyright (c) - OrientDB LTD 29

Assign 1 Cluster per Node

2929

customer_usa customer_europe customer_china

Master Node usa

Master Node europe

Master Node china

Customer

Copyright (c) - OrientDB LTD

Master Node usa

Master Node china

Master Node europe

30

RAID for Databases

303030

customer_usa customer_europe customer_china

Customer

customer_europecustomer_usacustomer_china

Replica factor = 2

Copyright (c) - OrientDB LTD

Master Node usa

Master Node china

Master Node europe

31

Traversal with Spark (Pregel)

313131

C

https://spark.apache.org/docs/latest/graphx-programming-guide.html#pregel-api

Copyright (c) - OrientDB LTD 32

Sharding ConfigurationLEGEND: X = Owner, o = Copy +---------------+-----------+----------+-------+-------+-------+ | | | |MASTER |MASTER |MASTER | | | | |ONLINE |ONLINE |ONLINE | +---------------+-----------+----------+-------+-------+-------+ |CLUSTER |writeQuorum|readQuorum| usa |europe |china | +---------------+-----------+----------+-------+-------+-------+ |* | 2 | 1 | X | | | |customer_usa | 2 | 1 | X | | o | |customer_europe| 2 | 1 | o | X | | |customer_china | 2 | 1 | | o | X | +---------------+-----------+----------+-------+-------+-------+

Copyright (c) - OrientDB LTD 33

Sharding Configuration

first node in list is the “owner” for that

cluster

{ "clusters": { “customer_usa": { "servers": [“usa”, “china”] }, “customer_europe": { "servers": [“europe”, “usa”] }, “customer_china": { "servers": [“china”, “europe”] } } }

Copyright (c) - OrientDB LTD 34

Static OwnerThe owner can be static. If the

owner is not online, new records can’t be

inserted in the cluster

{ "clusters": { "client_usa": { "owner": "usa", "servers" : [ "usa", "europe", "asia" ] } } } This assures

the owner is not assigned

dynamically

Copyright (c) - OrientDB LTD 353535353535

Master Node

C

5000 req/sec

C C

Performance with Sharding

Copyright (c) - OrientDB LTD 363636363636

Master Node

Master Node

C

5000 req/sec

C C

5000 req/sec

Performance with Sharding

Copyright (c) - OrientDB LTD 373737373737

Master Node

Master Node

C

5000 req/sec

C C

5000 req/sec

Performance with Sharding

Master Node

5000 req/sec

Master Node

5000 req/sec

Master Node

5000 req/sec

Each write is replicated on only 2 nodes

Copyright (c) - OrientDB LTD 38

Master Node

Master Node

CC

C

CC

C Master Node

CC

C

Master Node

CC

C

Master Node

CC

C

Master Node

CC

C

Master Node

C

C

Linear and Elastic Scalability on both Read & Writes!

Copyright (c) - OrientDB LTD 39

Replica only Nodes

3939

Replica NodeMaster

Node usa

Master Node usa

Master Node usa

Replica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica NodeReplica Node

Replica servers don’t concur in writeQuorum

Copyright (c) - OrientDB LTD 40

Server Role Configuration

“*” by default each node is a master.

Following nodes are replica only

{ "servers": { "*": “master”, “usa_r1": “replica", “usa_r2": “replica", “usa_r3": “replica", “europe_r1": “replica", “china1_r1": "replica" } }

Copyright (c) - OrientDB LTD 41

Master Node

Master Node

C

Load-Balancing on Client Side

Master Node

Copyright (c) - OrientDB LTD 42

Load-Balancing Configuration

final OrientGraphFactory factory = new OrientGraphFactory(“remote:localhost/demo");

factory.setConnectionStrategy( OStorageRemote.CONNECTION_STRATEGY.ROUND_ROBIN_CONNECT);

OrientGraph graph = factory.getTx();

Available strategies: - STICKY, - ROUND_ROBIN_CONNECT, - ROUND_ROBIN_REQUEST

Copyright (c) - OrientDB LTD 43

What if some records are not aligned

for **ANY** reason?

Copyright (c) - OrientDB LTD 44

OrientDB Auto-Repairer

- Executed in chain - It works in batch of 50 records per time (configurable) - Strategies: - Quorum: checks if it meets the configured write quorum - Content: checks the content - Majority: checks if at least there is a majority - Version: gets the higher version - DC (EE only). Example: dc{winner:asia}

Copyright (c) - OrientDB LTD 45

Master Node

A

Master Node

B

Auto-Repair Flow

(1) tx read [#10:33,#43:90,#12:23]

(3) return [{#10:33 v1},{#43:90 v5},{#12:23 v9}]

(2) lock [#10:33,#43:90,#12:

23](4) fix [{#43:90 v6}]

Copyright (c) - OrientDB LTD 46

Dynamic Timeouts (from v2.2.18)

orientdb> HA STATUS -latency -output=text

REPLICATION LATENCY AVERAGE (in milliseconds) +-------+-----+------+-----+ |Servers|node1|node2*|node3| +-------+-----+------+-----+ |node1 | | 0.60| 0.43| |node2* | 0.39| | 0.38| |node3 | 0.35| 0.53| | +-------+-----+------+-----+

Copyright (c) - OrientDB LTD 47

Cross Datacenter & Cloud

Dublin

Austin

Cross Data Centre and Cloud replication. Data

Centers don’t need to have the same number

of servers

Copyright (c) - OrientDB LTD 48

Data-Center (Enterprise only)"dataCenters": { "rome": { "writeQuorum": "all", "servers": [ "europe-0", "europe-1", "europe-2" ] }, "austin": { "writeQuorum": "all", "servers": [ "usa-0", "usa-1", "usa-2" ] } }

Copyright (c) - OrientDB LTD 49

Performance

Copyright (c) - OrientDB LTD 50

Yahoo Benchmark 3 Nodes

OrientDB is 2x-3x faster than Cassandra on the

same HW/SW configuration, same workload

Ops/sec

Copyright (c) - OrientDB LTD 51

OrientDB v3.1 (Q1 2018)- DHT-like algorithm = Distributed Sharded index

- Native support for batch-traversal (Pregel-like) without Spark

- Background refactoring of the Graph based on usage statistics

Copyright (c) - OrientDB LTD 52

Thank you!@lgarulli orientdb.com

top related