orientdb & hazelcast: in-memory distributed graph database
DESCRIPTION
OrientDB uses Hazelcast to achieve a distributed configuration with multi-master replication. By using these together you can scale up horizontally by adding new servers without stopping or reconfigure the cluster. In this webinar, you’ll be introduced to OrientDB and how it compares to other NoSQL DBMS. You will also learn how and why Hazelcast is being used with OrientDB to achieve scale, elasticity and performance. Both Hazelcast and Orient Technologies are providing professional open source support to their respective projects under the Apache software license.TRANSCRIPT
AgendaWhat is OrientDB
Master-Slave replication
New Distributed Architecture goals
Why we choose Hazelcast
Multi-Master replication features
Future roadmap
Copyright (c) - Orient Technologies LTD 5
Complex Domains
Person
Customer Provider
name surname phone email
type delivery
discount
True Polymorphism
Copyright (c) - Orient Technologies LTD 6
ACID Transactionsdb.begin();
try {
…
db.commit();
} catch( Exception e ) {
db.rollback();
}
Eventually consistency is cool, but many times you need full ACID transactions
distributed across servers
SQL + extensions
Everybody knows SQL
We’ve just added a few new operators and functions to support
Tree and Graphs concepts
Copyright (c) - Orient Technologies LTD 8
OrientDB Server
(C)lient
{ “@class”: “VIP”, “name”: “Jay”, “surname”: “Miner”, “invented”: “Amiga” }
HTTP + JSON
HTTP + JSON native support
Copyright (c) - Orient Technologies LTD 9
Commercial Friendly Apache license
FREE for any usage !
We like to keep it simple for users: no (A)GPL
Copyright (c) - Orient Technologies LTD 11
Short history !
In 2012, we had a Master/Slave replication !
While it scaled up well on reads, users complained of a single Master node bottleneck
It’s quite easy to scale up reads, the hard part is to scale up both reads and writes
Copyright (c) - Orient Technologies LTD
How Master/Slave works
12
Master Node
CC C
Slave Node
Slave Node
Writes
Master node is the bottleneck
Copyright (c) - Orient Technologies LTD 13
Master/Slave !
PROS: - Relatively easy to develop !
CONS: - The master is the bottleneck for writes - No matter how many servers you have, the
throughput is limited by the Master node
Copyright (c) - Orient Technologies LTD 14
What happened to OrientDB's M/S architecture?
This is the old MASTER/SLAVE replication
Copyright (c) - Orient Technologies LTD 15
2012: new architectural goals
Multi-Master: all the nodes must accept writes
Sharding: split data in multiple partitions
Better Fail-Over
Simplified configuration with Auto-Discovery
Copyright (c) - Orient Technologies LTD 16
How to achieve this?
We evaluated different approaches such as:
Akka (actor model)
JBoss Infinispan
Queue Engines: ActiveMQ, RabbitMQ, ZeroMQ
and Hazelcast, the winner!
16
Copyright (c) - Orient Technologies LTD 17
Why ? !
- It has everything you need to build distributed software: - Auto-discovery - RPC - Synchronization - Queues and Topics
- Embeddable - Easy API, good documentation - Stable in terms of:
- Run-time - API changes
- Mature product with many users in production - Same license as OrientDB: Apache2
Copyright (c) - Orient Technologies LTD 20
Master Node
Master Node
C
updated distributed configuration is broadcasted to
all the connected clients
Clients see the distributed configuration
Copyright (c) - Orient Technologies LTD 2121
Master Node
Master Node
CC
Master Node
Auto-reconnect in case of failure
In case of failure, the clients auto-reconnect to
the available nodes
Copyright (c) - Orient Technologies LTD 2222
Master Node
Master Node
C
DB are automatically deployed
to the new joining nodes
DB DB
Auto-deploy of databases
CC
C
Copyright (c) - Orient Technologies LTD 24
Customer
customer
Classes rely on Cluster to store records
By default 1 class -> 1 cluster Class
Cluster
Copyright (c) - Orient Technologies LTD 25
Classes can be split into more clusters
25
Customer
customer_usa
customer_europe
customer_china
Class
Cluster
ClusterCluster
Define multiple clusters
and assign them to each node
Copyright (c) - Orient Technologies LTD 26
Assign 1 cluster per Node
2626
customer_usa customer_europe customer_china
Master Node
Master Node
Master Node
Customer
Copyright (c) - Orient Technologies LTD
What about sharing + replication?
!
We used a solution similar to RAID for HardDrives
27
Copyright (c) - Orient Technologies LTD 28
RAID for databases
282828
customer_usa customer_europe customer_china
Master Node
Master Node
Master Node
Customer
customer_europecustomer_usacustomer_china
Replica factor = 2
Copyright (c) - Orient Technologies LTD 29
RAID for databases
29292929
customer_usa customer_europe customer_china
Master Node
Master Node
Master Node
Replica factor = 3
customer_usa customer_europecustomer_china
customer_usacustomer_europe customer_china
Each node owns all customers
Copyright (c) - Orient Technologies LTD 30
Replication: under the hood Client sends an INSERT request
3030303030
Master Node
Master Node
Master Node
HZ Queue
HZ Queue
HZ Queue
C
INSERT
Requests
Copyright (c) - Orient Technologies LTD 31313131313131
Master Node
Master Node
Master Node
HZ Queue
HZ Queue
HZ Queue
C
HZ Queue
HZ Queue
HZ Queue
Replication: under the hood Response handling
WriteQuorum = 2
Sends OK
OK
Requests
Responses
Copyright (c) - Orient Technologies LTD 32
Replication: under the hood Fix the unaligned node
32323232323232
Master Node
Master Node
Master Node
HZ Queue
HZ Queue
HZ Queue
HZ Queue
HZ Queue
HZ Queue
Requests
Responses
Fix
Copyright (c) - Orient Technologies LTD 34
Master Node
Master Node
CC
C
CC
C Master Node
CC
C
Master Node
CC
C
Master Node
CC
C
Master Node
CC
C
Master Node
C
C
Linear and Elastic scalability on both read & writes!
Copyright (c) - Orient Technologies LTD 35
Hazelcast’s role
Auto-Discovering (Multicast/TCP-IP/Amazon)
Queues for requests and responses
Store metadata in distributed Maps
Distributed Locks
3535
Copyright (c) - Orient Technologies LTD 36
OrientDB 2.0 (Sept 2014) has even better performance: +300% improvement on all the distributed operations
Pluggable conflict resolution strategy
Auto-discovery also by Clients
OrientDB’s Future Roadmap