new neo4j auto ha cluster
DESCRIPTION
In this talk, Michael Hunger is going to shed some light over the new High Availability architecture for the popular Neo4j Graph Database. We are going to look at the different variants of the Paxos protocol, master failover strategies and cluster management state handling. This piece of infrastructure poses non-trivial challenges to distributed consensus-finding, an interesting session for anyone into scalable systems.TRANSCRIPT
![Page 1: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/1.jpg)
Neo4jHigh Availability
New Auto-Cluster
1
Michael Hunger - @mesirii
![Page 2: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/2.jpg)
High Availability Cluster
2
๏Neo4j Enterprise
๏Master-Slave Replication
๏read-scaling and fault-tolerance
๏eventual consistency
•write to master (push_factor)
•write to slaves
![Page 3: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/3.jpg)
3 Separate Concerns (I)
3
๏Cluster Management
•Members join/leave/heartbeat
๏Failover
•Master Election
•Distribution of Master-Status
![Page 4: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/4.jpg)
3 Separate Concerns (II)
4
๏Replication
•synchronized id-generation
•distributed locks
•pull, push of transactions
•initial store synchronization
![Page 5: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/5.jpg)
Pre 1.9 - Zookeeper
5
![Page 6: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/6.jpg)
Pre 1.9
6
๏Apache Zookeeper took care of concerns
•Cluster Management
‣new members register with ZK
•Failover
‣ZK stores Master and last TX-Id
‣ZK uses ZAB to determine new Masterand distribute information
![Page 7: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/7.jpg)
HA Cluster
7
Master
Slave Slave
RO-Slave
Coordinator Coordinator
Coordinator
![Page 8: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/8.jpg)
Pre 1.9 - Problems
8
๏Additional setup and operations of a separate component
๏unreliable operation / hiccups
๏longterm stability
๏no dynamic reconfig of the ZK cluster important for cloud setup
![Page 9: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/9.jpg)
Post 1.9 - Neo4j Auto Cluster
9
![Page 10: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/10.jpg)
Replace Zookeeper!?
10
๏Implement Multi-Paxos ourselves
๏simple, testable code
๏only covers
•cluster management,
•master election
![Page 11: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/11.jpg)
HA Cluster
11
![Page 12: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/12.jpg)
What is Paxos?
12
๏reliable consensus making
๏broadcasting
๏works even with unreliable communication
•message lost
•delays, invalid order
๏does not guarantee progress
![Page 13: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/13.jpg)
What is Paxos?
13
![Page 14: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/14.jpg)
Implementation
14
๏everything is a State Machines
•SM = stateless enums + context
•Message = type enum + payload
•State = enum instance
•Transition = handle() messages, switch on msg-type, implement logic
![Page 15: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/15.jpg)
Implementation (II)
15
๏everything is a State Machines
•use timeouts for reliability
•handle failing messages
•decouple network and time
‣for testability
•listeners interact on messages with outside world, sync or async
![Page 16: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/16.jpg)
Paxos
Implementation (II)
16
๏Paxos (3 roles)
•Proposer-SM
•Acceptor-SM
•Learner-SM
๏Cluster
•Heartbeat
Proposer
Acceptor
Learner
Heartbeat
ClusterState
![Page 17: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/17.jpg)
LEARN FAIL
Proposer Acceptor(2 * f + 1)Learner
PREPARE
PROMISE
ACCEPT
TIMEOUT
TIMEOUT
ACCEPTEDOR
STORE VALUE
ORREJECT
REJECTED
VALUE MATCH
NO MATCH
MATCHES PROMISE?
NO
CHECK , STORE
RESPONSESIF QUORUM
MET, CANCEL TIMEOUT
OUT OF ORDER
MSG HANDLING
STORE VALUE
DELIVER ALL VALID
ATOMIC BC
LEARN TIMEOUTWE STILL DON'T KNOW
LEARN TIMEOUT
A VALUE IS MISSING
LEARN REQLEARN TIMEOUT
other Learner
LEARN
LEARN
ORDON'T KNOW
HAVE VALUE
PREPARE
Multi-Paxos (happy path)
17
...
![Page 18: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/18.jpg)
18
LEARN FAIL
Proposer Acceptor(2 * f + 1)Learner
PROPOSE
PREPARE
PROMISE
ACCEPT
TIMEOUT
TIMEOUT
ACCEPTEDOR
STORE VALUE
ORREJECT
REJECTED
VALUE MATCH
NO MATCH
MATCHES PROMISE?
NO
CHECK , STORE
RESPONSESIF QUORUM
MET, CANCEL TIMEOUT
OUT OF ORDER
MSG HANDLING
STORE VALUE
DELIVER ALL VALID
ATOMIC BC
LEARN TIMEOUTWE STILL DON'T KNOW
LEARN TIMEOUT
A VALUE IS MISSING
LEARN REQLEARN TIMEOUT
other Learner
LEARN
LEARN
ORDON'T KNOW
HAVE VALUE
Multi-Paxos (happy path)...
![Page 19: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/19.jpg)
Acceptor State Machine
19
![Page 20: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/20.jpg)
Heartbeat State Machine
20
![Page 21: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/21.jpg)
Implementation (III)
21
๏HA Implementation uses state machines as infrastructure
๏notifications via listeners
๏piggyback heartbeat on messages
๏master election
•(all - failed) have to agree
•Paxos BC needs quorum of total
![Page 22: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/22.jpg)
Multi-Paxos
22
๏everything is a State Machines
•use timeouts for reliability
•handle failing messages
•decouple network and time
‣for testability
•listeners interact on messages with outside world, sync or async
![Page 23: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/23.jpg)
Unit-Testing
23
•Mock Time
‣fast running tests despite timeouts
•Mock Network
‣simulate delays, failing messages
![Page 24: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/24.jpg)
Unit-Test-Example
24
![Page 25: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/25.jpg)
Setup
25
•Config
•Video
•Auto-Setup Script (Demo)
![Page 26: New Neo4j Auto HA Cluster](https://reader033.vdocuments.us/reader033/viewer/2022042700/559453061a28abc84f8b4717/html5/thumbnails/26.jpg)
Thank You - Questions?
26