new neo4j auto ha cluster

Post on 02-Jul-2015

2.713 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

In this talk, Michael Hunger is going to shed some light over the new High Availability architecture for the popular Neo4j Graph Database. We are going to look at the different variants of the Paxos protocol, master failover strategies and cluster management state handling. This piece of infrastructure poses non-trivial challenges to distributed consensus-finding, an interesting session for anyone into scalable systems.

TRANSCRIPT

Neo4jHigh Availability

New Auto-Cluster

1

Michael Hunger - @mesirii

High Availability Cluster

2

๏Neo4j Enterprise

๏Master-Slave Replication

๏read-scaling and fault-tolerance

๏eventual consistency

•write to master (push_factor)

•write to slaves

3 Separate Concerns (I)

3

๏Cluster Management

•Members join/leave/heartbeat

๏Failover

•Master Election

•Distribution of Master-Status

3 Separate Concerns (II)

4

๏Replication

•synchronized id-generation

•distributed locks

•pull, push of transactions

•initial store synchronization

Pre 1.9 - Zookeeper

5

Pre 1.9

6

๏Apache Zookeeper took care of concerns

•Cluster Management

‣new members register with ZK

•Failover

‣ZK stores Master and last TX-Id

‣ZK uses ZAB to determine new Masterand distribute information

HA Cluster

7

Master

Slave Slave

RO-Slave

Coordinator Coordinator

Coordinator

Pre 1.9 - Problems

8

๏Additional setup and operations of a separate component

๏unreliable operation / hiccups

๏longterm stability

๏no dynamic reconfig of the ZK cluster important for cloud setup

Post 1.9 - Neo4j Auto Cluster

9

Replace Zookeeper!?

10

๏Implement Multi-Paxos ourselves

๏simple, testable code

๏only covers

•cluster management,

•master election

HA Cluster

11

What is Paxos?

12

๏reliable consensus making

๏broadcasting

๏works even with unreliable communication

•message lost

•delays, invalid order

๏does not guarantee progress

What is Paxos?

13

Implementation

14

๏everything is a State Machines

•SM = stateless enums + context

•Message = type enum + payload

•State = enum instance

•Transition = handle() messages, switch on msg-type, implement logic

Implementation (II)

15

๏everything is a State Machines

•use timeouts for reliability

•handle failing messages

•decouple network and time

‣for testability

•listeners interact on messages with outside world, sync or async

Paxos

Implementation (II)

16

๏Paxos (3 roles)

•Proposer-SM

•Acceptor-SM

•Learner-SM

๏Cluster

•Heartbeat

Proposer

Acceptor

Learner

Heartbeat

ClusterState

LEARN FAIL

Proposer Acceptor(2 * f + 1)Learner

PREPARE

PROMISE

ACCEPT

TIMEOUT

TIMEOUT

ACCEPTEDOR

STORE VALUE

ORREJECT

REJECTED

VALUE MATCH

NO MATCH

MATCHES PROMISE?

NO

CHECK , STORE

RESPONSESIF QUORUM

MET, CANCEL TIMEOUT

OUT OF ORDER

MSG HANDLING

STORE VALUE

DELIVER ALL VALID

ATOMIC BC

LEARN TIMEOUTWE STILL DON'T KNOW

LEARN TIMEOUT

A VALUE IS MISSING

LEARN REQLEARN TIMEOUT

other Learner

LEARN

LEARN

ORDON'T KNOW

HAVE VALUE

PREPARE

Multi-Paxos (happy path)

17

...

18

LEARN FAIL

Proposer Acceptor(2 * f + 1)Learner

PROPOSE

PREPARE

PROMISE

ACCEPT

TIMEOUT

TIMEOUT

ACCEPTEDOR

STORE VALUE

ORREJECT

REJECTED

VALUE MATCH

NO MATCH

MATCHES PROMISE?

NO

CHECK , STORE

RESPONSESIF QUORUM

MET, CANCEL TIMEOUT

OUT OF ORDER

MSG HANDLING

STORE VALUE

DELIVER ALL VALID

ATOMIC BC

LEARN TIMEOUTWE STILL DON'T KNOW

LEARN TIMEOUT

A VALUE IS MISSING

LEARN REQLEARN TIMEOUT

other Learner

LEARN

LEARN

ORDON'T KNOW

HAVE VALUE

Multi-Paxos (happy path)...

Acceptor State Machine

19

Heartbeat State Machine

20

Implementation (III)

21

๏HA Implementation uses state machines as infrastructure

๏notifications via listeners

๏piggyback heartbeat on messages

๏master election

•(all - failed) have to agree

•Paxos BC needs quorum of total

Multi-Paxos

22

๏everything is a State Machines

•use timeouts for reliability

•handle failing messages

•decouple network and time

‣for testability

•listeners interact on messages with outside world, sync or async

Unit-Testing

23

•Mock Time

‣fast running tests despite timeouts

•Mock Network

‣simulate delays, failing messages

Unit-Test-Example

24

Setup

25

•Config

•Video

•Auto-Setup Script (Demo)

Thank You - Questions?

26

top related