masterless distributed applications with riak...

31
Masterless Distributed Applications With Riak Core Tim.Tang May 2016 1

Upload: others

Post on 11-Mar-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Masterless Distributed Applications With Riak Core

Tim.Tang May 2016

1

Page 2: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Why Riak Core?

2

Distributed, Scalable, Failure-tolerant

Page 3: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Why Riak Core?

3

Distributed, Scalable, Failure-tolerant

No central coordinator.Easy to setup/operate.

Page 4: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Why Riak Core?

4

Distributed, Scalable, Failure-tolerant

Horizontally scalable.Easy add more physical nodes.

Page 5: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Why Riak Core?

5

Distributed, Scalable, Failure-tolerant

No single point of failure. Self-healing.

Page 6: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Why Consistent Hash?• Limits reshuffling of keys when

hash table data structure is rebalanced (Add/Remove Nodes).

• Uses consistent hashing to determine where to store data on a primary replica as well as fallbacks if the primary is offline.

6

Page 7: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Why Consistent Hash?

7

Add Node 4Remove Node 3

Page 8: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: The Ring

8

Page 9: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: Virtual Node• One Erlang process per partition in the

consistent hashing ring.• One partition may have multi-vnodes.• Fundamental unit of replication, fault

tolerance, concurrency.• Receives work for its portion of the hash

space.

9

Page 10: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: Virtual Node Master• Keep track of all active vnodes on its node

receives messages from coordinating FSMs.• Translates partition numbers to local PIDs

and dispatches commands to individual vnodes.

• One vnode_master per Physical Node.

10

Page 11: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: N/R/W• N = number of replicas to store (on

distinct nodes)• R = number of replica responses

needed for a successful read per-request• W = number of replica responses

needed for a successful write perrequest

11

Page 12: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: N/R/W

12

Page 13: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: N/R/W

13

Page 14: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts:Preference List

14

Page 15: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts:Read Repair

15

• If a read detects that a vnode has stale data, it is repaired via asynchronous update.

• Passive Anti-Entropy, helps implement eventual consistency.

Page 16: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts: VClock

16

Ops NodeA([email protected]) NodeB([email protected]) NodeC([email protected])

NodeA +500 500 [{A,1}] 500 [{A,1}] 500 [{A,1}]

NodeA +200 700 [{A,2}] 700 [{A,2}] 700 [{A,2}]

NodeC + 300 1050 [{A,2}, {C,1}] 1050 [{A,2}, {C,1}] 1050 [{A,2}, {C,1}]

Network Split -- (A,B), (C)NodeC + 100 1050 [{A,2},{C,1}] 1050 [{A,2},{C,1}] 1150 [{A,2}, {C,2}]

NodeB + 500 1550 [{A,2}, {B,1}, {C,1}] 1550 [{A,2}, {B,1}, {C,1}] 1150 [{A,2}, {C,2}]

Network Repaired -- (A,B,C)NodeA + 50 1600 [{A,3}, {B,1}, {C,1}] 1600 [{A,3}, {B,1}, {C,1}] 1200 [{A,3}, {C,2}]

Get Request On NodeA, How To Merge Results?

Page 17: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts:VClock

17

• Last Write Wins (LWW)• Allow multiple versions to coexist, caller

reconcile the versions with full context.• Use riak_dt module to handle data

conflicting.

Page 18: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Concepts:Handoff

18

• Handoff Types:• Ownership: occurs when a new node joins

the cluster or the vnode needs to be moved.• Hinted: occurs when a "fallback" vnode took

the responsibility for a "primary" vnode but the primary vnode is reachable again.

• Repairs: repair handoff happens when your application explicitly calls riak_core_vnode_manager:repair/3.

• Resize: > Riak core 2.0, riak_core_ring:resize().

Page 19: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

MisConcepts:Fallback

19

• One Physical Node down, all vnodes(primary) on this physical node status will fallback to another physical node. Switch to type fallback.

• Fallback is never performed by another partition, it goes to another node but keeps the index.

• For some time fallback and primary vnodes coexists.

Page 20: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Routing With Consistent Hash

20

Page 21: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Adding A Node

21

Page 22: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Traditional Router

22

Page 23: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Riak Core Router

23

Page 24: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

How Do The Routers Reach Agreement?

24

• Each node has one copy of ring cache.• Compare ring state strictly by gossip protocol.• Ring knows each vnode status.

Page 25: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

What We Get “Out Of Box”• Physical Node cluster state management.• Ring state management.• Vnode placement and replication.• Cluster and ring state gossip protocols.• Consistent hashing utilities.• Handoff activities, covering set callbacks.• Rolling upgrade capability.• Key based request dispatch.• etc…

25

Page 26: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Building An Application On Riak Core

• MIDI demo => https://github.com/tim-tang/midi• Reference:

• http://marianoguerra.github.io/little-riak-core-book/index.html

• Rebar3 => https://www.rebar3.org/• rebar3_template_riak_core => https://github.com/

marianoguerra/rebar3_template_riak_core• Riak Core source => https://github.com/basho/

riak_core• Erlang/OTP 18.

Page 27: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Riak Core Pitfalls

• Cluster membership is controlled by a human, even when a node failure has been (correctly) detected by the cluster manager.

• Vnode distribution around the ring is sometimes suboptimal.

Page 28: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Is it a good fit?• It expects you to have a "key" that

links to a blob of data or service.• The key (or rather its chash)

determines its primary vnode and adjacent replicas.

• The data itself is opaque and has application context.

Page 29: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Not Mentioned• Merkle Trees (AAE): https://

github.com/basho/riak_core/blob/develop/docs/hashtree.md

• Ring Resizing: https://github.com/basho/riak_core/blob/develop/docs/ring-resizing.md

Page 30: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

References

• Riak Core Confliction resolution => https://github.com/tim-tang/try-try-try/tree/master/04-riak-core-conflict-resolution

• CRDT LASP => https://github.com/lasp-lang/lasp

• Why Vector Clock are Easy => http://basho.com/posts/technical/why-vector-clocks-are-easy/

Page 31: Masterless Distributed Applications With Riak Coretim-tang.github.io/images/pdf/Masterless_Distributed_Applications_With_Riak_Core.pdfMasterless Distributed Applications With Riak

Thanks!

Q&A.31