masterless distributed computing with riak core - euc 2010

52
Rusty Klophaus (@rklophaus) Basho Technologies Masterless Distributed Computing with Riak Core Erlang User Conference Stockholm, Sweden · November 2010 Monday, November 22, 2010

Upload: rusty-klophaus

Post on 06-May-2015

5.372 views

Category:

Technology


1 download

DESCRIPTION

Riak Core--an open-source Erlang library created by Basho Technologies that powers Riak KV and Riak Search--allows developers to build distributed, scalable, failure-tolerant applications based on a generalized version of Amazon's Dynamo architecture. In this talk, Rusty will explain why Riak Core was built, discuss what problems it solves and how it works, and walk through the steps to using Riak Core in an Erlang application.

TRANSCRIPT

Page 1: Masterless Distributed Computing with Riak Core - EUC 2010

Rusty Klophaus (@rklophaus)Basho Technologies

Masterless Distributed Computing with Riak Core

Erlang User ConferenceStockholm, Sweden · November 2010

Monday, November 22, 2010

Page 2: Masterless Distributed Computing with Riak Core - EUC 2010

Basho Technologies

2

About Basho Technologies• Offices in:

• Cambridge, Massachusetts • San Francisco, California

• Distributed company• ~20 people• Riak KV and Riak Search (both Open Source)• SLA-based Support and Enterprise Software ($)• Other Open Source Erlang developed by Basho:

• Webmachine, Rebar, Bitcask, Erlang_JS, Basho Bench

Monday, November 22, 2010

Page 3: Masterless Distributed Computing with Riak Core - EUC 2010

Riak KV and Riak Search

3

Riak KV

Key/Value Datastore

Map/Reduce, Lightweight Data Relations, Client APIs

Riak Search

Full-text search and indexing engine

Near Realtime Indexing, Riak KV Integration, Solr Support.

Common Properties

Both are distributed, scalable, failure-tolerant applications.

Both based on Amazon’s Dynamo Architecture.

Monday, November 22, 2010

Page 4: Masterless Distributed Computing with Riak Core - EUC 2010

Riak

KV

Riak

Search

Riak

Core

The Common Parts are called Riak Core

Monday, November 22, 2010

Page 5: Masterless Distributed Computing with Riak Core - EUC 2010

Riak

KV

Riak

Search

Riak

Core

The Common Parts are called Riak Core

Distribution / Scaling / Failure-Tolerance Code

Monday, November 22, 2010

Page 6: Masterless Distributed Computing with Riak Core - EUC 2010

Riak Core is an Open Source Erlang library

that helps you builddistributed, scalable, failure-tolerant

applications using a Dynamo-style architecture.

6

Monday, November 22, 2010

Page 7: Masterless Distributed Computing with Riak Core - EUC 2010

“We Generalized the Dynamo Architecture and Open-Sourced the Bits.”

7

Monday, November 22, 2010

Page 8: Masterless Distributed Computing with Riak Core - EUC 2010

What Areas are Covered?

Amazon’s Dynamo Paper highlighted to show parts covered in Riak Core.

Monday, November 22, 2010

Page 9: Masterless Distributed Computing with Riak Core - EUC 2010

Distributed, scalable, failure-tolerant.

9

Monday, November 22, 2010

Page 10: Masterless Distributed Computing with Riak Core - EUC 2010

Distributed, scalable, failure-tolerant.

No central coordinator. Easy to setup/operate.

10

Monday, November 22, 2010

Page 11: Masterless Distributed Computing with Riak Core - EUC 2010

Distributed, scalable, failure-tolerant.

Horizontally scalable; add commodity hardware

to get more X.

11

Monday, November 22, 2010

Page 12: Masterless Distributed Computing with Riak Core - EUC 2010

Distributed, scalable, failure-tolerant.

Always available. No single point of failure.

Self-healing.

12

Monday, November 22, 2010

Page 13: Masterless Distributed Computing with Riak Core - EUC 2010

Wait, doesn’t *Erlang* let you build distributed, scalable, failure-tolerant

applications?

13

Monday, November 22, 2010

Page 14: Masterless Distributed Computing with Riak Core - EUC 2010

Client

Service A Service B

Resource D

Service C

Queue E

Erlang makes it easy to connect the components of your application.

Monday, November 22, 2010

Page 15: Masterless Distributed Computing with Riak Core - EUC 2010

Service

Node A

Node E

Node I

Node M

Node B

Node F

Node J

Node N

Node C

Node G

Node K

Node O

Node D

Node H

Node L

. . .

Riak Core helps you build a service that harnesses the power of many nodes.

Monday, November 22, 2010

Page 16: Masterless Distributed Computing with Riak Core - EUC 2010

How does Riak Core work?

16

Monday, November 22, 2010

Page 17: Masterless Distributed Computing with Riak Core - EUC 2010

A Simple Interface...

Command ObjectName, Payload

Send commands, get responses.

How do we route the commands to physical machines?

Monday, November 22, 2010

Page 18: Masterless Distributed Computing with Riak Core - EUC 2010

Hash the Object Name

Command ObjectName, Payload

SHA1(ObjName), Payload

0 to 2^160

Monday, November 22, 2010

Page 19: Masterless Distributed Computing with Riak Core - EUC 2010

A Naive Approach

Command ObjectName, Payload

SHA1(ObjName), Payload

Node A Node B Node C Node D

Monday, November 22, 2010

Page 20: Masterless Distributed Computing with Riak Core - EUC 2010

A Naive Approach

Command

SHA1(ObjName), Payload

Node A Node B Node C Node D Node E

ObjectName, Payload

Existing routes become invalid when you add/remove nodes.

Monday, November 22, 2010

Page 21: Masterless Distributed Computing with Riak Core - EUC 2010

"All problems in computer science can be solved by

another level of indirection." - David Wheeler

21

Monday, November 22, 2010

Page 22: Masterless Distributed Computing with Riak Core - EUC 2010

Routing with Consistent Hashing

Command ObjectName, Payload

SHA1(ObjName), Payload

VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 VNode 7

Node A Node B Node C Node D

Monday, November 22, 2010

Page 23: Masterless Distributed Computing with Riak Core - EUC 2010

Adding a Node

Command ObjectName, Payload

SHA1(ObjName), Payload

VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 VNode 7

Node A Node B Node C Node D Node E

Monday, November 22, 2010

Page 24: Masterless Distributed Computing with Riak Core - EUC 2010

Removing a Node

Command ObjectName, Payload

SHA1(ObjName), Payload

VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 VNode 7

Node A Node B Node C Node D Node E

Monday, November 22, 2010

Page 25: Masterless Distributed Computing with Riak Core - EUC 2010

The Ring

Hash Location

Monday, November 22, 2010

Page 26: Masterless Distributed Computing with Riak Core - EUC 2010

Writing Replicas (N Value)

Locations when N=3

Monday, November 22, 2010

Page 27: Masterless Distributed Computing with Riak Core - EUC 2010

Routing Around Failures

Locations when N=3and node 0 is down.

X

Monday, November 22, 2010

Page 28: Masterless Distributed Computing with Riak Core - EUC 2010

The Preflist

Preflist

Monday, November 22, 2010

Page 29: Masterless Distributed Computing with Riak Core - EUC 2010

Location of the Routing Layer

29

Monday, November 22, 2010

Page 30: Masterless Distributed Computing with Riak Core - EUC 2010

Router in the Middle Leads to SPOF

Client Client Client

Router

VNode

0

Node A Node B Node C Node D Node E

VNode

1

VNode

3

VNode

4

VNode

2

VNode

5

VNode

6

VNode

7

Monday, November 22, 2010

Page 31: Masterless Distributed Computing with Riak Core - EUC 2010

Riak Core - Router on Each Node

Client Client Client

Router Router Router RouterRouter

VNode

0

Node A Node B Node C Node D Node E

VNode

1

VNode

3

VNode

4

VNode

2

VNode

5

VNode

6

VNode

7

Monday, November 22, 2010

Page 32: Masterless Distributed Computing with Riak Core - EUC 2010

Eventually - Router in the Client

Client Client Client

VNode

0

Node A Node B Node C Node D Node E

VNode

1

VNode

3

VNode

4

VNode

2

VNode

5

VNode

6

VNode

7

Router RouterRouter

Why isn’t this done yet? Time and complexity.

Monday, November 22, 2010

Page 33: Masterless Distributed Computing with Riak Core - EUC 2010

How Do The Routers Reach Agreement?

Router Router Router RouterRouter

VNode

0

Node A Node B Node C Node D Node E

VNode

1

VNode

3

VNode

4

VNode

2

VNode

5

VNode

6

VNode

7

Monday, November 22, 2010

Page 34: Masterless Distributed Computing with Riak Core - EUC 2010

The Nodes Gossip Their World View

Local Ring State

IncomingRing State

Are rings equivalent?Strictly descendent?Or different?

Monday, November 22, 2010

Page 35: Masterless Distributed Computing with Riak Core - EUC 2010

Not MentionedVector ClocksMerkle TreesBloom Filters

35

Monday, November 22, 2010

Page 36: Masterless Distributed Computing with Riak Core - EUC 2010

Building an Applicationwith Riak Core

36

Monday, November 22, 2010

Page 37: Masterless Distributed Computing with Riak Core - EUC 2010

Building an Application on Riak Core?Two things to think about:

37

The Command SetCommand = ObjectName, PayloadThe commands/requests/operations that you will send through the system.

The VNode ModuleThe callback module that will receive the commands.

Monday, November 22, 2010

Page 38: Masterless Distributed Computing with Riak Core - EUC 2010

Writing a VNode Module

38

Startup/Shutdowninit([Partition]) -> {ok, State}

terminate(State) -> ok

Receive Incoming Commandshandle_command(Cmd, Sender, State) -> {noreply, State1} | {reply, Reply, State1}

handle_handoff_command(Cmd, Sender, State) ->{noreply, State1} | {reply, ok, State1}

Monday, November 22, 2010

Page 39: Masterless Distributed Computing with Riak Core - EUC 2010

Writing a VNode Module

39

Send and Receive Handoff Datahandoff_starting(Node, State) -> {Bool, State1}

encode_handoff_data(Data, State) -><<Binary>>.

handle_handoff_data(Data, Sender, State) ->{reply, ok, State1}

handoff_finished(Node, State) ->{ok, State1}

Monday, November 22, 2010

Page 40: Masterless Distributed Computing with Riak Core - EUC 2010

Start the riak_core application

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

40

application:start(riak_core).

Monday, November 22, 2010

Page 41: Masterless Distributed Computing with Riak Core - EUC 2010

Start the riak_core application

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

41

Supervise vnode processes.

Monday, November 22, 2010

Page 42: Masterless Distributed Computing with Riak Core - EUC 2010

Start the riak_core application

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

42

Start, coordinate, and supervise handoff.

Monday, November 22, 2010

Page 43: Masterless Distributed Computing with Riak Core - EUC 2010

Start the riak_core application

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

43

Maintain cluster membership information.

Monday, November 22, 2010

Page 44: Masterless Distributed Computing with Riak Core - EUC 2010

Start the riak_core application

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

44

Monitor node liveness, broadcast to registered modules.

Monday, November 22, 2010

Page 45: Masterless Distributed Computing with Riak Core - EUC 2010

Start the riak_core application

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

45

Send ring information to other nodes.Reconcile different views of the cluster.

Rebalance cluster when nodes join or leave.

Monday, November 22, 2010

Page 46: Masterless Distributed Computing with Riak Core - EUC 2010

In your application...

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

46

Start the vnodes for your application.Master = { riak_X_vnode_master, { riak_core_vnode_master, start_link, [riak_X_vnode] }, permanent, 5000, worker, [riak_core_vnode_master]},{ok, { {one_for_one, 5, 10}, [Master]} }.

Monday, November 22, 2010

Page 47: Masterless Distributed Computing with Riak Core - EUC 2010

In your application...

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

47

Tell riak_core that your applicationis ready to receive requests.

riak_core:register_vnode_module(riak_X_vnode),riak_core_node_watcher:service_up(riak_X, self())

Monday, November 22, 2010

Page 48: Masterless Distributed Computing with Riak Core - EUC 2010

In your application...riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

riak_core

riak_core_vnode_sup riak_core_handoff_*

riak_core_ring_*

riak_core_node_*

riak_core_gossip_*X_vnode

. . .

X_vnode

X_vnode

48

Join to an existing node in the cluster.

riak_core_gossip:send_ring(ClusterNode, node())

Monday, November 22, 2010

Page 49: Masterless Distributed Computing with Riak Core - EUC 2010

Start Sending Commands

49

# Figure out the preflist...{_Verb, ObjName, _Payload} = Command,PrefList = riak_core_apl:get_apl(ObjName, NVal, riak_X),

# Send the command...riak_core_vnode_master:command(PrefList, Command, riak_X_vnode_master)

Monday, November 22, 2010

Page 50: Masterless Distributed Computing with Riak Core - EUC 2010

Review

Riak KVOpen Source Key/Value datastore.

Riak Search

Full-text, near real-time search engine based on Riak Core.

Riak CoreOpen Source Erlang library that helps you build distributed, scalable, failure-tolerant applications using a Dynamo-style architecture.

50

Monday, November 22, 2010

Page 51: Masterless Distributed Computing with Riak Core - EUC 2010

Thanks! Questions?

Learn Morehttp://wiki.basho.comRead Amazon’s Dynamo Paper

Get the Codehttp://github.com/basho/riak_core

Get in [email protected] on Email@rklophaus on Twitter

51

Monday, November 22, 2010

Page 52: Masterless Distributed Computing with Riak Core - EUC 2010

END

Monday, November 22, 2010