distributed system coordination by zookeeper and introduction to kazoo python library
TRANSCRIPT
![Page 1: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/1.jpg)
Distributed System Coordination by Zookeeper and Introduction to
Kazoo Python Library
Jimmy Lai r97922028 [at] ntu.edu.tw
Dec. 22th, 2014
1
![Page 2: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/2.jpg)
Outline1. Overview 2. Basics 3. Deployment 4. Recipes 5. References
2
![Page 3: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/3.jpg)
Overview of Zookeeper
3
![Page 4: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/4.jpg)
A Distributed System - Master-Worker
• Coordination tasks: 1. elect new master when the master crashes 2. master assign tasks to worker 3. when worker crashes, re-assign the task to other
worker 4. When worker finished their task, master assign new
tasks to it
Master
Worker Worker Worker Worker Worker Worker
4
![Page 5: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/5.jpg)
Distributed System• An application consists of programs run on a
group of computers. • Coordination is more difficult than writing a
standalone program. • Developer may take too much times to handle
the coordination or create a fragile (e.g. race condition, single point failure) distributed system.
5
![Page 6: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/6.jpg)
Easy Distributed System by Zookeeper• Common coordination tasks:
• Naming service • Configuration management • Synchronization • Leader election • Message queue • Notification system
• Zookeeper provides highly reliable API for those common coordination tasks
http://en.wikipedia.org/wiki/Apache_ZooKeeper#Typical_use_cases6
![Page 7: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/7.jpg)
Powered By Zookeeper• Zookeeper is built by Yahoo Research • Customers:
• Hadoop, Hbase • Solr • Neo4j • Flume • Facebook messages
7
![Page 8: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/8.jpg)
Benefits of Zookeeper• With Zookeeper:
• simplify the development of distributed system, more agile and robust
• zookeeper is simple, fast and replicated • Without Zookeeper:
• more difficult8
![Page 9: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/9.jpg)
• Servers replicate data • Client connect to one of the
server • Throughput test • Hardware: dual 2Ghz Xeon and
two SATA 15K RPM drives
Benefits of Zookeeper
9
![Page 10: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/10.jpg)
Zookeeper Basics
10
![Page 11: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/11.jpg)
Znode (1/2)• Based on shared storage
model, each client store/acquire data from zookeeper service
• File system-like API• Znode: hierarchical tree
contains optional data or optional znodes.
• Persistent znode will disappear after delete operation
• Ephemeral znode will disappear when the client creator crashes or close the connection, or deleted by any client
11
![Page 12: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/12.jpg)
Znode (2/2)• Sequential znode will
be assigned a monotonically increasing integer at the end of path. E.g. /path-1, /path-2
• Versions: each node have a version and will be increased when its data changes
12
![Page 13: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/13.jpg)
Operations• Primitive operations:
• create /path data • delete /path • exists /path • setData /path data • getData /path • getChildren /path
13
![Page 14: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/14.jpg)
Notification• set a watch on a znode operation (getData,
getChildren, exist) and then get the notification when there is a change at the target
• Watch is: • one-time trigger • with ordering guarantee: all the event received
in client side will preserve the order of time
14
![Page 15: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/15.jpg)
Session• Session: client create a session connection
to one of the server and start operations • Session states:
• connecting • connected • closed • not_connected
15
![Page 16: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/16.jpg)
Example - implement a lock• Spec: n clients try to get the lock at the same
time, but only one of them can get the lock. • Solution: clients try to create a ephemeral
znode e.g. /lock. the first one will get the lock and the rest of them which fail to create the znode set up a watch to know when the lock is released and then try to acquire again.
16
![Page 17: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/17.jpg)
Example - implement master-worker
• Spec: • client submit tasks • master watches for new workers and tasks,
assign tasks to available workers • backup master takes over when the master fails • workers register themselves and then watch for
new tasks
17
![Page 18: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/18.jpg)
Example - implement master-worker• Solution:
• ephemeral znode /master for master election • backup masters sets up a watch for /master
• persistent znode /workers • master set up with for /workers • worker create a znode in /workers, e.g. /workers/host1
• persistent sequential znode /tasks • client submit tasks by creating znode under /tasks
• persistent znode /assign • workers set up watch on their corresponding znode under /assign e.g. /assign/
host1 • master assign task to worker by create znode under /assign, e.g. /assign/host1/
task1• worker mark the task as done by update the data of task as “done”
18
![Page 19: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/19.jpg)
Zookeeper Deployment
19
![Page 20: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/20.jpg)
Zookeeper Server Run Modes• Standalone: single server • Quorum: multiple servers replicate the data
• the cluster apply majority vote to keep the consistency so a cluster can afford less than half of nodes crash
• default ports: client(2181), quorum(2182), election(2183)
20
![Page 21: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/21.jpg)
Clients• Native primitive operations
• C library • Java library
• Recipes (3rd party high level API) • Java: Curator (by Netflix) • Python: kazoo (by Mozilla and Zope)
21
![Page 22: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/22.jpg)
Java Client Console• bin/zkCli.sh -server 127.0.0.1:2181 • Commands
• get path [watch] • ls path [watch] • set path data [version] • createpath data acl • delete path [version] • setquota -n|-b val path
22
![Page 23: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/23.jpg)
Python client - kazoo
• from kazoo.client import KazooClient • zk = KazooClient(hosts='127.0.0.1:2181') • zk.start()
• zk.stop()
https://kazoo.readthedocs.org/en/latest/23
![Page 24: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/24.jpg)
from kazoo.client import KazooClientfrom kazoo.client import KazooState
def my_listener(state): if state == KazooState.LOST: print 'lost session' elif state == KazooState.SUSPENDED: print 'disconnected from Zookeeper' elif state == KazooState.CONNECTED: # try to become the master print 'connected'
zk = KazooClient(hosts='127.0.0.1:2181')zk.add_listener(my_listener)zk.start()lock = zk.Lock('/master', '%s-%d' %(socket.gethostname(), os.getpid()))
24
zk.ensure_path("/path")
zk.set("/path", “data_string".encode('utf8'))
start_key, stat = zk.get("/path")
![Page 25: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/25.jpg)
Zookeeper Recipes
25
![Page 26: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/26.jpg)
Common Recipes• lock • election • counter • barrier • partitioner • party • queue
• watch
26
![Page 27: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/27.jpg)
Lock
zk = KazooClient()lock = zk.Lock("/lockpath", "my-identifier")with lock: # blocks waiting for lock acquisition # do something with the lock
lock.release()
27
![Page 28: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/28.jpg)
Electionzk = KazooClient()election = zk.Election("/electionpath", "my-identifier")# blocks until the election is won, then calls# my_leader_function() election.run(my_leader_function)
28
![Page 29: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/29.jpg)
zk = KazooClient()counter = zk.Counter("/int")counter += 2counter -= 1counter.value == 1counter = zk.Counter("/float", default=1.0)counter += 2.0counter.value == 3.0
Counter
29
![Page 30: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/30.jpg)
Barrierbarrier = zk.Barrier("/barrier")barrier.create() barrier.wait()# master release the barrier bybarrier.remove()
30
![Page 31: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/31.jpg)
Partitionerfrom kazoo.client import KazooClientclient = KazooClient()qp = client.SetPartitioner( path='/work_queues', set=('queue-1', 'queue-2', 'queue-3'))while 1: if qp.failed: raise Exception("Lost or unable to acquire partition") elif qp.release: qp.release_set() elif qp.acquired: for partition in qp: # Do something with each partition elif qp.allocating: qp.wait_for_acquire()
31
![Page 32: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/32.jpg)
Partyparty1 = zk.Party("/party1", "my-identifier")party2 = zk.Party("/party2", "my-identifier")party1.join()"my-identifier" in party1"my-identifier" not in party2
32
![Page 33: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/33.jpg)
Queue
queue = zk.LockingQueue("/queue")for task in tasks: queue.put(task.encode('utf8')) task = queue.get()
33
![Page 34: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/34.jpg)
Watch: watch znode continuously
@zk.DataWatch('/last_scanned_card_key')def my_func(data, stat, event): print("Data is %s" % data) print("Version is %s" % stat.version) print("Event is %s" % event)
34
![Page 35: Distributed system coordination by zookeeper and introduction to kazoo python library](https://reader033.vdocuments.us/reader033/viewer/2022052223/55a20a261a28aba5368b465a/html5/thumbnails/35.jpg)
References
35
• Flavio Junqueira, Benjamin Reed, ZooKeeper: Distributed Process Coordination, O'Reilly Media, Inc., November 25, 2013
• Zookeeper website, http://zookeeper.apache.org/