introduc)on*to* zookeeper - o'reilly mediaassets.en.oreilly.com/1/event/61/manage...

40
Introduc)on to Zookeeper

Upload: others

Post on 28-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Introduc)on  to  Zookeeper

Page 2: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Today’s  speaker  –  Tom  Hanlon

[email protected]• Senior  Instructor  at  Cloudera

2

Page 3: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

How  I  became  aware  of  Zookeeper,  and  why  that  ma<ers  to  you

• HBASE• Uses Zookeeper to manage HA

Page 4: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

What’s  HBASE

• Hbase  is  a  massively  scalable  distributed  data  store,  easily  scales  across  many  many  nodes

• Uses  Zookeeper  to  manage  failures  and  to  provide  informa)on  about  the  current  state  of  the  system

Page 5: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

The  Problem

• Managing  Distributed  Services  is  a  challenge• A  recurring  challenge  with  similar  repea)ng  issues• What  issues  ?  • Who  is  in  charge?• Am  I  in  charge?• Are  we  both  in  charge?  • What  do  I  do  when  I  start  back  aMer  failing  ?• what  is  the  state  of  process  X  ?• Elec)on  of  Primary  out  of  group  of  peers  

Page 6: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

The  Solu)on...

• Modify  your  apps  to  deal  with  these  issues• Where  to  start  ?• Not  trivial• 2  systems,  Ac)ve  and  Standby  might  suffice• What  about  split  brain  ?  

Page 7: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

The  Solu)on  con)nued

• You  will  quickly  find  that  you  need  a  similar  set  of  services• Distributed  lock  management• Master  Elec)on.  (Slave  Discovery  ?)• Informa)on  sharing,  )me  based• Some  sort  of  hierarchy  management• Service  B  is  dead,  do  not  offer  service  C• Includes  startup  hierarchy  management

Page 8: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Ad-­‐Hoc  Solu)ons  ...  not  good

• A  collec)on  of  ad-­‐hoc  solu)ons  is  problema)c  in  the  long  term

• Like  services  should  be  managed  by  the  same  tool• A  single  place  of  reference  is  good  

Page 9: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Zookeeper  has  what  you  need

Well  a  significant  por)on  of  what  you  need.  

Zookeeper  can  provide  simple  services  that  allow  client  services  to  coordinate  or  synchronize  ac)vi)es  and  provide  a  reference  point  to  agree  on  basic  informa)on  about  the  state  of  the  current  environment  

Page 10: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Zookeeper  to  the  Rescue

• Zookeeper  h]p://zookeeper.apache.org/• Built  by  Yahoo  as  a  distributed  system  to  manage  distributed  services  in  distributed  systems.  

• Clean  focussed  Open  Source  tool• Does  one  thing  and  does  it  well

• A  highly  available  distributed  coordina)on  service

Page 11: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Distributed  Systems  Need  the  same  thing  over  and  over  again• In  order  to  be  highly  available  we  must  deal  with• Leader  elec)on• Quorum• temporal  dependencies• consensus• shared  locks• etc

Page 12: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Zookeeper  has  goals  not  unlike  your  own

ReliabilityAvailabilitySimplicityEase  of  managementClarity

Page 13: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Distributed  Lock  Server..  and  more

• highly available, distributed metadata filesystem• Runs on 3 or 5 (or more) machines• Transparently elects Master• Other machines are read-only replicas

Page 14: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Zookeeper  as  a  Metadata  Filesystem

• Zookeepers model resembles a standard tree based filesystem

• Nodes can store up to 1MB of data• Nodes can have child nodes• Nodes can not be renamed• No hard or soft links• Guarantees• Ordered updates, conditional updates, and watches

Page 15: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

A  Distributed  metadata  filesystem

• Over the network creation of nodes• Over the network updating of metadata associated with

a node• Conditional updates • Watches *

The  distributed  nature  of  Zookeeper  is  what  makes  these  simple  func)ons  extremely  useful

*  Watch  no)fica)on  similar  to  linux  ino)fy

Page 16: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

But  wait..  there  is  more  !!!

• Zookeeper can generate NodeNames in order to avoid collisions

• Can build “presence” awareness systems using Ephemeral Nodes• Nodes that exist as long as you are connected

Page 17: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Zookeeper  can  provide

• Naming  • Synchroniza)on• Leader  elec)on• Group  Management

Page 18: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Zookeeper  details

• Presents  a  filesystem  like  abstrac)on  with  a  tree  of  znodes• each  node  can  hold  data  and  other  nodes• How  much  data?  limited  to  1MB

• Also  stored  at  each  node,  transac)on  ID,  crea)on  and  modified  )mestamps,  and  version  number.

Page 19: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Interac)on  with  znodes

• create• delete• exists• get  data• set  data• get  children• sync

Page 20: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

More  on  Znodes

• All  opera)ons  are:• Atomic  • persistant  • ordered• Access  Control  List  per  node  define  who  can  perform  opera)ons

• ACL  determines  Read  Write  Delete  Permissions

Page 21: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

More  API• Sequen)al  numbering

– i.e.  Create  node  /n/child_X,  segng  X  to  lowest  unused  value

• Condi)onal  updates  via  the  version  number– A  bit  like  transac)onal  shared  memory– Do  a  read,  make  all  your  changes,  try  and  commit– If  you  fail,  start  again.

Page 22: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Watching  a  Node

• A  watch  can  be  set  on  a  node  allowing  ac)on  to  occur  when  a  node  is  updated

• In  fact,  when  any  state  is  changed  on  the  server,  the  client  can  be  no)fied  via  a  watch  –  including  session  state  changes.

• Sessions  are  persistent  across  server  failures  –  if  the  server  you  are  talking  to  fails,  you  can  transparently  nego)ate  session  con)nua)on  with  another  server.

Page 23: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

API  features

• Enough  power  to  implement  common  concurrent  objects– Can  do  compare-­‐and-­‐swap  on  version  number  (and  therefore  CAS  on  data  in  two  round  trips)

– CAS  has  consensus  number  infinity  in  wait-­‐free  hierarchy,  so  theore)cally  powerful,  for  what  that’s  worth

• Can  build  queues,  semaphores  and  so  on.– Examples  later

Page 24: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Yes,  but  does  it  scale?• Typical  deployment  is  5  to  7  machines

– Always  choose  an  odd  number  for  maximum  rela)ve  fault  tolerance

• All  updates  go  through  same  machine– So  single  machine  is  bo]leneck– Par))oning  would  probably  help

• Throughputs  are  fairly  decent– Tens  of  thousands  of  updates  per  second– Works  best  in  heavy  read  scenarios

Page 25: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Is  it  Paxos?

• No,  not  quite• In  standard  opera)on,  protocols  look  very  similar

• Paxos  has  to  cope  with  arbitrary  message  re-­‐orderings  and  compe)ng  leaders

Page 26: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

DEMO....

Page 27: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Ques)ons?

(Thank  you  for  your  Ime)

Page 28: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Appendix  A:  Resources

• Cloudera• h]p://www.cloudera.com/blog•  h]p://wiki.cloudera.com/• h]p://www.cloudera.com/blog/2011/03/simple-­‐moving-­‐average-­‐secondary-­‐sort-­‐and-­‐mapreduce-­‐part-­‐1/

• Zookeeper• h]p://zookeeper.apache.org/

Page 29: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Appendix  B:  Demo

ZOOKEEPER DEMO

In this demo we will show some of the functionality of Zookeeper and hopefully assist you in getting started with Zookeeper.

Introduction: Zookeeper is written in java, and we can connect to Zookeeper using Java. There is also a command line client and we will use that for our demo. Since Zookeeper presents a distriuted system coordination platform as a filesystem or directory structure there is also a tool that "mounts" that filesystem in userspace. I have no information regarding stability or production use of zkfuse, but worth checking out. There are python (http://www.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/) and other bindings so that you can start using zookeeper in your application.

Page 30: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Getting started:

For this demo I downloaded zookeeper and gunzipped and untarred in my home directory on my mac.

$ tar xvf zookeeper-3.3.3.tar.gz

This will create a drectory zookeeper-3.3.3$ cd zookeeper-3.3.3

First we will configure a basic Zookeeper config file

$ mv conf/zoo_sample.cfg conf/zoo.cfg

Page 31: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Edit that file so that it resembles this..

# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial # synchronization phase can takeinitLimit=10# The number of ticks that can pass between # sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.dataDir=/tmp/zoo# the port at which the clients will connectclientPort=2181

Page 32: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Start up Zookeeper in server mode. *note that in real environments you would want 3 Zookeepers working together for High Availability.

$ bin/zkServer.sh start

You will get a warning 2011-07-27 17:56:40,231 - WARN [main:QuorumPeerMain@105] - Either no config or no quorum defined in config, running in standalone mode

This warning reflects that Zookeeper is runing in Standalone mode, a single instance.

Zookeeper is a java process so we can verify easilly that it is running by running jps as the user who launched zookeeper, or as root.

$ jps14761 Jps14743 QuorumPeerMain

This shows that process ID 14743 is the Zookeeper server that we launched, technically reffered to as a Quorum, but in our case a Quorum of 1.

Page 33: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

With Zookeeper running we can explore some of the features it provides.

Administrative commands: Zookeeper provides a set of 4-letter words, or administrative information commands. Fore example.

$ echo ruok | nc localhost 2181imok ## Zookeeper response

Note that there is no newline after "imok" so it is easy to lose this response visually in your terminal.Also note that the terminal that you started Zookeeper in will still be getting messages from Zookeeper. For puposes of demonstration I like to keep that terminal output available, so launch another terminal window to run the rest of the demos, such as the 'ruok' and other 4-letter words.

If Zookeeper was not running or was not "OK" you would get no response, silence.

Page 34: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

MORE 4 LETTER WORDS

'stat' returns statisticsIn order to use the 4-letter words we will echo the word, and then pipe it through netcat to our zookeeper note that this could be a remote instance in our case we will connect locally.

STAT show statistics

$ echo stat | nc localhost 2181Zookeeper version: 3.3.3-1073969, built on 02/23/2011 22:27 GMTClients: /0:0:0:0:0:0:0:1%0:51486[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0Received: 6Sent: 5Outstanding: 0Zxid: 0x0Mode: standaloneNode count: 4

Page 35: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

DUMP shows session and ephemeral nodes

$ echo dump | nc localhost 2181SessionTracker dump:Session Sets (0):ephemeral nodes dump:Sessions with Ephemerals (0):

Page 36: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

ENVI shows the environment

$ echo envi | nc localhost 2181Environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMThost.name=172.16.1.220java.version=1.6.0_17java.vendor=Apple Inc.java.home=/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Homejava.class.path=bin/../build/classes:bin/../build/lib/*.jar:bin/../zookeeper-3.3.3.jar:bin/../lib/log4j-1.2.15.jar:bin/../lib/jline-0.9.94.jar:bin/../src/java/lib/*.jar:bin/../conf:java.library.path=.:/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/javajava.io.tmpdir=/var/folders/w4/w4mbu0k8FlG9o1hKAIOarU+++TI/-Tmp-/java.compiler=<NA>os.name=Mac OS Xos.arch=x86_64os.version=10.6.4user.name=thanlonuser.home=/Users/thanlonuser.dir=/Users/thanlon/zookeeper/zookeeper-3.3.3

Page 37: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

See the documentation for further details regarding 4 letter words.

For the rest of the demonstration you should connect to zookeeper with the Command line tool. When you run this command you will get a bunch of INFO level information, check to verify that you do not have WARN or more severe messages.

$ bin/zkCli.sh

Run "ls /" to get information of current 'nodes'

[zk: localhost:2181(CONNECTED) 1] ls /[zookeeper]

You see that Zookeeper uses Zookeeper to manage Zookeeper. Nepotism to be sure, but it makes sense here.

Page 38: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Launch another terminal and connect to Zookeeper with the CLI tool, keep both windows open.

Create a node that indicates that a service is starting and child services can begin starting. [zk:… ] create /webapp 'starting ' null

* note the space after starting I think there might be issues with the CLI,

We are creating a znode with the data of "starting" and an Access Control List of null. So anyone can view this node, anyone can create child nodes and anyone can extract the data piece from this node.

Clean up the CLI artifacts..set /webapp starting cZxid = 0x52ctime = Wed Jul 27 20:32:18 EDT 2011mZxid = 0x54mtime = Wed Jul 27 20:37:29 EDT 2011pZxid = 0x52cversion = 0dataVersion = 2aclVersion = 0....

Page 39: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Do a get to confirm..

get /webapp

Perhaps our webapp is not running unless the backend DB is up. And perhaps there is a DBmaster and a DBslave. So from the other window create those.

zk: localhost:2181(CONNECTED) 10] create /webapp/db_master "master " null

Suppose the DB servers are peers, and the first to start is instructed to be the master, and the second is instructed to be the slave.

So perhaps the second one tries to grab the master role..create /webapp/db_master "master " null Node already exists: /webapp/db_master

The node creation would fail and the server would then register as the slave.

create /webapp/dbslave "master " null Created /webapp/dbslave

Page 40: Introduc)on*to* Zookeeper - O'Reilly Mediaassets.en.oreilly.com/1/event/61/Manage Distributed... · 2011-08-02 · This warning reflects that Zookeeper is runing in Standalone mode,

Copyright  2011  Cloudera  Inc.  All  rights  reserved

Once our webapp has the services it needs to run, it can register as running, or perhaps put more of a payload in the data piece and note the IP address.

set /webapp "192.168.1.23"

get /webapp"192.168.1.23"

#### End of DEMO ######