hadoop operations: starting out small / so your cluster isn't yahoo-sized (yet)

40
APOLLO GROUP So Your Cluster Isn't Yahoo-sized (yet) Hadoop Operations: Starting Out Small Michael Arnold Principal Systems Engineer 14 June 2012

Upload: michael-arnold

Post on 06-Jul-2015

2.359 views

Category:

Technology


1 download

DESCRIPTION

Hadoop Summit 2012 - Deployment and Operations track Everyone hears about large clusters with thousands of machines and petabytes of storage yet not everyone starts their first Hadoop deployment with dozens of cabinets of equipment. What do you do when you don`t have quite as large of a deployment? What decisions should you make now and which should you postpone for later? This session is for SysAdmins that have not yet or just recently jumped into the Hadoop fray. You will be presented with the knowledge gained from two years of operational experience at a (currently) small Hadoop site. We will discuss things that are initially important for a small (10-100 node) cluster and what happens when you outgrow your first deployment.

TRANSCRIPT

Page 1: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

APOLLO GROUP

So Your Cluster Isn't Yahoo-sized (yet)

Hadoop Operations: Starting Out Small

Michael ArnoldPrincipal Systems Engineer14 June 2012

Page 2: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

2APOLLO GROUP

Who

What (Definitions)

Decisions for Now

Decisions for Later

Lessons Learned

Agenda

© 2012 Apollo Group

Page 3: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

3APOLLO GROUP

APOLLO GROUP

Who

© 2012 Apollo Group

Page 4: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

4APOLLO GROUP

Who is Apollo?

© 2012 Apollo Group

Apollo Group is a leading provider of higher education programs for working adults.

Page 5: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

5APOLLO GROUP

Systems Administrator

Automation geek

13 years in IT

I deal with:

–Server hardware specification/configuration

–Server firmware

–Server operating system

–Hadoop application health

–Monitoring all the above

Who is Michael Arnold?

© 2012 Apollo Group

Page 6: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

6APOLLO GROUP

APOLLO GROUP

What

Definitions

© 2012 Apollo Group

Page 7: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

7APOLLO GROUP

Q: What is a tiny/small/medium/large cluster?

A:

–Tiny: 1-9

–Small: 10-99

–Medium: 100-999

–Large: 1000+

–Yahoo-sized: 4000

Definitions

© 2012 Apollo Group

Page 8: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

8APOLLO GROUP

Q: What is a “headnode”?

A: A server that runs one or more of the following Hadoop processes:

–NameNode

–JobTracker

–Secondary NameNode

–ZooKeeper

–HBase Master

Definitions

© 2012 Apollo Group

Page 9: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

9APOLLO GROUP

APOLLO GROUP

What decisions should you make now and which can you postpone for later?

Decisions for Now

© 2012 Apollo Group

Page 10: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

10APOLLO GROUP

Amazon

Apache

Cloudera

Greenplum

Hortonworks

IBM

MapR

Platform Computing

Which Hadoop distribution?

© 2012 Apollo Group

Page 11: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

11APOLLO GROUP

Can be OK for small clusters BUT

–virtualization adds overhead

–can cause performance degradation

–cannot take advantage of Hadoop rack locality

Virtualization can be good for:

–functional testing of M/R job or workflow changes

–evaluation of Hadoop upgrades

Should you virtualize?

© 2012 Apollo Group

Page 12: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

12APOLLO GROUP

Inexpensive

Not “enterprisey” hardware

–No RAID*

–No Redundant power*

Low power consumption

No optical drives

–get systems that can boot off the network

* except in headnodes

What sort of hardware should you be considering?

© 2012 Apollo Group

Page 13: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

13APOLLO GROUP

Start at the bottom and work your way up

Leave room in your cabinets for more machines

Plan for capacity expansion

© 2012 Apollo Group

Page 14: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

14APOLLO GROUP

Deploy your initial cluster in two cabinets

–One headnode, one switch, and several (five) datanodes per cabinet

Plan for capacity expansion (cont.)

© 2012 Apollo Group

Page 15: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

15APOLLO GROUP

Install a second cluster in the empty space in the upper half of the cabinet

Plan for capacity expansion (cont.)

© 2012 Apollo Group

Page 16: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

16APOLLO GROUP

APOLLO GROUP

What decisions should you make now and which can you postpone for later?

Decisions for Later

© 2012 Apollo Group

Page 17: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

17APOLLO GROUP

Depends upon your:

Budget

Data size

Workload characteristics

SLA

What size cluster?

© 2012 Apollo Group

Page 18: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

18APOLLO GROUP

Are your MapReduce jobs:

compute-intensive?

reading lots of data?

http://www.cloudera.com/blog/2010/08/hadoophbase-capacity-planning/

What size cluster? (cont.)

© 2012 Apollo Group

Page 19: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

19APOLLO GROUP

If more than one switch in the cluster:

YES

Should you implement rack awareness?

© 2012 Apollo Group

Page 20: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

20APOLLO GROUP

If not in the beginning, then as soon as possible.

Boot disks will fail.

Automated OS and application installs:

–Save time

–Reduce errors

•Cobbler/Spacewalk/Foreman/xCat/etc

•Puppet/Chef/Cfengine/shell scripts/etc

Should you use automation?

© 2012 Apollo Group

Page 21: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

21APOLLO GROUP

APOLLO GROUP

Lessons Learned

© 2012 Apollo Group

Page 22: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

22APOLLO GROUP

Don't add redundancy and features (server/network) that will make things more

complicated and expensive.

Hadoop has built-in redundancies.

Don't overlook them.

Keep It Simple

© 2012 Apollo Group

Page 23: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

23APOLLO GROUP

Twelve hours of manual work in the datacenter is not fun.

Make sure all server firmware is configured identically.

–HP SmartStart Scripting Toolkit

–Dell OpenManage Deployment Toolkit

–IBM ServerGuide Scripting Toolkit

Automate the Hardware

© 2012 Apollo Group

Page 24: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

24APOLLO GROUP

(Just not of the Hadoop software.)

Datanodes can be decommissioned, patched, and added back into the cluster without service

downtime.

Rolling upgrades are possible

© 2012 Apollo Group

Page 25: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

25APOLLO GROUP

Bad NIC/switchport can cause cluster slowness.

Slow disks can cause intermittent job slowdowns.

The smallest thing can have a big impact on the cluster

© 2012 Apollo Group

Page 26: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

26APOLLO GROUP

On ext3/ext4:

–Small blocks are not padded to the HDFS block-size, but rather the actual size of the data.

–Each HDFS block is actually two files on the datanode's filesystem:

•The actual data and

•A metadata/checksum file

HDFS blocks are weird

© 2012 Apollo Group

# ls -l blk_1058778885645824207*

-rw-r--r-- 1 hdfs hdfs 35094 May 14 01:26 blk_1058778885645824207

-rw-r--r-- 1 hdfs hdfs 283 May 14 01:26 blk_1058778885645824207_19155994.meta

Page 27: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

27APOLLO GROUP

Be careful tuning your datanode filesystems.

• mkfs -t ext4 -T largefile4 ... (probably bad)

• mkfs -t ext4 -i 131072 -m 0 ... (better)

Do not prematurely optimize

© 2012 Apollo Group

/etc/mke2fs.conf

[fs_types]

hadoop = {

features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink, extra_isize

inode_ratio = 131072

blocksize = -1

reserved_ratio = 0

default_mntopts = acl,user_xattr

}

Page 28: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

29APOLLO GROUP

hdfs://hdfs.delta.hadoop.apollogrp.edu:8020/

mapred.delta.hadoop.apollogrp.edu:8021

http://oozie.delta.hadoop.apollogrp.edu:11000/

hiveserver.delta.hadoop.apollogrp.edu:10000

Yes, the names are long, but I bet you can figure out how to connect to Bravo Cluster.

Use DNS-friendly names for services

© 2012 Apollo Group

Page 29: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

30APOLLO GROUP

pdsh/Cluster SSH/mussh/etc

SSH in a for loop is so 2010

FUNC/MCollective

Use a parallel, remote execution tool

© 2012 Apollo Group

Page 30: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

31APOLLO GROUP

20-100GB /var/log

–Implement log purging cronjobs or your log directories will fill up.

Beware: M/R jobs can fill up /tmp as well.

Make your log directories as large as you can.

© 2012 Apollo Group

Page 31: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

33APOLLO GROUP

Serial Over LAN is awesome when booting a system.

Standardized hardware/temperature monitoring.

Simple remote power control.

Insist on IPMI 2.0 for out of band management of server hardware.

© 2012 Apollo Group

Page 32: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

34APOLLO GROUP

Enable portfast on your server switch ports or the BMCs may never get a DHCP lease.

Spanning-tree is the devil

© 2012 Apollo Group

Page 33: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

35APOLLO GROUP 35APOLLO GROUP

You may end up doing so as well.

Apollo has re-built it's cluster four times.

© 2012 Apollo Group

Page 34: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

36APOLLO GROUP

First build

Cloudera Professional Services helped install CDH

Four nodes

Manually build OS via USB CDROM.

CDH2

Apollo Timeline

© 2012 Apollo Group

Page 35: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

37APOLLO GROUP

Second build

Cobbler

All software deployment is via kickstart. Very little is in puppet. Config files are deployed via wget.

CDH2

Apollo Timeline

© 2012 Apollo Group

Page 36: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

38APOLLO GROUP

Third build

OS filesystem partitioning needed to change.

Most software deployment still via kickstart.

CDH3b2

Apollo Timeline

© 2012 Apollo Group

Page 37: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

39APOLLO GROUP

Fourth build

HDFS filesystem inodes needed to be increased.

Full puppet automation.

Added redundant/hotswap enterprise hardware for headnodes.

CDH3u1

Apollo Timeline

© 2012 Apollo Group

Page 38: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

40APOLLO GROUP

Hardware

–disk failures (40+)

–disk cabling (6)

–RAM (2)

–switch port (1)

Software

–Cluster

•NFS (NN -> 2NN metadata)

–Job

•TT java heap

•Running out of /tmp or /var/log/hadoop

•Running out of HDFS space

Cluster failures at Apollo

© 2012 Apollo Group

Page 39: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

41APOLLO GROUP

You can spend all the time in the world trying to get the best CPU/RAM/HDD/switch/cabinet configuration, but you are running on pure luck until you understand your cluster's workload.

Know your workload

© 2012 Apollo Group

Page 40: Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)

42APOLLO GROUP

APOLLO GROUP

Questions?

© 2012 Apollo Group