hadoop hardware @twitter: size does matter

21
Hadoop Hardware @Twitter: Size does matter. @joep and @eecraft Hadoop Summit 2013 v2.3

Upload: michael-zhang

Post on 12-May-2015

347 views

Category:

Technology


1 download

DESCRIPTION

@joep and @eecraft Hadoop Summit 2013

TRANSCRIPT

Page 1: Hadoop Hardware @Twitter: Size does matter

Hadoop Hardware @Twitter:Size does matter.

@joep and @eecraftHadoop Summit 2013

v2.3

Page 2: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit20132

Joep RottinghuisSoftware Engineer @ Twitter

Engineering Manager Hadoop/HBase team @ Twitter

Follow me @joep

Jay ShenoyHardware Engineer @ Twitter

Engineering Manager HW @ Twitter

Follow me @eecraft

HW & Hadoop teams @ Twitter, Many others

•••

•••

About us

Page 3: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit20133

Scale of Hadoop ClustersSingle versus multiple clustersTwitter Hadoop ArchitectureHardware investigationsResults

Agenda

Page 4: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Scale

4

Scaling limits

JobTracker 10’s thousands of jobs per day; 10’s Ks concurrentslots

Namenode 250-300 M objects in single namespace

Namenode @~100 GB heap -> full GC pauses

Shipping job jars to 1,000’s of nodes

JobHistory server at a few 100’s K job history/conf files

••••

# Nodes

Page 5: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

When / why to split clusters ?

5

In principle preference for single clusterCommon logs, shared free space, reduced admin burden, more rack

diversity

Varying SLA’sWorkload diversity

Storage intensiveProcessing (CPU / Disk IO) intensiveNetwork intensive

Data accessHot, Warm, Cold

•••

Page 6: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Cluster Architecture

6

Page 7: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Hardware investigations

7

Page 8: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit20138

Hadoop does not need live HDD swapTwitter DC : No SLA on data nodesRack SLA : Only 1 rack down at any time in a cluster

Service criteria for hardware

Page 9: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit20139

Baseline Hadoop Server (~ early 2012)

E56xx

DIMM

DIMM

DIMM

E56xx

DIMM

DIMM

DIMM

PCH NICGbE

HBA

Expander

Works for the general cluster,but...

Need more density for storage

Potential IO bottlenecks

••

Characteristics: Standard 2U

server 20 servers / rack

E5645 CPU Dual 6-core 72GB memory 12 x 2TB HDD 2 x 1 GbE

•••••

Page 10: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201310

Hadoop Server: Possible evolution

Characteristics:+ CPU performance? 20 servers / rack

Candidate forDW

NICGbE

HBA

Expander16 x 2T?16 x 3T?24 x 3T?

E5-26xx orE5-24xx

DIMM

DIMM

DIMM

DIMM

E5-26xx orE5-24xx

DIMM

DIMM

DIMM

DIMM

10GbE ?

Can deploy into the general DW cluster, but...

Too much CPU for storage intensive apps

Server failure domain too large if we scale updisks

••

Page 11: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Rethinking hardware evolution

11

Debunking mythsBigger is always betterOne size fits all

Back to Hadoop Hardware Roots:Scale horizontally, not vertically

Twitter Hadoop Server - “THS”

••

Page 12: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201312

NIC

SAS HBA

E3-12xxDIMM

DIMM

PCH

GbE

THS for backups

Storage focus:

Cost efficient (single socket, 3Tdrives)

Less memory needed

Characteristics: + IO Performance

Few fast cores

E3-1230 V2 CPU 16 GB memory 12 x 3 TB HDD SSD boot 2 x 1 GbE

•••••

Page 13: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201313

THS variant for Hadoop-Proc and HBase

NIC

SAS HBA

10GbE

E3-12xxDIMM

DIMM

PCH

Characteristics: + IO Performance

Few fast cores

E3-1230 V2 CPU 32 GB memory 12 x 1 TB HDD SSD boot 1 x 10 GbE

•••••

Processing / throughput focus:

Cost efficient (single socket, 1Tdrives)

More disk and network IO persocket

Page 14: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201314

THS for cold cluster

NIC

SAS HBA

E3-12xxDIMM

DIMM

PCH

GbE

Characteristics:

Disk Efficiency

Some compute

E3-1230 V2 CPU

32 GB memory

12 x 3 TB HDD

2 x 1 GbE

••

••••Combination of previous 2 use cases:

Space & power efficient

Storage dense and some processingcapabilities

••

Page 15: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201315

Rack-level view

BaselineTwitter Hadoop Server

Backups Proc ColdPower ~ 8 kW ~ 8 kW ~ 8 kW ~ 8 kWCPU sockets; DRAM 40; 1440 GB 40; 640 GB 40; 1280 GB 40; 1280 GBSpindles; TB raw 240; 480 TB 480; 1,440 TB 480; 480 TB 480; 1,440 TBUplink; Internal BW 20 ; 40 Gbps 20 ; 80 Gbps 40 ; 400 Gbps 20 ; 80 Gbps

1G TOR1G TOR1G TOR

1G TOR1G TOR10G TOR

Page 16: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201316

Processing performance comparison

Benchmark Baseline Server THS (-Cold)TestDFSIO (write replication = 1) 360 MB/s / node 780 MB/s / nodeTeraGen (30TB replication = 3) 1:36 hrs 1:35 hrsTeraSort (30 TB, replication = 3) 6:11 hrs 4:22 hrs2 Parallel TeraSort (30 TB each, replication = 3) 10:36 hrs 6:21 hrsApplication #1 4:37 min 3:09 minApplication set #2 13:3 hrs 10:57 hrs

Performance benchmark set up:

Each clusters 102 nodes of respective type

Efficient server = 3 racks, Baseline 5+ racks

“Dated” stack: CentOS 5.5, Sun 1.6 JRE, Hadoop 2.0.3

•••

Page 17: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Results

17

Page 18: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit201316

LZO performance comparison

18

Page 19: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Recap

19

At a certain scale it makes sense to split into multiple clustersFor us: RT, PROC, DW, COLD, BACKUPS, TST, EXP

For large enough clusters, depending on use-case, it may be worth to choosedifferent HW configurations

••

Page 20: Hadoop Hardware @Twitter: Size does matter

@Twitter#HadoopSummit2013

Conclusion

20

@Twitter our “Twitter Hadoop Server”not only saves many $$$, it is also

faster !

Page 21: Hadoop Hardware @Twitter: Size does matter

#ThankYou

@joep and @eecraft

Come talk to us at booth 26