springfs: bridging agility and performance in elastic ......• hadoop/hdfs deployments at facebook...

SpringFS: Bridging Agility and Performance in Elastic Distributed Storage

Lianghong Xu Jim Cipar, Elie Krevat, Alexey Tumanov, Nitin Gupta,

Mike Kozuch (Intel), Greg Ganger

Carnegie Mellon University

http://www.pdl.cmu.edu/ Lianghong Xu © Feb 14!

Elasticity in Distributed Storage •  “Elasticity” in distributed storage:

•  ability to resize dynamically as workload varies •  More difficult than elastic computing

•  Benefits •  Re-use for other purposes or reduce energy usage •  Save machine hours (operating cost)

•  Most distributed storage is not elastic •  Designed for load balancing, not elasticity •  E.g., GFS and HDFS •  Deactivating servers may make data unavailable

http://www.pdl.cmu.edu/ 2 Lianghong Xu © Feb 14!

SpringFS Contributions

http://www.pdl.cmu.edu/ 3

“Agility”: speed of elastic resizing

Lianghong Xu © Feb 14!

Sierra

Some elasticity Max peak write performance

Rabbit

Max elasticity Sub-max peak write performance

SpringFS

minimize data migration minimize machine hours

Key metrics

1. Fill the gap

2. Propose and address agility

Agility is Important

<your name here> © 2/24/14!http://www.pdl.cmu.edu/ 4

“Burstiness” in the Facebook HDFS trace



> 50% potential saving of machine hours


Agility allows close tracking of workload variation


Outline •  Introduction

•  Background and motivation

•  SpringFS design

•  Evaluation

•  Conclusion


Non-elastic Example: HDFS


Almost all servers must be “on” to ensure 100% availability •  Little potential for elastic resizing

Primary replicas

Tertiary replicas

Secondary replicas

Assumption: 3-way replication


Server number Dat

a st

ored

on

serv

er X

N 1

Pseudo-random placement, even data layout

Data Layout in Elastic Storage •  General rule:

•  Take advantage of replication •  Always keep the first (primary) replicas “on” •  The other replicas can be activated on demand

•  Notable examples: Sierra [1] and Rabbit [2]

[1] E. Thereska et al. Sierra: Practical Power-proportionality for Data Center Storage. Eurosys 2011.

[2] J. Cipar et al. Robust and flexible power-proportional storage. SoCC 2010.


Sierra Data Layout


N/3

•  Peak write performance: N/3 (as good as HDFS)

Primary servers

Secondary and tertiary servers


Server number

Dat

a st

ored

on

serv

er X

N 1

•  #servers with primary replicas: N/3

Dat

a st

ored

on

serv

er X

Server number N 1

Rabbit Equal-work Data Layout


Primary servers p = N / e2 ≈ 0.13N


Peak write performance: p

#servers with primary replicas: p (theoretically the best)

Rabbit When Using Offloading


Dat

a st

ored

on

the

serv

er

Server number N


Primary replicas spread across all active servers

•  Peak write performance: N/3

•  #servers with primary replicas: N

Tradeoff Space


#ser

vers

with

prim

ary

repl

icas

Peak write performance (normalized to 1 server) 0

better

bette

r N HDFS

Rabbit (offloaded)

N/3

N/3 Sierra (all cases)

p

p Rabbit (no offload)

SpringFS (depending on workload)


Outline •  Introduction

•  Background and motivation


•  Evaluation

•  Conclusion


SpringFS Design •  Bounded write offloading

•  Dynamically constrain distribution of primary replicas

•  Read offloading •  Preferentially offload reads from write-heavy servers

•  Passive migration •  Delay migration on server re-integration


Bounded Write Offloading


Dat

a st

ored

on

serv

er X

Server number N p m

Offload set: automatically adapts to workload Lianghong Xu © Feb 14!

Bounded Write Offloading


N m


m = N/3 m = p Rabbit Sierra

SpringFS Implementation •  Modified instance of HDFS

•  Written in Java and Python

•  Built and used a Scriptable HDFS interface •  Data placement •  Load balancing •  Data Migration •  Implement SpringFS and Rabbit in the same system

•  Resizing agent •  Activate/deactivate servers according to workload


Outline •  Introduction and motivation

•  Background


•  Evaluation

•  Conclusion


Evaluation Overview •  Experiments with SpringFS prototype

•  Analysis with real-world traces •  Hadoop/HDFS deployments at Facebook (FB) and

Cloudera Customers (CC)

•  Summary of results •  SpringFS improves over state-of-the-art designs

– Reduces data migration – Reduces machine hour usage (often near-ideal) – Provides range of options between Rabbit and

Sierra <your name here> © 2/24/14!http://www.pdl.cmu.edu/ 20

Data Migration in SpringFS Prototype


0

50

100

150

200

250

300

350

400

450

0 5 10 15 20 25 30

Num

ber o

f blo

cks

to m

ove

Target active servers

Rabbit (no offload) Rabbit (offload=30) SpringFS(offload=6) SpringFS(offload=8) SpringFS(offload=10)


Experimental setup: •  30 DataNodes •  4 primary servers •  2GB file per server •  128MB block size

Same performance

Policy Analysis with the Facebook Trace


“Ideal” system: no migration for resizing

Area under curve: aggregate machine hour usage


Machine Hours (SpringFS vs. Ideal)


No cleanup work

Extra servers activated due to passive migration


#primary servers

Machine Hour Usage


SpringFS: 6-120% improvement

SpringFS within 5% off “Ideal”


Data Migration


SpringFS: 9-208X improvement


Conclusion •  SpringFS: a new elastic distributed storage

•  Fills the gap in state-of-the-art designs

•  Agility is important •  Ability to track workload burstiness

•  Address agility by minimizing data migration •  Bounded write offloading •  Read offloading + passive migration

•  Much lower machine hour usage


springfs: bridging agility and performance in elastic ......• hadoop/hdfs deployments at facebook...

Documents