ec-cache load-balanced, low-latency cluster caching with ... · ec-cache: load-balanced,...
Post on 23-Jul-2020
28 Views
Preview:
TRANSCRIPT
EC-Cache: Load-balanced, Low-latency Cluster Caching with
Online Erasure Coding
Joint work with
Mosharaf Chowdhury, Jack Kosaian (U Michigan) Ion Stoica, Kannan Ramchandran (UC Berkeley)
Rashmi'Vinayak'
UC#Berkeley
Caching'for'data4intensive'clusters
• Data.intensive#clusters#rely#on#distributed, in-memory#caching#for#high#performance#
. Reading#from#memory#orders#of#magnitude#faster#than#from#disk/ssd#
. Example:##Alluxio#(formerly#Tachyon†)
†Li#et#al.#SOCC#2014# 2
Imbalances'prevalent'in'clusters'
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#network#imbalance#
• Failures/unavailabiliRes
3
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#network#imbalance#
• Failures/unavailabilites
Small#fracRon#of#objects#highly#popular#. Zipf.like#distribuRon##. Top#5%#of#objects#7x#more#popular#than#boWom#75%†#
(Facebook#and#MicrosoY#producRon#cluster#traces)
†Ananthanarayanan#et#al.#NSDI#2012#
Imbalances'prevalent'in'clusters'
4
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#network#imbalance#
• Failures/unavailabilites
Some#parts#of#the#network#more#congested#than#others#. RaRo#of#maximum#to#average#uRlizaRon#more#than#4.5x#
with#>#50%#uRlizaRon##
(Facebook#data.analyRcs#cluster)
Imbalances'prevalent'in'clusters'
5
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#network#imbalance#
• Failures/unavailabilites
Some#parts#of#the#network#more#congested#than#others#. RaRo#of#maximum#to#average#uRlizaRon#more#than#4.5x#
with#>#50%#uRlizaRon##
(Facebook#data.analyRcs#cluster)
Imbalances'prevalent'in'clusters'
†#Chowdhury#et#al.#SIGCOMM#2013#
. Similar#observaRons#from#other#producRon#clusters†
5
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#load#imbalance#
• Failures/unavailabilites
Norm#rather#than#the#excepRon#. median#>#50#machine#unavailability#events#every#day#in#a#
cluster#of#several#thousand#servers†#
(Facebook#data#analyRcs#cluster)
Imbalances'prevalent'in'clusters'
†Rashmi#et#al.#HotStorage#2013 6
➡ Adverse#effects:#4 load#imbalance'
. high#read#latency
Imbalances'prevalent'in'cluster'
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#network#imbalance#
• Failures/unavailabiliRes
7
➡ Adverse#effects:#4 load#imbalance'
. high#read#latency
Imbalances'prevalent'in'cluster'
Sources#of#imbalance:#
• Skew#in#object#popularity#
• Background#network#imbalance#
• Failures/unavailabiliRes
Single#copy#in#memory#oYen#not#sufficient#to#get#good#performance
7
Popular'approach:'Selec?ve'Replica?on
• Uses#some#memory#overhead#to#cache#replicas#of#objects#based#on#their#popularity#. more#replicas#for#more#popular#objects
8
Popular'approach:'Selec?ve'Replica?on
• Uses#some#memory#overhead#to#cache#replicas#of#objects#based#on#their#popularity#. more#replicas#for#more#popular#objects
A B
GET A GET B
2x 1x
…Server 1 Server 2 Server 3
8
Popular'approach:'Selec?ve'Replica?on
• Uses#some#memory#overhead#to#cache#replicas#of#objects#based#on#their#popularity#. more#replicas#for#more#popular#objects
A B A
GET A GET AGET B
1x 1x1x
…Server 1 Server 2 Server 3
8
Popular'approach:'Selec?ve'Replica?on
• Uses#some#memory#overhead#to#cache#replicas#of#objects#based#on#their#popularity#. more#replicas#for#more#popular#objects
A B A
GET A GET AGET B
1x 1x1x
• Used#in#data.intensive#clusters†#as#well#as#widely#used#in#key.value#stores#for#many#web.services#such#as#Facebook#Tao‡
…Server 1 Server 2 Server 3
†Ananthanarayanan#et#al.#NSDI#2011,##‡Bronson#et#al.#ATC!20138
Memory'Overhead
Read'performance''
&'Load'balance''
9
Memory'Overhead
Read'performance''
&'Load'balance''
Single'copy''
in'memory
9
Memory'Overhead
Read'performance''
&'Load'balance''
Single'copy''
in'memory
Selec?ve'
replica?on
9
Memory'Overhead
Read'performance''
&'Load'balance''
Single'copy''
in'memory
Selec?ve'
replica?on
EC4Cache
9
Memory'Overhead
Read'performance''
&'Load'balance''
Single'copy''
in'memory
Selec?ve'
replica?on
EC4Cache
“Erasure'Coding”
9
Quick'primer'on'erasure'coding
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
• Any$k#of#the#(k+r)#units#are#sufficient#to#decode#the#original#k#data#units
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
• k = 5 • r = 4
• Any$k#of#the#(k+r)#units#are#sufficient#to#decode#the#original#k#data#units
data units parity units
d1 d2 d3 d4 d5 p1 p2 p3 p4
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
• k = 5 • r = 4
• Any$k#of#the#(k+r)#units#are#sufficient#to#decode#the#original#k#data#units
data units parity units
Read
d1 d2 d3 d4 d5 p1 p2 p3 p4
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
• k = 5 • r = 4
• Any$k#of#the#(k+r)#units#are#sufficient#to#decode#the#original#k#data#units
data units parity units
Read
Decode
d1 d2 d3 d4 d5 p1 p2 p3 p4
d1 d2 d3 d4 d5
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
• k = 5 • r = 4
• Any$k#of#the#(k+r)#units#are#sufficient#to#decode#the#original#k#data#units
data units parity units
Read
d1 d2 d3 d4 d5 p1 p2 p3 p4
10
Quick'primer'on'erasure'coding
• Takes#in#k data units#and#creates#r##“parity” units
• k = 5 • r = 4
• Any$k#of#the#(k+r)#units#are#sufficient#to#decode#the#original#k#data#units
data units parity units
Read
d1 d2 d3 d4 d5 p1 p2 p3 p4
Decode
d1 d2 d3 d4 d5
10
EC4Cache'bird’s'eye'view:'Writes
11
EC4Cache'bird’s'eye'view:'Writes
…
XPut
Caching#servers
11
EC4Cache'bird’s'eye'view:'Writes
…
X
k#=#2Splitd2
Put
d1
• Object#split#into#k#data#units
Caching#servers
11
EC4Cache'bird’s'eye'view:'Writes
…
k#=#2#r#=#1
X
Encode
p1
k#=#2Splitd2
d1 d2
Put
d1
• Object#split#into#k#data#units
• Encoded#to#generate#r#parity#units
Caching#servers
11
EC4Cache'bird’s'eye'view:'Writes
…
k#=#2#r#=#1
X
Encode
p1
k#=#2Splitd2
d1 d2
p1d1 d2
Put
d1
• Object#split#into#k#data#units
• Encoded#to#generate#r#parity#units
• (k+r)#units#cached#on#disRnct#servers#chosen#uniformly#at#random Caching#servers
11
EC4Cache'bird’s'eye'view:'Reads
12
EC4Cache'bird’s'eye'view:'Reads
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
p1d1 d2
Get X
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
Caching#servers
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
Δ#=#1#k#+#Δ#=#3
Read units
p1d1 d2
Get X
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
Caching#servers
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
Δ#=#1#k#+#Δ#=#3
Read units
p1d1 d2
Get X
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
Caching#servers
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
Δ#=#1#k#+#Δ#=#3
Read units
p1d1 d2
d2
Get X
p1
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
Caching#servers
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
Δ#=#1#k#+#Δ#=#3
Read units
p1d1 d2
d2
Get X
p1
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
Caching#servers
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
Decode
Δ#=#1#k#+#Δ#=#3
Read units
d1 d2
p1d1 d2
d2
Get X
p1
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
• Decode#the#data#units
Caching#servers
12
EC4Cache'bird’s'eye'view:'Reads
… k#=#2#r#=#1
Decode
Δ#=#1#k#+#Δ#=#3
Read units
d1 d2
p1d1 d2
d2
X
Get X
p1
Combine
• Read#from#(k#+#Δ)#units#of#the#object#chosen#uniformly#at#random#
. “AddiRonal#reads”
• Use#the#first#k#units#that#arrive
• Decode#the#data#units
• Combine#the#decoded#units
Caching#servers
12
Erasure'coding:'How'does'it'help?
13
Erasure'coding:'How'does'it'help?
1. Finer'control'over'memory'overhead'
. SelecRve#replicaRon#allows#only#integer#control#
. Erasure#coding#allows#fracRonal#control#
. E.g.,#k#=#10#allows#control#in#of#mulRples#of#0.1
13
Erasure'coding:'How'does'it'help?
1. Finer'control'over'memory'overhead'
. SelecRve#replicaRon#allows#only#integer#control#
. Erasure#coding#allows#fracRonal#control#
. E.g.,#k#=#10#allows#control#in#of#mulRples#of#0.1
2. Object'spliRng'helps'in'load'balancing'
. Smaller#granularity#reads#help#to#smoothly#spread#load#
. Analysis#on#a#certain#simplified#model:Theorem 1 For the setting described above:
Var(LEC-Cache)
Var(LSelective Replication)=
1
k.
Proof: Let w > 0 denote the popularity of each of thefiles. The random variable LSelective Replication is distributedas a Binomial random variable with F trials and successprobability 1
S , scaled by w. On the other hand, LEC-Cacheis distributed as a Binomial random variable with kF tri-als and success probability 1
S , scaled by wk . Thus we have
Var(LEC-Cache)
Var(LSelective Replication)=
�wk
�2
(kF ) 1
S
�1� 1
S
�
w2F 1
S
�1� 1
S
� =1
k,
thereby proving our claim. ⇤Intuitively, the splitting action of EC-Cache leads to
a smoother load distribution in comparison to selectivereplication. One can further extend Theorem 1 to accom-modate a skew in the popularity of the objects. Such anextension leads to an identical result on the ratio of thevariances. Additionally, the fact that each split of an ob-ject in EC-Cache is placed on a unique server furtherhelps in evenly distributing the load, leading to even bet-ter load balancing.
5.2 Impact on Latency
Next, we focus on how object splitting impacts read la-tencies. Under selective replication, a read request foran object is served by reading the object from a server.We first consider naive EC-Cache without any additionalreads. Under naive EC-Cache, a read request for an ob-ject is served by reading k of its splits in parallel fromk servers and performing a decoding operation. Let usalso assume that the time taken for decoding is negligi-ble compared to the time taken to read the splits.
Intuitively, one may expect that reading splits in paral-lel from different servers will reduce read latencies dueto the parallelism. While this reduction indeed occurs forthe average/median latencies, the tail latencies behave inan opposite manner due to the presence of stragglers –one slow split read delays the completion of the entireread request.
In order to obtain a better understanding of the afore-mentioned phenomenon, let us consider the followingsimplified model. Consider a parameter p 2 [0, 1] andassume that for any request, a server becomes a stragglerwith probability p, independent of all else. There are twoprimary contributing factors to the distributions of the la-tencies under selective replication and EC-Cache:
(a) Proportion of stragglers: Under selective replica-tion, the fraction of requests that hit stragglers is p. Onthe other hand, under EC-Cache, a read request for anobject will face a straggler if any of the k servers fromwhere splits are being read becomes a straggler. Hence,
a higher fraction�1� (1� p)k
�of read requests can hit
stragglers under naive EC-Cache.(b) Latency conditioned on absence/presence of strag-
glers: If a read request does not face stragglers, the timetaken for serving a read request is significantly smallerunder EC-Cache as compared to selective replication be-cause splits can be read in parallel. On the other hand, inthe presence of a straggler in the two scenarios, the timetaken for reading under EC-Cache is about as large asthat under selective replication.
Putting the aforementioned two factors together we getthat the relatively higher likelihood of a straggler underEC-Cache increases the number of read requests incur-ring a higher latency. The read requests that do not en-counter any straggler incur a lower latency as comparedto selective replication. These two factors explain the de-crease in the median and mean latencies, and the increasein the tail latencies.
In order to alleviate the impact on tail latencies, weuse additional reads and late binding in EC-Cache. Reed-Solomon codes have the property that any k of the collec-tion of all splits of an object suffice to decode the object.We exploit this property by reading more than k splitsin parallel, and using the k splits that are read first. It iswell known that such additional reads help in mitigatingthe straggler problem and alleviate the affect on tail la-tencies [36, 82].
6 Evaluation
We evaluated EC-Cache through a series of experimentson Amazon EC2 [1] clusters using synthetic workloadsand traces from Facebook production clusters. The high-lights of the evaluation results are:• For skewed popularity distributions, EC-Cache im-
proves load balancing over selective replication by3.3⇥ while using the same amount of memory. EC-Cache also decreases the median latency by 2.64⇥and the 99.9th percentile latency by 1.79⇥ (§6.2).
• For skewed popularity distributions and in the pres-ence of background load imbalance, EC-Cache de-creases the 99.9th percentile latency w.r.t. selectivereplication by 2.56⇥ while maintaining the samebenefits in median latency and load balancing as inthe case without background load imbalance (§6.3).
• For skewed popularity distributions and in the pres-ence of server failures, EC-Cache provides a gracefuldegradation as opposed to the significant degradationin tail latency faced by selective replication. Specif-ically, EC-Cache decreases the 99.9th percentile la-tency w.r.t. selective replication by 2.8⇥ (§6.4).
• EC-Cache’s improvements over selective replicationincrease as object sizes increase in production traces;
13
Erasure'coding:'How'does'it'help?
3. Object'spliRng'reduces'median'latency'but'hurts'tail'
latency'
. Read#parallelism#helps#reduce#median#latency#
. Straggler#effect#hurts#tail#latency#(if#no#addiRonal#reads)
14
Erasure'coding:'How'does'it'help?
3. Object'spliRng'reduces'median'latency'but'hurts'tail'
latency'
. Read#parallelism#helps#reduce#median#latency#
. Straggler#effect#hurts#tail#latency#(if#no#addiRonal#reads)
4. “Any'k'out'of'(k+r)”'property'helps'to'reduce'tail'latency'
. Read#from#(k#+#Δ)#and#use#the#first#k#that#arrive##
. Δ#=#1#oYen#sufficient#to#reign#in#tail#latency
14
Design'considera?ons
15
Design'considera?ons
Storage#systems EC.Cache
• Space.efficient#fault#tolerance • Reduce#read#latency#
• Load#balance
1. Purpose'of'erasure'codes
15
Design'considera?ons
Storage#systems EC.Cache
2. Choice'of'erasure'code
16
Design'considera?ons
Storage#systems EC.Cache
2. Choice'of'erasure'code
†Rashmi#et#al.#SIGCOMM#2014,##Sathiamoorthy#et#al.#VLDB#2013,#Huang#et#al.#ATC!2012
• OpRmize#resource#usage#during#reconstrucRon#operaRons†#
• Some#codes#do#not#have###“any#k#out#of#(k+r)”#property
16
Design'considera?ons
Storage#systems EC.Cache
2. Choice'of'erasure'code
†Rashmi#et#al.#SIGCOMM#2014,##Sathiamoorthy#et#al.#VLDB#2013,#Huang#et#al.#ATC!2012
• No#reconstrucRon#operaRons#in#caching#layer;#data#persisted#in#underlying#storage#
• “Any#k#out#of#(k+r)”#property#helps#in#load#balancing#and#reducing#latency#when#reading#objects
• OpRmize#resource#usage#during#reconstrucRon#operaRons†#
• Some#codes#do#not#have###“any#k#out#of#(k+r)”#property
16
Design'considera?ons
Storage#systems EC.Cache
3. How'do'we'use'erasure'coding:'across'vs.'within'objects
17
Design'considera?ons
Storage#systems EC.Cache
3. How'do'we'use'erasure'coding:'across'vs.'within'objects
• Some#systems#encode#across#objects#(e.g.,#HDFS.RAID);#some#within#(e.g.,#Ceph)#
• Does#not#affect#fault#tolerance#
17
Design'considera?ons
Storage#systems EC.Cache
3. How'do'we'use'erasure'coding:'across'vs.'within'objects
• Need#to#encode#within#objects#. To#spread#load#across#both#data#&#parity#
• Encoding#across:#Very#high#BW#overhead#for#reading#object#using#pariRes†
• Some#systems#encode#across#objects#(e.g.,#HDFS.RAID);#some#within#(e.g.,#Ceph)#
• Does#not#affect#fault#tolerance#
†Rashmi#et#al.#SIGCOMM#2014,##HotStorage#2013 17
Implementa?on
• EC.Cache#on#top#of#Alluxio#(formerly#Tachyon)#
. Backend#caching#servers:#cache#data#—#unaware#of#erasure#coding##
. EC.Cache#client#library:#all#read/write#logic#handled
18
Implementa?on
• EC.Cache#on#top#of#Alluxio#(formerly#Tachyon)#
. Backend#caching#servers:#cache#data#—#unaware#of#erasure#coding##
. EC.Cache#client#library:#all#read/write#logic#handled
• Reed.Solomon#code#
. Any#k#out#of#(k+r)#property
18
Implementa?on
• EC.Cache#on#top#of#Alluxio#(formerly#Tachyon)#
. Backend#caching#servers:#cache#data#—#unaware#of#erasure#coding##
. EC.Cache#client#library:#all#read/write#logic#handled
• Reed.Solomon#code#
. Any#k#out#of#(k+r)#property
• Intel#ISA.L#hardware#acceleraRon#library##
. Fast#encoding#and#decoding
18
Evalua?on'set4up
19
Evalua?on'set4up
• Amazon#EC2
• 25#backend#caching#servers#and#30#client#servers#
19
Evalua?on'set4up
• Amazon#EC2
• 25#backend#caching#servers#and#30#client#servers#
• Object#popularity:#Zipf#distribuRon#with#high#skew
19
Evalua?on'set4up
• Amazon#EC2
• 25#backend#caching#servers#and#30#client#servers#
• Object#popularity:#Zipf#distribuRon#with#high#skew
• EC.Cache#uses#k#=#10,##Δ#=#1#
. BW#overhead#=#10%
19
Evalua?on'set4up
• Amazon#EC2
• 25#backend#caching#servers#and#30#client#servers#
• Object#popularity:#Zipf#distribuRon#with#high#skew
• EC.Cache#uses#k#=#10,##Δ#=#1#
. BW#overhead#=#10%
• Varying#object#sizes
19
Load'balancing
0
100
200
300
400
Dat
a R
ead
(GB
)
Servers Sorted by Load 0 50
100 150 200 250 300 350 400
Dat
a R
ead
(GB
)
Servers Sorted by Load
SelecRve#ReplicaRon EC.Cache
20
Load'balancing
0
100
200
300
400
Dat
a R
ead
(GB
)
Servers Sorted by Load 0 50
100 150 200 250 300 350 400
Dat
a R
ead
(GB
)
Servers Sorted by Load
SelecRve#ReplicaRon EC.Cache
• Percent#imbalance#metric:
e.g., 5.5⇥ at median for 100 MB objects with an up-ward trend (§6.5).
• EC-Cache outperforms selective replication across awide range of values of k, r, and � (§6.6).
6.1 Methodology
Cluster Unless otherwise specified, our experimentsuse 55 c4.8xlarge EC2 instances. 25 of these machinesact as the backend servers for EC-Cache, each with 8GB cache space, and 30 machines generate thousandsof read requests to EC-Cache. All machines were in thesame Amazon Virtual Private Cloud (VPC) with 10 Gbpsenhanced networking enabled; we observed around 4-5 Gbps bandwidth between machines in the VPC usingiperf.
As mentioned earlier, we implemented EC-Cache onAlluxio [56], which, in turn, used Amazon S3 [2] as itspersistence layer and runs on the 25 backend servers. Weused DFS-Perf [5] to generate the workload on the 30client machines.
Metrics Our primary metrics for comparison are la-tency in reading objects and load imbalance across thebackend servers.
Given a workload, we consider mean, median, andhigh-percentile latencies. We measure improvements inlatency as:
Latency Improvement =Latency w/ Compared Scheme
Latency w/ EC-Cache
If the value of this “latency improvement” is greater (orsmaller) than one, EC-Cache is better (or worse).
We measure load imbalance using the percent imbal-ance metric � defined as follows:
� =
✓Lmax
� Lavg?
Lavg?
◆⇤ 100, (1)
where Lmax
is the load on the server which is maximallyloaded and Lavg? is the load on any server under an oraclescheme, where the total load is equally distributed amongall the servers without any overhead. � measures thepercentage of additional load on the maximally loadedserver as compared to the ideal average load. BecauseEC-Cache operates in the bandwidth-limited regime, theload on a server translates to the total amount of data readfrom that server. Lower values of � are better. Note thatthe percent imbalance metric takes into account the ad-ditional load introduced by EC-Cache due to additionalreads.
Setup We consider a Zipf distribution for the popular-ity of objects, which is common in many real-world ob-ject popularity distributions [20, 23, 56]. Specifically, weconsider the Zipf parameter to be 0.9 (that is, high skew).
Unless otherwise specified, we allow both selectivereplication and EC-Cache to use 15% memory overhead
245
238 286 43
5
1226
99 93 141 22
9
478
0
200
400
600
800
1000
1200
1400
Mean Median 95th 99th 99.9th
Rea
d L
aten
cy (m
s)
Selective Replication
EC-Cache
242
238
283 340
881
96 90 134 193
492
0
200
400
600
800
1000
1200
1400
Mean Median 95th 99th 99.9th
Rea
d L
aten
cy (m
s)
Selective Replication
EC-Cache
Figure 8: Read latencies under skewed popularity of objects.
to handle the skew in the popularity of objects. Selec-tive replication uses all the allowed memory overheadfor handling popularity skew. Unless otherwise specified,EC-Cache uses k = 10 and � = 1. Thus, 10% of the al-lowed memory overhead is used to provide one parityto each object. The remaining 5% is used for handlingpopularity skew. Both schemes make use of the skew in-formation to decide how to allocate the allowed memoryamong different objects in an identical manner: the num-ber of replicas for an object under selective replicationand the number of additional parities for an object underEC-Cache are calculated so as to flatten out the popu-larity skew to the extent possible starting from the mostpopular object, until the memory budget is exhausted.
Moreover, both schemes use uniform random place-ment policy to evenly distribute objects (splits in case ofEC-Cache) across memory servers.
Unless otherwise specified, the size of each objectconsidered in these experiments is 40 MB. We presentresults for varying object sizes observed in the Facebooktrace in Section 6.5. In Section 6.6, we perform a sensi-tivity analysis with respect to all the above parameters.
Furthermore, we note that while the evaluations pre-sented here are for the setting of high skew in objectpopularity, EC-Cache outperforms selective replicationin scenarios with low skew in object popularity as well.Under high skew, EC-Cache offers significant benefitsin terms of load balancing and read latency. Under lowskew, while there is not much to improve in load balanc-ing, EC-Cache will still provide latency benefits.
6.2 Skew Resilience
We begin by evaluating the performance of EC-Cache inthe presence of skew in object popularity.
Latency Characteristics Figure 8 compares the mean,median, and tail latencies of EC-Cache and selectivereplication. We observe that EC-Cache improves medianand mean latencies by 2.64⇥ and 2.52⇥, respectively.EC-Cache outperforms selective replication at high per-
20
Load'balancing
0
100
200
300
400
Dat
a R
ead
(GB
)
Servers Sorted by Load 0 50
100 150 200 250 300 350 400
Dat
a R
ead
(GB
)
Servers Sorted by Load
SelecRve#ReplicaRon EC.Cache
λSR = 43.45% λEC = 13.14%
• Percent#imbalance#metric:
e.g., 5.5⇥ at median for 100 MB objects with an up-ward trend (§6.5).
• EC-Cache outperforms selective replication across awide range of values of k, r, and � (§6.6).
6.1 Methodology
Cluster Unless otherwise specified, our experimentsuse 55 c4.8xlarge EC2 instances. 25 of these machinesact as the backend servers for EC-Cache, each with 8GB cache space, and 30 machines generate thousandsof read requests to EC-Cache. All machines were in thesame Amazon Virtual Private Cloud (VPC) with 10 Gbpsenhanced networking enabled; we observed around 4-5 Gbps bandwidth between machines in the VPC usingiperf.
As mentioned earlier, we implemented EC-Cache onAlluxio [56], which, in turn, used Amazon S3 [2] as itspersistence layer and runs on the 25 backend servers. Weused DFS-Perf [5] to generate the workload on the 30client machines.
Metrics Our primary metrics for comparison are la-tency in reading objects and load imbalance across thebackend servers.
Given a workload, we consider mean, median, andhigh-percentile latencies. We measure improvements inlatency as:
Latency Improvement =Latency w/ Compared Scheme
Latency w/ EC-Cache
If the value of this “latency improvement” is greater (orsmaller) than one, EC-Cache is better (or worse).
We measure load imbalance using the percent imbal-ance metric � defined as follows:
� =
✓Lmax
� Lavg?
Lavg?
◆⇤ 100, (1)
where Lmax
is the load on the server which is maximallyloaded and Lavg? is the load on any server under an oraclescheme, where the total load is equally distributed amongall the servers without any overhead. � measures thepercentage of additional load on the maximally loadedserver as compared to the ideal average load. BecauseEC-Cache operates in the bandwidth-limited regime, theload on a server translates to the total amount of data readfrom that server. Lower values of � are better. Note thatthe percent imbalance metric takes into account the ad-ditional load introduced by EC-Cache due to additionalreads.
Setup We consider a Zipf distribution for the popular-ity of objects, which is common in many real-world ob-ject popularity distributions [20, 23, 56]. Specifically, weconsider the Zipf parameter to be 0.9 (that is, high skew).
Unless otherwise specified, we allow both selectivereplication and EC-Cache to use 15% memory overhead
245
238 286 43
5
1226
99 93 141 22
9
478
0
200
400
600
800
1000
1200
1400
Mean Median 95th 99th 99.9th
Rea
d L
aten
cy (m
s)
Selective Replication
EC-Cache
242
238
283 340
881
96 90 134 193
492
0
200
400
600
800
1000
1200
1400
Mean Median 95th 99th 99.9th
Rea
d L
aten
cy (m
s)
Selective Replication
EC-Cache
Figure 8: Read latencies under skewed popularity of objects.
to handle the skew in the popularity of objects. Selec-tive replication uses all the allowed memory overheadfor handling popularity skew. Unless otherwise specified,EC-Cache uses k = 10 and � = 1. Thus, 10% of the al-lowed memory overhead is used to provide one parityto each object. The remaining 5% is used for handlingpopularity skew. Both schemes make use of the skew in-formation to decide how to allocate the allowed memoryamong different objects in an identical manner: the num-ber of replicas for an object under selective replicationand the number of additional parities for an object underEC-Cache are calculated so as to flatten out the popu-larity skew to the extent possible starting from the mostpopular object, until the memory budget is exhausted.
Moreover, both schemes use uniform random place-ment policy to evenly distribute objects (splits in case ofEC-Cache) across memory servers.
Unless otherwise specified, the size of each objectconsidered in these experiments is 40 MB. We presentresults for varying object sizes observed in the Facebooktrace in Section 6.5. In Section 6.6, we perform a sensi-tivity analysis with respect to all the above parameters.
Furthermore, we note that while the evaluations pre-sented here are for the setting of high skew in objectpopularity, EC-Cache outperforms selective replicationin scenarios with low skew in object popularity as well.Under high skew, EC-Cache offers significant benefitsin terms of load balancing and read latency. Under lowskew, while there is not much to improve in load balanc-ing, EC-Cache will still provide latency benefits.
6.2 Skew Resilience
We begin by evaluating the performance of EC-Cache inthe presence of skew in object popularity.
Latency Characteristics Figure 8 compares the mean,median, and tail latencies of EC-Cache and selectivereplication. We observe that EC-Cache improves medianand mean latencies by 2.64⇥ and 2.52⇥, respectively.EC-Cache outperforms selective replication at high per-
>'3x'reduc?on'in'load'imbalance'metric20
Read'latency
242
238
283 340
881
96
90
134 193
492
0 200 400 600 800
1000 1200 1400
Mean Median 95th 99th 99.9th
Rea
d L
aten
cy (m
s) Selective Replication
EC-Cache
21
Read'latency
• Median:#2.64x#improvement#
• 99th#and#99.9th:#~1.75x#improvement
242
238
283 340
881
96
90
134 193
492
0 200 400 600 800
1000 1200 1400
Mean Median 95th 99th 99.9th
Rea
d L
aten
cy (m
s) Selective Replication
EC-Cache
21
Varying'object'sizes
5.5x improvement for 100MB
More'improvement'for'larger'object'sizes
0
500
1000
1500
2000
10 30 50 70 90
Rea
d L
aten
cy (m
s)
Object Size (MB)
EC-Cache (Median) Selective Replication (Median)
Median#latency
0
500
1000
1500
2000
10 30 50 70 90
Rea
d L
aten
cy (m
s)
Object Size (MB)
EC-Cache (99th) Selective Replication (99th)
Tail#latency
3.85x improvement for 100 MB
22
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80
CD
F
Read Latency (ms)
EC-Cache, �=0EC-Cache, �=1Selective Replication
Role'of'addi?onal'reads'(Δ)
23
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80
CD
F
Read Latency (ms)
EC-Cache, �=0EC-Cache, �=1Selective Replication
Significant degradation in tail latency without additional reads (i.e., Δ = 0)
Role'of'addi?onal'reads'(Δ)
23
Addi?onal'evalua?ons'in'the'paper
• With#background#network#imbalance##
• With#server#failures#
• Write#performance#
• SensiRvity#analysis#for#all#parameters
24
Summary
• EC.Cache#
. Cluster#cache#employing#erasure#coding#for#load#balancing#and#reducing#read#latencies#
. Demonstrates#new#applicaRon#and#new#goals#for#which#erasure#coding#is#highly#effecRve
Summary
• EC.Cache#
. Cluster#cache#employing#erasure#coding#for#load#balancing#and#reducing#read#latencies#
. Demonstrates#new#applicaRon#and#new#goals#for#which#erasure#coding#is#highly#effecRve
• ImplementaRon#on#Alluxio#
• EvaluaRon#. Load#balancing:#>#3x#improvement#. Median#latency:#>#5x#improvement##. Tail#latency:##>#3x#improvement
Summary
• EC.Cache#
. Cluster#cache#employing#erasure#coding#for#load#balancing#and#reducing#read#latencies#
. Demonstrates#new#applicaRon#and#new#goals#for#which#erasure#coding#is#highly#effecRve
• ImplementaRon#on#Alluxio#
• EvaluaRon#. Load#balancing:#>#3x#improvement#. Median#latency:#>#5x#improvement##. Tail#latency:##>#3x#improvement
Thanks!
top related