236601 - coding and algorithms for memories lecture 13

15
236601 - Coding and Algorithms for Memories Lecture 13 1

Upload: krista

Post on 22-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

236601 - Coding and Algorithms for Memories Lecture 13. Large Scale Storage Systems. Big Data Players: Facebook, Amazon, Google, Yahoo,… Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!) Failures are the norm. 2. Node failures at Facebook. Date. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 236601 - Coding and Algorithms  for  Memories Lecture 13

1

236601 - Coding and Algorithms for

MemoriesLecture 13

Page 2: 236601 - Coding and Algorithms  for  Memories Lecture 13

Large Scale Storage Systems

2

• Big Data Players: Facebook, Amazon, Google, Yahoo,…

Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!)

• Failures are the norm

Page 3: 236601 - Coding and Algorithms  for  Memories Lecture 13

Node failures at Facebook

3

Date

XORing Elephants: Novel Erasure Codes for Big Data M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, VLDB 2013

Page 4: 236601 - Coding and Algorithms  for  Memories Lecture 13

4

Problem Setup• Disks are stored together in a group (rack)• Disk failures should be supported• Requirements:– Support as many disk failures as possible– And yet…

• Optimal and fast recovery• Low complexity

Page 5: 236601 - Coding and Algorithms  for  Memories Lecture 13

5

Reed Solomon Codes• A code with parity check matrix of the form

Where is a primitive element at some extension field and O() > n-1Claim: Every sub-matrix of size dxd has full rank

Page 6: 236601 - Coding and Algorithms  for  Memories Lecture 13

6

Reed Solomon Codes• Advantages:– Support the maximum number of disk failures– Are very comment in practice and have

relatively efficient encoding/decoding schemes• Disadvantages – Require to work over large fields

Solution: EvenOdd Codes– Need to read all the disks in order to recover

even a single disk failure – not efficient rebuildSolution: ZigZag Codes

Page 7: 236601 - Coding and Algorithms  for  Memories Lecture 13

The Repair Problem

7

1 2 3 4 5 6 7 9 108 P

1P3

P4

P2

• A disk is lost – Repair job starts

• Access, read, and transmit data of disks!

• Overuse of system resources during single repair

• Goal: Reduce repair cost in a single disk repair

• Facebook’s storage Scheme:– 10 data blocks– 4 parity blocks– Can tolerate any four disk failures

RS code

Page 8: 236601 - Coding and Algorithms  for  Memories Lecture 13

8

ZigZag Codes• Designed by Itzhak Tamo, Zhiying Wang,

and Jehoshua Bruck• The goal: construct codes correcting the

max number of erasures and yet allow efficient reconstruction if only a single drive fails

Page 9: 236601 - Coding and Algorithms  for  Memories Lecture 13

9

ZigZag Codes• Lower bound: The min amount of data required to

be read to recover a single drive failure– (n,k) code: n drives, k information, and n-k redundancy– M- size of a single drive in bits

• For (n,n-2) code it is required to read at least 1/2 from the remaining drives, that is at least (1/2)(n-1)M bits– The last example is optimal

• In general, for (n,n-r) code it required to read at least 1/r from the remaining drives (1/r)(n-1)M

Page 10: 236601 - Coding and Algorithms  for  Memories Lecture 13

10

ZigZag Codes• Example

info 1 info 2 info 3 Row parity

ZigZag

parity0 2 1 01 3 0 12 0 3 23 1 2 3

Page 11: 236601 - Coding and Algorithms  for  Memories Lecture 13

11

Network Coding for Distributed Storage

• Goal – show the following:In general, for (n,n-r) code it required to read at least 1/r from the remaining drives (1/r)(n-1)M

• Network Coding for Distributed StorageDimakis, Godfrey, Wu, Wainwright, Ramchandran

• File of size M is partitioned into k pieces of size M/k• The k pieces are encoded into n encoded pieces

using an (n,k) MDS code

Page 12: 236601 - Coding and Algorithms  for  Memories Lecture 13

12

Network Coding for Distributed Storage

• File of size M is partitioned into k pieces of size M/k• The k pieces are encoded into n encoded pieces

using an (n,k) MDS code

y1

y2

x1

x2

x3

x4

Page 13: 236601 - Coding and Algorithms  for  Memories Lecture 13

13

Network Coding for Distributed Storage

• File of size M is partitioned into k pieces of size M/k• The k pieces are encoded into n encoded pieces

using an (n,k) MDS code

y1

y2

x1

x2

x3

x4

x5

β=?

β

β

Page 14: 236601 - Coding and Algorithms  for  Memories Lecture 13

14

Network Coding for Distributed Storage

• File of size M is partitioned into k pieces of size M/k• The k pieces are encoded into n encoded pieces

using an (n,k) MDS code

S

x1

out

x2

out

x3

out

x4

out

x5i

n

β=?

β

β

x1i

n

x2i

n

x3i

n

x4i

n

α=1

α=1

α=1

α=1

DC

x5

out

Page 15: 236601 - Coding and Algorithms  for  Memories Lecture 13

15

ZigZag Codes• Example

a b a+b a+2dc d c+d c+b