part8 ch3 raid
TRANSCRIPT
-
7/31/2019 Part8 Ch3 Raid
1/12
Page 1
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .1
FAULT TOLERANT SYSTEMS
ht t p: / / www. ecs.umass. edu/ ece/ kor en/ Fault Toler ant Syst ems
Par t 8 RAI D Syst ems
Chapt er 3 I nf or mat ion Redundancy
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .2
RAI D - Redundant Ar r ays ofI nexpensive (I ndependent ) Disks
RAI D1 - t wo mirr or ed disks
I f one disk f ails, t he ot her can cont inue
I f bot h wor k:
speeds up r ead accesses - dividest hem among two disks
Wr it e accesses are slowed downComput ing r eliabili t y, availabili t y, and MTTDL (mean
t ime t o data loss) of RAI D1
1
0
0
1
0
0
1
1
1
0
0
1
0
0
1
1
-
7/31/2019 Part8 Ch3 Raid
2/12
Page 2
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .3
RAI D1 - Reliabilit y Calculat ion
Assumpt ions:
disks f ail independent ly
f ailur e pr ocess - Poisson pr ocess wit h r at e
r epair t ime - exponent ial wit h mean t ime 1/
Mar kov chain: st at e - number of good disks
Reliabilit y at t ime t -
)()(2)(
122 tPtPdt
tdP += )(2)()(
)(21
1 tPtPdt
tdP ++=
)()(1)(210tPtPtP = 0)0()0(;1)0(
102=== PPP
)(1)()()( 021 tPtPtPtR =+=
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .4
RAI D1 - MTTDLCalculat ionStarting in state 2 at t =0
- mean t ime bef or e ent er ing state 1 = 1/ (2)
Mean t ime spent in state 1 is 1/ ( + )
Go back t o state 2 wit h pr obabilit y q = / ( + )
or to state 0 wit h pr obabilit y p = / ( + )
Pr obabilit y of n visit st o state 1 bef or e t r ansit ion t o state 0 is
Mean t ime t o ent er state 0 :
pqn 1
)(2
3)
1
21()(
02
+
+=
++= nnnT
)1()(02
1
102
1
1
=
=
== TpnqnTpqMTTDLn
n
n
n
202
2
3)1(
+==
p
T
-
7/31/2019 Part8 Ch3 Raid
3/12
Page 3
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .5
Appr oximat eReliabilit y of RAI D1
I f >> , the transition
r at e int o st at e 0 f rom t heaggr egate of st at es 1 and 2 is 1/ MTTDL
Appr oximate r eliabilit y:
I mpact of Disk lif et ime
I mpact of Disk Repair t ime
MTTDLtetR
=)(
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .6
RAI D1 - Availabi li t yCalculat ion
Mar kov chain: st at e - number of good disks
The solut ion t o t he dif f erent ial equat ions:
Long- t erm availabilit y -
A = P2 + P1 = 1 - P0
2222 )(1)()2( +=+= +
tetP
)(2222
)(2)()( ++++= te )(222 )( +++
te
)(222 )(2
++
)()(1)( 120 tPtPtP =
tetP
)(221 )()(2)(2)(
++++=
-
7/31/2019 Part8 Ch3 Raid
4/12
Page 4
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .7
RAI D2
A bank of data disks in parallel wit h Hamming- codeddisks
d dat a disks and c code disks
i - t h bi t of each disk - bi t of a c+d- bit word
Fr om Hamming codes t heor y - t o permit t hecor r ect ion of one bit per wor d -
We wil l not spend mor e t ime on RAI D2 because ot herRAI D designs impose much less overhead
12 ++ dcc
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .8
RAI D3
Modif icat ion of RAI D2
Observat ion - each disk has er ror - det ect ioncoding per sect or - a bad sect or can be ident if ied
Bank of d dat a disks t oget her wit h one parit y disk
Data ar e bit - int erleaved acr oss t he dat a disks
The i - t h posit ion of t he pari t y disk cont ains t hepar it y bit associat ed wit h t he bit s in t he i - t hposit ion of each of t he dat a disks
-
7/31/2019 Part8 Ch3 Raid
5/12
Page 5
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .9
Er r or Det ect ion and Cor rect ion inRAI D3
i - t h bit s of each disk f orm a d+1- bit wor d- d dat a and 1 parity bits
I f j - t h bit in wor d is incor r ect - sect or err or -det ect ing code inj - t h disk will indicat e a f ailur e -f ault will be locat ed - r emaining bit s can be used t or est or e t he f ault y bit
Example: word - 11100 ; dat a bit s - 1110 ;parity bit - 0
I f even par it y is being used - a bit is in er r orThir d disk indicat es an err or in t he r elevant sect or
and t he ot her disks show no such er r or sThe cor r ect wor d is 11000
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 10
Reliabilit y of RAI D3
Similar analysis t o RAI D1:(d+1) disks inst ead of 2
Syst em f ails (dat a loss) if t wo or mor e disks f ail
Mean t ime t o dat a loss f or t his gr oup is
The reliabilit y is given appr oximat ely by
d(d+1)
d+1 d F
MTTDLtetR
=)(
2)1(
)12(
+++=
dddMTTDL
-
7/31/2019 Part8 Ch3 Raid
6/12
Page 6
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 11
Compar ing Dif f er ent RAI D3 Syst ems
Unr eliabilit y of RAI D3 syst ems f or dif f er entvalues of d - mean lif et ime of a single disk is500,000 hour s
The d=1 case - ident ical t o RAI D1
Reliabili t y goes down as d increases
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 12
RAI D4
Similar t o RAI D3 -but unit of int er leavingblock of ar bit r ar ysize - a stripe
Advant age - a small r ead may be cont ained in onesingle dat a disk, r at her t han int er leaved over all disks
Small read operat ions ar e f ast er in RAI D4
Similar ly f or small wr it e operat ionsWr it e - af f ected dat a disk and par it y disk must be
updat ed
Par it y updat e simple - par it y bit t oggles if dat a bitbeing wr it t en is dif f erent f r om one being over wr it t en
Reliabilit y model f or RAI D4 - ident ical t o RAI D3
-
7/31/2019 Part8 Ch3 Raid
7/12
Page 7
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 13
RAI D5
Observat ion - par it y disk can be syst em bot t leneck
I n RAI D4 - par it y disk accessed in each wr it e
I n RAI D5 - par it y blocks int erleaved among disks
Every disk has some dat a blocks and some par it yblocks
Reliabili t y model f or RAI D5 same as f or RAI D4Only t he perf or mance model is dif f erent
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 14
Modeling Cor r elat ed Failur es
We assumed unt il now t hat disks are independentwit h respect t o f ailur es
Disk f ailur es may be cor r elat ed - power supply andcont r ol ar e t ypically shared among mult iple disks
Disk syst ems are usually made up of st r ings -consist ing of disks t hat shar e power supply, cabling,cooling, and a cont r oller
I f any of t hese shar ed suppor t it ems f ail, t heent ir e st r ing can f ail
I f t he st r ing const it ut es t he RAI D group - dat aloss can occur
-
7/31/2019 Part8 Ch3 Raid
8/12
Page 8
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 15
Appr oximat e Reliabilit y of St r ing
- f ailur e r at e of t he suppor t element spower, cabling, cooling, cont r ol) of a st r ing
- appr oximat e f ailure r at e due t oindependent f ailur es
I f a RAI D gr oup is cont r olled by a single st r ing -t he aggr egate f ailur e rat e of t he gr oup is
And t he reliabilit y is
strindeptotal +=
indep
str
ttotaletRtotal
=)(
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 16
I mpact of St r ing Failur es on RAI D1
Similar r esult s f or RAI D3 and higher levels
Figures of 150,000 hour s f or t he mean st r inglif et ime have been quot ed in t he lit eratur e
At least one manuf act ur er claims mean disklif et imes of 1,000,000 hours
Gr ouping an ent ir e RAI D arr ay as a single st r ingincreases unr eliabilit y by or ders of magnit ude
Mean St r ing Lif et ime
-
7/31/2019 Part8 Ch3 Raid
9/12
Page 9
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 17
Or t hogonal
Ar r angementof St r ingsand RAI D
Gr oups
Failur e of a st r ing af f ect s onlyone disk in each RAI D group
Since each RAI D can t oler at e t he f ailure of up t oone disk, t his r educes t he impact of st r ing f ailur es
Data loss wil l happen only if any RAI D group has atleast t wo disks down at t he same t ime
Str ing
RAID
group
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 18
Appr oximat e Modeling of Or t hogonal Syst ems
Dat a loss is caused by a sequence of event s
A f ailur e can be t r igger ed by an individual diskf ailur e or by a st r ing f ailure - ver y low f ailur e r at es
We will f ind t he (appr oximate) f ailur e rat e due t oeach -
Sum of t hese t wo f ailur e r at es - t he approximat e
over all f ailur e rat e -
I t can t hen be used t o appr oximately det ermineMTTDL - Mean t ime t o dat a loss, and r eliabilit y -probabil it y of no dat a loss over any given per iod oft ime
strindivlossdata +=_
strindiv ;
lossdata_
-
7/31/2019 Part8 Ch3 Raid
10/12
Page 10
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 19
Ort hogonal Ar r angement - Not at ions
d+1 st r ings, g RAI D gr oups - t ot al of (d+1)g disks
- densit y f unct ion of t he disk repair t ime - f ailur e rat e of a single disk
- pr obabilit y t hat a given individual f ailur et r iggers dat a loss
Appr oximat e r at e (per disk) at which individualf ailur es t r igger dat a loss
= pr obabilit y t hat a second disk f ails in t heaf f ect ed RAI D gr oup while t he f ir st f ailur e is notyet r epair ed
This second f ailur e has t he r at e -t he second disk f ailure can happen eit her due t o anindividual disk or st r ing f ailur e
disk
)(tfdisk
indiv
indivdisk
indiv
)( strdiskd +
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 20
Calculat ing Failure Rat es - Condit ioning on - repai r t ime of f i rst disk fa ilure
Uncondit ional pr obabilit y of dat a loss -
- t he Laplace t ransf orm of Appr oximat e rat e at which dat a loss is t r iggered by
individual disk f ailur e -
indiv
)(1}|{Pr strdisk
detakesrepairLossDataob
+=
(.)*diskF (.)diskf
-
7/31/2019 Part8 Ch3 Raid
11/12
Page 11
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 21
Calculat ing Failur e Rat es -
Tot al r at e of st r ing f ailur es:
When a st r ing f ails - we repair t he st r ing, and anyindividual disks af f ect ed by t his st r ing f ailur e
Pessimist ic assumpt ion - a second f ailur e can happenat any group or disk bef or e all gr oups ar e f ullyr est or ed
Example: arr ival of a second st r ing f ailur e t o t hesame st r ing bef or e f ir st f ailur e has gone
Opt imist ic assumpt ion: disks af f ect ed by st r ingf ailur e ar e immune t o f ur t her f ailur es bef or e st r ing
and af f ect ed disks ar e f ully r est or edThe dif f erence between f ailur e r at es det ermines how
t ight bounds are
strd )1( +
str
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 22
Pessimist ic Calculat ion - (random) t ime t aken t o r epair t he f ailed st r ing
and all disks af f ect ed by it
- probabilit y densit y f unct ion of
- Laplace t ransf orm of
Pessimist ic assumpt ion - r at e of addit ional f ailur es
Condit ioning upon - t he pr obabilit y of dat a loss
I nt egr at ing on - uncondit ional pessimist icpr obabilit y of dat a loss
diskstrdpess gd )1()1( +++=
pess
pess ep
= 1
()
strF
()str
f
)(1 pessstrFpess
=
()str
f
-
7/31/2019 Part8 Ch3 Raid
12/12
Page 12
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 23
Opt imist ic Calculat ion
Opt imist ic assumpt ion - r at e of addit ional f ailur es
Condit ioning upon - t he pr obabilit y of dat a loss is
I nt egr at ing on - uncondit ional opt imist ic probabilit yof dat a loss
diskstrdopt dg +=
opt
opt ep
= 1
)(1 optstrFopt
=
Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 24
Reliabilit y of Or t hogonal Syst emRat e of st r ing f ailur es t r iggering dat a loss
Appr oximate rat e of data loss in t he syst em -
Mean Time To Data Loss -
Syst em r eliabilit y -
strindivlossdata +
_
lossdata
MTTDL_
1
tlossdataetR _)(
)(;)1(opt
orpessstr
dstr +=