part8 ch3 raid

Upload: himanshuagra

Post on 04-Apr-2018

232 views

Category:

Documents


1 download

TRANSCRIPT

  • 7/31/2019 Part8 Ch3 Raid

    1/12

    Page 1

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .1

    FAULT TOLERANT SYSTEMS

    ht t p: / / www. ecs.umass. edu/ ece/ kor en/ Fault Toler ant Syst ems

    Par t 8 RAI D Syst ems

    Chapt er 3 I nf or mat ion Redundancy

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .2

    RAI D - Redundant Ar r ays ofI nexpensive (I ndependent ) Disks

    RAI D1 - t wo mirr or ed disks

    I f one disk f ails, t he ot her can cont inue

    I f bot h wor k:

    speeds up r ead accesses - dividest hem among two disks

    Wr it e accesses are slowed downComput ing r eliabili t y, availabili t y, and MTTDL (mean

    t ime t o data loss) of RAI D1

    1

    0

    0

    1

    0

    0

    1

    1

    1

    0

    0

    1

    0

    0

    1

    1

  • 7/31/2019 Part8 Ch3 Raid

    2/12

    Page 2

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .3

    RAI D1 - Reliabilit y Calculat ion

    Assumpt ions:

    disks f ail independent ly

    f ailur e pr ocess - Poisson pr ocess wit h r at e

    r epair t ime - exponent ial wit h mean t ime 1/

    Mar kov chain: st at e - number of good disks

    Reliabilit y at t ime t -

    )()(2)(

    122 tPtPdt

    tdP += )(2)()(

    )(21

    1 tPtPdt

    tdP ++=

    )()(1)(210tPtPtP = 0)0()0(;1)0(

    102=== PPP

    )(1)()()( 021 tPtPtPtR =+=

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .4

    RAI D1 - MTTDLCalculat ionStarting in state 2 at t =0

    - mean t ime bef or e ent er ing state 1 = 1/ (2)

    Mean t ime spent in state 1 is 1/ ( + )

    Go back t o state 2 wit h pr obabilit y q = / ( + )

    or to state 0 wit h pr obabilit y p = / ( + )

    Pr obabilit y of n visit st o state 1 bef or e t r ansit ion t o state 0 is

    Mean t ime t o ent er state 0 :

    pqn 1

    )(2

    3)

    1

    21()(

    02

    +

    +=

    ++= nnnT

    )1()(02

    1

    102

    1

    1

    =

    =

    == TpnqnTpqMTTDLn

    n

    n

    n

    202

    2

    3)1(

    +==

    p

    T

  • 7/31/2019 Part8 Ch3 Raid

    3/12

    Page 3

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .5

    Appr oximat eReliabilit y of RAI D1

    I f >> , the transition

    r at e int o st at e 0 f rom t heaggr egate of st at es 1 and 2 is 1/ MTTDL

    Appr oximate r eliabilit y:

    I mpact of Disk lif et ime

    I mpact of Disk Repair t ime

    MTTDLtetR

    =)(

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .6

    RAI D1 - Availabi li t yCalculat ion

    Mar kov chain: st at e - number of good disks

    The solut ion t o t he dif f erent ial equat ions:

    Long- t erm availabilit y -

    A = P2 + P1 = 1 - P0

    2222 )(1)()2( +=+= +

    tetP

    )(2222

    )(2)()( ++++= te )(222 )( +++

    te

    )(222 )(2

    ++

    )()(1)( 120 tPtPtP =

    tetP

    )(221 )()(2)(2)(

    ++++=

  • 7/31/2019 Part8 Ch3 Raid

    4/12

    Page 4

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .7

    RAI D2

    A bank of data disks in parallel wit h Hamming- codeddisks

    d dat a disks and c code disks

    i - t h bi t of each disk - bi t of a c+d- bit word

    Fr om Hamming codes t heor y - t o permit t hecor r ect ion of one bit per wor d -

    We wil l not spend mor e t ime on RAI D2 because ot herRAI D designs impose much less overhead

    12 ++ dcc

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .8

    RAI D3

    Modif icat ion of RAI D2

    Observat ion - each disk has er ror - det ect ioncoding per sect or - a bad sect or can be ident if ied

    Bank of d dat a disks t oget her wit h one parit y disk

    Data ar e bit - int erleaved acr oss t he dat a disks

    The i - t h posit ion of t he pari t y disk cont ains t hepar it y bit associat ed wit h t he bit s in t he i - t hposit ion of each of t he dat a disks

  • 7/31/2019 Part8 Ch3 Raid

    5/12

    Page 5

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 .9

    Er r or Det ect ion and Cor rect ion inRAI D3

    i - t h bit s of each disk f orm a d+1- bit wor d- d dat a and 1 parity bits

    I f j - t h bit in wor d is incor r ect - sect or err or -det ect ing code inj - t h disk will indicat e a f ailur e -f ault will be locat ed - r emaining bit s can be used t or est or e t he f ault y bit

    Example: word - 11100 ; dat a bit s - 1110 ;parity bit - 0

    I f even par it y is being used - a bit is in er r orThir d disk indicat es an err or in t he r elevant sect or

    and t he ot her disks show no such er r or sThe cor r ect wor d is 11000

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 10

    Reliabilit y of RAI D3

    Similar analysis t o RAI D1:(d+1) disks inst ead of 2

    Syst em f ails (dat a loss) if t wo or mor e disks f ail

    Mean t ime t o dat a loss f or t his gr oup is

    The reliabilit y is given appr oximat ely by

    d(d+1)

    d+1 d F

    MTTDLtetR

    =)(

    2)1(

    )12(

    +++=

    dddMTTDL

  • 7/31/2019 Part8 Ch3 Raid

    6/12

    Page 6

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 11

    Compar ing Dif f er ent RAI D3 Syst ems

    Unr eliabilit y of RAI D3 syst ems f or dif f er entvalues of d - mean lif et ime of a single disk is500,000 hour s

    The d=1 case - ident ical t o RAI D1

    Reliabili t y goes down as d increases

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 12

    RAI D4

    Similar t o RAI D3 -but unit of int er leavingblock of ar bit r ar ysize - a stripe

    Advant age - a small r ead may be cont ained in onesingle dat a disk, r at her t han int er leaved over all disks

    Small read operat ions ar e f ast er in RAI D4

    Similar ly f or small wr it e operat ionsWr it e - af f ected dat a disk and par it y disk must be

    updat ed

    Par it y updat e simple - par it y bit t oggles if dat a bitbeing wr it t en is dif f erent f r om one being over wr it t en

    Reliabilit y model f or RAI D4 - ident ical t o RAI D3

  • 7/31/2019 Part8 Ch3 Raid

    7/12

    Page 7

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 13

    RAI D5

    Observat ion - par it y disk can be syst em bot t leneck

    I n RAI D4 - par it y disk accessed in each wr it e

    I n RAI D5 - par it y blocks int erleaved among disks

    Every disk has some dat a blocks and some par it yblocks

    Reliabili t y model f or RAI D5 same as f or RAI D4Only t he perf or mance model is dif f erent

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 14

    Modeling Cor r elat ed Failur es

    We assumed unt il now t hat disks are independentwit h respect t o f ailur es

    Disk f ailur es may be cor r elat ed - power supply andcont r ol ar e t ypically shared among mult iple disks

    Disk syst ems are usually made up of st r ings -consist ing of disks t hat shar e power supply, cabling,cooling, and a cont r oller

    I f any of t hese shar ed suppor t it ems f ail, t heent ir e st r ing can f ail

    I f t he st r ing const it ut es t he RAI D group - dat aloss can occur

  • 7/31/2019 Part8 Ch3 Raid

    8/12

    Page 8

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 15

    Appr oximat e Reliabilit y of St r ing

    - f ailur e r at e of t he suppor t element spower, cabling, cooling, cont r ol) of a st r ing

    - appr oximat e f ailure r at e due t oindependent f ailur es

    I f a RAI D gr oup is cont r olled by a single st r ing -t he aggr egate f ailur e rat e of t he gr oup is

    And t he reliabilit y is

    strindeptotal +=

    indep

    str

    ttotaletRtotal

    =)(

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 16

    I mpact of St r ing Failur es on RAI D1

    Similar r esult s f or RAI D3 and higher levels

    Figures of 150,000 hour s f or t he mean st r inglif et ime have been quot ed in t he lit eratur e

    At least one manuf act ur er claims mean disklif et imes of 1,000,000 hours

    Gr ouping an ent ir e RAI D arr ay as a single st r ingincreases unr eliabilit y by or ders of magnit ude

    Mean St r ing Lif et ime

  • 7/31/2019 Part8 Ch3 Raid

    9/12

    Page 9

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 17

    Or t hogonal

    Ar r angementof St r ingsand RAI D

    Gr oups

    Failur e of a st r ing af f ect s onlyone disk in each RAI D group

    Since each RAI D can t oler at e t he f ailure of up t oone disk, t his r educes t he impact of st r ing f ailur es

    Data loss wil l happen only if any RAI D group has atleast t wo disks down at t he same t ime

    Str ing

    RAID

    group

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 18

    Appr oximat e Modeling of Or t hogonal Syst ems

    Dat a loss is caused by a sequence of event s

    A f ailur e can be t r igger ed by an individual diskf ailur e or by a st r ing f ailure - ver y low f ailur e r at es

    We will f ind t he (appr oximate) f ailur e rat e due t oeach -

    Sum of t hese t wo f ailur e r at es - t he approximat e

    over all f ailur e rat e -

    I t can t hen be used t o appr oximately det ermineMTTDL - Mean t ime t o dat a loss, and r eliabilit y -probabil it y of no dat a loss over any given per iod oft ime

    strindivlossdata +=_

    strindiv ;

    lossdata_

  • 7/31/2019 Part8 Ch3 Raid

    10/12

    Page 10

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 19

    Ort hogonal Ar r angement - Not at ions

    d+1 st r ings, g RAI D gr oups - t ot al of (d+1)g disks

    - densit y f unct ion of t he disk repair t ime - f ailur e rat e of a single disk

    - pr obabilit y t hat a given individual f ailur et r iggers dat a loss

    Appr oximat e r at e (per disk) at which individualf ailur es t r igger dat a loss

    = pr obabilit y t hat a second disk f ails in t heaf f ect ed RAI D gr oup while t he f ir st f ailur e is notyet r epair ed

    This second f ailur e has t he r at e -t he second disk f ailure can happen eit her due t o anindividual disk or st r ing f ailur e

    disk

    )(tfdisk

    indiv

    indivdisk

    indiv

    )( strdiskd +

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 20

    Calculat ing Failure Rat es - Condit ioning on - repai r t ime of f i rst disk fa ilure

    Uncondit ional pr obabilit y of dat a loss -

    - t he Laplace t ransf orm of Appr oximat e rat e at which dat a loss is t r iggered by

    individual disk f ailur e -

    indiv

    )(1}|{Pr strdisk

    detakesrepairLossDataob

    +=

    (.)*diskF (.)diskf

  • 7/31/2019 Part8 Ch3 Raid

    11/12

    Page 11

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 21

    Calculat ing Failur e Rat es -

    Tot al r at e of st r ing f ailur es:

    When a st r ing f ails - we repair t he st r ing, and anyindividual disks af f ect ed by t his st r ing f ailur e

    Pessimist ic assumpt ion - a second f ailur e can happenat any group or disk bef or e all gr oups ar e f ullyr est or ed

    Example: arr ival of a second st r ing f ailur e t o t hesame st r ing bef or e f ir st f ailur e has gone

    Opt imist ic assumpt ion: disks af f ect ed by st r ingf ailur e ar e immune t o f ur t her f ailur es bef or e st r ing

    and af f ect ed disks ar e f ully r est or edThe dif f erence between f ailur e r at es det ermines how

    t ight bounds are

    strd )1( +

    str

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 22

    Pessimist ic Calculat ion - (random) t ime t aken t o r epair t he f ailed st r ing

    and all disks af f ect ed by it

    - probabilit y densit y f unct ion of

    - Laplace t ransf orm of

    Pessimist ic assumpt ion - r at e of addit ional f ailur es

    Condit ioning upon - t he pr obabilit y of dat a loss

    I nt egr at ing on - uncondit ional pessimist icpr obabilit y of dat a loss

    diskstrdpess gd )1()1( +++=

    pess

    pess ep

    = 1

    ()

    strF

    ()str

    f

    )(1 pessstrFpess

    =

    ()str

    f

  • 7/31/2019 Part8 Ch3 Raid

    12/12

    Page 12

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 23

    Opt imist ic Calculat ion

    Opt imist ic assumpt ion - r at e of addit ional f ailur es

    Condit ioning upon - t he pr obabilit y of dat a loss is

    I nt egr at ing on - uncondit ional opt imist ic probabilit yof dat a loss

    diskstrdopt dg +=

    opt

    opt ep

    = 1

    )(1 optstrFopt

    =

    Copyright 2007 Koren & Kri shna, Morgan- KaufmanPart. 8 . 24

    Reliabilit y of Or t hogonal Syst emRat e of st r ing f ailur es t r iggering dat a loss

    Appr oximate rat e of data loss in t he syst em -

    Mean Time To Data Loss -

    Syst em r eliabilit y -

    strindivlossdata +

    _

    lossdata

    MTTDL_

    1

    tlossdataetR _)(

    )(;)1(opt

    orpessstr

    dstr +=