lecture 7 ecen 5653 -...

19
March 6, 2012 Sam Siewert Lecture 7 ECEN 5653 RT Digital Media Block Devices and Filesystems

Upload: others

Post on 14-Sep-2019

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

March 6, 2012 Sam Siewert

Lecture 7

ECEN 5653

RT Digital Media Block

Devices and Filesystems

Page 2: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Sam Siewert 2

Hardware View of Device

InterfacesAnalog I/O

– DAC analog output: servos, motors, heaters, ...

– ADC analog input: photodiodes, thermistors, ...

Digital I/O

– Direct TTL I/O or GPIO

– Digital Serial (I2C, SPI, ... - Chip-to-Chip)

– Bus InterfacesParallel

– PCI 2.x, PCI-X, SCSI, etc (32-bit, 64-bit, synchronous parallel transfer)

Differential Serial– USB

– Infiniband

– gigE / 10GE Ethernet

– Fiber Channel

– SAS/SATA

Page 3: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Sam Siewert 3

Software View of DriversCharacter– Register Control/Config, Status, Data

– Typical of Low-Rate I/O Interfaces (RS232)

– Linux User Space Buffer Drivers (Direct IO) – e.g. SCSI Generic

Block– FIFOs, Dual-Port RAM and DMA

– Typical of High-Rate I/O Interfaces (Network, Storage)

– Only Interface for 512 Byte LBA/Sector HDDs

Network– Driver Stacks

– OSI 7 Layer Model (Phy, Link, Network, Transport, Session, Presentation, Application)

– TCP/IP/Ethernet/Cat-6e

Page 4: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Sam Siewert 4

Linux Char Driver DesignApplication Interface

– Application Policy

– Blocking/Non-Blocking

– Multi-thread access

– Abstraction

Device Interface

– SW/HW Interface

– Immediate Buffering

– Interrupt Service

Routine

App/Device InterfaceHardware Device

Application(s)

ISR

SemGiveInput

Ring-Buffer

Output

Ring-Buffer

If Output

Ring-Buffer Full then

{SemTake

or EAGAIN}

else

{Process and Return}

If Input

Ring-Buffer Empty then

{SemTake

or EAGAIN}

else

{Processand Return}

open/close, read/write,

creat, ioctlEAGAIN, Block,

Data, Status

Page 5: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Sam Siewert 5

Cached Memory and DMACache Coherency– Making sure that cached data and memory are in sync

– Can become out of sync due to DMAs and Multi-Processor Caches

– Push Caches Allow for DMA into and out of Cache Directly

– Cache Snooping by HW may Obviate Need for Invalidate

Drivers Must Ensure Cache Coherency– Invalidate Memory Locations on DMA Read Completion

– Flush Cache Prior to DMA Write Initiation

IO Data Cache Line Alignment– Ensure that IO Data is Aligned on Cache Line Boundaries

– Other Data That Shares Cache Line with IO Data Could Otherwise Be Errantly Invalidated

Page 6: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Sam Siewert 6

Advantages of Abstracted

DriverPortability

– If SW/HW Interface changes, change BH

– If Application Interface changes, change TH

Testability (Test BH and TH Separately)

Single Point of Access and Maintenance

Enforce Multi-Thread Usage Policies

Separate Buffering and ISR from Usage

Common Application Entry Points

Scheduled I/O (Most Work in Task Context rather

than ISR Context)

Page 7: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Sam Siewert 7

Linux Driver Writer Resources

“Linux Device Drivers – 3rd Ed.”, by J.

Corbet, A. Rubini, G. Kroah-Hartman,

2005, (0-596-00590-3), O’Reilly, publisher

link, E-book link

"PCI System Architecture", Tom Shanley

and Don Anderson, 4th Edition, 1999,

(ISBN 0-201-30974-2) MindShare, Inc., E-

book link, publisher link, retailer link,

library link.

Page 8: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Current and Detailed Linux

Driver Developer ReferencesJerry Cooperstein’s Linux Developer

Books - http://www.coopj.com/

Cooperstein’s Solutions -

http://www.coopj.com/LDD/

http://ecee.colorado.edu/~ecen5033/ecen5

033/code/ssd/

Sam Siewert 8

Page 9: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Digital Media Filesystems

Three Types of Media Storage– Direct Attached Storage – e.g. SATA (Serial ATA)

– Network Attached Storage – e.g. NFS

– Storage Area Networks – e.g. SAS (Serial Attached SCSI), Fiber Channel

Flash / RAM based SSD Still 10x++ More Costly than Spinning Media– Predictions for Demise of HDDs and RAID?

– Cost is the Driver

Fast Storage is Either SSD, RAID or Hybrid

Sam Siewert 9

Page 10: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

RAID Operates on

LBAs/Sectors (Sometimes Files)SAN/DAS RAID

NAS – Filesystem on top of RAID

RAID-10, RAID-50, RAID-60– Stripe Over Mirror Sets

– Stripe Over RAID-5 XOR Parity Sets

– Stripe Over RAID-6 Reed-Soloman or Double-Parity Encoded Sets

EVEN/ODD

Row Diagonal Parity

Minimum Density Codes (Liberation)

Reed-Solomon Codes

– Generalized Erasure CodesCauchy Reed-Solomon, LDPC (Low Density Parity Codes), Weaver/Hover

MDS (Maximal Distance Seperation) – For each Parity Device, Another Level of Fault Tolerance is Provided

– Larger Drives (Multi-terabyte), Larger arrays (100’s of drives), and Cost Reduction are Driving RAID6 and Higher Levels

Sam Siewert 10

Page 11: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

RAID-10

Sam Siewert 11

A1 A1A2 A2 A3 A3

A4 A4A5 A5 A6 A6

RAID-1 Mirror RAID-1 Mirror RAID-1 Mirror

RAID-0 Striping Over RAID-1 Mirrors

A7 A7A8 A8 A9 A9

A10 A10A11 A11 A12 A12

A1,A2,A3, K A12

Page 12: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

RAID5,6 XOR Parity Encoding

MDS Encoding, Can Achieve High

Storage Efficiency with N+1: N/(N+1) and

N+2: N/(N+2)

Sam Siewert 12

0.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

90.0%

100.0%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Sto

rag

e E

ffic

ien

cy

Number of Data Devices for 1 XOR or 2 P,Q Encoded Devices

RAID6

RAID5

Page 13: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

RAID-50

Sam Siewert 13

A1

RAID-5 Set RAID-5 Set

B1 C1 D1 P(ABCD)

E1 F1 G1 H1P(EFGH)

I1 J1 P(IJKL) K1 L1

M1 P(MNOP) N1 P1O1

P(QRST) Q1 R1 S1 T1

A2 B2 C2 D2 P(ABCD)

E2 F2 G2 H2P(EFGH)

I2 J2 P(IJKL) K2 L2

M2 P(MNOP) N2 P2O2

P(QRST) Q2 R2 S2 T2

RAID-0 Striping Over RAID-5 Sets

A1,B1,C1,D1,A2,B2,C2,D2,E1,F1,G1,H1,K,

Q2,R2,S2,T2

Page 14: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

A1

RAID-6 Set RAID-6 Set

B1 C1 D1 P(ABCD)

E1 F1 G1 P(EFGH)

I1 J1 P(IJKL) K1

M1 P(MNOP) N1 O1

P(QRST) Q1 R1 S1

RAID-0 Striping Over RAID-6 Sets

A1,B1,C1,D1,A2,B2,C2,D2,E1,F1,G1,H1,…,

Q2,R2,S2,T2

Disk5Disk1 Disk2 Disk3 Disk4

Q(EFGH)

Disk6

H1

QABCD)

Q(IJKL)

Q(MNOP)

Q(QRST)

L1P1

T1

A2 B2 C2 D2 P(ABCD)

E2 F2 G2 P(EFGH)

I2 J2 P(IJKL) K2

M2 P(MNOP) N2 O2

P(QRST) Q2 R2 S2

Disk5Disk1 Disk2 Disk3 Disk4

Q(EFGH)

Disk6

H2

QABCD)

Q(IJKL)

Q(MNOP)

Q(QRST)

L2P2

T2

RAID-60 (Reed-Solomon

Encoding)

Page 15: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

How RAID Relates to

Erasure Codes

Erasure Codes Applied to

Disk or SSD Devices

Sam Siewert

15

Page 16: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

RAID is an Erasure Code

RAID-1 is an MDS EC (James Plank, U. of

Tennessee)

Sam Siewert 16

Page 17: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Comparison of ECsData Devices = n

Coding Devices = m

Total = m+n

Storage Efficiency: R=n/(n+m)– RAID1 2-Way, R=1/(1+1)=50%, MDS=1, Reads 2x Speed-up, 1x

Write

– RAID1 3-Way, R=1/(1+2)=33%, MDS=2, 3x Read, 1x Write

– RAID10 with 10 sets, R=10/(10+10)=50%, MDS=1, 20x Read, 10x Write

– RAID5 with 3+1 set, R=3/(3+1)=75%, MDS=1, 3x Read (Parity Check?), RMW Penalty, Striding Issues

– RAID6 with 7+2 set, R=5/(5+2)=71%, MDS=2, 5x Read, Reed-Solomon Encode on Write and RMW Penalty

– Beyond RAID6?Cauchy Reed-Solomon Scales, but Encode, Decode Complexity High

Low Density Parity Codes, Simpler, but not MDS

Sam Siewert 17

Page 18: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Read, Modify Write PenaltyAny Update that is Less than the Full RAID5 or RAID6 Set, Requires1. Read Old Data and Parity – 2 Reads

2. Compute New Parity (From Old & New Data)

3. Write New Parity and New Data – 2 Writes

Only Way to Remove Penalty is a Write-Back Cache to Coalesce Updates and Perform Full-Set Writes Always

Sam Siewert 18

A1

RAID-5 Set

B1 C1 D1 P(ABCD)

E1 F1 G1 H1P(EFGH)

I1 J1 P(IJKL) K1 L1

M1 P(MNOP) N1 P1O1

P(QRST) Q1 R1 S1 T1

Write A1 P(ABCD)new=A1new xor A1

xor P(ABCD)

A1 B1 C1 D1 P(ABCD)

0 0 0 0 0

0 0 0 1 1

0 0 1 0 1

0 0 1 1 0

0 1 0 0 1

0 1 0 1 0

0 1 1 0 0

Page 19: Lecture 7 ECEN 5653 - ecee.colorado.eduecee.colorado.edu/ecen5653/ecen5653/lectures/pdf/Lecture7-BW.pdfSam Siewert 3 Software View of Drivers Character –Register Control/Config,

Conclusion

IDF Paper on traditional RAID vs EC -http://ecee.colorado.edu/~ecen5033/ecen5033/papers/SF11_STOS004_101F.pdf

Deeper Dive Into Erasure Codes (James Plank FAST Presentation)

Lab 3 Discussion– http://ecee.colorado.edu/~ecen5033/ecen5033/la

bs/lab3-hints.html

– Using RAM Disk to Explore MDADM

Linux RAID Demos

Driver Discussion

Sam Siewert 19