disks and raid
DESCRIPTION
Disks and RAID. Outline. Disks Disk scheduling algorithms Redundancy in storage systems, RAID File Systems Overview of file system File system design Consistency and crash recovery Sharing files Unix file system. Sector. Disk Geometry. Disks have multiple platters - PowerPoint PPT PresentationTRANSCRIPT
Operating SystemsECE344
Ashvin GoelECE
University of Toronto
Disks and RAID
2
Outline
Diskso Disk scheduling algorithmso Redundancy in storage systems, RAID
File Systemso Overview of file systemo File system designo Consistency and crash recoveryo Sharing fileso Unix file system
3
Disk Geometry
Disks have multiple platterso Each platter has an heado The different heads are attached to a single armo The different heads can access data in parallel
Each platter has multiple concentric trackso A cylinder consists of the same track across different platters
Each track has multiple sectorso A sector typically has preamble, data and ECC
Track
Sector Cylinder
Platters
4
Disk Access Delays
Time to access a disk sector is determined by 3 delayso Seek time
Time to move head to correct track
o Rotational delay Time for disk to rotate to
correct sectoro Transfer time
Time to read/write the bits of sector
5
Disk Performance Trends
Capacityo 100% per year (2X every year)
Transfer rate (BW)o 40% per year (2X every two years)o Typically, 50-100 MB/so Sector transfer time is 100-200 microseconds
Seek time and Rotation timeo 8% per year (1/2 every 10 years)o Seek and rotation time are typically 4-8 ms todayo Disk scheduling aims to minimize these times, especially seek
time
Disk Performance
Fastest: No head movement, no rotational latencyo => Sequential access
Faster: no head movement, rotational latencyo => Access to blocks on the same cylinder is faster
Slow: head movement, rotational latencyo => Access to blocks on different cylinderso => The further the cylinders the worse it is
7
Addressing Disks
Older disks required OS to specify all parameters for transferring datao E.g., cylinder #, track #, sector #, transfer size
Modern disks are more complicatedo Not all sectors are the same size, sectors are remapped, etc.
Current disks provide a higher-level interfaceo Disk exports its data as a logical array of blockso Disk maps logical blocks to its surfaceo OS code is simpler but disk parameters are hidden
8
Layers of Abstraction
Program <Filename, Offset> File system <Partition, Block#> Device driver <Disk#, Sector#> Disk Controller <Cylinder, Track, Sector>
9
Disk Errors
Lots of errors possibleo E.g., latent sector errors, mis-directed writes, etc.o Transient vs. hard errorso Some errors can be masked by ECC
Bad sectorso Allocate spare sectors per tracko Block can be mapped to spare in various ways
In the factory, by the device controller, by OSo OS can hide bad sectors by allocating them to a special
hidden fileo Physical backup programs have to careful
10
Disk Scheduling Algorithms
Aim is to improve disk performance Two methods
o Reduce seeks and rotationo Read several blocks of data
Algorithmso First-come, first served (FCFS)
Simple, fair, slowo Shortest seek time first (SSF)o SCAN (Elevator)
11
Shortest Seek First (SSF)
Shortest seek first minimizes arm motion Unlike FCFS, starvation is possible
Initialposition
Pendingrequests
12
SCAN (Elevator)
13
SCAN (Elevator)
Use a bit to track outward or inward arm direction Service the next pending request in same direction When there are no more requests in the current
direction, reverse direction Increases seek compared to SSF but ensures no
starvation
A variant is called C-SCANo SCAN, but request go in one direction (typewriter)o What are its benefits/drawbacks vs. SCAN?
14
Modern Disk Scheduling
Disks know their layout better than the OS May ignore or undo disk scheduling in OS!
15
Redundancy in Storage Systems
Idea: Use many disks in parallelo Increases storage bandwidth, improves reliability
Redundant Array of Inexpensive Disks (RAID)o A storage system, not a file system
Files are striped across diskso Stripes on different disks can be read/written in parallelo Bandwidth increases with more diskso Better throughput for large requests
Choosing stripe size is importanto Normally between 1KB to 1MBo Increase stripe size, as average file size increases
16
RAID Levels 0, 1
RAID level 0: disk stripingo Distributes data across several disks for speedo No redundancy
RAID level 1: mirroringo Backup solutiono Write both, read eithero 50% utilization
17
RAID Levels 0, 1
18
RAID Levels 4, 5
RAID level 4: block striping, dedicated parity disko Calculate XOR value of stripes across diskso Store XOR stripe value on separate disko Uses one extra disk for parity
Utilization: (N-1)/N, N is the number of disks
RAID level 5: striping with distributed parityo Similar to 4 but parity information is distributed across all
diskso Avoids bottleneck for parity disko If a single disk dies, it has to be replaced
Information for that disk is recreated from the other disks
RAID level 6:o Can recovery from the loss of two diskso Allows (single) failure during disk recovery
19
RAID Levels 4, 5