introduction disk scheduling overview outline smd149 - operating systems - disk scheduling ·...
TRANSCRIPT
IntroductionDisk Scheduling
RAID
SMD149 - Operating Systems - Disk Scheduling
Roland Parviainen
November 18, 2005
1 / 64
IntroductionDisk Scheduling
RAIDOverview
Outline
Introduction
Disk scheduling
Other methods
RAID
2 / 64
IntroductionDisk Scheduling
RAIDOverview
Introduction
Processor and memory speeds increases faster than secondarystorage
Hard drive speeds improve slower than their capacity
Long service delays for I/O bound processes
Need optimizations
3 / 64
IntroductionDisk Scheduling
RAIDOverview
Hard drive history
Important milestones
Punch cards and paper tape
1951 - Magnetic tape - Sequential access
1956 - first commercial hard disk, the IBM 350 RAMAC disk drive, 5megabyte
1973 - IBM introduced the 3340 ”Winchester” disk system (two30MB spindles)
First to use a sealed head/disk assembly (HDA)Winchester was used to described hard disks until 90s
1980 - first 5.25-inch Winchester drive, the Seagate ST-506, 5megabyte
1991 - 100 megabyte hard drive
1995 - 2 gigabyte hard drive
1997 - 10 gigabyte hard drive
2005 - 500 gigabyte hard drive
4 / 64
IntroductionDisk Scheduling
RAIDOverview
Characteristics of Moving-head Disk Storage
Variable access speed
Depends on the location of data, position of read-write head
Magnetic disks, platters
rotating spindle (thousands of RPM)
Read-write head
attached to actuator (boom, moving arm assembly)
Head is above a circular track
Vertical set of tracks: cylinder
Seeking: moving to a cylinder
5 / 64
IntroductionDisk Scheduling
RAIDOverview
6 / 64
IntroductionDisk Scheduling
RAIDOverview
7 / 64
IntroductionDisk Scheduling
RAIDOverview
Accessing data
To access a particular record:Seek operation, move the arm to the correct cylinder
Seek time
Rotate disk so data is under head
Rotational latency time
Data spins by head
Transmission time
8 / 64
IntroductionDisk Scheduling
RAIDOverview
9 / 64
IntroductionDisk Scheduling
RAIDOverview
10 / 64
IntroductionDisk Scheduling
RAIDOverview
Data
Rotational speed: 4200-15000 rpm
Number of I/O operations per second: around 50 random or 100sequential OPS
Transfer Rate
Inner Zone: from 44.2MB/sec to 74.5MB/secOuter Zone: from 74.0MB/sec to 111.4MB/sec
Random access time: from 5ms to 15ms
11 / 64
IntroductionDisk Scheduling
RAID
Why disk scheduling?
Processes generate requests simultaneously
Early systems: FCFS, First Come, First Served
FairHigh request rate - long waiting timesRandom seek patternArm might move from one end to the otherBetter to reorder requests?
Disk scheduling
Looks at physical postion of requested recordsAvoid mechanical motionSeek optimization, rotational optimization
12 / 64
IntroductionDisk Scheduling
RAID
Strategies
Criteria
ThroughputMean response timeVariance of response time
13 / 64
IntroductionDisk Scheduling
RAID
14 / 64
IntroductionDisk Scheduling
RAID
FCFS
15 / 64
IntroductionDisk Scheduling
RAID
SSTF
Shortest Seek Time First
Service request closest to read-write head
Advantages
Higher throughput and lower response times than FCFSReasonable solution for batch processing systems
Disadvantages
Does not ensure fairnessPossibility of indefinite postponementHigh variance of response timesResponse time generally unacceptable for interactive systems
16 / 64
IntroductionDisk Scheduling
RAID
SSTF
17 / 64
IntroductionDisk Scheduling
RAID
SCAN
SCAN
Shortest seek time in preferred direction
Aims to reduce unfairness and variance of SSTF response timesDoes not change direction until edge of disk reachedSimilar characteristics to SSTFIndefinite postponement still possible
18 / 64
IntroductionDisk Scheduling
RAID
SCAN
19 / 64
IntroductionDisk Scheduling
RAID
C-SCAN
C-SCAN
Similar to SCAN, but at the end of an inward sweep, the disk armjumps (without servicing requests) to the outermost cylinder
Further reduces variance of response times at the expense ofthroughput and mean response times
20 / 64
IntroductionDisk Scheduling
RAID
C-SCAN
21 / 64
IntroductionDisk Scheduling
RAID
FSCAN and N-Step SCAN
FSCAN and N-Step SCAN
Group requests into batches
FSCAN: “freeze” the disk request queue periodically, service onlythose requests in the queue at that time
N-Step SCAN: Service only the first N requests in the queue at atime
N = 1: FCSC, N = infinite: SCAN
Advantages
Both strategies prevent indefinite postponementBoth reduce variance of response times compared to SCAN
22 / 64
IntroductionDisk Scheduling
RAID
FSCAN
23 / 64
IntroductionDisk Scheduling
RAID
N-Step SCAN, N=3
24 / 64
IntroductionDisk Scheduling
RAID
LOOK and C-LOOK
LOOK
Improvement on SCAN scheduling
Only performs sweeps large enough to service all requests
Does move the disk arm to the outer edges of the disk if no requestsfor those regions are pendingImproves efficiency by avoiding unnecessary seek operationsHigh throughput
C-LOOK
Improvement on C-SCAN scheduling
Combination of LOOK and C-SCANLower variance of response times than LOOK, at the expense ofthroughput
25 / 64
IntroductionDisk Scheduling
RAID
26 / 64
IntroductionDisk Scheduling
RAID
Summary
27 / 64
IntroductionDisk Scheduling
RAID
Rotational Optimization
Seek time formerly dominated performance concerns
Seek times and rotational latency are the same order of magnitude
Optimization by reducing rotational latency?Important when accessing small pieces of data distributedthroughout the disk surfaces
28 / 64
IntroductionDisk Scheduling
RAID
SLTF
Shortest Latency Time First
On a given cylinder, service request with shortest rotational latencyfirst
Easy to implement
Achieves near-optimal performance for rotational latency
29 / 64
IntroductionDisk Scheduling
RAID
30 / 64
IntroductionDisk Scheduling
RAID
SPTF and SATF
Shortest Position Time First
Positioning time: Sum of seek time and rotational latency
SPTF first services the request with the shortest positioning time
Yields good performance
Can indefinitely postpone requests
Shortest Access Time First
Access time: positioning time plus transmission time
High throughput
Again, possible to indefinitely postpone requests
Both SPTF and SATF can implement LOOK to improveperformance
Weakness
Both SPTF and SATF require knowledge of disk performancecharacteristics which might not be readily available
Increase rotational speed? 31 / 64
IntroductionDisk Scheduling
RAID
Hard drive geometry
Hard drives sometimes report wrong geometry
Error correction, spare sectors, etc.
Sometimes, true geomtry is available
Problem for disk scheduling algorithms
32 / 64
IntroductionDisk Scheduling
RAID
33 / 64
IntroductionDisk Scheduling
RAID
Examples
Linux
Default: elevator algorithm (LOOK variation of SCAN)
Can suffer from indefenite postponment
Deadline and anticipatory scheduling (LOOK)
DeadlineTwo FIFO queues (read and write requests)
References to requests, with deadline
Head of queues close to deadlineReads: 500ms, Writes: 5sService requests that have expired, together with requests thatalmost expires
34 / 64
IntroductionDisk Scheduling
RAID
Caching and Buffering
Disk cache buffer
Cache for disk data
Buffer for data
To delay writing of data
Need replacement strategies
Memory usage?
System failure with modified buffer?
Write-back caching
Write-through caching
Hard drives have their own buffer cache
35 / 64
IntroductionDisk Scheduling
RAID
Other disk performance techniques
Fragmentation/degfragmentation
Place files that will be modified near free space
SCAN visits midrange more often - place often reference data there
Compression
When disk is idle, position the head correctly
36 / 64
IntroductionDisk Scheduling
RAID
RAID
Redundant Array of Independent Disks
Use several disks to improve
Capacity, reliability, speed, or a combination
David A. Patterson, Garth A. Gibson and Randy H. Katz, ”A Casefor Redundant Arrays of Inexpensive Disks (RAID)”, SIGMODConference 1988
Combine multiple drives into one logical unit
Hardware and/or Software
Different RAID levels
37 / 64
IntroductionDisk Scheduling
RAID
Data striping and strips
Strips forms
Stripes
Strips
Fixed size blocks (bit, byte, blocks)
38 / 64
IntroductionDisk Scheduling
RAID
39 / 64
IntroductionDisk Scheduling
RAID
RAID 0
RAID 0 - Striped set
Not one of the original levels - no redundancy
Splits data evenly across two or more disks
Reliability decreases fast
Block size typically multiple of disk sector size
Each drive can seek independentlyFast seek timesTransfer speed: sum of drive speeds
40 / 64
IntroductionDisk Scheduling
RAID
RAID 0
41 / 64
IntroductionDisk Scheduling
RAID
RAID 1
RAID 1 - Mirroring
Disk mirroring for redundancy
Each disk is duplicated
Reads can be served simultaneously for a pair
Writes one at a time (writes to both disks)
Half the capacity
Multiple disk failures possible
Data generation
Can be done onlineHot swapping
42 / 64
IntroductionDisk Scheduling
RAID
RAID 1
43 / 64
IntroductionDisk Scheduling
RAID
RAID 2
RAID 2 (Bit-level Hamming ECC Parity
Striped at the bit-level
Hamming error correcting codes (Hamming ECCs)
Detect up to two errors, correct one
Parity disks: log(data disks) (10 - 4, 25 - 5)
Writes requires writing the parity - and thus reading the completestripe
Read-modify-write-cycle
Read requests read full stripe (compute parity)
Sometimes ignored
Not used
44 / 64
IntroductionDisk Scheduling
RAID
RAID 2
45 / 64
IntroductionDisk Scheduling
RAID
XOR Parity
One parity block for the set of blocks
Parity calulation: A1 XOR A2 XOR A3 = Ap
Recovery: A1 XOR A2 XOR Ap = A3
One data block can be recovered
46 / 64
IntroductionDisk Scheduling
RAID
RAID 3
RAID 3 (Bit/byte level XOR ECC parity)
Bit or byte level striping
XOR ECC
One disk for parity
Can recover from a single disk failure
Recovery expensive
Most reads access full array
One write at a time
High transfer rate for single file
47 / 64
IntroductionDisk Scheduling
RAID
RAID 3
48 / 64
IntroductionDisk Scheduling
RAID
RAID 4
RAID 4 (Block level XOR ECC Parity)
Blocks instead of bits/bytes
Higher concurrency than level 3
Parity calculation easier
Ap’ = (Ad XOR Ad’) XOR Ap
Single write at a time - parity must always be written
49 / 64
IntroductionDisk Scheduling
RAID
RAID 4
50 / 64
IntroductionDisk Scheduling
RAID
RAID 5
RAID 5 (Block-level Distributed XOR ECC Parity)
Removes bottleneck from level 4
Parity blocks distributed among the array
Still requires read-modify-write cycle for write requests (4 ops)
Parity logging: parity difference stored in memoryAFRAID: parity calculation is done when load is light
Maximum number of drives unlimited
Common practice: 14 or less
Two drives (no recovery) have a high probability of failureHigh probability that second drive fails before first failure is detected,replaced and recreated
51 / 64
IntroductionDisk Scheduling
RAID
52 / 64
IntroductionDisk Scheduling
RAID
RAID 6
RAID 6
Extends RAID 5 with one more parity block
One XOR block and one Reed-Solomon block
Can handle two disk failures
Inefficient with small number of disks
Traditional Typical
RAID 5 RAID 6
A1 A2 A3 Ap A1 A2 A3 Ap Aq
B1 B2 Bp B3 B1 B2 Bp Bq B3
C1 Cp C2 C3 C1 Cp Cq C2 C3
Dp D1 D2 D3 Dp Dq D1 D2 D3
54 / 64
IntroductionDisk Scheduling
RAID
RAID 7
RAID 7
Storage Computer Corporation
Adds caching to RAID 3, 4
55 / 64
IntroductionDisk Scheduling
RAID
RAID 01
RAID 0+1, 01
A mirror of stripes
RAID 1 above several RAID 0 arrays
RAID 1
/--------------------------\
| |
RAID 0 RAID 0
/-----------------\ /-----------------\
| | | | | |
120 GB 120 GB 120 GB 120 GB 120 GB 120 GB
A1 A2 A3 A1 A2 A3
A4 A5 A6 A4 A5 A6
B1 B2 B3 B1 B2 B3
B4 B5 B6 B4 B5 B6
57 / 64
IntroductionDisk Scheduling
RAID
RAID 10
RAID 10
A stripe of mirrors
RAID 0 above several RAID 1 arrays
One drive from each RAID 1 set can fail
Fast write speeds
RAID 0
/-----------------------------------\
| | |
RAID 1 RAID 1 RAID 1
/--------\ /--------\ /--------\
| | | | | |
120 GB 120 GB 120 GB 120 GB 120 GB 120 GB
A1 A1 A2 A2 A3 A3
A4 A4 A5 A5 A6 A6
B1 B1 B2 B2 B3 B3
B4 B4 B5 B5 B6 B659 / 64
IntroductionDisk Scheduling
RAID
RAID 50
RAID 50, 5+0
RAID 0 above several RAID 5 arrays
Higher performance that RAID 5
RAID 0
/-------------------------------------------------\
| | |
RAID 5 RAID 5 RAID 5
/-----------------\ /-----------------\ /-----------------\
| | | | | | | | |
120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB
A1 A2 Ap A3 A4 Ap A5 A6 Ap
B1 Bp B2 B3 Bp B4 B5 Bp B6
Cp C1 C2 Cp C3 C4 Cp C5 C6
D1 D2 Dp D3 D4 Dp D5 D6 Dp
61 / 64
IntroductionDisk Scheduling
RAID
RAID summary
62 / 64
IntroductionDisk Scheduling
RAID
Summary
Next: File systems
63 / 64
IntroductionDisk Scheduling
RAID
Sources
Course booken.wikipedia.org
64 / 64