20061021 2 6_io_sched
TRANSCRIPT
linux 2.6 Kernel I/O scheduleintroduction on conception
from internet...
Content
● What's I/O schedler? (elevator)
● Linux I/O scheduler framework
● I/O scheduler● Noop elevator
● Linus elevator
● Deadline I/O scheduler
● Anticipatory I/O scheduler
● Completely fair queuing
● Conclusion
What is I/O scheduler?
● The I/O scheduler schedules the pending I/O requests in order to minimize the time spent moving the disk head. This, in turn, minimizes disk seek time and maximizes hard disk throughput.
● Share and control bandwith of hard disk throughput.
What is I/O scheduler? (Cont.)● Why need a scheduler?
– First of all, look inside of hard disk
What is I/O scheduler? (Cont.)● Why need a scheduler?
– How data written or read
● the write head is an induction coil
● the read head is a magnetoresistive(MR) sensor
What is I/O scheduler? (Cont.)● Why need a scheduler?
– How data addressed
● Head is moved across the C.H.S (LBA address)
What is I/O scheduler? (Cont.)● Why need a scheduler?
– seek time: time it takes to position the head at the track
– rotational delay: time it takes for the beginning of the sector to reach the head
– access time= seek time + rotational delay
– transfer time: time required for sector data transfer5.
What is I/O scheduler? (Cont.)● Disk seek is the slowest operation in a computer
– A system would perform horribly without a suitable I/O scheduler
● I/O scheduler arranges the disk head to move in a single direction to minimize seeks
– Like the way elevators moves between floors
– Achieve greater global throughput at the expense of fairness to some requestss
What is I/O scheduler? (Cont.)I/O scheduler's job
● Improve overall disk throughput by
– Reordering and sorting requests to reduce the disk seek time
– Merging requests to reduce the number of requests
● Prevent starvation
– Submit requests before deadline
– Avoid read starvation by write
● Provide fairness among different processes
Linux I/O scheduler framework
● Linux elevator is an abstract layer to which different I/O scheduler can attach
● Merging mechanisms are provided by request queues– Front or back merge of a request and a bio
– Merge two requests
● Sorting policy and merge decision are done in elevators– Pick up a request to be merged with a bio
– Add a new request to the request queue
– Select next request to be processed by block driverselevator
Linux I/O scheduler framework
● Mina idea of Linux I/O scheduler framework
– Internal queue: used inside I/O scheduler
– external queue: device driver visible queue
Linux I/O schedulerNoop elevator
● Suitable for truly random-access device, like RAM disk
● Requests in the queue are kept in FIFO order
● Only the last request added to the request queue will be tested for the possibility of a merge
● For Flash device, need some modification, to make life of chip longer.
Linux I/O schedulerLinus elevator
● The Linus Elevator functions almost exactly like the classic I/O scheduler (sort and merge).
● For the most part, this was great because simplicity is a good thing and the 2.4 kernel's I/O scheduler just worked.
● Unfortunately, in the I/O scheduler's quest to max-imize global I/O throughput, a trade-off was made: local fairness --- in particular, request latency --- can go easily out the window.
Linux I/O schedulerDeadline elevator
● The Deadline I/O Scheduler was introduced to solve the starvation issue surrounding the 2.4 I/O sched-uler and traditional elevator algorithms in general.
Linux I/O schedulerDeadline elevator (cont)
● Goal– Reorder requests to improve I/O performance while simultan-
eouslyensuring that no I/O request is being starved– Favor reads over writess
● Each requests is associated with a expire time– Read: 500ms, write 5sec
● Requests are inserted into– A sorted-by-start-sector queue (two queues! for read and
write)– A FIFO list (two lists too!) sorted by expire time
● Normally, requests are pulled from sorted queues. However, if the request at the head of either FIFO queue expires, requests are still processed in sor-ted order but started from the first request in the FIFO queue
Linux I/O schedulerDeadline elevator (cont)
Linux I/O schedulerDeadline elevator (cont)
● The Deadline I/O Scheduler can enforce a soft dead-line on I/O requests.
● Although it makes no promise that an I/O request is serviced before the expiration time, the I/O sched-uler generally services requests near their expira-tion times.
● Deadline I/O Scheduler continues to provide good global throughput without starving any one request for an unacceptably long time. Because read requests are given short expiration times, the writes-starving-reads problem is minimized.
Linux I/O schedulerAnticipatory elevator
● Deadline elevator is perfect, but not best.
VS
Linux I/O schedulerAnticipatory elevator (Cont.)
● Key idea:
– Sometimes wait for process whose request was last serviced.
– Keeps disk idle for short intervals.
● But with informed decisions, this idea:
– Improves throughput
– Achieves desired proportions
● Balance expected benefits of waiting against cost of keeping disk idle.
Linux I/O schedulerAnticipatory elevator (Cont.)
● Based on deadline I/O scheduler● Suitable for desktop, good interactive performance● Design shortcomings
– Assume only 1 physical seeking head● Bad for RAID devices
– Only 1 read request are dispatched to the disk controller at a time
● Bad for controller that supports TCQ(Tagged Command Queuing)
– Read anticipation assumes synchronous requests are issued by individual processes
● Bad for requests issued cooperatively by multiple pro-cesses
● Rough benefit-cost analysis– Anticipate a better request if mean thinktimeof the process
< 6ms and mean seek distance of the process < seek distance of next requests
Linux I/O schedulerAnticipatory elevator (Cont.)
● One-way elevator algorithm– Limited backward seeks
● FIFO expiration times for reads and for writes– When a requests expire, interrupt the current elevator sweep
● Read and write request batching– Scheduler alternates dispatching read and write batches to
the driver. The read (write) FIFO timeout values are tested only during read(write) batches.
● Read Anticipation– At the end of each read request, the I/O scheduler examines
its next candidate read request from its sorted read list and decide whether to wait for a “better request”
Linux I/O schedulerAnticipatory elevator (Cont.)
● Robert Love's testing results
I/O Scheduler and Kernel Test 1 Test 2Linus Elevator on 2.4 45 seconds 30 minutes, 28 secondsDeadline I/O Scheduler on 2.6 40 seconds 3 minutes, 30 secondsAnticipatory I/O Scheduler on 2.6 4.6 seconds 15 seconds
Test 1:task 1
while truedo dd if=/dev/zero of=file bs=1Mdone
task 2
time cat 200mb-file > /dev/null
Test 2:task 1
while truedo cat big-file > /dev/nulldone
task 2
time find . -type f -exec cat '{}' ';' > /dev/null
Linux I/O schedulerCFQ elevator
CFQv2 (Complete Fair Queuing) I/O scheduler
● Goal
– Provide fair allocation of I/O bandwidth among all the ini-tiators of I/O requests
● CFQ can be configured to provide fairness at per-process, per-process-group, per-user and per-user-group levels.
● Each initiator has its own request queue and CFQ services these queues round-robin
– Data write back is usually performed by the pdflush kernel threads. That means, all data writes share the alloted I/O bandwidth of the pdflush threadsa
Linux I/O schedulerCFQ elevator (Cont.)
Linux I/O schedulerCFQ elevator (Cont.)
Redhat Says:
● The Completely Fair Queuing (CFQ) scheduler is the default algorthim in Red Hat Enterprise Linux 4.
● As the name implies, CFQ maintains a scalable per-process I/O queue and attempts to distribute the available I/O bandwidth equally among all I/O re-quests.
● CFQ is well suited for mid-to-large multi-processor systems and for systems which require balanced I/O performance over multiple LUNs and I/O controllers.
ConclusionEvery dog has its day.
● Every elevator has its advantage, even the NOOP el-evator.
● What is the best performance, depends on the applic-ations.
● Analyze the application, adopt proper elevator, you will achieve best performance.
● Dynamic elevator selecting is under discussion and developing.
● Linux kernel changes everyday, follow the kernel email list for latest news.
Thank you !
Q & A