unix ch03-03
DESCRIPTION
cvbTRANSCRIPT
Chap 3. The Buffer Cache
1. Buffer Headers2. Structure of the Buffer Pool3. Scenarios for Retrieval of a Buffer4. Reading and Writing Disk Blocks5. Advantages and Disadvantages
of the Buffer Cache
Why Buffer cacheTo reduce the frequency of Disk
AccessSlow disk access time
libraries
System call interface
File system
Buffer cachecharacter block
Device driver
ProcessControlSubsystem
ipcschedular
mm
Hardware control
hardware
User programUser level
Kernel level
Hardware level
Figure 2.1 block diagram of the System Kernel
Buffer HeadersBuffer
Memory array• Contains data from disk
Buffer header• Identifies the buffer• Has
– Device number» Logical file system num not a physical device
num– Block number– Pointer to a data array for the buffer
Memory array : Buffer header = 1 : 1
Buffer headerBuffer header
Dev num
Block num
status
ptr to data area
ptr to next buf on the hash queue
ptr to next buf on free list
ptr to previous bufon free list
ptr to previous bufon hash queue
States of bufferLockedValidDelayed writeReading or writingWaiting for free
Figure 3.1 buffer header
Structure of the Buffer PoolFree List
Doubly linked circular list Every buffer is put on the free list when boot When insert
• Push the buffer in the head of the list ( error case )• Push the buffer in the tail of the list ( usually )
When take• Choose first element
Hash Queue Separate queues : each Doubly linked list When insert
• Hashed as a function device num, block num
Scenarios for Retrieval of a BufferTo allocate a buffer for a disk block
Use getblk()• Has Five scenarios
algorithm getblkInput: file system number block numberOutput: locked buffer that can now be used for block{ while(buffer not found) { if(block in hash queue){ /*scenario 5*/ if(buffer busy){ sleep(event buffer becomes free) continue; } mark buffer busy; /*scenario 1*/ remove buffer from free list; return buffer; }
else{ if ( there are no buffers on free list) { /*scenario 4*/ sleep(event any buffer becomes free) continue; } remove buffer from free list; if(buffer marked for delayed write) { /*scenario 3*/ asynchronous write buffer to disk; continue; } /*scenario 2*/ remove buffer from old hash queue; put buffer onto new hash queue; return buffer; } }}
1st ScenarioBlock is in hash queue, not busy
Choose that buffer
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10
3 35 99
(a) Search for block 4 on first hash queue
1st ScenarioAfter allocating
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10
3 35 99
(b) Remove block 4 from free list
2nd ScenarioNot in the hash queue and exist free buff.
Choose one buffer in front of free list
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10
3 35 99
(a) Search for block 18 – Not in the cache
2nd ScenarioAfter allocating
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10 18
35 99
(b) Remove first block from free list, assign to 18
3rd ScenarioNot in the hash queue and there exists delayed wri
te buffer in the front of free list Write delayed buffer async. and choose next
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10
3 35 99
delay
delay
(a) Search for block 18, delayed write blocks on free list
3rd ScenarioAfter allocating
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28
18
64
17 5 97
98 50 10
3 35 99
writing
writing
(b) Writing block 3, 5, reassign 4 to 18
4th ScenarioNot in the hash queue and no free
buffer Wait until any buffer free and re-do
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10
3 35 99
(a) Search for block 18, empty free list
4th ScenarioProcess A Process B
Cannot find block b on the hash queue
No buffer on free list
Sleep Cannot find block b on hash queue
No buffers on free list
Sleep
Takes buffer from free list Assign to block b
Somebody frees a buffer: brelse
Figure 3.10 Race for free buffer
4th ScenarioWhat to remind
When process release a buffer wake all process waiting any buffer cache
5th ScenarioBlock is in hash queue, but busy
Wait until usable and re-do
Blkno 0 mod 4
Freelist header
Blkno 1 mod 4
Blkno 2 mod 4
Blkno 3 mod 4
28 4 64
17 5 97
98 50 10
3 35 99busy
(a) Search for block 99, block busy
5th ScenarioProcess A Process B
Allocate buffer to block b Lock buffer Initiate I/O Sleep until I/O done
brelse(): wake up others
Find block b on hash queue Buffer locked, sleep
Buffer does not contain block b start search again
Process C
I/O done, wake up
time
Sleep waiting for any free buffer ( scenario 4 )
Get buffer previously assigned to block breassign buffer to buffer b’
Reading Disk BlocksUse bread()If the disk block is in buffer cache
How to know the block is in buffer cache• Use getblk()
Return the data without disk accessElse
Calls disk driver to schedule a read request Sleep When I/O complete, disk controller interrupts the Proce
ss Disk interrupt handler awakens the sleeping process Now process can use wanted data
Reading Disk Blocks
Algorithm breadInput: file system block numberOutput: buffer containing data{ get buffer for block(algorithm getblk); if(buffer data valid) return buffer; initiate disk read; sleep(event disk read complete); return(buffer);}
Algorithm breadInput: file system block numberOutput: buffer containing data{ get buffer for block(algorithm getblk); if(buffer data valid) return buffer; initiate disk read; sleep(event disk read complete); return(buffer);}
algorithmalgorithm
Reading Disk Blocks
Block device fileBlock device file
High-level blockDevice handler
High-level blockDevice handler
Buffer cacheBuffer cache Low-level blockDevice handler
Low-level blockDevice handler DISK
RAM
block_read()block_write()bread()breada()
getblk() ll_rw_block()
In linuxIn linux
Figure 13-3 block device handler architecture for buffer I/O operation in Understanding the Linux Kernel
Reading Disk BlocksRead Ahead
Improving performance• Read additional block before request
Use breada()
Algorithm breadaInput: (1) file system block number for immediate read (2) file system block number for asynchronous readOutput: buffer containing data for immediate read{ if (first block not in cache){ get buffer for first block(algorithm getblk); if(buffer data not valid) initiate disk read; }
Algorithm breadaInput: (1) file system block number for immediate read (2) file system block number for asynchronous readOutput: buffer containing data for immediate read{ if (first block not in cache){ get buffer for first block(algorithm getblk); if(buffer data not valid) initiate disk read; }
AlgorithmAlgorithm
Reading disk Block
if second block not in cache){ get buffer for second block(algorithm getblk); if(buffer data valid) release buffer(algorithm brelse); else initiate disk read; } if(first block was originally in cache) { read first block(algorithm bread) return buffer; } sleep(event first buffer contains valid data); return buffer;}
if second block not in cache){ get buffer for second block(algorithm getblk); if(buffer data valid) release buffer(algorithm brelse); else initiate disk read; } if(first block was originally in cache) { read first block(algorithm bread) return buffer; } sleep(event first buffer contains valid data); return buffer;}
Algorithm-contAlgorithm-cont
Synchronous write the calling process goes the sleep awaiting I/O completion and rele
ases the buffer when awakens. Asynchronous write
the kernel starts the disk write. The kernel release the buffer when the I/O completes
Delayed write The kernel put off the physical write to disk until buffer reallocated Look Scenario 3
Relese Use brelse()
Writing disk Block
Writing Disk Block
Algorithm bwriteInput: bufferOutput: none{ Initiate disk write; if (I/O synchronous){ sleep(event I/O complete); relese buffer(algorithm brelse); } else if (buffer marked for delayed write) mark buffer to put at head of free list;}
Algorithm bwriteInput: bufferOutput: none{ Initiate disk write; if (I/O synchronous){ sleep(event I/O complete); relese buffer(algorithm brelse); } else if (buffer marked for delayed write) mark buffer to put at head of free list;}
algorithmalgorithm
Release Disk Block
Algorithm brelseInput: locked bufferOutput: none{ wakeup all process; event, waiting for any buffer to become free; wakeup all process; event, waiting for this buffer to become free; raise processor execution level to allow interrupts; if( buffer contents valid and buffer not old) enqueue buffer at end of free list; else enqueue buffer at beginning of free list lower processor execution level to allow interrupts; unlock(buffer);}
Algorithm brelseInput: locked bufferOutput: none{ wakeup all process; event, waiting for any buffer to become free; wakeup all process; event, waiting for this buffer to become free; raise processor execution level to allow interrupts; if( buffer contents valid and buffer not old) enqueue buffer at end of free list; else enqueue buffer at beginning of free list lower processor execution level to allow interrupts; unlock(buffer);}
algorithmalgorithm
Advantages and Disadvantages Advantages
Allows uniform disk access Eliminates the need for special alignment of user buffers
• by copying data from user buffers to system buffers, Reduce the amount of disk traffic
• less disk access Insure file system integrity
• one disk block is in only one buffer
Disadvantages Can be vulnerable to crashes
• When delayed write requires an extra data copy
• When reading and writing to and from user processes
What happen to buffer until now
Allocated buffer
Mark busy
release
Using getblk() – 5 scenarios
Preserving integrity
Using brelse algorithm
manipulate Using bread, breada, bwrite
Reference LINUX KERNEL INTERNALS
Beck, Bohme, Dziadzka, Kunitz, Magnus, Verworner The Design of the Unix operating system
Maurice j.bach Understanding the LINUX KERNEL
Bovet, cesati In linux
Buffer_head : include/linux/fs.h Bread : fs/buffer.c Brelse : include/linux/fs.h