chapter 11.2 file system implementation – part 2
DESCRIPTION
Chapter 11.2 File System Implementation – Part 2. Chapter 11: File System Implementation. Chapter 11.1 File-System Structure File-System Implementation Directory Implementation Chapter 11.2 Allocation Methods Chapter 11.3 Free-Space Management Recovery Log-Structured File System. - PowerPoint PPT PresentationTRANSCRIPT
Chapter 11.2 File System Chapter 11.2 File System Implementation – Part 2Implementation – Part 2
11.2/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Chapter 11: File System ImplementationChapter 11: File System Implementation
Chapter 11.1
File-System Structure
File-System Implementation
Directory Implementation
Chapter 11.2
Allocation Methods
Chapter 11.3
Free-Space Management
Recovery
Log-Structured File System
11.3/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
11.4 Allocation Methods11.4 Allocation Methods
An allocation method refers to how disk blocks are arranged that store file data (records).
There are three primary approaches: Contiguous allocation
Linked allocation
Indexed allocation
11.4/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Contiguous Allocation of Disk SpaceContiguous Allocation of Disk Space Each file occupies a set of contiguous blocks on the disk Blocks occupy a linear ordering, and disk head movements (a disk seek), are
only to next sectors on track or to the next track within cylinder, etc. Number of disk seeks is therefore minimal since all blocks are kept together. Directory entry typically has address of first block and the number of blocks
only. This is all that is needed.
File access is very straightforward. For sequential access, the file system keeps track of the last block
referenced and can readily read the next block (see FCB format). For random access to some specific block, given that we want block i and
we typically start at block b, we can go very quickly to block b + i.
Biggest problem: file growth. Is totally new space required or other mechanism? Ahead. Extents may
help, but still a significant problem… Let’s look and see what a file might look like…
11.5/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Contiguous Allocation of Disk Space - VisualContiguous Allocation of Disk Space - Visual
Can easily see starting block number and number of blocks for each file.
See ‘count’ starts at 0 on the disk.
‘Mail’ starts at block 19 for six blocks.
All allocations are contiguous!
Note: there are holes!
This is simplistic, however.
11.6/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Contiguous Allocation of Disk SpaceContiguous Allocation of Disk Space Finding Space – allocation schemes:
Both first fit and best fit work pretty well, with first fit generally a bit better. (We will see how the system keeps track of available blocks ahead…)
Worst fit is undesirable in terms of time and storage utilization. All contiguous allocation schemes have external fragmentation issues. Could be a major or minor problem in managing an overall disk resource.
Down Side. Generally all installations have a downtime during low system usage where the disk can be compacted and external fragments brought together during a disk compaction activity. Can be done off-line – generally best. Users get a ‘warning’ of imposing
‘non-availability’ like at 3am, etc. Save your files, the system will not be available for a while. Disks can be ‘reorganized’ and garbage collected… We have ‘periodic maintenance’ and ‘system saves’ and compaction…… More later…
11.7/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Extent-Based SystemsExtent-Based Systems How much space is needed for the file? Oftentimes we do not know!
Lots of times, files cannot be extended ‘in place.’ So, what to do?
Can take system offline, allocate more space; move the data, and then restart the system
Very costly in run time.
We often overestimate required space – can be very wasteful, especially if all the ‘required’ newly requested space is really not used / needed.
Can find a totally larger space, copy the file into the new space and release old space.
But this involves down time, possibly rerunning a process, and other management considerations.
Some systems use extent-based file systems and they allocate disk blocks in extents
An extent is a contiguous block of disks
A file consists of a basic allocation plus one or more extents.
IBM uses a SPACE parameter: A process requests an original allocation of say 10 tracks and 2 possible extents of one track each. Ten are allocated and two are held in reserve and used if needed.
Extents are ‘linked in’ as needed.
11.8/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Linked Allocation of Disk SpaceLinked Allocation of Disk Space Here, in linked allocation, we no longer have problems with contiguous
allocation scheme. Each file is a linked list of disk blocks: blocks may be scattered anywhere on
the disk. Directory will point to the first block, and each block points to the next
block. (of course, links take some of the space in the block) For a New file: create a new entry in the directory – no final size is needed.
Pointer is set to null and each request requires the space management system to find a block and link it in.
No external fragmentation, and file can grow. Disk need not be compacted due to this kind of allocation.
Major Disadvantage: Cannot be used for random access – only sequential access. We must follow the pointers until we find the desired block. Not efficient if we need a direct-access capability.
Also pointers do take up some space, if one adds them up!
11.9/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Linked Allocation of Disk Space - ClustersLinked Allocation of Disk Space - Clusters Lots of times clusters of blocks are allocated. If so, the pointers will occupy much less space, and efficiency is
improved because the cluster of blocks are located in contiguous locations.
But, of course, this means there’s a possibility of external fragmentation. Clusters are nevertheless used in most systems. There are a lot of inherent dangers is present in a linked allocation:
dropping a pointer. Could link into a protected area Could link into some other file Could simply lose your data!!!
Potential Solution - often used: have a doubly-linked list Potential Solution2 – store the file name and relative block number in
each block – but this requires more space! And these links add up!
So there are issues with linked allocation. Let’s see what linked allocation looks like….
11.10/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Linked Allocation - VisualLinked Allocation - Visual
Note: Starting location only is stored in the directory.
All else is linked!
Why might you think that in addition to the starting link, only the last link is stored in directory??
11.11/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Linked Allocation with File Allocation Table. Linked Allocation with File Allocation Table.
Many disks use a FAT (File Allocation Table), which is a data structure on disk and located at the beginning of each volume.
The directory has one entry per file, and this entry points into the FAT for a particular file reference.
(The FAT is indexed by block number)
The FAT entry contains the address of the ‘next’ block in the file for random access.
Final block in the table has a special end of file mark. (See next slide)
Remember: linked allocation only permits sequential access!
Unused blocks in the FAT have a 0 table value.
When more space is needed for the linked file, the file management system finds an available block (value 0 in the FAT) and moves that block number to the previous block’s EOF value. (simply a singly-linked list…)
Downslide: This scheme may result in a lot of disk head movement, which definitely slows things down.
Solution: Cache the FAT for sure.
Advantage: random-access is greatly improved because any block can be accessed via the FAT access, particularly if the FAT is in cache, if we know the block number.
11.12/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
File-Allocation Table - VisualFile-Allocation Table - Visual
Indexed byblock number.
11.13/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Indexed Allocation of Disk SpaceIndexed Allocation of Disk Space
In linked allocation, we
don’t have the external fragmentation problem and we
don’t have the size declaration problem, but
we also do not have direct access capability without the FAT because the pointers to the blocks are within the blocks and hence must be retrieved.
Indexed Allocation brings all pointers (links) together into the index block.
Each file has its own index built as an array of block addresses.
To access a block, we use the index,
search the index for a hit, and
hit (if present) will point to the disk location for that block.
11.14/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Indexed Allocation of Disk SpaceIndexed Allocation of Disk Space
Indexed allocation supports direct access w/no external fragmentation. Any free block will suffice when a block needs to be added to the file.
Pointer overhead is more than linked allocation because we actually have a separate file: the index.
This index itself will occupy at least one block of disk storage. (Of course, it can be cached during use – and generally is.)
So how large should the index block be?
Want it to be small, since every indexed file will have one, but we want a sufficient number of entries to support large file access.
Want it to be large? Might need to link several index blocks.
Several implementations of this, as we shall see.
11.15/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Example of Indexed Allocation - VisualExample of Indexed Allocation - Visual
Shows recods in block 19 as well as unused space…
11.16/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Structure of the Index BlockStructure of the Index Block
Linked Scheme: usually one-block long, but we can link blocks (that is, several ‘indices’) for particularly large files. (very large files.)
Multilevel index: First index block may only be a set of pointers to a second level index block. These in turn point to the data blocks.
IBM uses this organization for its indexed sequential files, which it calls Key Sequenced Data Sets (KSDS).
It calls the outermost block the index set, followed by the sequence set followed by the data themselves organized into what they call control areas and control intervals…
Note: a two-level index would allow a file size of up to 4GB (with 4K blocks).
Combined Scheme: (used by Unix) keeps the first set of pointers of the index block in the file’s inode
This scheme involves a number of direct and indirect blocks and we will not spend time on this one.
11.17/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Indexed Allocation – Mapping (Cont.)Indexed Allocation – Mapping (Cont.)
outer-index
index table file
General mappings with multiple indicesSome systems have ‘coarse indices followed by ‘fine’ indices, etc….
11.18/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
INDEX COMPONENT
…
. . .
INDEX SET
SEQUENCE SET
CONTROLINTERVALS
CONTROL AREA CONTROL AREA CONTROL AREA
. . .
DATA COMPONENT
11.19/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
I1 I2
S1 S2 S3
D1 D2 D3 D4
9/S162S2
FREE
FREE
FREE
FREE
FREE FREE FREE
3D1
9D2
36D3
62D4
1 3 5 9 35 36 42 43 62
CONTROL INTERVALS CONTROL INTERVALS
CONTROL AREAS
INDEXSET
SEQUENCE SETS
KEY VALUES EXTREMELY EXAGGERATED!!
11.20/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
PerformancePerformance Choice of an allocation methods is largely dependent upon how the data
needs to be accessed.
Contiguous Allocation – requires only one access to get to the data block.
Keep initial address in memory and calculate disk addresses from there.
Linked Allocation – keep the address of the next block in memory and can read it directly.
Major disadvantage – no random access, and access to a specific block might well require multiple reads to get ‘to’ that record.
Some systems that require direct access use a contiguous allocation scheme and linked allocation for sequential access.
These accesses must be declared when the file is created.
Sequential files will be linked
Direct access files will be contiguous and can support both direct access and sequential access, such as indexed sequential file organizations.
11.21/40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts
Performance - 2Performance - 2 Indexed Allocation – If index is in memory, accesses are quick.
Retaining the index in memory does require space; but often in cashe.
If space is available, then this is good. If space is not available, then the index and the data require two I/Os
– and this is not desirable. For multiple index blocks, more reads might be needed.
Performance using indexed allocation depends on the index structure, the size of the file, and the position of the block desired. Caching the index file(s) is significantly helpful if space is
available. There are a number of other approaches at optimization. Your book
cites that oftentimes it is not unreasonable to add thousands of extra instructions to the operating system to save just a few disk-head movements.
“Furthermore, this disparity is increasing over time, to the point where hundreds of thousands of instructions reasonably could be used to optimize head movements.” Discuss.
End of Chapter 11.2End of Chapter 11.2