file management introduction to operating systems: module 12
TRANSCRIPT
File Management
Introduction to Operating Systems: Module 12
File-system implementation
File-system structure Allocation methods Free-space management Directory implementation Efficiency and performance
File-system structure What are files?
Fundamental abstraction for organizing information on secondary storage
Named collection of information File system resides mainly on disk
Exceptions exist: /proc, /dev File system organized into a tree-like structure File manager administers the collection by:
Storing the information on a device Mapping the block storage to the logical view Allocating/de-allocating storage Providing file directories
Records
Applications
Structured Record Files
Record-Stream Translation
Stream-Block Translation
Byte Stream Files
Storage device
Information structure View of information varies based
on needs and requirements Applications need structured
perspective Via programming language primitives Or provided by OS
Translation to byte-streams Facilitate storage to media Bi-directional
Still must organize blocks on storage devices
Where does object serialization fit in?
Record-oriented files Manage & store set of records on list For example, mail messages are records with
Header, sender, subject, receiver, body Manipulated by editor, mailer (sender and receiver), browser, etc.
Structured sequential file is a named sequence of logical records indexed by non-negative numbers. Random access is only possible if records have fixed length
Operations include: Fileid=open(filename) close(fileid) Getrecord(fileid, record) putrecord(fileid, record) Seek(fileid, position)
File
Record-oriented files
A disk contains many files Files are sets of records Records are sets of fields Fields are variable types
Disk
File
File
File
Record
Record
Record
Record
Record
Record
RecordField 1
Field 2
Field 3
Field 4
Field 5
Stream-Block Translation
b0 b1 b2 bi......
Low level files
Byte-Streams Translated to Block Streams
Byte-Stream is a Named Sequence of Bytes of Unsigned Integers
File Pointer Maintains Position within Stream
After open operation, File Pointer is at the beginning
Byte stream file interface
Operations Fileid = open(filename) Close(fileid) Read(fileid, buffer, length) Write(fileid, buffer, length) Seek(fileid, fileposition)
File manager maps filenames to a collection of physical blocks on storage device Device drivers to read/write blocks Write operation allocates unused blocks Delete operation deallocate blocks Approach taken part of file management implementation strategy
Implementing low level files
Secondary storage device contains: Volume directory (sometimes a root directory for a file
system) External file descriptor for each file Contents of files
Manages blocks Assigns blocks to files (descriptor indexed) Keeps track of available blocks
Maps file to/from device byte stream
An open operation
Multi-step process Locate the external file descriptor Extract information needed to read/write file Authenticate that the process can access the file Create internal file descriptor in primary memory Create entry in a “per process” open file status table Allocate resources (buffers) for file usage
Close operation completes all pending operations, releases I/O buffers, frees locks, updates external file descriptor, deallocates file status table entry
fid = open(“fileA”, flags);…read(fid, buffer, len);
0 stdin1 stdout2 stderr3 ...
Descriptor Table
File structure
inode
Opening a UNIX file
Changes to in-memory inode occur when inode copied back to secondary storage either periodically or when file is closed
A typical file control block
In-memory file system structures
Block management
Assigning storage blocks to the file Fixed sized, k, blocks File of length m requires N = m/k blocks Byte bi is stored in block i/k File manager has three basic strategies for
allocating and managing blocks: Contiguous allocation Linked lists Indexed allocation
Contiguous allocation
Each file occupies contiguous blocks on the disk Simple – only starting location (block #) and length
(number of blocks) are required Random access is possible Wasteful of space
dynamic storage-allocation problem Files cannot grow
Contiguous allocation of disk space
Extent-based systems
Many newer file systems (I.E. Veritas file system) use a modified contiguous allocation scheme
Extent-based file systems allocate blocks in extents An extent is a contiguous set of disk blocks. A file
consists of one or more extents
Linked allocation Each file is a linked list of disk
blocks: blocks may be scattered anywhere on the disk.
Simple need only starting address
Free-space management system no waste of space
No random access Mapping is O(n)
pointerblock =
Linked allocation
File-allocation table FAT
Modified linked allocation
All the pointers are stored together The resulting structure
might fit in memory
Used for floppies
Indexed allocation
Brings all pointers together into the index block. Logical view.
index table
Example of indexed allocation
Indexed allocation
Need index table Random access Dynamic access without external fragmentation, but have
overhead of index block. Mapping from logical to physical in a file of maximum
size of 256k words and block size of 512 words. We need only 1 block for index table
Mapping from logical to physical in a file of unbounded length (block size of 512 words) Linked scheme – link blocks of index table (no limit on size)
Two-level index
INODE
outer-index
index table file
Combined scheme: UNIX (4k/block)
Free-space management
Bit vector (n blocks) Bit map requires extra space. Example:
Block size = 212 bytesDisk size = 230 bytes (1 gigabyte)N = 230/212 = 218 bits (or 32K bytes)
Easy to get contiguous files Linked list (free list)
Cannot get contiguous space easily No waste of space
Free-space management Need to protect:
Pointer to free list Bit map
Must be kept on disk Copy in memory and disk may differ Cannot allow for block[i] to have a situation where bit[i] = 1 in
memory and bit[i] = 0 on disk Solution:
Set bit[i] = 1 in disk Allocate block[i] Set bit[i] = 1 in memory
Linked free space list on disk
Efficiency and performance Efficiency dependent on:
Disk allocation and directory algorithms Types of data kept in file’s directory entry
Performance Disk cache
Separate section of main memory for frequently used blocks Free-behind and read-ahead
Techniques to optimize sequential access Improve PC performance by dedicating section of memory as virtual disk, or RAM
disk.
Various disk-caching locations
Managing the byte stream
Implementation of byte stream on top of contiguous set of blocks
Packing and unpacking blocks Must read-ahead on input (convert secondary storage blocks to
byte stream - unpack) Must write-behind on output (convert byte strings to secondary
storage blocks - pack) Block I/O
Buffer several blocks Memory mapped files
Directories
A set of logically associated files and sub directories File manager provides set of controls:
Enumerate: return file list and nested directories Copy: duplicate a file Rename: change symbolic name of file Delete: remove file from directory
Release all blocks and file descriptor
Traverse: maintain at least hierarchical structure and ability to navigate
Unix supports graphs (through symbolic links)
Directory structures
How should files be organized within directory? Flat name space: all files appear in a single directory Hierarchical name space
Directory contains files and subdirectories Each file/directory appears as an entry in exactly one other
directory -- a tree Popular variant: all directories form a tree, but a file can have
multiple parents
Dominance of directory structure in other tools Email folders in Netscape, Outlook Organizing designs in Rational Rose
Directory implementation
Device directory A device can contain a collection of files Easier to manage if there is a single root for all files: the device
root directory
File directory Typical implementations have directories implemented as a file
with a special format Entries are pointers to other files or subdirectories Symbolic linkage is possible (graphs)
Visibility of directory structure controlled by protection and security mechanism
Directory implementation
Linear list of file names with pointer to the data blocks Simple to program Time-consuming to execute
Hash table – linear list with hash data structure Decreases directory search time Collisions – situations where two file names hash to the
same location Fixed size
/
bin usr etc foo
bill nutt
abc
bar
blah
cde xyz
UNIX mount commandMount bar at foo