bil244 lecture05 files&drirectories

30
BIL 244 – System Programmi ng Lecture 5 Chapter 5 of Robbi ns Book Files and Directories

Upload: elemaniaq

Post on 30-May-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 1/30

BIL 244 – System Programming 

Lecture 5Chapter 5 of Robbins Book

Files and Directories

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 2/30

BIL 244 – System Programming 

UNIX File System Navigation

• Operating systems organize physical disks into file systems toprovide high-level logical access to the actual bytes of a file.

• A file system is a collection of files and attributes such as locationand name. Instead of specifying the physical location of a file ondisk, an application specifies a filename and an offset. Theoperating system makes a translation to the location of the physical

file through its file systems.• A directory is a file containing directory entries that associate a

filename with the physical location of a file on disk.

• When disks were small, a simple table of filenames and their

positions was a sufficient representation for the directory. Largerdisks require a more flexible organization, and most file systemsorganize their directories in a tree structure. This representationarises quite naturally when the directories themselves are files.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 3/30

BIL 244 – System Programming 

UNIX File System Navigation

• The absolute or fully qualified pathname specifies all of 

the nodes in the file system tree on the path from the root

to the file itself. The absolute path starts with a slash (/) todesignate the root node and then lists the names of the

nodes down the path to the file within the file system tree

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 4/30

BIL 244 – System Programming 

The current working directory

• A program does not always have to specify files by fully

qualified pathnames. At any time, each process has an associated

directory, called the current working directory, that it uses forpathname resolution.

• If a pathname does not start with /, the program prepends the

fully qualified path of the current working directory. Hence,

pathnames that do not begin with / are sometimes called relative

 pathnames because they are specified relative to the fully

qualified pathname of the current directory.

• A dot (.) specifies the current directory, and a dot-dot (..)specifies the directory above the current directory.

• The root directory has both dot and dot-dot pointing to itself.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 5/30

BIL 244 – System Programming 

The current working directory

• The PWD environment variable specifies the current working directory of aprocess. Do not directly change this variable, but rather use the getcwd

function to retrieve the current working directory and use the chdir function

to change the current working directory within a process.• The chdir function causes the directory specified by path to become the

current working directory for the calling process.

• The getcwd function returns the pathname of the current working directory.The buf parameter of getcwd represents a user-supplied buffer for holdingthe pathname of the current working directory. The size parameter specifiesthe maximum length pathname that buf can accommodate, including thetrailing string terminator.

• If successful, chdir returns 0. If unsuccessful, chdir returns –1 and sets

errno. If successful, getcwd returns a pointer to buf. If unsuccessful,

getcwd returns NULL and sets errno.

#include <uninstd.h>

int chdir(const char *path);

char *getcwd(char *buf, size_t size);

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 6/30

BIL 244 – System Programming 

The current working directory

• If buf is not NULL, getcwd copies the name into buf. If buf isNULL, POSIX states that the behavior of getcwd is undefined.

• In some implementations, getcwd uses malloc to create a bufferto hold the pathname. (!Do not rely on this behavior !)

• You should always supply getcwd with a buffer large enough tofit a string containing the pathname.

• The PATH_MAX constant may or may not be defined in limits.h.The optional POSIX constants can be omitted from limits.h if their values are indeterminate but larger than the required POSIX

minimum. For PATH_MAX, the _POSIX_PATH_MAX constant

specifies that an implementation must accommodate pathnamelengths of at least 255.

• A vendor might allow PATH_MAX to depend on the amount of available memory space on a specific instance of a specific

implementation.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 7/30

 A complete program to output the current working directory

#include <limits.h>

#include <stdio.h>

#include <unistd.h>

#ifndef PATH_MAX

#define PATH_MAX 255#endif

int main(void) {

char mycwd[PATH_MAX];

if (getcwd(mycwd, PATH_MAX) == NULL) {

 perror("Failed to get current working directory");return 1;

}

 printf("Current working directory: %s\n", mycwd);

return 0;

}

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 8/30

BIL 244 – System Programming 

The current working directory

• A more flexible approach uses the pathconf function to determine the realvalue for the maximum path length at run time. The pathconf function isone of a family of functions that allows a program to determine system and

runtime limits in a platform-independent way.• The sysconf function takes a single argument, which is the name of a

configurable systemwide limit such as the number of clock ticks per second( _SC_CLK_TCK) or the maximum number of processes allowed per user( _SC_CHILD_MAX).

• The pathconf and fpathconf functions report limits associated with aparticular file or directory.

• The fpathconf takes a file descriptor and the limit designator asparameters, so the file must be opened before a call to fpathconf.

• The pathconf function takes a pathname and a limit designator asparameters, so it can be called without the program actually opening the file.

• The sysconf function returns the current value of a configurable systemlimit that is not associated with files. Its name parameter designates the limit.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 9/30

BIL 244 – System Programming 

The current working directory

• If successful, these functions return the value of the limit. If unsuccessful, these functions return –1

and set errno

#include <uninstd.h>

long fpathconf(int fildes, int name);long pathconf(const char *path, int name); long sysconf(int name);long sysconf(int name);

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 10/30

A program that uses pathconf to output the current working directory

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

int main(void) {

long maxpath;

char *mycwdp;

if ((maxpath = pathconf(".", _PC_PATH_MAX)) == -1) {

 perror("Failed to determine the pathname length");

return 1;

}

if ((mycwdp = (char *) malloc(maxpath)) == NULL) {

 perror("Failed to allocate space for pathname");

return 1;

}

if (getcwd(mycwdp, maxpath) == NULL) {

 perror("Failed to get current working directory");

return 1;

}

 printf("Current working directory: %s\n", mycwdp);

return 0;

}

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 11/30

BIL 244 – System Programming 

Directory Access

• Directories should not be accessed with the ordinary open, closeand read functions. Instead, they require specialized functionswhose corresponding names end with "dir": opendir, closedir

and readdir.• The opendir function provides a handle of type DIR * to a

directory stream that is positioned at the first entry in thedirectory.

• The readdir function reads a directory by returning successiveentries in a directory stream pointed to by dirp. The readdir

returns a pointer to a struct dirent structure containinginformation about the next directory entry. The readdir movesthe stream to the next position after each call.

• The closedir function closes a directory stream, and therewinddir function repositions the directory stream at itsbeginning. Each function has a dirp parameter that correspondsto an open directory stream.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 12/30

BIL 244 – System Programming 

Directory Access

• opendir provides a handle for the other functions.

• readdir gets the next entry in the directory.

• rewinddir restarts from the beginning.

• closedir closes the handle.

Note that like strtok these are not reentrant.

#include <dirent.h>

DIR *opendir(const char *filename);

struct dirent *readdir(DIR *dirp);void rewinddir(DIR *dirp);

int closedir(DIR *dirp);

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 13/30

A program to list files in a directory.

#include <dirent.h>

#include <errno.h>

#include <stdio.h>

int main(int argc, char *argv[]) {

struct dirent *direntp;

DIR *dirp;

if (argc != 2) {

fprintf(stderr, "Usage: %s directory_name\n", argv[0]);

return 1;

}

if ((dirp = opendir(argv[1])) == NULL) {

 perror ("Failed to open directory");

return 1;

}

while ((direntp = readdir(dirp)) != NULL)

 printf("%s\n", direntp->d_name);

while ((closedir(dirp) == -1) && (errno == EINTR)) ;

return 0;

}

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 14/30

BIL 244 – System Programming 

Accessing file status information

• This section describes three functions for retrieving file statusinformation. The fstat function accesses a file with an openfile descriptor. The stat and lstat functions access a file by

name.

• stat is given the name of a file.

• fstat is used for open files.

• lstat does the same thing as stat except that if the file is asymbolic link, it gives information about the link, rather thanthe file it is linked to

#include <sys/stat.h>

int lstat(const char *restrict path, struct stat *restrict buf);

int stat(const char *restrict path, struct stat *restrict buf);int fstat(int fildes, struct stat *buf);

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 15/30

BIL 244 – System Programming 

Accessing file status information

• The contents of the struct stat are system dependent, but

the standard says that it must contain at least the following

fields:dev_t st_dev; /* device ID of device containing file */

ino_t st_ino; /* file serial number */

mode_t st_mode; /* file mode */

nlink_t st_nlink; /* number of hard links */uid_t st_uid; /* user ID of file */

gid_t st_gid; /* group ID of file */

off_t st_size; /* file size in bytes (regular files) */

/* path size (symbolic links) */

time_t st_atime; /* time of last access */time_t st_mtime; /* time of last data modification */

time_t st_ctime; /* time of last file status change */

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 16/30

The following function displays the time that the file path was last accessed

#include <stdio.h>

#include <time.h>

#include <sys/stat.h>

void printaccess(char *path) {

struct stat statbuf;

if (stat(path, &statbuf) == -1)

 perror("Failed to get file status");

else printf("%s last accessed at %s", path, ctime(&statbuf.st_atime));

}

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 17/30

The isdirectory function returns true (nonzero) if path is a directory, and false (0) otherwise.

#include <stdio.h>

#include <time.h>

#include <sys/stat.h>

int isdirectory(char *path) {

struct stat statbuf;

if (stat(path, &statbuf) == -1)

return 0;

else return S_ISDIR(statbuf.st_mode);

}

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 18/30

BIL 244 – System Programming 

Unix File System Implementation

• Disk formatting divides a physical disk into regions called partitions.

• Each partition can have its own file system associated with it. A particularfile system can be mounted at any node in the tree of another file system.

• The topmost node in a file system is called the root of the file system.

• The root directory of a process (denoted by /) is the topmost directory thatthe process can access.

• All fully qualified paths in UNIX start from the root directory /.

Structure of a typical UNIX file system

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 19/30

BIL 244 – System Programming 

UNIX file implementation

• POSIX does not mandate any particular representation of files ondisk, but traditionally UNIX files have been implemented with amodified tree structure.

• Directory entries contain a filename and a reference to a fixed-length structure called an inode.

• The inode contains information about the file size, the file

location, the owner of the file, the time of creation, time of lastaccess, time of last modification, permissions and so on.

• In addition to descriptive information about the file, the inodecontains pointers to the first few data blocks of the file. If the file

is large, the indirect pointer is a pointer to a block of pointers thatpoint to additional data blocks. If the file is still larger, the doubleindirect pointer is a pointer to a block of indirect pointers. If thefile is really huge, the triple indirect pointer contains a pointer to a

block of double indirect pointers.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 20/30

BIL 244 – System Programming 

Inodes

Schematic structure of a traditional UNIX file.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 21/30

BIL 244 – System Programming 

Directory implementation

• A directory is a file containing a correspondence between

filenames and file locations.

• UNIX has traditionally implemented the location specificationas an inode number, but as noted above, POSIX does not

require this.

• The inode itself does not contain the filename. When a

program references a file by pathname, the operating system

traverses the file system tree to find the filename and inode

number in the appropriate directory.

• Once it has the inode number, the operating system candetermine other information about the file by accessing the

inode.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 22/30

BIL 244 – System Programming 

Directory Implementation

• A directory implementation that contains only namesand inode numbers has the following advantages.

1. Changing the filename requires changing only the directoryentry. A file can be moved from one directory to another just bymoving the directory entry, as long as the move keeps the file onthe same partition or slice.

2. Only one physical copy of the file needs to exist on disk, but thefile may have several names or the same name in differentdirectories. Again, all of these references must be on the samephysical partition.

3. Directory entries are of variable length because the filename is

of variable length. Directory entries are small, since most of theinformation about each file is kept in its inode. Manipulatingsmall variable-length structures can be done efficiently. Thelarger inode structures are of fixed length.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 23/30

BIL 244 – System Programming 

Hard Links and Symbolic Links

UNIX directories have two types of links—links and symbolic link 

– A link is an association between a filename and an inode ,

sometimes called a hard link,

– A symbolic link, sometimes called a soft link, is a file that stores a

string used to modify the pathname when it is encountered during

pathname resolution

• Each inode contains a count of the number of hard links to theinode.

• When a file is created, a new directory entry is created an a new

inode is assigned.• Additional hard links can be created withln newname oldname

or with the link system call.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 24/30

BIL 244 – System Programming 

Links

• A new hard link to an existing file creates a new

directory entry but assigns no other additional disk 

space.• A new hard link increments the link count in the

inode.

• A hard link can be removed with the rm command orthe unlink system call.

• These decrement the link count.

• The inode and associated disk space are freed when

the count is decremented to 0.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 25/30

BIL 244 – System Programming 

Symbolic Links

• A symbolic link is a special type of file that contains

the name of another file.

• A reference to the name of a symbolic link causes theoperating system to use the name stored in the file,

rather than the name itself.

• Symbolic lines are created with the command:ln -s newname oldname

• Symbolic links do not affect the link count in the inode.

• Unlink hard links, symbolic links can span filesystems.

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 26/30

BIL 244 – System Programming 

Simple File

• Assume that the directory entry of a file name1 in

directory /dirA is as shown below

A directory entry, inode, and data block for a simple file

Th h ll d t t “ 2 ” i di B t i i i t t /di A/ 1

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 27/30

The shell command creates an entry “name2 ” in dirB containing a pointer to /dirA/name1

>ln /dirA/name2 /dirB/name2

or equavalently the following program will perform the same (figure illustrates the output)

#include <stdio.h>

#include <unistd.h>

....

if (link("/dirA/name1", "/dirB/name2") == -1)

 perror("Failed to make a new link in /dirB").....

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 28/30

BIL 244 – System Programming 

link and unlink

• The link function creates a new directory entry for the existing filespecified by path1 in the directory specified by path2.

• If successful, the link function returns 0. If unsuccessful, link returns –1and sets errno

• Similarly the unlink function removes the directory entry specified bypath. If the file's link count is 0 and no process has the file open, the unlink frees the space occupied by the file

• If successful, the unlink function returns 0. If unsuccessful, unlink returns–1 and sets errno

#include <unistd.h>

int link(const char *path1, const char *path2);

#include <unistd.h>

int unlink(const char *path);

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 29/30

BIL 244 – System Programming 

Creating and removing symbolic links

• Create a symbolic link by using the ln command with

the -s option or by invoking the symlink function.

• The path1 parameter of symlink contains the stringthat will be the contents of the link, and path2 gives

the pathname of the link. (path2 is the newly created

link and path1 is what the new link points to).

• If successful, symlink returns 0. If unsuccessful,

symlink returns –1 and sets errno

#include <unistd.h>

int symlink(const char *path1, const char *path2);

The following command creates a symbolic link /dirB/name2

8/14/2019 BIL244 Lecture05 Files&Drirectories

http://slidepdf.com/reader/full/bil244-lecture05-filesdrirectories 30/30

The following command creates a symbolic link /dirB/name2

>ln -s /dirA/name1 /dirB/name2

similarly the following code segment performs the same action

#include <stdio.h>

#include <unistd.h>

....

if (symlink("/dirA/name1", "/dirB/name2") == -1)

 perror("Failed to create symbolic link in /dirB");

.....