1 advanced programming in unix 1 file i/o hua lisystems programmingcs2690file i/o

38
1 Advanced programming in UNIX 1 File I/O Hua Li Systems Programming CS2690 File I/O

Post on 20-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

1

Advanced programming in UNIX

1 File I/O

Hua LiSystems ProgrammingCS2690 File I/O

Page 2: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

2

File Descriptors

• All the open files are referred to the kernel by file descriptors.

• A file descriptor is a non-negative integer.

• When we open an existing file or create a new file, the kernel returns a file descriptor to the process.

Hua LiSystems ProgrammingCS2690 File I/O

Page 3: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

3

open Function

• A file is opened or created by calling the open function.

Hua LiSystems ProgrammingCS2690 File I/O

#include <sys/types.h>

#include <sys/stat.h>

#include <fcntl.h>

int open(const char *pathname, int oflag, … /* , mode_t mode */);

Returns: file descriptor if OK, -1 on error

Page 4: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

4

• The pathname is the name of the file to open or create.

• The oflag argument can be constants : (defined in /usr/include/sys/fcntl.h)

• O_RDONLY Open for reading only. (integer 0)

• O_WRONLY Open for writing only. (integer 1) • O_RDWR Open for reading and writing. (integer 2)

Hua LiSystems ProgrammingCS2690 File I/O

Page 5: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

5

• Some other constants are as follows:

• O_APPEND Append to the end of file on each write. (0x08)

O_CREATE Create the file if it doesnt exist.

Hua LiSystems ProgrammingCS2690 File I/O

Page 6: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

6

• File Access Permissions (mode parameter)

• All the file types (directories, etc.) have permissions.

• There are nine permission bits for each file, divided into three categories.

• The file access permission bits defined in <sys/stat.h> as follows:

Hua LiSystems ProgrammingCS2690 File I/O

Page 7: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

7

Hua LiSystems ProgrammingCS2690 File I/O

other-read

other-write

other-execute

S_IROTH

S_IWOTH

S_IXOTH

group-read

group-write

group-execute

S_IRGRP

S_IWGRP

S_IXGRP

user-read

user-write

user-execute

S_IRUSR

S_IWUSR

S_IXUSR

Meaningmode mask

Page 8: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

8

creat Function

• A new file can also be created by:

Hua LiSystems ProgrammingCS2690 File I/O

#include <sys/types.h>

#include <sys/stat.h>

#include <fcntl.h>

int creat(const char *pathname, mode_t mode);

Returns: file descriptor opened for write-only if OK, -1 on error.

Page 9: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

9

close Function

• An open file is closed by:

Hua LiSystems ProgrammingCS2690 File I/O

#include <unistd.h>

int close(int filedes);

Returns: 0 if OK, -1 on error.

Page 10: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

10

lseek Function

• Every open file has an associated “current file offset”.

• This is a nonnegative integer that measures the number of bytes from the beginning of the file.

• Read and write operations normally start at the current file offset and cause the offset to be incremented by the number of bytes read or written.

• By default, this offset is initialized to 0 when a file is opened, unless the O_APPEND option is specified.

Hua LiSystems ProgrammingCS2690 File I/O

Page 11: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

11

• An open file can be explicitly positioned by calling lseek.

• If whence is SEEK_SET, the file’s offset is set to offset bytes from the beginning of the file.

• If whence is SEEK_CUR, the file’s offset is set to its current value plus the offset. The offset can be positive or negative.

Hua LiSystems ProgrammingCS2690 File I/O

#include <sys/types.h>

#include<unistd.h>

off_t lseek(int filedes, off_t offset, int whence);

Returns: new file offset if OK, -1 on error

Page 12: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

12

• If whence is SEEK_END, the file’s offset is set to the size of the file plus the offset. The offset can be positive or negative.

• Since a successful call to lseek returns the new file offset, we can seek zero bytes from the current position to determine the current offset.

• off_t currpos;

• currpos = lseek(fd, 0, SEEK_CUR);

• This technique can also be used to determine if the referenced file is capable of seeking: if the file descriptor refers to a pipe or FIFO, lseek returns -1 and sets errno to EPIPE.

Hua LiSystems ProgrammingCS2690 File I/O

Page 13: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

13

Example

• The following program tests its standard input to see if it is capable of seeking:

Hua LiSystems ProgrammingCS2690 File I/O

#include <sys/types.h>

#include “ourhdr.h”

int main (void)

{

if (lseek(STDIN_FILENO, 0, SEEK_CUR) = = -1)

printf(“cannot seek\n”);

Page 14: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

14

Hua LiSystems ProgrammingCS2690 File I/O

else

printf(“seek OK\n”);

exit(0);

}

Page 15: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

15

read Function

• Data is read from an open file with the read function.

• If the read is successful, the number of bytes read is returned.

Hua LiSystems ProgrammingCS2690 File I/O

#include <unistd.h>

ssize_t read(int filedes, void *buff, size_t nbytes);

Returns: number of bytes read, 0 if end of file, -1 on error

Page 16: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

16

• If the end of file is encountered, 0 is returned.

• The read operation starts at the file’s current offset.

• Before a successful return, the offset is incremented by the number of bytes actually read.

Hua LiSystems ProgrammingCS2690 File I/O

Page 17: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

17

write Function

• Data is written to an open file with the write function.

• The return value is usually equal to the nbytes argument, otherwise an error has occurred.

• For a regular file, the write starts at the file’s current offset. If the O_APPEND option was specified in the open, the file’s offset is set to the current end of file before each write operation.

Hua LiSystems ProgrammingCS2690 File I/O

#include <unistd.h>

ssize_t write(int filedes, const void *buff, size_t nbytes);

Returns: number of bytes written if OK, -1 on error

Page 18: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

18

The following program copies (reads) from standard input (and writes) to standard output. In this example, we use the two defined names STDIN_FILENO and STDOUT_FILENO from <unistd.h>

Hua LiSystems ProgrammingCS2690 File I/O

#include “ourhdr.h”

#define BUFFSIZE 8192

int main (void)

{

int n;

Page 19: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

19

Hua LiSystems ProgrammingCS2690 File I/O

char buf[BUFFSIZE];

while ( (n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0)

if (write(STDOUT_FILENO, buf, n) != n)

err_sys(“write error”);

if (n < 0)

err_sys(“read error”);

exit(0);

}

Page 20: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

20

File Sharing

• Unix supports the sharing of open files between different processes.

• Three data structures (Process table, file table, V-node)

are used by the kernel to describe the file sharing.

Hua LiSystems ProgrammingCS2690 File I/O

Page 21: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

21

Process Table

• Every process has an entry in the process table.

• Within each process table entry is a table of open file descriptors, which we can think of as a vector, with one entry per descriptor.

• Associated with each file descriptor are:

• The file descriptor flags.

• A pointer to a file table entry.

Hua LiSystems ProgrammingCS2690 File I/O

Page 22: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

22

File Table

• The kernel maintains a file table for all open files.

• Each file table entry contains:

• The file status flags for the file (read, write, append, sync, nonblocking, etc.).

• The current file offset.

• A pointer to the v-node table entry for the file.

Hua LiSystems ProgrammingCS2690 File I/O

Page 23: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

23

v-node structure

• Each open file (or device) has a v-node structure.

• The v-node (virtual node) contains information about the type of file and pointers to functions that operate on the file.

• For most files the v-node also contains the i-node (information node) for the file.

• This information is read from disk when the file is opened, so that all the pertinent information about the file is readily available.

• For example, the i-node contains the owner of the file, the size of the file, the device the file is located on, pointers to where the actual data blocks for the file are located on disk, and so on.

Hua LiSystems ProgrammingCS2690 File I/O

Page 24: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

24

• The following figure shows a pictorial arrangement of the above three tables for a single process that has two different files open—one file is open on standard input (file descriptor 0) and the other is open on standard output (file descriptor 1).

Hua LiSystems ProgrammingCS2690 File I/O

Page 25: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

25

• The following figure shows two independent processes with the same file open:

Hua LiSystems ProgrammingCS2690 File I/O

Page 26: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

26

• If two independent processes have the same file open we could have the arrangement shown in the above figure.

• We assume here that the first process has the file open on descriptor 3, and the second process has that same file open on descriptor 4.

• Each process that opens the file gets its own file table entry, but only a single v-node table entry is required for a given file.

• One reason each process gets its own file table entry is so that each process has its own current offset for the file.

Hua LiSystems ProgrammingCS2690 File I/O

Page 27: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

27

Atomic Operations

Appending to a File

• Assuming two independent processes, A and B are appending to the same file.

• If A does lseek and positions its byte offset to the end of the file, say 1500, but, before it does a write, the kernel switches processes.

Hua LiSystems ProgrammingCS2690 File I/O

Page 28: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

28

• Now B starts running again. And, suppose, before the kernel switches again, B is able to do lseek and position its byte offset to 1500, the current end of the file, and, is also able to write, say upto 1600, which is now B’s current file offset.

• Now, if the kernel switches processes, and A starts running again, and it executes its write, it will write from its byte offset 1500, which will overwrite the data that B wrote to the file.

• Thus our problem arises, because our logical operation of “position to the end of file and write” requires two separate function calls.

Hua LiSystems ProgrammingCS2690 File I/O

Page 29: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

29

• The solution is to have the positioning to the current end of file and the write to be an atomic operation with regard to other processes.

• Unix provides an atomic way to do this operation if we set the O_APPEND flag when a file is opened.

• This causes the kernel to position the file to its current end of file before each write.

• Thus we no longer have to call lseek before each write.

Hua LiSystems ProgrammingCS2690 File I/O

Page 30: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

30

• Atomic operation refers to an operation that is composed of multiple steps, where, either all the steps are performed, or none is performed.

Hua LiSystems ProgrammingCS2690 File I/O

Page 31: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

31

dup and dup2 Functions

• An existing file descriptor is duplicated by either of the following functions:

Hua LiSystems ProgrammingCS2690 File I/O

#include <unistd.h>

int dup(int filedes);

Both return: new file descriptor if OK, -1 on error

Page 32: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

32

• The new file descriptor that is returned as the value of the functions shares the same file table entry as the filedes argument.

Hua LiSystems ProgrammingCS2690 File I/O

Page 33: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

33

• The following figure refers to the kernel data structures after dup:

Hua LiSystems ProgrammingCS2690 File I/O

Page 34: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

34

• In the above figure, we are assuming that the process executesnewfd = dup(1);

when its started.

• We assume the next available descriptor is 3 (which it probably is, since 0, 1, and 2 are opened by the shell).

• Since both descriptors point to the same file table entry they share the same file status flags (read, write, append, etc.) and the same current file offset.

Hua LiSystems ProgrammingCS2690 File I/O

Page 35: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

35

Example

• The following program takes a single command-line argument that specifies a file descriptor and prints a description of the file flags for that descriptor.

Hua LiSystems ProgrammingCS2690 File I/O

Page 36: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

36

Hua LiSystems ProgrammingCS2690 File I/O

#include <sys/types.h>

#include <fcntl.h>

#include “ourhdr.h”

int main(int argc, char *argv[])

{

int accmode, val;

if (argc != 2)

err_quit(“usage: a.out <descriptor#>”);

Page 37: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

37

Hua LiSystems ProgrammingCS2690 File I/O

if ( (val = fcntl(atoi(argv[1]), F_GETFL, 0)) < 0)

err_sys(“fcntl error for fd %d”, atoi(argv[1]));

accmode = val & O_ACCMODE; /* O_ACCMODE=3*/

if (accmode = = O_RDONLY) printf(“read only”);

else if (accmode = = O_WRONLY) printf(“write only”);

else if (accmode = = O_RDWR) printf(“read write”);

else err_dump(“unknown access mode”);

if (val & O_APPEND) printf(“, append”);

Page 38: 1 Advanced programming in UNIX 1 File I/O Hua LiSystems ProgrammingCS2690File I/O

38

• Here we use the feature test macro _POSIX_SOURCE and conditionally compile the file access flags.

Hua LiSystems ProgrammingCS2690 File I/O

if (val & O_NONBLOCK) printf(“, nonblocking”);

#if !defined(_POSIX_SOURCE) && defined(O_SYNC)

if (val & O_SYNC) printf(“, synchronous writes”);

#endif

putchar(‘\n’);

exit(0);

}