stat, mmap, process syncing and file system interface nezer j. zaidenberg
TRANSCRIPT
Stat, mmap, process Stat, mmap, process syncing and file syncing and file system interfacesystem interfaceNezer J. ZaidenbergNezer J. Zaidenberg
AgendaAgenda
We will discuss how files are accessed in UNIXWe will discuss how files are accessed in UNIX
Lots of system calls will be introduced. (most Lots of system calls will be introduced. (most of them – reintroduced.)of them – reintroduced.)
We will also discuss shared memory and We will also discuss shared memory and syncing between processes.syncing between processes.
Prior knowladgePrior knowladge
open(2) and fopen(3) – open a file to open(2) and fopen(3) – open a file to read/write.read/write.
close(2) and fclose(3) – close a file to close(2) and fclose(3) – close a file to read/write.read/write.
read(2), write(2), fread(2), fwrite(2)read(2), write(2), fread(2), fwrite(2)
opendir(3), readdir(3) – read directory opendir(3), readdir(3) – read directory contents contents
All those are covered in K&R2 chapter 8All those are covered in K&R2 chapter 8
Stat(2)Stat(2)
Stat(2) is a system call used to learn about a file.Stat(2) is a system call used to learn about a file.
Information receivedInformation received Access permissionsAccess permissions SizeSize OwnerOwner GroupGroup File last access time, change time and modification File last access time, change time and modification
time (change time = file information changed, time (change time = file information changed, mtime = content change)mtime = content change)
Etc.Etc.
Stat(2) returns a struct with all the file informationStat(2) returns a struct with all the file information
Stat(2)Stat(2)
NAMENAME
stat, lstat, fstat -- get file statusstat, lstat, fstat -- get file status
SYNOPSISSYNOPSIS
#include <sys/types.h>#include <sys/types.h>
#include <sys/stat.h>#include <sys/stat.h>
int stat(const char *path, struct stat *sb);int stat(const char *path, struct stat *sb);
int lstat(const char *path, struct stat *sb);int lstat(const char *path, struct stat *sb);
int fstat(int fd, struct stat *sb);int fstat(int fd, struct stat *sb);
Struct stat (1/2))Struct stat (1/2))
struct stat {struct stat {
dev_t st_dev; /* device inode resides on */dev_t st_dev; /* device inode resides on */
ino_t st_ino; /* inode's number */ino_t st_ino; /* inode's number */
mode_t st_mode; /* inode protection mode */mode_t st_mode; /* inode protection mode */
nlink_t st_nlink; /* number or hard links to the file */nlink_t st_nlink; /* number or hard links to the file */
uid_t st_uid; /* user-id of owner */uid_t st_uid; /* user-id of owner */
gid_t st_gid; /* group-id of owner */gid_t st_gid; /* group-id of owner */
dev_t st_rdev; /* device type, for special file inode dev_t st_rdev; /* device type, for special file inode */*/
Struct stat (2/2)Struct stat (2/2)
struct timespec st_atimespec; /* time of last access */struct timespec st_atimespec; /* time of last access */
struct timespec st_mtimespec; /* time of last data modification struct timespec st_mtimespec; /* time of last data modification */*/
struct timespec st_ctimespec; /* time of last file status change struct timespec st_ctimespec; /* time of last file status change */*/
off_t st_size; /* file size, in bytes */off_t st_size; /* file size, in bytes */
quad_t st_blocks; /* blocks allocated for file */quad_t st_blocks; /* blocks allocated for file */
u_long st_blksize;/* optimal file sys I/O ops blocksize */u_long st_blksize;/* optimal file sys I/O ops blocksize */
u_long st_flags; /* user defined flags for file */u_long st_flags; /* user defined flags for file */
u_long st_gen; /* file generation number */u_long st_gen; /* file generation number */
};};
Access permissionAccess permission
Like everything in the IDF file system Like everything in the IDF file system permissions in UNIX is divided to 3 partspermissions in UNIX is divided to 3 parts
ME – what I can do.ME – what I can do.
MY GROUP – (every user have a group) – what MY GROUP – (every user have a group) – what my group can do.my group can do.
OTHER – what everybody else can do.OTHER – what everybody else can do.
ACCESS PERMISSIONSACCESS PERMISSIONS
And of course what every body can do also And of course what every body can do also divides to 3.divides to 3. Permission to read (file contents, directory Permission to read (file contents, directory
contents)contents) Permission to write (file contents, add files to Permission to write (file contents, add files to
directory)directory) Permission to execute (file, cd to directory)Permission to execute (file, cd to directory) FILE SYSTEM PERMISSION ARE MANDATORY!FILE SYSTEM PERMISSION ARE MANDATORY! We use 3 octal digits to represent permissions.We use 3 octal digits to represent permissions. Read = 4, write =2 execute =1 we sum the Read = 4, write =2 execute =1 we sum the
permission number and get the digitpermission number and get the digit
Some examplesSome examples
Permission meaning
644 6(=4+2) I can read and writeMy group and everybody else can read
777 (7 = 4+2+1) everybody can read write and execute
705 (7=4+2+1), 5=(4+1)I can read write and executeMy group can’t do anything but other people can read and execute
700 Only I can read, write and execute the file
lseek(2)lseek(2)
Move to specific location on a file.Move to specific location on a file.
Usage example:Usage example: Large database on a specific fileLarge database on a specific file User interested in a specific valueUser interested in a specific value lseek(2) to the right location and read the valuelseek(2) to the right location and read the value
lseek(2)lseek(2)
NAMENAME
lseek -- reposition read/write file offsetlseek -- reposition read/write file offset
SYNOPSISSYNOPSIS
#include <unistd.h>#include <unistd.h>
off_toff_t
lseek(int fildes, off_t offset, int whence);lseek(int fildes, off_t offset, int whence);
creat(2)creat(2)
SYNOPSISSYNOPSIS
#include <fcntl.h>#include <fcntl.h>
intint
creat(const char *path, mode_t mode);creat(const char *path, mode_t mode);
// Open can also be used for this file.// Open can also be used for this file.
dup(2)dup(2)
SYNOPSISSYNOPSIS
#include <unistd.h>#include <unistd.h>
intint
dup(int oldd);dup(int oldd);
intint
dup2(int oldd, int newd);dup2(int oldd, int newd);
File lockingFile locking
Different flavours of UNIX offers different file Different flavours of UNIX offers different file locking.locking.
Two system calls are used:Two system calls are used: flock(2) –on BSD UNIX (Mac OSX, FreeBSD, flock(2) –on BSD UNIX (Mac OSX, FreeBSD,
Darwin)Darwin) fcntl(2) - on SVR4 UNIX (Linux, Solaris, HP-UX)fcntl(2) - on SVR4 UNIX (Linux, Solaris, HP-UX)
File locking only works on processes not File locking only works on processes not threads.threads.
Use fcntl(2) locking on TAU.Use fcntl(2) locking on TAU.
flock(2)flock(2)
SYNOPSISSYNOPSIS
#include <sys/file.h>#include <sys/file.h>
#define LOCK_SH 1 /* shared lock */#define LOCK_SH 1 /* shared lock */
#define LOCK_EX 2 /* exclusive lock */#define LOCK_EX 2 /* exclusive lock */
#define LOCK_NB 4 /* don't block when locking */#define LOCK_NB 4 /* don't block when locking */
#define LOCK_UN 8 /* unlock */#define LOCK_UN 8 /* unlock */
intint
flock(int fd, int operation);flock(int fd, int operation);
fcntl(2)fcntl(2)
SYNOPSISSYNOPSIS
#include <fcntl.h>#include <fcntl.h>
intint
fcntl(int fd, int cmd, int arg); fcntl(int fd, int cmd, int arg);
OFTEN USED FOR FILE LOCKING ON SYS V SYSTEMSOFTEN USED FOR FILE LOCKING ON SYS V SYSTEMS
Memory mappingMemory mapping
mmap(2)mmap(2)
munmap(2)munmap(2)
msync(2)msync(2)
mmap(2)mmap(2)
Maps a file to memory Maps a file to memory
This memory can be shared (among process) or lockedThis memory can be shared (among process) or locked
Check mprotect(2) for locking option (beyond our Check mprotect(2) for locking option (beyond our scope)scope)
Use msync(2) to write changesUse msync(2) to write changes
Use munmap(2) to write and free memoryUse munmap(2) to write and free memory
Homework : use read(2) and write(2) to copy file. Then Homework : use read(2) and write(2) to copy file. Then use mmap and memory copy. What works faster?use mmap and memory copy. What works faster?
mmap(2)mmap(2)
NAMENAME
mmap -- map files or devices into memorymmap -- map files or devices into memory
SYNOPSISSYNOPSIS
#include <sys/mman.h>#include <sys/mman.h>
void *void *
mmap(void *addr, size_t len, int prot, int flags, int mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t offset);fildes, off_t offset);
munmap(2)munmap(2)
NAMENAME
munmap -- remove a mappingmunmap -- remove a mapping
SYNOPSISSYNOPSIS
#include <sys/mman.h>#include <sys/mman.h>
int munmap(void *addr, size_t len);int munmap(void *addr, size_t len);
msync(2)msync(2)
NAMENAME
msync -- synchronize a mapped regionmsync -- synchronize a mapped region
SYNOPSISSYNOPSIS
#include <sys/mman.h>#include <sys/mman.h>
int msync(void *addr, size_t len, int int msync(void *addr, size_t len, int flags);flags);
Example (1/2)Example (1/2)
#include <sys/types.h>#include <sys/types.h>
#include <sys/stat.h>#include <sys/stat.h>
#include <sys/mman.h> /* mmap() is defined in this header */#include <sys/mman.h> /* mmap() is defined in this header */
#include <fcntl.h>#include <fcntl.h>
int main (int argc, char *argv[]) // will be used as cp source destint main (int argc, char *argv[]) // will be used as cp source dest
{{
int fdin, fdout;int fdin, fdout;
char *src, *dst;char *src, *dst;
struct stat statbuf;struct stat statbuf;
fdin = open (argv[1], O_RDONLY);fdin = open (argv[1], O_RDONLY);
fdout = open (argv[2], O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);fdout = open (argv[2], O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);
Example 2/2Example 2/2
fstat (fdin,&statbuf); // get sizefstat (fdin,&statbuf); // get size
lseek (fdout, statbuf.st_size - 1, SEEK_SET); // go to the location lseek (fdout, statbuf.st_size - 1, SEEK_SET); // go to the location corresponding to the last bytecorresponding to the last byte
write (fdout, "", 1) // output file now the size of input filewrite (fdout, "", 1) // output file now the size of input file
/* mmap the input and output file *//* mmap the input and output file */
src = mmap (0, statbuf.st_size, PROT_READ, MAP_SHARED, fdin, 0);src = mmap (0, statbuf.st_size, PROT_READ, MAP_SHARED, fdin, 0);
dst = mmap (0, statbuf.st_size, PROT_READ | PROT_WRITE, dst = mmap (0, statbuf.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fdout, 0);MAP_SHARED, fdout, 0);
memcpy (dst, src, statbuf.st_size);memcpy (dst, src, statbuf.st_size);
} // if we wanted to write and continue we had to use munmap or } // if we wanted to write and continue we had to use munmap or msync msync
SyncSync
Process Thread
Shared memory Shared memory via mmap(2)
Everything is shared
Mutex Mutex via flock(2) Mutex via pthread_mutex
Cond Cond via socket and message passing
Cond via pthread_cond
Why does mmap(2) and Why does mmap(2) and memcpy(2) works memcpy(2) works
faster then read(2) and faster then read(2) and write(2) when copying write(2) when copying
files?files?
ExplanationExplanation
When read(2)ing and write(2) we copy from When read(2)ing and write(2) we copy from OS buffer to user buffers and back.OS buffer to user buffers and back.
This copy is redundant. mmap(2) just use the This copy is redundant. mmap(2) just use the OS buffers making less memory copies. OS buffers making less memory copies. Therefore, we should expect mmap to work Therefore, we should expect mmap to work faster.faster.
LinkingLinking
Symlink(2)Symlink(2)
Link(2)Link(2)
Create soft and hard linkCreate soft and hard link
Types of linksTypes of links
Hard LinkHard Link new (additional) name for a file.new (additional) name for a file. File is stored once with two (or more) names.File is stored once with two (or more) names. No “master” and “slave” later. All names are equal.No “master” and “slave” later. All names are equal.
ProblemsProblems Who is charged for the file for quota?Who is charged for the file for quota? How to save links after backup?How to save links after backup?
Symbolic (soft) LinkSymbolic (soft) Link New file contain path of linked fileNew file contain path of linked file Most API follow symbolic links by reading content and Most API follow symbolic links by reading content and
opening linked fileopening linked file If linked file is deleted – we have broken link. Link that points If linked file is deleted – we have broken link. Link that points
at nothingat nothing
link(2) and unlink(2)link(2) and unlink(2)
Unlink – delete a directory entry. Removes a Unlink – delete a directory entry. Removes a “link” from a directory. When all links are “link” from a directory. When all links are removed file is deleted. (most files are linked removed file is deleted. (most files are linked once)once)
Link – create a hard link – give a new name to Link – create a hard link – give a new name to a file. File now resides in both original and a file. File now resides in both original and new directory. new directory.
Unlink(2)Unlink(2)
NAMENAME
unlink -- remove directory entryunlink -- remove directory entry
SYNOPSISSYNOPSIS
#include <unistd.h>#include <unistd.h>
intint
unlink(const char *path);unlink(const char *path);
link(2)link(2)
NAMENAME
link -- make a hard file linklink -- make a hard file link
SYNOPSISSYNOPSIS
#include <unistd.h>#include <unistd.h>
intint
link(const char *path1, const char *path2);link(const char *path1, const char *path2);
symlink(2)symlink(2)
NAMENAME
symlink -- make symbolic link to a filesymlink -- make symbolic link to a file
SYNOPSISSYNOPSIS
#include <unistd.h>#include <unistd.h>
intint
symlink(const char *path1, const char *path2);symlink(const char *path1, const char *path2);
Signals - revisittedSignals - revisittedDebt from homeworkDebt from homework
Set signal handlerSet signal handler
signal(3) – old interfacesignal(3) – old interface
sigaction(2) – new interface (more generic)sigaction(2) – new interface (more generic)
Those functions get signal numbers and Those functions get signal numbers and signal handler function pointersignal handler function pointer
When signal is caught the function handler is When signal is caught the function handler is called.called.
Signal(3)Signal(3)
SYNOPSISSYNOPSIS
#include <signal.h>#include <signal.h>
void (*void (*
signal(int sig, void (*func)(int)))(int);signal(int sig, void (*func)(int)))(int);
//or in the equivalent but easier to read typedef'd //or in the equivalent but easier to read typedef'd version:version:
typedef void (*sig_t) (int);typedef void (*sig_t) (int);
sig_tsig_t
signal(int sig, sig_t func);signal(int sig, sig_t func);
Sigaction(2) (1/2)Sigaction(2) (1/2)
SYNOPSISSYNOPSIS
#include <signal.h>#include <signal.h>
struct sigaction { union {struct sigaction { union {
void (*__sa_handler)(int);void (*__sa_handler)(int);
void (*__sa_sigaction)(int, struct __siginfo *, void *);void (*__sa_sigaction)(int, struct __siginfo *, void *);
} __sigaction_u; /* signal handler */} __sigaction_u; /* signal handler */
int sa_flags; /* see signal options below */int sa_flags; /* see signal options below */
sigset_t sa_mask; /* signal mask to apply */sigset_t sa_mask; /* signal mask to apply */
};};
Sigaction(2) 2/2Sigaction(2) 2/2
#define sa_handler #define sa_handler __sigaction_u.__sa_handler__sigaction_u.__sa_handler
#define sa_sigaction #define sa_sigaction __sigaction_u.__sa_sigaction__sigaction_u.__sa_sigaction
intint
sigaction(int sig, const struct sigaction * sigaction(int sig, const struct sigaction * restrict act,restrict act,
struct sigaction * restrict oact);struct sigaction * restrict oact);
Kill(2)Kill(2)
SYNOPSISSYNOPSIS
#include <signal.h>#include <signal.h>
intint
kill(pid_t pid, int sig);kill(pid_t pid, int sig);
What you may do in a What you may do in a sig handlersig handler
Signal handler are very low level code. They are Signal handler are very low level code. They are invoked by the kernel, as an interrupt to the invoked by the kernel, as an interrupt to the program (even while doing other system call) – program (even while doing other system call) – some of you got EINTR on select(2)some of you got EINTR on select(2)
Therefore we must not get another interrupt on Therefore we must not get another interrupt on signal handler. signal handler.
Use only memory operation and non blocking Use only memory operation and non blocking system calls. (specifically avoid file I/O on signal system calls. (specifically avoid file I/O on signal hanglers)hanglers)
Getting another signal on signal handler may result Getting another signal on signal handler may result in undefined behavior or even damned recursionin undefined behavior or even damned recursion
Things to specifically Things to specifically avoidavoid
Don’t call new process from signal handler. (if Don’t call new process from signal handler. (if the process dies and you get SIGCHLD you the process dies and you get SIGCHLD you will end up in damn recursion)will end up in damn recursion)
Avoid calling syslog from signal handler.Avoid calling syslog from signal handler.
Adding stuff to log queue in signal handler Adding stuff to log queue in signal handler and deleting stuff from the queue later is and deleting stuff from the queue later is good alternative.good alternative.
Interrupted system callInterrupted system call
You get EINTR on select(2)You get EINTR on select(2)
Check chapter 10.5 in advanced Check chapter 10.5 in advanced programming in the unix environmentprogramming in the unix environment
Also do man select(2) and check EINTRAlso do man select(2) and check EINTR
Basically if you get a signal such a SIGCHLD Basically if you get a signal such a SIGCHLD select will fail. This is condition you have to select will fail. This is condition you have to check but you don’t have to die (actually you check but you don’t have to die (actually you should not die) should not die)