stat, mmap, process syncing and file system interface nezer j. zaidenberg

42
Stat, mmap, process Stat, mmap, process syncing and file syncing and file system interface system interface Nezer J. Zaidenberg Nezer J. Zaidenberg

Upload: sheila-simmons

Post on 16-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Stat, mmap, process Stat, mmap, process syncing and file syncing and file system interfacesystem interfaceNezer J. ZaidenbergNezer J. Zaidenberg

Page 2: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

AgendaAgenda

We will discuss how files are accessed in UNIXWe will discuss how files are accessed in UNIX

Lots of system calls will be introduced. (most Lots of system calls will be introduced. (most of them – reintroduced.)of them – reintroduced.)

We will also discuss shared memory and We will also discuss shared memory and syncing between processes.syncing between processes.

Page 3: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Prior knowladgePrior knowladge

open(2) and fopen(3) – open a file to open(2) and fopen(3) – open a file to read/write.read/write.

close(2) and fclose(3) – close a file to close(2) and fclose(3) – close a file to read/write.read/write.

read(2), write(2), fread(2), fwrite(2)read(2), write(2), fread(2), fwrite(2)

opendir(3), readdir(3) – read directory opendir(3), readdir(3) – read directory contents contents

All those are covered in K&R2 chapter 8All those are covered in K&R2 chapter 8

Page 4: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Stat(2)Stat(2)

Stat(2) is a system call used to learn about a file.Stat(2) is a system call used to learn about a file.

Information receivedInformation received Access permissionsAccess permissions SizeSize OwnerOwner GroupGroup File last access time, change time and modification File last access time, change time and modification

time (change time = file information changed, time (change time = file information changed, mtime = content change)mtime = content change)

Etc.Etc.

Stat(2) returns a struct with all the file informationStat(2) returns a struct with all the file information

Page 5: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Stat(2)Stat(2)

NAMENAME

stat, lstat, fstat -- get file statusstat, lstat, fstat -- get file status

SYNOPSISSYNOPSIS

#include <sys/types.h>#include <sys/types.h>

#include <sys/stat.h>#include <sys/stat.h>

int stat(const char *path, struct stat *sb);int stat(const char *path, struct stat *sb);

int lstat(const char *path, struct stat *sb);int lstat(const char *path, struct stat *sb);

int fstat(int fd, struct stat *sb);int fstat(int fd, struct stat *sb);

Page 6: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Struct stat (1/2))Struct stat (1/2))

struct stat {struct stat {

dev_t st_dev; /* device inode resides on */dev_t st_dev; /* device inode resides on */

ino_t st_ino; /* inode's number */ino_t st_ino; /* inode's number */

mode_t st_mode; /* inode protection mode */mode_t st_mode; /* inode protection mode */

nlink_t st_nlink; /* number or hard links to the file */nlink_t st_nlink; /* number or hard links to the file */

uid_t st_uid; /* user-id of owner */uid_t st_uid; /* user-id of owner */

gid_t st_gid; /* group-id of owner */gid_t st_gid; /* group-id of owner */

dev_t st_rdev; /* device type, for special file inode dev_t st_rdev; /* device type, for special file inode */*/

Page 7: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Struct stat (2/2)Struct stat (2/2)

struct timespec st_atimespec; /* time of last access */struct timespec st_atimespec; /* time of last access */

struct timespec st_mtimespec; /* time of last data modification struct timespec st_mtimespec; /* time of last data modification */*/

struct timespec st_ctimespec; /* time of last file status change struct timespec st_ctimespec; /* time of last file status change */*/

off_t st_size; /* file size, in bytes */off_t st_size; /* file size, in bytes */

quad_t st_blocks; /* blocks allocated for file */quad_t st_blocks; /* blocks allocated for file */

u_long st_blksize;/* optimal file sys I/O ops blocksize */u_long st_blksize;/* optimal file sys I/O ops blocksize */

u_long st_flags; /* user defined flags for file */u_long st_flags; /* user defined flags for file */

u_long st_gen; /* file generation number */u_long st_gen; /* file generation number */

};};

Page 8: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Access permissionAccess permission

Like everything in the IDF file system Like everything in the IDF file system permissions in UNIX is divided to 3 partspermissions in UNIX is divided to 3 parts

ME – what I can do.ME – what I can do.

MY GROUP – (every user have a group) – what MY GROUP – (every user have a group) – what my group can do.my group can do.

OTHER – what everybody else can do.OTHER – what everybody else can do.

Page 9: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

ACCESS PERMISSIONSACCESS PERMISSIONS

And of course what every body can do also And of course what every body can do also divides to 3.divides to 3. Permission to read (file contents, directory Permission to read (file contents, directory

contents)contents) Permission to write (file contents, add files to Permission to write (file contents, add files to

directory)directory) Permission to execute (file, cd to directory)Permission to execute (file, cd to directory) FILE SYSTEM PERMISSION ARE MANDATORY!FILE SYSTEM PERMISSION ARE MANDATORY! We use 3 octal digits to represent permissions.We use 3 octal digits to represent permissions. Read = 4, write =2 execute =1 we sum the Read = 4, write =2 execute =1 we sum the

permission number and get the digitpermission number and get the digit

Page 10: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Some examplesSome examples

Permission meaning

644 6(=4+2) I can read and writeMy group and everybody else can read

777 (7 = 4+2+1) everybody can read write and execute

705 (7=4+2+1), 5=(4+1)I can read write and executeMy group can’t do anything but other people can read and execute

700 Only I can read, write and execute the file

Page 11: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

lseek(2)lseek(2)

Move to specific location on a file.Move to specific location on a file.

Usage example:Usage example: Large database on a specific fileLarge database on a specific file User interested in a specific valueUser interested in a specific value lseek(2) to the right location and read the valuelseek(2) to the right location and read the value

Page 12: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

lseek(2)lseek(2)

NAMENAME

lseek -- reposition read/write file offsetlseek -- reposition read/write file offset

SYNOPSISSYNOPSIS

#include <unistd.h>#include <unistd.h>

off_toff_t

lseek(int fildes, off_t offset, int whence);lseek(int fildes, off_t offset, int whence);

Page 13: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

creat(2)creat(2)

SYNOPSISSYNOPSIS

#include <fcntl.h>#include <fcntl.h>

intint

creat(const char *path, mode_t mode);creat(const char *path, mode_t mode);

// Open can also be used for this file.// Open can also be used for this file.

Page 14: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

dup(2)dup(2)

SYNOPSISSYNOPSIS

#include <unistd.h>#include <unistd.h>

intint

dup(int oldd);dup(int oldd);

intint

dup2(int oldd, int newd);dup2(int oldd, int newd);

Page 15: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

File lockingFile locking

Different flavours of UNIX offers different file Different flavours of UNIX offers different file locking.locking.

Two system calls are used:Two system calls are used: flock(2) –on BSD UNIX (Mac OSX, FreeBSD, flock(2) –on BSD UNIX (Mac OSX, FreeBSD,

Darwin)Darwin) fcntl(2) - on SVR4 UNIX (Linux, Solaris, HP-UX)fcntl(2) - on SVR4 UNIX (Linux, Solaris, HP-UX)

File locking only works on processes not File locking only works on processes not threads.threads.

Use fcntl(2) locking on TAU.Use fcntl(2) locking on TAU.

Page 16: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

flock(2)flock(2)

SYNOPSISSYNOPSIS

#include <sys/file.h>#include <sys/file.h>

#define LOCK_SH 1 /* shared lock */#define LOCK_SH 1 /* shared lock */

#define LOCK_EX 2 /* exclusive lock */#define LOCK_EX 2 /* exclusive lock */

#define LOCK_NB 4 /* don't block when locking */#define LOCK_NB 4 /* don't block when locking */

#define LOCK_UN 8 /* unlock */#define LOCK_UN 8 /* unlock */

intint

flock(int fd, int operation);flock(int fd, int operation);

Page 17: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

fcntl(2)fcntl(2)

SYNOPSISSYNOPSIS

#include <fcntl.h>#include <fcntl.h>

intint

fcntl(int fd, int cmd, int arg); fcntl(int fd, int cmd, int arg);

OFTEN USED FOR FILE LOCKING ON SYS V SYSTEMSOFTEN USED FOR FILE LOCKING ON SYS V SYSTEMS

Page 18: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Memory mappingMemory mapping

mmap(2)mmap(2)

munmap(2)munmap(2)

msync(2)msync(2)

Page 19: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

mmap(2)mmap(2)

Maps a file to memory Maps a file to memory

This memory can be shared (among process) or lockedThis memory can be shared (among process) or locked

Check mprotect(2) for locking option (beyond our Check mprotect(2) for locking option (beyond our scope)scope)

Use msync(2) to write changesUse msync(2) to write changes

Use munmap(2) to write and free memoryUse munmap(2) to write and free memory

Homework : use read(2) and write(2) to copy file. Then Homework : use read(2) and write(2) to copy file. Then use mmap and memory copy. What works faster?use mmap and memory copy. What works faster?

Page 20: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

mmap(2)mmap(2)

NAMENAME

mmap -- map files or devices into memorymmap -- map files or devices into memory

SYNOPSISSYNOPSIS

#include <sys/mman.h>#include <sys/mman.h>

void *void *

mmap(void *addr, size_t len, int prot, int flags, int mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t offset);fildes, off_t offset);

Page 21: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

munmap(2)munmap(2)

NAMENAME

munmap -- remove a mappingmunmap -- remove a mapping

SYNOPSISSYNOPSIS

#include <sys/mman.h>#include <sys/mman.h>

int munmap(void *addr, size_t len);int munmap(void *addr, size_t len);

Page 22: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

msync(2)msync(2)

NAMENAME

msync -- synchronize a mapped regionmsync -- synchronize a mapped region

SYNOPSISSYNOPSIS

#include <sys/mman.h>#include <sys/mman.h>

int msync(void *addr, size_t len, int int msync(void *addr, size_t len, int flags);flags);

Page 23: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Example (1/2)Example (1/2)

#include <sys/types.h>#include <sys/types.h>

#include <sys/stat.h>#include <sys/stat.h>

#include <sys/mman.h> /* mmap() is defined in this header */#include <sys/mman.h> /* mmap() is defined in this header */

#include <fcntl.h>#include <fcntl.h>

int main (int argc, char *argv[]) // will be used as cp source destint main (int argc, char *argv[]) // will be used as cp source dest

{{

int fdin, fdout;int fdin, fdout;

char *src, *dst;char *src, *dst;

struct stat statbuf;struct stat statbuf;

fdin = open (argv[1], O_RDONLY);fdin = open (argv[1], O_RDONLY);

fdout = open (argv[2], O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);fdout = open (argv[2], O_RDWR | O_CREAT | O_TRUNC, FILE_MODE);

Page 24: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Example 2/2Example 2/2

fstat (fdin,&statbuf); // get sizefstat (fdin,&statbuf); // get size

lseek (fdout, statbuf.st_size - 1, SEEK_SET); // go to the location lseek (fdout, statbuf.st_size - 1, SEEK_SET); // go to the location corresponding to the last bytecorresponding to the last byte

write (fdout, "", 1) // output file now the size of input filewrite (fdout, "", 1) // output file now the size of input file

/* mmap the input and output file *//* mmap the input and output file */

src = mmap (0, statbuf.st_size, PROT_READ, MAP_SHARED, fdin, 0);src = mmap (0, statbuf.st_size, PROT_READ, MAP_SHARED, fdin, 0);

dst = mmap (0, statbuf.st_size, PROT_READ | PROT_WRITE, dst = mmap (0, statbuf.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fdout, 0);MAP_SHARED, fdout, 0);

memcpy (dst, src, statbuf.st_size);memcpy (dst, src, statbuf.st_size);

} // if we wanted to write and continue we had to use munmap or } // if we wanted to write and continue we had to use munmap or msync msync

Page 25: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

SyncSync

Process Thread

Shared memory Shared memory via mmap(2)

Everything is shared

Mutex Mutex via flock(2) Mutex via pthread_mutex

Cond Cond via socket and message passing

Cond via pthread_cond

Page 26: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Why does mmap(2) and Why does mmap(2) and memcpy(2) works memcpy(2) works

faster then read(2) and faster then read(2) and write(2) when copying write(2) when copying

files?files?

Page 27: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

ExplanationExplanation

When read(2)ing and write(2) we copy from When read(2)ing and write(2) we copy from OS buffer to user buffers and back.OS buffer to user buffers and back.

This copy is redundant. mmap(2) just use the This copy is redundant. mmap(2) just use the OS buffers making less memory copies. OS buffers making less memory copies. Therefore, we should expect mmap to work Therefore, we should expect mmap to work faster.faster.

Page 28: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

LinkingLinking

Symlink(2)Symlink(2)

Link(2)Link(2)

Create soft and hard linkCreate soft and hard link

Page 29: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Types of linksTypes of links

Hard LinkHard Link new (additional) name for a file.new (additional) name for a file. File is stored once with two (or more) names.File is stored once with two (or more) names. No “master” and “slave” later. All names are equal.No “master” and “slave” later. All names are equal.

ProblemsProblems Who is charged for the file for quota?Who is charged for the file for quota? How to save links after backup?How to save links after backup?

Symbolic (soft) LinkSymbolic (soft) Link New file contain path of linked fileNew file contain path of linked file Most API follow symbolic links by reading content and Most API follow symbolic links by reading content and

opening linked fileopening linked file If linked file is deleted – we have broken link. Link that points If linked file is deleted – we have broken link. Link that points

at nothingat nothing

Page 30: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

link(2) and unlink(2)link(2) and unlink(2)

Unlink – delete a directory entry. Removes a Unlink – delete a directory entry. Removes a “link” from a directory. When all links are “link” from a directory. When all links are removed file is deleted. (most files are linked removed file is deleted. (most files are linked once)once)

Link – create a hard link – give a new name to Link – create a hard link – give a new name to a file. File now resides in both original and a file. File now resides in both original and new directory. new directory.

Page 31: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Unlink(2)Unlink(2)

NAMENAME

unlink -- remove directory entryunlink -- remove directory entry

SYNOPSISSYNOPSIS

#include <unistd.h>#include <unistd.h>

intint

unlink(const char *path);unlink(const char *path);

Page 32: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

link(2)link(2)

NAMENAME

link -- make a hard file linklink -- make a hard file link

SYNOPSISSYNOPSIS

#include <unistd.h>#include <unistd.h>

intint

link(const char *path1, const char *path2);link(const char *path1, const char *path2);

Page 33: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

symlink(2)symlink(2)

NAMENAME

symlink -- make symbolic link to a filesymlink -- make symbolic link to a file

SYNOPSISSYNOPSIS

#include <unistd.h>#include <unistd.h>

intint

symlink(const char *path1, const char *path2);symlink(const char *path1, const char *path2);

Page 34: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Signals - revisittedSignals - revisittedDebt from homeworkDebt from homework

Page 35: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Set signal handlerSet signal handler

signal(3) – old interfacesignal(3) – old interface

sigaction(2) – new interface (more generic)sigaction(2) – new interface (more generic)

Those functions get signal numbers and Those functions get signal numbers and signal handler function pointersignal handler function pointer

When signal is caught the function handler is When signal is caught the function handler is called.called.

Page 36: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Signal(3)Signal(3)

SYNOPSISSYNOPSIS

#include <signal.h>#include <signal.h>

void (*void (*

signal(int sig, void (*func)(int)))(int);signal(int sig, void (*func)(int)))(int);

//or in the equivalent but easier to read typedef'd //or in the equivalent but easier to read typedef'd version:version:

typedef void (*sig_t) (int);typedef void (*sig_t) (int);

sig_tsig_t

signal(int sig, sig_t func);signal(int sig, sig_t func);

Page 37: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Sigaction(2) (1/2)Sigaction(2) (1/2)

SYNOPSISSYNOPSIS

#include <signal.h>#include <signal.h>

struct sigaction { union {struct sigaction { union {

void (*__sa_handler)(int);void (*__sa_handler)(int);

void (*__sa_sigaction)(int, struct __siginfo *, void *);void (*__sa_sigaction)(int, struct __siginfo *, void *);

} __sigaction_u; /* signal handler */} __sigaction_u; /* signal handler */

int sa_flags; /* see signal options below */int sa_flags; /* see signal options below */

sigset_t sa_mask; /* signal mask to apply */sigset_t sa_mask; /* signal mask to apply */

};};

Page 38: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Sigaction(2) 2/2Sigaction(2) 2/2

#define sa_handler #define sa_handler __sigaction_u.__sa_handler__sigaction_u.__sa_handler

#define sa_sigaction #define sa_sigaction __sigaction_u.__sa_sigaction__sigaction_u.__sa_sigaction

intint

sigaction(int sig, const struct sigaction * sigaction(int sig, const struct sigaction * restrict act,restrict act,

struct sigaction * restrict oact);struct sigaction * restrict oact);

Page 39: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Kill(2)Kill(2)

SYNOPSISSYNOPSIS

#include <signal.h>#include <signal.h>

intint

kill(pid_t pid, int sig);kill(pid_t pid, int sig);

Page 40: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

What you may do in a What you may do in a sig handlersig handler

Signal handler are very low level code. They are Signal handler are very low level code. They are invoked by the kernel, as an interrupt to the invoked by the kernel, as an interrupt to the program (even while doing other system call) – program (even while doing other system call) – some of you got EINTR on select(2)some of you got EINTR on select(2)

Therefore we must not get another interrupt on Therefore we must not get another interrupt on signal handler. signal handler.

Use only memory operation and non blocking Use only memory operation and non blocking system calls. (specifically avoid file I/O on signal system calls. (specifically avoid file I/O on signal hanglers)hanglers)

Getting another signal on signal handler may result Getting another signal on signal handler may result in undefined behavior or even damned recursionin undefined behavior or even damned recursion

Page 41: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Things to specifically Things to specifically avoidavoid

Don’t call new process from signal handler. (if Don’t call new process from signal handler. (if the process dies and you get SIGCHLD you the process dies and you get SIGCHLD you will end up in damn recursion)will end up in damn recursion)

Avoid calling syslog from signal handler.Avoid calling syslog from signal handler.

Adding stuff to log queue in signal handler Adding stuff to log queue in signal handler and deleting stuff from the queue later is and deleting stuff from the queue later is good alternative.good alternative.

Page 42: Stat, mmap, process syncing and file system interface Nezer J. Zaidenberg

Interrupted system callInterrupted system call

You get EINTR on select(2)You get EINTR on select(2)

Check chapter 10.5 in advanced Check chapter 10.5 in advanced programming in the unix environmentprogramming in the unix environment

Also do man select(2) and check EINTRAlso do man select(2) and check EINTR

Basically if you get a signal such a SIGCHLD Basically if you get a signal such a SIGCHLD select will fail. This is condition you have to select will fail. This is condition you have to check but you don’t have to die (actually you check but you don’t have to die (actually you should not die) should not die)