file system

59
1 File System 1. Data Structure 2. Functions 제 05 제 : File System

Upload: svein

Post on 23-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

제 05 강 : File System. File System. 1. Data Structure 2. Functions. Kernel Data Structure for File. Process 1. Process 2. PCB. PCB. CPU. mem. FCB. CPU. mem. File. : Table (Data Structure) : Object (hardware or software). Meta-data for a File. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: File System

1

File System

1. Data Structure2. Functions

제 05 강 : File System

Page 2: File System

2

Kernel Data Structure for File

CPU mem

PCB

FCB

Process 1

CPU mem File

: Table (Data Structure): Object (hardware or software)

PCB

Process 2

Page 3: File System

3

Meta-data for a File• Information kernel needs for a file:

– owner (eg Clinton)– protection (eg rwx r-- r--)– device (eg disk)– content (eg. sector address)– device driver routines (eg read(), open() )– accessing where now (eg offset*)– ….

•In Linux kernel, read/write system call is sequential.•Try “man 2 read” for system call parameters. offset assumed.•For random access, use lseek() system call that moves offset.

Page 4: File System

4

contiguous allocation

scattered allocation

Page 5: File System

5

Filecontent

Filecontent

Contents of File FA

may be stored in disk non-contiguously*in units of disk sectors

Filecontent

Filecontent

•Why not contiguous allocation?(O) fast – if R/W whole content sequential use for swap, device copy, …

(X) space management many small holes (useless) external fragmentation

Page 6: File System

6

Filecontent

Filecontent

Filecontent

Filemetadata

Filecontent

Kernel maintains metadata for each file

Page 7: File System

7

Filecontent

Filecontent

Filecontent

Filecontent

Filemetadata

File metadata includes pointers to data sectors

Page 8: File System

8

Filecontent

Filecontent

Filecontent

Filemetadata

Filecontent

Filemetadata

Open() retrieves metadata

from disk to main memory

But not contents – they are too big !!

Page 9: File System

9

Filecontent

Filecontent

Filecontent

Filemetadata

Filecontent

Filemetadata

This metadata has pointers to data sectors

Page 10: File System

10

Filecontent

Filecontent

Filecontent

FX

metadata

Filecontent

PA

PB

PC FX

metadata

FX

metadata

Split Metadata for file

FX

metadata

Page 11: File System

11

Split Metadata for file

– owner– protection information – device – pointer to file content – device driver routines– offset

All processes share single copy in memory

“inode” struct

Let each process have private copysince processes access different part

“file” struct

Private information

Systemwide information

Page 12: File System

12

offset

other info.

(system) file table inode table

So we have two data structures for each file

private infoPer-process data

Next byteposition to r/w

shared info (systemwide)single copy globally

Information --- less frequently changed

offset

PA

PB

Private informationSystemwide information

Page 13: File System

13

/* * One file structure is allocated for each open/creat/pipe call. * Main use is to hold the read/write offset */

struct file{

char f_flag;char f_count; /* reference count */int f_inode; /* pointer to inode structure */char *f_offset[2]; /* read/write character pointer */

} file[NFILE];

/* flags */#define FREAD 01#define FWRITE 02#define FPIPE 04

Page 14: File System

14

struct inode{

char i_flag;char i_count; /* reference count */int i_dev; /* device where inode resides */int i_number; /* i number, 1-to-1 with device address */int i_mode;char i_nlink; /* directory entries */char i_uid; /* owner */char i_gid; /* group of owner */char i_size0; /* most significant of size */char *i_size1; /* least sig */int i_addr[8]; /* device addresses constituting file */int i_lastr; /* last logical block read (for read-ahead) */

} inode[NINODE];

Page 15: File System

15

Sharing Files

• Example– (Case-1) who, grep -- pipe file

• % who | grep• share inode (pipe file), • not share offset

– (Case-2) parent/child -- tty file• % vi• share inode (tty file), • share offset

pipe

tty(in)

who

vi

grep

sh

Page 16: File System

16

Sharing files

who

vi

grep

sh

game

(system) file tableInode table

offset

offset

offset

offset

inode

inode

inode

Pipe file

game file

tty device

pipe

processgroup

$ grep|who

$ vi

Page 17: File System

17

Device switch table• 2-dim array which maps

(device name, operation name) => device driver routine

device independence (above: file, below: device)

openclosereadwrite ioctl

Starting address

ofdriver

routine

devswtab[]:

Read_lp

Page 18: File System

18

struct cdevsw{ int (*d_open)();

int (*d_close)();int (*d_read)();int (*d_write)();int (*d_sgtty)();

} cdevsw[];d_opend_closed_readd_write d_ioctl

Read_lp

Actually,

one dimensional array of

struct

not two dimensional array

Page 19: File System

19

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user PA

/

b

a b

datablock

datablock

datablock

Device name

Page 20: File System

20

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user PA

/

b

a b

offset

datablock

datablock

datablock

Page 21: File System

21

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user

u-ofile01234

PA

/

b

a b

offset

datablock

datablock

datablock

fd = 4

Page 22: File System

22

File descriptor table(or open file table)

• An array in struct user ( u_ofile[] array )• per process open file information • whenever program calls open(), create()

fd = open(“/a/b”, …)

– fd is integer (“file descriptor”), starts from 0, 1, 2 ..• 0, 1, 2 reserved for standard (input/output/error) file

– used as an index into• u_ofile[] array (file descriptor table, open file table_)• starting point to access file (points to system file

table)

(3) file descriptor (2) kernel (1) pathname of is returned opens file the file to open Internal rep. symbolic name

Page 23: File System

23

(system) file tableinode table

offset

offset

inode

inode

user per process

open file tablefile descriptor table ( “file handle” extends this notion to network. Window’s name)

01234

devswtab

device

routine

PA

Kernel data structure for file

u_ofile[]

fd

Page 24: File System

24

Kernel Data Structure

CPU

user

Process 1

CPUFX

inode

offset

read( )

devswtab

r w o c

Page 25: File System

25

(System) file table

• struct file• One entry for each open/create/pipe• may be shared (if offset is shared)• content

– offset – counter (number of processes sharing

this entry)– pointer to inode table– r/w/p flag

Page 26: File System

26

Inode table• includes most of the information for file• shared by all processes• changed less frequently (than offset)• content (while in disk)

– protection mode– owner– size– pointer to sectors– etc

Page 27: File System

27

In core Inode • content (while in disk)

– protection mode– owner– size– time– array of pointers to disk blocks

• plus (at load time)– counter (number of processes sharing

file)– device name (major/minor device

number)– i-number (location of inode in disk)– status (locked, mount point, …)

Page 28: File System

28

Filecontent

Filecontent

Filecontent

Filecontent

inode

pointer array within inode

inode

Now, you can reach any data block through in-core inodeThese pointers are stored in an array within inode

Page 29: File System

29

Balanced tree

• Example: 10 GB disk, 1K sector # of sectors = 10,000,000,000 / 1000

= 10,000,000 sectors each sector pointer ----- 24 bits

• A sector can hold (1000/24) = about 50 sector pointers

~50

Page 30: File System

30

Balanced tree• Example: 10 GB disk, 1K sector # of sectors = 10,000,000,000 / 1000

= 10,000,000 sectors each sector pointer ----- 24 bits

• A sector can hold (1000/24) = about 50 sector pointers

~50 ~50 ~50 ~50

Page 31: File System

31

Balanced tree• Example: 10 GB disk, 1K sector # of sectors = 10,000,000,000 / 1000

= 10,000,000 sectors each sector pointer ----- 24 bits

• A sector can hold (1000/24) = about 50 sector pointers

~50

~50 ~50 ~50

~50 ~50 ~50

Page 32: File System

32

Balanced tree

• Balanced tree of order ~ M (special insert/delete algorithm)• Top level, master index• UNIX – skewed tree

Page 33: File System

33

direct 0

direct 1

direct 2

direct 3

direct 4

direct 5

direct 6

direct 7

direct 8

direct 9

single indirect

double indirect

triple indirect

Data

Blockpointer array within inode

Page 34: File System

34

direct 0

direct 1

direct 2

direct 3

direct 4

direct 5

direct 6

direct 7

direct 8

direct 9

single indirect

double indirect

triple indirect

Data

Blockpointer array within inode

Fast for small files (created by human being at terminal keyboards)slower for big files timesharing application

Page 35: File System

35

direct 0

direct 1

direct 2

direct 3

direct 4

direct 5

direct 6

direct 7

direct 8

direct 9

single indirect

double indirect

Triple indirect

Data

Block~ 1KB

~ 2 KB

~ 9 KB

~109KB

~ 10109 KB

Offset vs Disk Block

57821

Page 36: File System

36

Linux• 1-12th pointer – direct pointer• 13th pointer – indirect pointer• 14th pointer - doubly indirect pointer• 15th pointer – triply indirect pointer• ---------------• Max 4096 GB file data if

– block address - 32 bits– block size - 4096 byte

Page 37: File System

37

inode

datablock

Space for inode

Space for data blocks

inode inode

inode

datablock

datablock

datablock

Disk Space for ...

File data size --- variesinode size --- fixed

Page 38: File System

38

Space for inode in Disk (Each inode - fixed size)

inode 0

inode 1

inode 2

inode n

i-number:

ordinal number ( 順番 ) of inode in disk

If I know (disk, i-number), I can access file content.

disk name inode content

i-number

Page 39: File System

39

Directory file (it is also a file. content: <name,

pointer>)

file name

“a” “b” “bin” “dev”

i-number

7 1 3 772

i-number = 3

3rd inode in disk

Data blocks

Q: file name – limit char?Q: # of files – limit?

Page 40: File System

40

Kernel tables before open(/a/b)

(system) file tableinode table

/

user PA

/ a b

datablock

datablock

datablock

inode

data

Page 41: File System

41

Kernel tables before open(/a/b)

file table inode table

/

user

PA

/ a b

datablock

datablock

datablock

inode

data

datablock

datablock

datablock

a bin x7 11 8

Page 42: File System

42

Kernel tables before open(/a/b)

file table inode table

/

user

PA

/ a b

datablock

datablock

datablock

inode

data

datablock

datablock

datablock

a bin x7 11 8

a datablock

datablock

datablock

b usr y3 21 6

Page 43: File System

43

open(“/a/b”)

/ :

/a:

/a/b:

i

data

data a x y bin dev 7 6 8 11 40

data

i

data

data w u b ch temp 7 6 8 11 40

data

i

data

data Content of this file

data

Directory

Directory

Regular File

Page 44: File System

44

open(“/a/b”, …)• Kernel system call open( ) scans

pathname– 1st -- root directory file:

• get inode 0 in disk inode space• read data blocks of root directory file• search for file name “a”• get corresponding i-number for file “a”

– 2nd -- “a” file:• get inode of file “a” from disk (it is directory file)• get data blocks of directory file “a”• search for file name “b”• get corresponding i-number for file “b”

/ a / b

/ a / ba 7bin 12dev

11

Page 45: File System

45

(continued)

– file “b”:•read inode of “b” from disk (regular file) ---- given pathname “/a/b” ends here -------

•set up kernel data structures for file “b”– insert inode into in-core inode table– new entry in system file table

(offset <= zero)– new entry in u_ofile[] in user– return file descriptor– open( ) is done

/ a / b

Page 46: File System

46

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user PA

/ a b

datablock

datablock

datablock

inode

data

Page 47: File System

47

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user PA

/

b

a b

datablock

datablock

datablock

Device name

Page 48: File System

48

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user PA

/

b

a b

offset

datablock

datablock

datablock

Page 49: File System

49

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user

u-ofile01234

PA

/

b

a b

offset

datablock

datablock

datablock

fd = 4

Page 50: File System

50

Kernel tables after open(/a/b)

(system) file tableinode table

/

a

user

u-ofile01234

PA

/

b

a b

offset

datablock

datablock

datablock

fd = 4returned

Once you have fd,you can access b’s inodeafter only 3 memory accesses

Once you have fd,you can access b’s inodeafter only 3 memory accesses

Page 51: File System

51

(continued)• open(“/a/b”) is very costly -- # of disk

accesses– once (open or create) is enough translate (pathname=> fd) once, save it

– do not use pathname in subsequent calls• read( ), write( ), close( ), …

– use file descriptor instead• read(fd, ... ), write(fd, ... ),

• Try “man read” … only one system call …

Page 52: File System

52

Try …

• man 2 read read(int fd )• man 2 open open(char *pathname )• man 3 fread

fread(FILE *file)

Page 53: File System

53

C functions for file

• Wait a minute …– I used printf(), scanf(), getchar() …. But never used read(), write() before …?

– I used *FILE …. But never used fd (file descriptor) before …?

Right, most people use library function

And library then invokes invokes system calls Remember? Library cannot perform I/O directly ….

library functions are in my address space (user)

Page 54: File System

54

System calls for files

create() open(), close()read(), write()lseek() move offsetstat() get inode content

• All others are library functions– eg scanf(), gets(), getchar(), …..

Page 55: File System

55

System call v.s. Library call

in kernel in a.out (user)system call library call

scanf() format getchar() char tty files

gets() string

read() fsacnf() fgetc() all filesfgets() fread() any number

fd *FILE (struct in lib)

Page 56: File System

56

FILE vs fd

(system)file table

inodetable

/

a

user

u-ofile01234

User a.out

/

b

a

offsetdatablock

datablock

fd

libraryFILE (

count ---- buf

pointer -- buf file

descriptor }

local buffer

Kernel a.out

main( )

When the local buffer (in FILE) becomes empty,Read() system call fills this buffer again

fopen( )

printf( ) write()

system call add( )

sub( )

my code

trap( )

Page 57: File System

57

Example: open 1. my a.out calls library fopen(“/a/b/c” )2. fopen() creates struct FILE for /a/b/c3. library invokes system call open(“/a/b/c” )

kenel sets up tables (inode, user, .., u_ofile[])

kernel returns file descriptor fd

fopen() saves fd in *FILE (for future use)

fopen() returns *FILE

4. my a.out saves *FILE (for future use) 5. all future use getchar(*FILE)

Page 58: File System

58

Example: getchar() #include “syscalls.h”

int getchar(void) /* library function -- copied into my a.out */

{ static char buf[BUFSIZ]; /* library local buffer */ static char *bufp = buf; /* pointer */ static int n =0; /* counter */

/* Is library local buffer empty? */if (n == 0) {/* Yes, invoke read() system call & fill up local buffer*/

n = read (0, buf, sizeof(buf)); /* system call */bufp = buf;

} return(--n>0)? (unsigned char) *bufp++: EOF; /* return a character

*/}

data structurein library

Page 59: File System

59

Functions for file handling

• So, you usually use library…printf() for formating (such as %s, %d)getchar() for performance ….But all library I/O functions end up asking system call(Library functions are “ user” code & cannot do I/O directly)They are front-end and provide you with convenience, performance …

Many library functions may exist

But there’s only one system call for read()