[537] file-system implementation

129
[537] File-System Implementation Chapter 40 Tyler Harter 11/05/14

Upload: others

Post on 30-Nov-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [537] File-System Implementation

[537] File-System Implementation

Chapter 40 Tyler Harter

11/05/14

Page 2: [537] File-System Implementation

Review File-System API

Page 3: [537] File-System Implementation

File NamesThree types of names: - inode number - path - file descriptor !

Why?

Page 4: [537] File-System Implementation

File Namesinode! - unique name - remember file size, permissions, etc !path! - easy to remember - hierarchical !file descriptor! - avoid frequent traversal - remember multiple offsets

Page 5: [537] File-System Implementation

File API

int fd = open(char *path, int flag, mode_t mode) !read(int fd, void *buf, size_t nbyte) !write(int fd, void *buf, size_t nbyte) !close(int fd)

Page 6: [537] File-System Implementation

Special Calls

fsync(int fd) !rename(char *oldpath, char *newpath) !flock(int fd, int operation)

Page 7: [537] File-System Implementation

How do you delete a file?

Page 8: [537] File-System Implementation

How do you delete a file? !

You don’t! It’s garbage collected when there are no more names (fds or paths)

Page 9: [537] File-System Implementation

012345

fd table

location = … size = …

inode 123

file data

a.out:123, … location = … size = …

inode 2root dir

other file

(per process)

Inodes, Paths, FDs

Page 10: [537] File-System Implementation

012345

fd table

location = … size = …

inode 123

file data

a.out:123, … location = … size = …

inode 2root dir

other file

(per process)

offset = 0 inode =

fd

Inodes, Paths, FDs

Page 11: [537] File-System Implementation

012345

fd table

location = … size = …

inode 123

file data

a.out:123, … location = … size = …

inode 2root dir

other file

(per process)

offset = 128 inode =

fd

Inodes, Paths, FDs

Page 12: [537] File-System Implementation

012345

fd table

location = … size = …

inode 123

file data

a.out:123, … location = … size = …

inode 2root dir

other file

(per process)

offset = 128 inode =

fd

opened /a.out, read 128 bytes

Inodes, Paths, FDs

Page 13: [537] File-System Implementation

Implementation

Page 14: [537] File-System Implementation

Implementation1. On-disk structures - how do we represent files, directories? !

2. Access methods - what steps must reads/writes take?

Page 15: [537] File-System Implementation

Disk Structures

Page 16: [537] File-System Implementation

Persistent StoreGiven: big array of bytes/blocks. Want: to add some structure/organization. !What 537 project (so far) is most similar?

Page 17: [537] File-System Implementation

Persistent StoreGiven: big array of bytes/blocks. Want: to add some structure/organization. !What 537 project (so far) is most similar? p3a: malloc. !You could build a persistent malloc that saves to disk (instead of to memory)! - use offsets instead of ptrs, writes instead of stores

Page 18: [537] File-System Implementation

Persistent Malloc vs. FSWhat features does a file system provide beyond what a persistent malloc would provide?

Page 19: [537] File-System Implementation

Persistent Malloc vs. FSWhat features does a file system provide beyond what a persistent malloc would provide? !

String names Hierarchy (names within names) Changeable file sizes Sharing across processes …

Page 20: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 21: [537] File-System Implementation

FS Structs: Empty Disk

D D D D D D D D0 7

D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

Page 22: [537] File-System Implementation

Data Blocks

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D D D D D D

Page 23: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 24: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 25: [537] File-System Implementation

Inodes

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Page 26: [537] File-System Implementation

Inodes

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Page 27: [537] File-System Implementation

inode!16

inode!17

inode!18

inode!19

inode!20

inode!21

inode!22

inode!23

inode!24

inode!25

inode!26

inode!27

inode!28

inode!29

inode!30

inode!31

Inodes are typically 128 or 256 bytes (depends on the FS). !

So 16 - 32 inodes per inode block.

Inode Block

Page 28: [537] File-System Implementation

inode!16

inode!17

inode!18

inode!19

inode!20

inode!21

inode!22

inode!23

inode!24

inode!25

inode!26

inode!27

inode!28

inode!29

inode!30

inode!31

Inodes are typically 128 or 256 bytes (depends on the FS). !

So 16 - 32 inodes per inode block.

Inode Block

Page 29: [537] File-System Implementation

Inodetype!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Page 30: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

file or directory?

Page 31: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

user and permissions

Page 32: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

size in bytes and blocks

Page 33: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

access time, create time

Page 34: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

how many paths

Page 35: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

N data blocks

Page 36: [537] File-System Implementation

Inodes

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Page 37: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

Assume 4-byte addrs. What is an upper bound

on the file size? (assume 256-byte inodes)

Page 38: [537] File-System Implementation

type!uid!rwx!size!

blocks!time!

ctime!links_count!

addrs[N]

Inode

Assume 4-byte addrs. What is an upper bound

on the file size? (assume 256-byte inodes)

!How to get larger files?

Page 39: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 40: [537] File-System Implementation

inode!

data! data! data! data!

Page 41: [537] File-System Implementation

inode!

indirect! indirect! indirect! indirect!

Page 42: [537] File-System Implementation

inode!

indirect! indirect! indirect! indirect!

indirects are stored in regular data blocks.

Page 43: [537] File-System Implementation

inode!

indirect! indirect! indirect! indirect!

what if we want to optimize for small files?

Page 44: [537] File-System Implementation

inode!

indirect!data! data! data!

what if we want to optimize for small files?

Page 45: [537] File-System Implementation

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Assume 256 byte sectors. What is offset for inode with number 0?

Page 46: [537] File-System Implementation

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Assume 256 byte sectors. What is offset for inode with number 4?

Page 47: [537] File-System Implementation

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Assume 256 byte sectors. What is offset for inode with number 40?

Page 48: [537] File-System Implementation

Various Link StructuresTree (usually unbalanced)! - with indirect blocks - e.g., ext3 !Extents! - store offset+size pairs - e.g., ext4 !Linked list! - each data block points to the next - e.g., FAT

Page 49: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 50: [537] File-System Implementation

DirectoriesFile systems vary. !

Common design: just store directory entries in files. !

Various formats could be used - lists - b-trees

Page 51: [537] File-System Implementation

Simple List Example

valid name inode111

...

foo

1343580

1 bar 23

Page 52: [537] File-System Implementation

Simple List Example

valid name inode110

...

foo

1343580

1 bar 23

unlink(“foo”)

Page 53: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 54: [537] File-System Implementation

AllocationHow do we find free data blocks or free inodes?

Page 55: [537] File-System Implementation

AllocationHow do we find free data blocks or free inodes? !

Free list. !

Bitmaps. !

Tradeoffs?

Page 56: [537] File-System Implementation

Bitmaps

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D D I I I I I

Page 57: [537] File-System Implementation

Data Bitmap

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D D d I I I I I

Page 58: [537] File-System Implementation

Inode Bitmap

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D i d I I I I I

Page 59: [537] File-System Implementation

Opportunity for Inconsistency (fsck)

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D i d I I I I I

Page 60: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - data block - inode table - indirect block - directories - data bitmap - inode bitmap - superblock

Page 61: [537] File-System Implementation

SuperblockNeed to know basic FS metadata, like: - block size - how many inodes are there - how much free data !

Store this in a superblock

Page 62: [537] File-System Implementation

Super Block

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

D i d I I I I I

Page 63: [537] File-System Implementation

Super Block

0 7D D D D D D D D8 15

D D D D D D D D16 23

D D D D D D D D24 31

D D D D D D D D32 39

D D D D D D D D40 47

D D D D D D D D48 55

D D D D D D D D56 63

S i d I I I I I

Page 64: [537] File-System Implementation

Structure OverviewStructures: - superblock - data block - data bitmap - inode table - inode bitmap - indirect block - directories

Page 65: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Page 66: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Data Block!

Page 67: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Data Block! Data Bitmap!

Page 68: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Data Block! Data Bitmap!

Inode Table!

Page 69: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Data Block! Data Bitmap!

Inode Table! Inode Bitmap!

Page 70: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Data Block!

Inode Table!

Data Bitmap!

Inode Bitmap!

directories

Page 71: [537] File-System Implementation

Structure OverviewCore Performance

Super Block!

Data Block!

Inode Table!

Data Bitmap!

Inode Bitmap!

directories indirects

Page 72: [537] File-System Implementation

Operations

Page 73: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 74: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 75: [537] File-System Implementation

mkfsDifferent version for each file system (e.g., mkfs.ext4, mkfs.xfs, mkfs.btrfs, etc) !

Initialize metadata (bitmaps, inode table). !

Create empty root directory. !

Demo…

Page 76: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 77: [537] File-System Implementation

/dev/sda1 on / /dev/sdb1 on /backups AFS on /home/tyler

/

backups home

bak1 bak2 bak3

etc bin

tyler

.bashrc

Page 78: [537] File-System Implementation

/dev/sda1 on / /dev/sdb1 on /backups AFS on /home/tyler

/

backups home

bak1 bak2 bak3

etc bin

tyler

.bashrc

mount harter@galap-1:… /home/tyler/537

Page 79: [537] File-System Implementation

/dev/sda1 on / /dev/sdb1 on /backups AFS on /home/tyler harter@galap-1:… on /home/tyler/537

/

backups home

bak1 bak2 bak3

etc bin

tyler

537

p1 p2

.bashrc

Page 80: [537] File-System Implementation

mountAdd the file system to the FS tree. !

Minimally requires reading superblock. !

Demo…

Page 81: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 82: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

Page 83: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

readread

Page 84: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

readread

readread

Page 85: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

readread

readread

readwrite

Page 86: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

readread

readread

readwrite

write

Page 87: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

readread

readread

readwrite

readwrite

write

Page 88: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

create /foo/bar

readread

readread

readwrite

readwrite

write

write

Page 89: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 90: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

write to /foo/bar

databar

Page 91: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

write to /foo/bar

bardata

read

Page 92: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

write to /foo/bar

bardata

readread

Page 93: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

write to /foo/bar

bardata

readreadwrite

Page 94: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

write to /foo/bar

bardata

read

write

readwrite

Page 95: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

write to /foo/bar

bardata

readreadwrite

writewrite

Page 96: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 97: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

open /foo/bar

databar

Page 98: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

open /foo/bar

databar

read

Page 99: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

open /foo/bar

databar

readread

Page 100: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

open /foo/bar

databar

readread

read

Page 101: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

open /foo/bar

databar

readread

readread

Page 102: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

open /foo/bar

databar

readread

readread

read

Page 103: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 104: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

read /foo/bar

databar

Page 105: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

read /foo/bar

databar

read

Page 106: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

read /foo/bar

databar

readread

Page 107: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

read /foo/bar

databar

readread

write

Page 108: [537] File-System Implementation

OperationsFS - mkfs - mount !File - create - write - open - read - close

Page 109: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

close /foo/bar

databar

Page 110: [537] File-System Implementation

data inode root foo bar root foobitmap bitmap inode inode inode data data

close /foo/bar

databar

nothing to do on disk!

Page 111: [537] File-System Implementation

Efficiency

Page 112: [537] File-System Implementation

Efficiency

How can we avoid this excessive I/O for basic ops?

Page 113: [537] File-System Implementation

Efficiency

How can we avoid this excessive I/O for basic ops? !Cache for: - reads - write buffering

Page 114: [537] File-System Implementation

StructuresWhat data is likely to be read frequently? - superblock - data block - data bitmap - inode table - inode bitmap - indirect block - directories

Page 115: [537] File-System Implementation

Unified Page CacheInstead of a dedicated file-system cache, draw pages from a common pool for FS and processes. !

API change: - read - shrink_cache (Linux)

Page 116: [537] File-System Implementation

LRU ExampleOps

read 1 read 2 read 3 read 4 shrink shrink read 1 read 2 read 3 read 4

Hits miss miss miss miss

- -

miss miss hit hit

State 1 1,2 1,2,3 1,2,3,4 2,3,4 3,4 1,3,4 1,2,3,4 1,2,3,4 1,2,3,4

Page 117: [537] File-System Implementation

Write BufferingWhy does procrastination help?

Page 118: [537] File-System Implementation

Write BufferingWhy does procrastination help? !Overwrites, deletes, scheduling. !Shared structs (e.g., bitmaps+dirs) often overwritten.

Page 119: [537] File-System Implementation

Write BufferingWhy does procrastination help? !Overwrites, deletes, scheduling. !Shared structs (e.g., bitmaps+dirs) often overwritten. !We decide: how much to buffer, how long to buffer… - tradeoffs?

Page 120: [537] File-System Implementation

Structure Review

Page 121: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Page 122: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Data Block!

Page 123: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Data Block! Data Bitmap!

Page 124: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Data Block! Data Bitmap!

Inode Table!

Page 125: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Data Block! Data Bitmap!

Inode Table! Inode Bitmap!

Page 126: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Data Block!

Inode Table!

Data Bitmap!

Inode Bitmap!

directories

Page 127: [537] File-System Implementation

Structure ReviewCore Performance

Super Block!

Data Block!

Inode Table!

Data Bitmap!

Inode Bitmap!

directories indirects

Page 128: [537] File-System Implementation

Summary/FutureWe’ve described a very simple FS. - basic on-disk structures - the basic ops !

Future questions: - how to allocate efficiently? - how to handle crashes?

Page 129: [537] File-System Implementation

Announcement

Office hours today at 1pm, in office. !

Discussion tomorrow. p4b.