advanced file systems: lfs and soft updates ken birman (based on slides by ben atkin)
TRANSCRIPT
![Page 1: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/1.jpg)
Advanced file systems: LFS and Soft Updates
Ken Birman(based on slides by Ben Atkin)
![Page 2: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/2.jpg)
2Advanced file systems
Overview of talk
Unix Fast File System Log-Structured System Soft Updates Conclusions
![Page 3: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/3.jpg)
3Advanced file systems
The Unix Fast File System
Berkeley Unix (4.2BSD) Low-level index nodes (inodes)
correspond to files Reduces seek times by better
placement of file blocks Tracks grouped into cylinders Inodes and data blocks grouped together Fragmentation can still affect performance
![Page 4: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/4.jpg)
4Advanced file systems
File system on disk
......
super block disk layout
freespace map inodes and blocks in use
inodes inode size < block size
data blocks
![Page 5: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/5.jpg)
5Advanced file systems
File representationfile size
link count
access times
...
data blocks
indirect block
double indirect
triple indirect
data
data
data
data
...
...
...
data
data
data
data
...
...
data
data
data
data
...
...
![Page 6: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/6.jpg)
6Advanced file systems
Inodes and directories
Inode doesn't contain a file name Directories map files to inodes
Inode can be in multiple directories Low-level file system doesn't
distinguish files and directories Separate system calls for directory
operations
![Page 7: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/7.jpg)
7Advanced file systems
FFS implementation
Most operations do multiple disk writes File write: update block, inode modify time Create: write freespace map, write inode,
write directory entry Write-back cache improves
performance Benefits due to high write locality Disk writes must be a whole block Syncer process flushes writes every 30s
![Page 8: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/8.jpg)
8Advanced file systems
FFS crash recovery
Asynchronous writes are lost in a crash Fsync system call flushes dirty data Incomplete metadata operations can cause
disk corruption (order is important) FFS metadata writes are synchronous
Large potential decrease in performance Some OSes cut corners
![Page 9: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/9.jpg)
9Advanced file systems
After the crash
Fsck file system consistency check Reconstructs freespace maps Checks inode link counts, file sizes
Very time consuming Has to scan all directories and inodes
![Page 10: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/10.jpg)
10Advanced file systems
Overview of talk
Unix Fast File System Log-Structured System Soft Updates Comparison and conclusions
![Page 11: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/11.jpg)
11Advanced file systems
The Log-StructuredFile System
CPU speed increases faster than disk speed
Caching improves read performance Little improvement in write
performance Synchronous writes to metadata Metadata access dominates for small files e.g. Five seeks and I/Os to create a file
![Page 12: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/12.jpg)
12Advanced file systems
LFS design
Increases write throughput from 5-10% of disk to 70% Removes synchronous writes Reduces long seeks
Improves over FFS "Not more complicated" Outperforms FFS except for one case
![Page 13: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/13.jpg)
13Advanced file systems
LFS in a nutshell
Boost write throughput by writing all changes to disk contiguously Disk as an array of blocks, append at end Write data, indirect blocks, inodes together No need for a free block map
Writes are written in segments ~1MB of continuous disk blocks Accumulated in cache and flushed at once
![Page 14: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/14.jpg)
14Advanced file systems
Log operation
inode blocks data blocks
active segment
log
Kernel buffer cache
log head log tail
Disk
![Page 15: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/15.jpg)
15Advanced file systems
Locating inodes
Positions of data blocks and inodes change on each write Write out inode, indirect blocks too!
Maintain an inode map Compact enough to fit in main
memory Written to disk periodically at
checkpoints
![Page 16: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/16.jpg)
16Advanced file systems
Cleaning the log
Log is infinite, but disk is finite Reuse the old parts of the log
Clean old segments to recover space Writes to disk create holes Segments ranked by "liveness", age Segment cleaner "runs in background"
Group slowly-changing blocks together Copy to new segment or "thread" into old
![Page 17: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/17.jpg)
17Advanced file systems
Cleaning policies
Simulations to determine best policy Greedy: clean based on low utilisation Cost-benefit: use age (time of last write)
Measure write cost Time disk is busy for each byte written Write cost 1.0 = no cleaning
benefitcost
(free space generated)*(age of segment)cost
=
![Page 18: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/18.jpg)
18Advanced file systems
Greedy versus Cost-benefit
![Page 19: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/19.jpg)
19Advanced file systems
Cost-benefit segment utilisation
![Page 20: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/20.jpg)
20Advanced file systems
LFS crash recovery
Log and checkpointing Limited crash vulnerability At checkpoint flush active segment,
inode map No fsck required
![Page 21: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/21.jpg)
21Advanced file systems
LFS performance
Cleaning behaviour better than simulated predictions
Performance compared to SunOS FFS Create-read-delete 10000 1k files Write 100-MB file sequentially, read
back sequentially and randomly
![Page 22: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/22.jpg)
22Advanced file systems
Small-file performance
![Page 23: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/23.jpg)
23Advanced file systems
Large-file performance
![Page 24: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/24.jpg)
24Advanced file systems
Overview of talk
Unix Fast File System Log-Structured System Soft Updates Conclusions
![Page 25: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/25.jpg)
25Advanced file systems
Soft updates
Alternative mechanism for improving performance of writes All metadata updates can be
asynchronous Improved crash recovery Same on-disk structure as FFS
![Page 26: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/26.jpg)
26Advanced file systems
The metadata update problem
Disk state must be consistent enough to permit recovery after a crash No dangling pointers No object pointed to by multiple pointers No live object with no pointers to it
FFS achieves this by synchronous writes Relaxing sync. writes requires update
sequencing or atomic writes
![Page 27: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/27.jpg)
27Advanced file systems
Design constraints
Do not block applications unless fsync
Minimise writes and memory usage Retain 30-second flush delay Do not over-constrain disk
scheduler It is already capable of some
reordering
![Page 28: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/28.jpg)
28Advanced file systems
Dependency tracking
Asynchronous metadata updates need ordering information For each write, pending writes which
precede it Block-based ordering is insufficient
Cycles must be broken with sync. writes Some blocks stay dirty for a long time False sharing due to high granularity
![Page 29: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/29.jpg)
29Advanced file systems
Circular dependency example
inode #32
inode #33
inode #34
inode #35
a.txt 89
b.pdf 32
c.doc 366
...
directory inode block
![Page 30: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/30.jpg)
30Advanced file systems
Circular dependency example
inode #32
inode #33
inode #34
inode #35
a.txt 89
b.pdf 32
c.doc 366
d.txt 34
...
create file d.txt
Inode must be initialised before directory entry is added
![Page 31: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/31.jpg)
31Advanced file systems
Circular dependency example
inode #32
inode #33
inode #34
inode #35
a.txt 89
c.doc 366
d.txt 34
...
remove file b.pdf
Directory entry must be removed before inode is deallocated
![Page 32: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/32.jpg)
32Advanced file systems
Update implementation
Update list for each pointer in cache FS operation adds update to each
affected pointer Update incorporates dependencies
Updates have "before", "after" values for pointers Roll-back, roll-forward to break cycles
![Page 33: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/33.jpg)
33Advanced file systems
Circular dependency example
inode #32
inode #33
inode #34
inode #35
a.txt 89
b.pdf 32
c.doc 366
d.txt 34
...
Rollback allows dependency to be suppressed
roll back
remove
![Page 34: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/34.jpg)
34Advanced file systems
Soft updates details
Blocks are locked during roll-back Prevents processes from seeing stale
cache Existing updates never get new
dependencies No indefinite aging
Memory usage is acceptable Updates block if usage becomes too high
![Page 35: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/35.jpg)
35Advanced file systems
Recovery with soft updates
"Benign" inconsistencies after crashes Freespace maps may miss free entries Link counts may be too high
Fsck is still required Need not run immediately Only has to check in-use inodes Can run in the background
![Page 36: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/36.jpg)
36Advanced file systems
Soft updates performance
Recovery time on 76% full 4.5GB disk 150s for FFS fsck versus 0.35s ...
Microbenchmarks Compared soft updates, async writes, FFS Create, delete, read for 32MB of files
Soft updates versus update logging Sdet benchmark of "user scripts" Various degrees of concurrency
![Page 37: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/37.jpg)
37Advanced file systems
Create and delete performance
Create files Delete files
![Page 38: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/38.jpg)
38Advanced file systems
Read performance
![Page 39: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/39.jpg)
39Advanced file systems
Overall create traffic
![Page 40: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/40.jpg)
40Advanced file systems
Soft updates versus logging
![Page 41: Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649f495503460f94c6ae91/html5/thumbnails/41.jpg)
41Advanced file systems
Conclusions
Papers were separated by 8 years Much controversy regarding LFS-FFS
comparison Both systems have been influential
IBM Journalling file system Ext3 filesystem in Linux Soft updates come enabled in
FreeBSD