why panic () ? improving reliability through restartable file systems
DESCRIPTION
Swaminathan Sundararaman , Sriram Subramanian, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift. Why panic () ? Improving Reliability through Restartable File Systems. Data Availability. Slave Nodes. GFS Maste r. GFS Maste r. Slave Nodes. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/1.jpg)
Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C.
Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift
![Page 2: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/2.jpg)
Applications require data Use FS to reliably store data
Both hardware and software can fail
Typical Solution Large clusters for availability Reliability through replication
2
GFS MasterGFS
Master
Sla
ve
Nod
esS
lave
N
odes
![Page 3: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/3.jpg)
OS
FS
Replication infeasible for desktop environments
Wouldn’t RAID work? Can only tolerate H/W failures
FS crash are more severe Services/applications are killed Requiring OS reboot and
recovery Need: better reliability in the event of file system failures
3
Raid Controller
Dis
ks
Dis
k
App
App
App
![Page 4: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/4.jpg)
MotivationBackgroundRestartable file systemsAdvantages and limitationsConclusions
4
![Page 5: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/5.jpg)
5
![Page 6: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/6.jpg)
6
int journal_mark_dirty(….){ struct reiserfs_journal_cnode *cn = NULL; if (!cn) { cn = get_cnode(p_s_sb); if (!cn) { reiserfs_panic(p_s_sb, "get_cnode failed!\n"); }}}
void reiserfs_panic(struct super_block *sb, ...){ BUG(); /* this is not actually called, but makes reiserfs_panic() "noreturn" */ panic("REISERFS: panic %s\n“, error_buf);}
ReiserFS
File systems already detect failures
Recovery: simplified by generic recovery mechanism
![Page 7: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/7.jpg)
1. Code to recover from all failures Not feasible in reality
2. Restart on failure Previous work have taken this approach
FS need: stateful & lightweightrecovery
7
HeavyweightLightweight
Stat
eles
sSt
atef
ulNooks/Shadow
Xen, MinixL4, Nexus
SafeDriveSingularity
CuriOSEROS
![Page 8: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/8.jpg)
Goal: build lightweight & stateful solution to tolerate file-system failures
Solution: single generic recovery mechanism for any file system failure
1. Detect failures through assertions2. Cleanup resources used by file system3. Restore file-system state before crash4. Continue to service new file system requests
8
FS Failures: completely transparent to applications
![Page 9: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/9.jpg)
9
Transparency Multiple applications using FS upon crash Intertwined execution
Fault-tolerance Handle a gamut of failures Transform to fail-stop failures
Consistency OS and FS could be left in an inconsistent state
![Page 10: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/10.jpg)
FS consistency required to prevent data loss
10
Not all FS support crash-consistency FS state constantly modified by applications
Periodically checkpoint FS state Mark dirty blocks as Copy-On-Write Ensure each checkpoint is atomically written
On Crash: revert back to the last checkpoint
![Page 11: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/11.jpg)
11
VFS
File System
Application
Epoch 0 Epoch 1
time
chec
kpoi
ntOpen (“file”) write() read()
Completed In-progressLegend: Crash
write()
Periodically create
checkpoints1
Move to recent checkpoint4
Replay completed operations
5
Unwind in-flight
processes3
File System Crash2
Re-execute unwound process
6
1
2
4
5
6
write() Close()3
![Page 12: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/12.jpg)
File systems constantly modified Hard to identify a consistent recovery
point
Naïve Solution: Prevent any new FS operation and call sync Inefficient and unacceptable overhead
12
![Page 13: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/13.jpg)
13
VFS
File System
Page Cache
Disk
App
App
App
File Systems write to disk through Page Cache
All requests go through the VFS layer
ext3VFA
T Control requests to FS and dirty pages to disk
![Page 14: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/14.jpg)
14
VFS
File System
Page Cache
Disk
App
VFS
File System
Page Cache
App
Disk
Regular
VFS
File System
Page Cache
App
Disk
STOP STOP
Membrane
11
![Page 15: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/15.jpg)
Have built-in crash consistency mechanism Journaling or Snapshotting
Seamlessly integrate with these mechanism Need FSes to indicate beginning and end of
an transaction Works for data and ordered journaling mode Need to combine writeback mode with COW
15
![Page 16: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/16.jpg)
Log operations at the VFS level Need not modify existing file systems
Operations: open, close, read, write, symlink, unlink, seek, etc. Read:
Logs are thrown away after each checkpoint
What about logging writes?16
![Page 17: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/17.jpg)
Mainly used for replaying writesGoal: Reduce the overhead of
logging writes Soln: Grab data from page cache during
recovery
17
VFS
File System
Page Cache
VFS
File System
Page Cache
VFS
File System
Page Cache
Write (fd, buf, offset, count)
![Page 18: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/18.jpg)
18
![Page 19: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/19.jpg)
19
![Page 20: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/20.jpg)
Setup
20
![Page 21: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/21.jpg)
21
![Page 22: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/22.jpg)
22
![Page 23: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/23.jpg)
Restart ext2 during random-read micro benchmark
23
![Page 24: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/24.jpg)
Data (Mb)
Recovery Time (ms)
10 12.920 13.240 16.1
24
![Page 25: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/25.jpg)
Improves tolerance to file system failures Build trust in new file systems (e.g., ext4, btrfs)
Quick-fix bug patching Developer transform corruptions to restart Restart instead of extensive code restructuring
Encourage more integrity checks in FS code Assertions could be seamlessly transformed to
restart File systems more robust to failures/crashes
25
![Page 26: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/26.jpg)
Only tolerate fail-stop failures Not address-space based Faults could corrupt other kernel components
FS restart may be visible to application e.g., Inode numbers could be changed after
restart
26
VFS
File System
Application
Epoch 0After Crash RecoveryBefore Crash
Epoch 0
create (“file1”) stat (“file1”) write (“file1”, 4k)
File : file1Inode# : 15
create (“file1”) stat (“file1”)write (“file1”, 4k)
File1: inode# 12
File1: inode# 15
Inode# Mismatch
File : file1Inode# : 12
![Page 27: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/27.jpg)
Failures are inevitable in file systems Learn to cope and not hope to avoid them
Generic recovery mechanism for FS failures Improves FS reliability availability of
data Users: Install new FSes with confidence Developers: Ship FS faster; as not all
exception cases are now show-stoppers27
![Page 28: Why panic () ? Improving Reliability through Restartable File Systems](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a8b550346895dc8007c/html5/thumbnails/28.jpg)
Questions and Comments
28
Advanced Systems Lab (ADSL)University of Wisconsin-Madison
http://www.cs.wisc.edu/adsl