Page 1: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


SQCK: A Declarative File System Checker

Haryadi S. Gunawi, Abhishek Rajimwale,

Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

University of Wisconsin – Madison

OSDI ’08 – December 9th, 2008

Page 2: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Corrupt file systems File systems

Store massive amounts of data Must be reliable

Corrupted file system images Due to hardware errors, file system bugs, etc. Need to be repaired a.s.a.p.

Page 3: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Who should repair? Does journaling (write-ahead log) help?

No, only for crashes

Does file system repair itself online? No, not enough machinery

Fsck: the last line of defense It’s a “must have” utility

− XFS: “no need fsck ever”, but deploys fsck at the end Must be fully reliable

Page 4: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


But … fsck is complex Fsck has a big task

Turn any corrupt image to a consistent image E.g. check if a data block is shared by two inodes

How are they implemented? Written in C hard to reason about Large and complex

− Ext2 fsck: 150 checks in 16 KLOC− XFS fsck: 340 checks in 22 KLOC

Hundreds of cluttered if-check statements

Bottom line: fsck code is “untouchable”

Page 5: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Two Questions

Are current checkers really reliable?

If not, how should we build robust checkers?

Page 6: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


e2fsck is unreliable

Analyze e2fsck (ext2 file system checker)

Findings: Inconsistent repair

− The file system becomes unreadable Consistent but not “correct”

− Fsck deletes valid directory entries− Fsck loses a huge number of files

Page 7: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


SQCK Lesson: Complexity is the enemy of reliability

Big task + bad design complexity unreliability Need a higher-level approach for simplicity

SQCK (SQL-based Fsck) Use a declarative query language to write checks Put simply: write fewer lines of code

Evaluation Simple and reliable: e2fsck in 150 queries (vs. 16 KLOC of C) More: Great flexibility and reasonable performance

Page 8: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Outline Introduction

Analysis of e2fsck

SQCK Design

SQCK Evaluation


Page 9: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Methodology E2fsck task: cross-check all ext2 metadata

An indirect pointer should not point to the superblock A subdir should only be accessible from one


Inject single corruption Observe how e2fsck repairs a single corruption Only corrupt on-disk pointers

− Corrupt an indirect pointer to point to the superblock− Corrupt a directory entry to point to another directory

Usually, a corrupt pointer is simply cleared to zero

Page 10: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Inconsistent (Out-of-order) Repair









Indirect block



1. Check bad indirect pointer

2. Check indirect content

Ideal fsck






2. Check indirect content

1. Check bad indirect pointer






Page 11: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Consistent but Incorrect Repair (1)


a1 b1

a2 b2

Ideal fsck



a1 b1

a2 b2


a1 b1




a1 b1

a2 b2


a1 b1



Kidnapping problem!

E2fsck does not use all available information

Page 12: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Result Summary Four problems

Inconsistent Information-incomplete Policy-inconsistent Insecure

E2fsck does not handle all corruptions “Warning: Programming bug in e2fsck! Or some bonehead

(you) is checking a mounted (live) filesystem.”

Not simple implementation bugs Difficult to combine available information Difficult to ensure correct ordering

Page 13: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Outline Introduction


SQCK Design

SQCK Evaluation


Page 14: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Fsck Properties Hundreds of checks

Complex cross-checks Taxonomy of checks in e2fsck:

Must be ordered correctly

Single instance

Multiple instances

Same structure

63 11

Different structures

12 35

struct A {

int x

int y


A {




A {




A {




A {




B {




A { x y}

B { m n}

A { x y}

B { m n}

A { x y}

B { m n}

Page 15: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


A Declarative Approach Lesson: Complexity is the enemy of reliability

SQCK Use a declarative query language (e.g. SQL), why? It is declarative: high-level intent is clear Fit for cross-checking massive information

Goals achieved Simple: e2fsck in 150 queries (vs. 16 KLOC of C) Reliable: Each check/query is easy to understand Flexible: Plug in/out different queries

Page 16: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Using SQCK Take a fs image

Load metadata to db tables Temporary tables Ex: InodeTable,

GroupDescTable, DirEntryTable

Run checks and repairs (in the form of queries)

Flush any modification, and delete tables


File system image

Checks + Repairs


Database tables

Page 17: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Declarative check (example 1) Cross-checking a single instance of a structure

“Find block bitmap that is not located within its block group”

first_block = sb->s_first_data_block;last_block = first_block + blocks_per_group;for (i = 0, gd=fs->group_desc; i < fs->group_desc_count; i++, gd++) \{ if (i == fs->group_desc_count - 1) last_block = sb->s_blocks_count; if ((gd->bg_blk_bmap < first_block) || (gd->bg_blk_bmap >= last_block)) { px.blk = gd->bg_block_bitmap; if (fix_problem(BB_NOT_GROUP, ...)) gd->bg_block_bitmap = 0; } ...}

SELECT *FROM GroupDescTable GWHERE G.blockBitmap NOT BETWEEN G.start AND G.end

Page 18: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Declarative check (example 2) Cross-checking multiple instances of the same


“Find false parents (i.e. directory entries that point to a subdirectory that already belongs to another directory)” Must read all directory entries in dir data blocks Wrong implementation in e2fsck (the kidnapping


Page 19: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Declarative check (example 2)if ((dot_state > 1) && (ext2fs_test_inode_bitmap (ctx->inode_dir_map, dirent->inode))) { // ext2fs_get_dir_info // is 20 lines long subdir = e2fsck_get_dir_info (dirent->inode); ... if (subdir->parent) { if (fix_problem(LINK_DIR,..)) { dirent->inode = 0; goto next; } } else { subdir->parent = ino; }}

Page 20: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Declarative check (example 2)SELECT F.* // returns the // false parent(s)

FROM DirEntryTable P, C, F

WHERE // P says C is its child P.entry_num >= 3 AND P.entry_ino = C.ino AND

// and C says P is his parent C.entry_num = 2 AND C.entry_ino = P.ino AND

// F also says C is its child F.entry_num >= 3 AND F.entry_ino = C.ino AND F.ino <> P.ino AND



Page 21: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Declarative Repairs Running declarative checks is part of the problem

Must also perform the declarative repairs

A repair = An update query Some repairs simply update a few fields

A repair = A series of queries Ex: Reconnect an orphan directory to the lost+found directory Combine a series of queries with C code

− All repairs are written in SQL− C code is only used for connecting them

...SET T.field = newValue, T.dirty = 1

Page 22: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Outline Introduction


SQCK Design

SQCK Evaluation


Page 23: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


SQCK Evaluation Complexity

150 queries in 1100 lines of SQL statements (compared to 16,000 lines of C in e2fsck)

Reliability Pass hundreds of corruption scenarios

Flexibility Add new checks/repairs Enable different versions of e2fsck

Performance Introduce some optimizations

Page 24: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


SQCK vs. e2fsck

Reasonable First generation of

SQCK (with MySQL) Within 1.5x of e2fsck

Future optimizations Hierarchical checks Concurrent queries

Page 25: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Conclusion Complexity is the enemy of reliability

Recovery code is complex

SQCK: Build recovery tools with a higher-level approach

Page 26: 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin


Thank you!Questions?

ADvanced Systems Laboratory

Top Related