provable semantics for storage stacks - usenix · beyond storage apis: provable semantics for...

47
BEYOND STORAGE APIS: PROVABLE SEMANTICS FOR STORAGE STACKS UNIVERSITY OF WISCONSIN-MADISON RAMNATTHAN ALAGAPPAN VIJAY CHIDAMBARAM THANUMALAYAN PILLAI REMZI ARPACI-DUSSEAU ANDREA ARPACI-DUSSEAU AWS ALBARGHOUTHI

Upload: others

Post on 12-Mar-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

BEYOND STORAGE APIS: PROVABLE SEMANTICS FOR STORAGE STACKS

UNIVERSITY OF WISCONSIN-MADISON

RAMNATTHAN ALAGAPPAN

VIJAY CHIDAMBARAM

THANUMALAYAN PILLAI

REMZI ARPACI-DUSSEAU

ANDREA ARPACI-DUSSEAU

AWS ALBARGHOUTHI

Page 2: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

2

Application

Laptops

Page 3: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

2

Application

Laptops Desktops

Page 4: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

2

Application

Laptops Desktops

Mobile Devices

Page 5: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

2

Application

Laptops Desktops

Mobile Devices

Private and Public Clouds

Page 6: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

2

Application

Laptops Desktops

Mobile Devices

Private and Public Clouds

Heterogeneity of environments is increasing

Page 7: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

STORAGE STACKS: DEEP AND DIVERSE

3

Windows IO stack has 18 layers! [ThereskaSOSP13]

Page 8: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

STORAGE STACKS: DEEP AND DIVERSE

3

Application

SSTF

Disk

ext4

Workstation

Application

CFQ

SSD

btrfs

Laptop

Application

CFQ

SSD

F2FS

Mobile

Windows IO stack has 18 layers! [ThereskaSOSP13]

Page 9: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

APPLICATION PORTABILITY

Applications should be portable between environments

- Reduce development effort and bugs

- Avoid vendor lock-in

4

Amazon EC2 OpenStack Nova

Application 1 Application 2

Common API

Page 10: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

API Compatibility is not enough!

Page 11: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

API Compatibility is not enough!

Application correctness depends upon unspecified properties

of the storage stack

Page 12: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

API Compatibility is not enough!

Application correctness depends upon unspecified properties

of the storage stack

Results: data corruption, data loss, unavailability [PillaiOSDI14]

Page 13: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

6

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

CFQ

SSD

F2FS

Mobile

Dat

acen

ter

Page 14: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

6

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

CFQ

SSD

F2FS

Mobile

Dat

acen

ter

?

??

Quick, automated check at deployment

Page 15: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

6

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

CFQ

SSD

F2FS

Mobile

Dat

acen

ter

?

??

Quick, automated check at deployment

Page 16: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

7

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

Dat

acen

ter

Page 17: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

7

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

Dat

acen

ter

Which is the best node to deploy this application to?

Page 18: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

7

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

Dat

acen

ter

??

Which is the best node to deploy this application to?

Page 19: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

7

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

Dat

acen

ter

??

Which is the best node to deploy this application to?

Page 20: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THE VISION

7

Application

Disk

ext4

Node 1

SSD

btrfs

Node 1

Dat

acen

ter

??

Which is the best node to deploy this application to?

Best: least # of stack layers, least utilized, etc.

Page 21: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

OUTLINE

Introduction

Portability Bug Study

First Steps Toward The Vision

The Road Ahead

8

1

3

2

4

Page 22: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

PORTABILITY BUG STUDY

Portability bug: bug that occurs when an application is moved to a different environment

Studied public bug databases

- Android deployed on different mobile devices

- Applications run on cloud platforms and on NFS

Performed our own experiments based on previous work [PillaiOSDI14]

9

Page 23: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

OPERATIONS NOT SUPPORTED

10

SQLite creates temporary files by opening a file and unlinking them

Not supported by the daemon emulating FAT32

on the sd card

FAT32 File System

Page 24: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

UNEXPECTED ERROR CODES

11

fsync() on FreeBSD returns ENOLCK

even on success

MySQL restarts when it sees that error

NFS

Page 25: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

ORDERING REQUIREMENTS NOT MET

12

Cloud

Partial File!log

filepart2

filepart 1

filepart1

File System

Inotify

Page 26: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

13

ext4

Workstation

Guarantee: Committed data can always be read back after a crash

POSIX

btrfs

Laptop

POSIX

All file systems are not created equal: On the complexity of crafting crash-consistent applications, OSDI 2014

Page 27: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

13

ext4

Workstation

Guarantee: Committed data can always be read back after a crash

Guarantee: Committed data can always be read back after a crash

POSIX

btrfs

Laptop

POSIX

All file systems are not created equal: On the complexity of crafting crash-consistent applications, OSDI 2014

Page 28: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

13

ext4

Workstation

POSIX

btrfs

Laptop

POSIX

Write A Write B Write C

A

A B A B C

B

All file systems are not created equal: On the complexity of crafting crash-consistent applications, OSDI 2014

Page 29: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

13

ext4

Workstation

We term this an Application Crash Vulnerability

POSIX

btrfs

Laptop

POSIX

All file systems are not created equal: On the complexity of crafting crash-consistent applications, OSDI 2014

Page 30: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

13

ext4

Workstation

We term this an Application Crash Vulnerability

POSIX

btrfs

Laptop

POSIX

API compatibility is not enough!

All file systems are not created equal: On the complexity of crafting crash-consistent applications, OSDI 2014

Page 31: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

OUTLINE

Introduction

Portability Bug Study

First Steps Toward The Vision

The Road Ahead

14

1

3

2

4

Page 32: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

Formally verify that an application

will run correctly on a given storage stack

15

What application requires

<=What

storage stack provides

Page 33: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

WHY IS THIS HARD?

16

Application requirements can be complex

- e.g., append(“AB”) should result in file containing A or AB

- if then else form

Binary or numerical checks are not sufficient

<= What storage stack provides

What application requires

Page 34: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

WHY IS THIS HARD?

16

Application requirements can be complex

- e.g., append(“AB”) should result in file containing A or AB

- if then else form

Binary or numerical checks are not sufficient

Need expressive language for specifying application requirements

<= What storage stack provides

What application requires

Page 35: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

WHY IS THIS HARD?

17

Disk provides atomic reads and writes

File system provides atomic metadata operations

Disk

ext3

PostgresPostgres provides ACID transactions

<= What storage stack provides

What application requires

Page 36: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

WHY IS THIS HARD?

17

Disk provides atomic reads and writes

File system provides atomic metadata operations

Disk

ext3

PostgresPostgres provides ACID transactionsNeed to dynamically compute

guarantees provided by the stack

<= What storage stack provides

What application requires

Page 37: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

OVERVIEW

18

Application

CFQ

Disk

ext4

Model guarantees the application requires from storage as a theorem

Model guarantees provided by each layer of the storage stack

as axioms

Ex: application will be crash-consistent if all writes are ordered and atomic

Ex: disk guarantees sector-level reads and writes are atomic even with crash

Page 38: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

OVERVIEW

18

Application

CFQ

Disk

ext4

Model guarantees the application requires from storage as a theorem

Model guarantees provided by each layer of the storage stack

as axioms

Ex: application will be crash-consistent if all writes are ordered and atomic

Ex: disk guarantees sector-level reads and writes are atomic even with crash

Prove application theorem using axioms from storage stack

Page 39: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

SYSTEM

19

Application Requirements

Stack Configuration Library of

stack-layer specifications

Result

PROVER

E.g., put() must be atomic

E.g., put() is atomic on given stack

E.g., Key-Value Store on top of disk

Page 40: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

SYSTEM

20

Application Requirements

Stack Configuration Library of

stack-layer specifications

Result

E.g., put() must be atomic

E.g., put() is atomic on given stack

E.g., Key-Value Store on top of disk

Page 41: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

SYSTEM

20

Application Requirements

Stack Configuration Library of

stack-layer specifications

Result

E.g., put() must be atomic

E.g., put() is atomic on given stack

Use the proof assistant to manually write machine-checked proofs

E.g., Key-Value Store on top of disk

Page 42: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

EXPERIENCE WITH ISABELLE

Modelled a simple 2-layer stack

- Key-value store on top of a disk

Proved put() is atomic

About 160 lines of code (lots of trial and error)

Code available at: https://github.com/ramanala/StorageStackSemantics

21

Page 43: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

EXPERIENCE WITH ISABELLE

Modelled a simple 2-layer stack

- Key-value store on top of a disk

Proved put() is atomic

About 160 lines of code (lots of trial and error)

Code available at: https://github.com/ramanala/StorageStackSemantics

21

Page 44: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

OUTLINE

Introduction

Portability Bug Study

First Steps Toward The Vision

The Road Ahead

22

1

3

2

4

Page 45: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

CHALLENGES

Obtaining specifications

- Developer provides/written by grad students

- How to figure out automatically?

Automatic proofs

- Use Z3 instead of Isabelle?

Proofs without specifications

- Know a layer provides guarantees, without knowing how

Verifying implementations

23

Page 46: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

CONCLUSION

The promise of software-defined storage

- Increases in performance, flexibility, and utilization

- Unspoken aspect: application correctness!

Simply ensuring API compatibility is not enough

- Storage semantics are complex and nuanced

PL tools like SMT solvers/proof assistants can help match application to diverse storage stacks

Interesting, significant challenges on path ahead

24

Page 47: PROVABLE SEMANTICS FOR STORAGE STACKS - USENIX · beyond storage apis: provable semantics for storage stacks university of wisconsin-madison ramnatthan alagappan vijay chidambaram

THANK YOU! QUESTIONS?

SOURCE CODE AT:HTTP://CS.WISC.EDU/~VIJAYC

VIJAY CHIDAMBARAM UNIVERSITY OF WISCONSIN MADISON [email protected] | HTTP://CS.WISC.EDU/~VIJAYC