speculative execution in a distributed file system ed nightingale peter chen jason flinn university...
TRANSCRIPT
![Page 1: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/1.jpg)
Speculative Execution in a Distributed File System
Ed Nightingale
Peter Chen
Jason Flinn
University of Michigan
![Page 2: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/2.jpg)
2
Motivation• Why are distributed file systems slow(er)?
– Sync n/w messages provide consistency– Sync disk writes provide safety
• Sacrifice guarantees for speed
• Can DFS can be safe, consistent and fast?– Yes! With OS support for speculative execution
![Page 3: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/3.jpg)
3
Big Idea: Slow Way
RPC Req
Client
RPC Resp
• Guarantees without blocking I/O!
Server
Block!2) Speculate!
1) Checkpoint
Big Idea: Speculator
3) Correct?
Yes: discard ckpt.No: restore process & re-execute RPC Req
RPC Resp
RPC Req
RPC Resp
![Page 4: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/4.jpg)
4
Conditions for Success
• Operations are highly predictable– Conflicts are rare
• Checkpoints are cheaper than network I/O– 52 µs for small process
• Computers have resources to spare– Need memory and CPU cycles for speculation
![Page 5: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/5.jpg)
5
Outline
• Motivation
• Implementing speculation
• Multi-process speculation
• Using Speculator
• Evaluation
![Page 6: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/6.jpg)
6Undo log
Implementing SpeculationPro
cess
Checkpoint Spec
1) System call 2) Create speculation
Time
![Page 7: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/7.jpg)
7
Speculation Success
Undo log
Checkpoint
1) System call 2) Create speculation
Proce
ss
3) Commit speculation
Time
Spec
![Page 8: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/8.jpg)
8
Speculation Failure
Undo log
Checkpoint
1) System call 2) Create speculation
Proce
ss
3) Fail speculation
Proce
ss
Time
Spec
![Page 9: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/9.jpg)
9
Ensuring Correctness
• Spec processes often affect external state
• Three ways to ensure correct execution– Block– Buffer– Propagate speculations (dependencies)
![Page 10: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/10.jpg)
10
Systems Calls• Block calls that externalize state
– Allow read-only calls (e.g. getpid)– Allow calls that modify only task state (e.g. dup2)
• File system calls -- need to dig deeper– Mark file systems that support Speculator
getpid
reboot
mkdir
Call sys_getpid()
Block until specs resolved
Allow only if fs supports Speculator
![Page 11: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/11.jpg)
11
Output Commits
“stat worked”
“mkdir worked”
Undo log
Checkpoint
Checkpoint
Spec(stat)
Spec(mkdir)
1) sys_stat 2) sys_mkdir
Proce
ss
Time
3) Commit speculation
![Page 12: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/12.jpg)
12
Multi-Process Speculation
• Processes often cooperate– Example: “make” forks children to compile, link, etc.– Would block if speculation limited to one task
• Allow kernel objects to have speculative state– Examples: inodes, signals, pipes, Unix sockets, etc.– Propagate dependencies among objects– Objects rolled back to prior states when specs fail
![Page 13: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/13.jpg)
13
Spec 1Spec 1
Multi-Process Speculation
Spec 2
pid 8001
Checkpoint
Checkpoint
inode 3456
Chown-1
Write-1
pid 8000
CheckpointCheckpoint
Checkpoint
Chown-1
Write-1
![Page 14: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/14.jpg)
14
Multi-Process Speculation
• What we handle:– DFS objects, RAMFS, Ext3, Pipes & FIFOs– Unix Sockets, Signals, Fork & Exit
• What we don’t (i.e. we block)– System V IPC– Multi-process write-shared memory
![Page 15: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/15.jpg)
15
Outline
• Motivation
• Implementing speculation
• Multi-process speculation
• Using Speculator
• Evaluation
![Page 16: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/16.jpg)
16
Example: NFSv3 LinuxClient 1 Client 2Server
Open BGetattr
Modify BWrite
Commit
![Page 17: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/17.jpg)
17
Example: SpecNFS
Modify B
speculate
Getattr
Open Bspeculate
Open BGetattrspeculate
Write+Commit
Client 1 Client 2Server
![Page 18: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/18.jpg)
18
Problem: Mutating Operations
• bar depends on cat foo
• What does client 2 view in bar?
Client 1
1. cat foo > bar
Client 2
2. cat bar
![Page 19: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/19.jpg)
19
Solution: Mutating Operations• Server determines speculation success/failure
– State at server never speculative
• Send server hypothesis speculation based on– List of speculations an operation depends on
• Requires server to track failed speculations
• Requires in-order processing of messages
![Page 20: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/20.jpg)
20
Group Commit
• Previously sequential ops now concurrent
• Sync ops usually committed to disk
• Speculator makes group commit possible
write
writecommit
commit
ClientClient Server Server
![Page 21: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/21.jpg)
21
Putting it all Together: SpecNFS
• Apply Speculator to an existing file system
• Modified NFSv3 in Linux 2.4 kernel– Same RPCs issued (but many now asynchronous)– SpecNFS has same consistency, safety as NFS– Getattr, lookup, access speculate if data in cache– Create, mkdir, commit, etc. always speculate
![Page 22: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/22.jpg)
22
Putting it all Together: BlueFS• Design a new file system for Speculator
– Single copy semantics– Synchronous I/O
• Each file, directory, etc. has version number– Incremented on each mutating op (e.g. on write)– Checked prior to all operations.– Many ops speculate and check version async
![Page 23: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/23.jpg)
23
Outline
• Motivation
• Implementing speculation
• Multi-process speculation
• Using Speculator
• Evaluation
![Page 24: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/24.jpg)
24
Apache Benchmark
• SpecNFS up to 14 times faster
0
50
100
150
200
250
300
No delay
Tim
e (s
eco
nd
s)
NFS
SpecNFS
BlueFS
ext3
0
500
1000
1500
2000
2500
3000
3500
4000
4500
30 ms delay
![Page 25: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/25.jpg)
25
The Cost of Rollback
• All files out of date SpecNFS up to 11x faster
0
20
40
60
80
100
120
140
NFS SpecNFS ext3
No delay
Tim
e (s
eco
nd
s)
0
200
400
600
800
1000
1200
1400
1600
1800
2000
NFS SpecNFS ext3
30ms delay
No files invalid10% files invalid
50% files invalid100% files invalid
![Page 26: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/26.jpg)
26
Conclusion
• Speculator greatly improves performance of existing distributed file systems
• Speculator enables new file systems to be safe, consistent and fast
![Page 27: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/27.jpg)
27
Group Commit & Sharing State
050
100150200250300350400450500
NFS SpecNFS BlueFS
0 ms delay
Tim
e (s
eco
nd
s)
0
500
1000
1500
2000
2500
3000
3500
4000
4500
NFS SpecNFS BlueFS
30ms delay
Default
No prop
No grp commit
No grp commit & no prop
![Page 28: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/28.jpg)
28
Apache Benchmark
0
50
100
150
200
250
300
No delay
Tim
e (s
eco
nd
s)
0
500
1000
1500
2000
2500
3000
3500
4000
4500
30ms delay
Remove Make
Configure Untar
![Page 29: Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649f505503460f94c7234c/html5/thumbnails/29.jpg)
29
Related Work• Chang & Gibson, Fraser & Chang
– Speculative pre-fetching
• Time Warp– Virtual Time: distributed simulations
• Hardware branch prediction
• Transactional file systems