prophecy : using history for high-throughput fault tolerance
DESCRIPTION
Prophecy : Using History for High-Throughput Fault Tolerance. Siddhartha Sen Joint work with Wyatt Lloyd and Mike Freedman Princeton University. Non-crash failures happen. Model as Byzantine (malicious). Mask Byzantine faults. Service. Clients. Mask Byzantine faults. Throughput. Clients. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/1.jpg)
Prophecy: Using History for High-Throughput Fault Tolerance
Siddhartha Sen
Joint work with Wyatt Lloyd and Mike Freedman
Princeton University
![Page 2: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/2.jpg)
Non-crash failures happen
Model as Byzantine (malicious)
![Page 3: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/3.jpg)
Mask Byzantine faults
Clients Service
![Page 4: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/4.jpg)
Mask Byzantine faults
Replicated service
Clients
Throughput
![Page 5: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/5.jpg)
Mask Byzantine faults
Replicated service
Clients
Throughput
Linearizability(strong consistency)
![Page 6: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/6.jpg)
Byzantine fault tolerance (BFT)
• Low throughput
• Modifies clients
• Long-lived sessions
![Page 7: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/7.jpg)
Prophecy
• High throughput + good consistency
• No free lunch:– Read-mostly workloads– Slightly weakened consistency
![Page 8: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/8.jpg)
Byzantine fault tolerance (BFT)
• Low throughput
• Modifies clients
• Long-lived sessions
D-ProphecyD-Prophecy
ProphecyProphecy
![Page 9: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/9.jpg)
Traditional BFT reads
Clients
Replica Group
application
Agree?
![Page 10: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/10.jpg)
A cache solution
Clients
Replica Group
applicationcache
Agree?
![Page 11: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/11.jpg)
A cache solution
Clients
Replica Group
applicationcache
Agree?Problems:
• Huge cache• Invalidation
![Page 12: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/12.jpg)
A compact cache
Clients
Replica Group
applicationcache
Requests Responsesreq1 resp1req2 resp2req3 resp3
![Page 13: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/13.jpg)
A compact cache
Clients
Replica Group
applicationcache
Requests Responsessketch(req1) sketch(resp1)sketch(req2) sketch(resp2)sketch(req3) sketch(resp3)
Requests Responses
![Page 14: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/14.jpg)
A sketcher
Clients
Replica Group
applicationsketcher
![Page 15: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/15.jpg)
A sketcher
Clients
Replica Group
…………
…………
…………
sketch webpage
![Page 16: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/16.jpg)
Executing a read
Clients
Replica Group
…………
……………………
…………Agree?
Fast, load-balanced reads
sketch webpage
![Page 17: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/17.jpg)
Executing a read
Clients
Replica Group
…………
………………
……
…………Agree?
sketch webpage
![Page 18: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/18.jpg)
Executing a read
Clients
Replica Group
…………
…………
…………
sketch webpage
key-value store
replicated state machine
![Page 19: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/19.jpg)
Executing a read
Clients
Replica Group
…………
…………
…………Agree?
sketch webpage
…………
Maintain a fresh cache
![Page 20: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/20.jpg)
NO!
Did we achieve linearizability?
![Page 21: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/21.jpg)
Executing a read
Clients
Replica Group
…………
…………
…………
sketch webpage
…………
![Page 22: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/22.jpg)
Executing a read
Clients
Replica Group
…………
…………
…………Agree?
sketch webpage
…………
![Page 23: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/23.jpg)
Executing a read
Clients
Replica Group
…………
…………
…………Agree?
sketch webpage
…………
Fast reads may be stale
![Page 24: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/24.jpg)
Load balancing
Clients
Replica Group
…………
…………
…………Agree?
sketch webpage
…………
Pr(k stale) = gk
![Page 25: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/25.jpg)
Traditional BFT:• Each replica executes read• Linearizability
D-Prophecy:• One replica executes read• “Delay-once” linearizability
D-Prophecy vs. BFT
Clients
Replica Group
![Page 26: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/26.jpg)
Byzantine fault tolerance (BFT)
• Low throughput
• Modifies clients
• Long-lived sessions
D-ProphecyD-Prophecy
ProphecyProphecy
![Page 27: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/27.jpg)
Key-exchange overhead
11%
3%
![Page 28: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/28.jpg)
Internet services
Clients
Replica Group
![Page 29: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/29.jpg)
Sketcher
A proxy solution
Clients
Replica Group
Proxy
Consolidate sketchers
![Page 30: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/30.jpg)
Sketcher
Clients
Replica Group
Trusted
A proxy solution
Sketcher must be fail-stop
![Page 31: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/31.jpg)
Sketcher must be fail-stop
Sketcher
Clients
Replica Group
Trusted
A proxy solution
• Trust middlebox already• Small and simple
![Page 32: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/32.jpg)
…………
…………
Sketcher
…………
Executing a read
Clients
Replica Group
Trusted
q
…………
…………
…………
Fast, load-balanced reads
Prophecy
![Page 33: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/33.jpg)
Prophecy
Sketcher
Clients
Replica Group
Trusted
……………………
…………
…………
…………
…………
…………
Fast reads may be stale
…………
![Page 34: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/34.jpg)
Delay-once linearizability
![Page 35: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/35.jpg)
W, R, W, W, R, R, W, R
Delay-once linearizability
Read-after-write property
![Page 36: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/36.jpg)
W, R, W, W, R, R, W, R
Delay-once linearizability
Read-after-write property
![Page 37: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/37.jpg)
Example application
• Upload embarrassing photos1. Remove colleagues
from ACL 2. Upload photos3. (Refresh)
• Weak may reorder
• Delay-once preserves order
![Page 38: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/38.jpg)
Byzantine fault tolerance (BFT)
• Low throughput
• Modifies clients
• Long-lived sessions
D-ProphecyD-Prophecy
ProphecyProphecy
![Page 39: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/39.jpg)
Implementation
• Modified PBFT– PBFT is stable, complete– Competitive with Zyzzyva et. al.
• C++, Tamer async I/O– Sketcher: 2000 LOC– PBFT library: 1140 LOC– PBFT client: 1000 LOC
![Page 40: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/40.jpg)
Evaluation
• Prophecy vs. proxied-PBFT– Proxied systems
• D-Prophecy vs. PBFT– Non-proxied systems
![Page 41: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/41.jpg)
Evaluation
• Prophecy vs. proxied-PBFT– Proxied systems
• We will study:– Performance on “null” workloads– Performance with real replicated service– Where system bottlenecks, how to scale
![Page 42: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/42.jpg)
Basic setup
Sketcher
Clients (100)
Replica Group (PBFT)
(concurrent)
![Page 43: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/43.jpg)
Fraction of failed fast reads
Alexa top sites:< 15%
![Page 44: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/44.jpg)
Small benefit on null reads
![Page 45: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/45.jpg)
Apache webserver setup
Sketcher
Clients
Replica Group
![Page 46: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/46.jpg)
Large benefit on real workload
3.7x
2.0x
![Page 47: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/47.jpg)
Benefit grows with work94s (Apache)
Null workloads are misleading!
![Page 48: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/48.jpg)
Benefit grows with work
![Page 49: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/49.jpg)
Single sketcher bottlenecks
![Page 50: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/50.jpg)
Scaling out
![Page 51: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/51.jpg)
Scales linearly with replicas
![Page 52: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/52.jpg)
Summary• Prophecy good for Internet services– Fast, load-balanced reads
• D-Prophecy good for traditional services
• Prophecy scales linearly while PBFT stays flat
• Limitations:– Read-mostly workloads (meas. study corroborates)– Delay-once linearizability (useful for many apps)
![Page 53: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/53.jpg)
Thank You
![Page 54: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/54.jpg)
Additional slides
![Page 55: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/55.jpg)
Transitions
• Prophecy good for read-mostly workloads
• Are transitions rare in practice?
![Page 56: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/56.jpg)
Measurement study
• Alexa top sites
• Access main page every 20 sec for 24 hrs
![Page 57: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/57.jpg)
Mostly static content
![Page 58: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/58.jpg)
Mostly static content15%
![Page 59: Prophecy : Using History for High-Throughput Fault Tolerance](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681598b550346895dc6cf4e/html5/thumbnails/59.jpg)
Dynamic content
• Rabin fingerprinting on transitions
• 43% differ by single contiguous change
• Sampled 4000 of them, over half due to:– Load balancing directives– Random IDs in links, function parameters