cap theorem, eventual consistency, and crdt'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf ·...

25
CAP Theorem, Eventual Consistency, and CRDT’s Distributed Systems Spring 2019 Lecture 16

Upload: others

Post on 15-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

CAP Theorem, Eventual Consistency, andCRDT’s

Distributed Systems Spring 2019

Lecture 16

Page 2: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Agenda

• Last time: How consistency models can be implemented• Primary-based methods

• CAP theorem and its implications• Most famous observation made about distributed systems in

the last 2 decades• Eventual Consistency• CRDT’s

Attend talk at 2.30 in Luddy 3166Literally black-magic with wireless signals (like WiFi)

Page 3: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Recap: Primary based protocol

Data store

Primary serverfor item x

Client Client

Backup server

W1. Write requestW2. Forward request to primaryW3. Tell backups to updateW4. Acknowledge updateW5. Acknowledge write completed

W1

W2

W3 W3

W3

W4 W4

W4

W5

R1. Read requestR2. Response to read

R1 R2

Page 4: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Tradeoffs In Consistency Models

• Sequential Consistency: Blocking writes via multicasting• Eventual Consistency: Non-blocking writes lazily propagated• What if we cannot write to a replica?

• Perhaps due to Network or hardware failure• Should the write operation continue?

• For strong consistency, we cannot allow write to proceed• System becomes “unavailable” to the client

Page 5: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

CAP Theorem

C, A, and P:

• Consistency: Strong Consistency (Linearizability)• Availability: Clients can always perform operations• Partition Tolerance: If some replicas are unreachable, systemfunction should not be compromised

CAP theorem: Pick any two*With a few important caveats!

Page 6: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

CAP Theorem

Page 7: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

CAP Theorem Details

• Partition tolerance usually cannot be sacrificed• But, dont need to sacrifice either C or A when there are nopartitions.• Partitions usually detected via timeouts• System can enter “C” or “A” mode if partition detected

• Cancel operation and have reduced availability• Continue operation and risk inconsistency

Page 8: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Partitions

Page 9: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Dealing with Partitions

• No global view of partitions: some replicas may beunreachable from certain sources• Can have a “one-sided” partition

• Offline-mode: long periods of unavailability• Mask unavailability: log operations and replay later

• Credit-card machines• “Merge” paritions using version vectors, or last-write-wins

• Source control systems typically use this• Prohibit “risky” operations (such as deletes)

Page 10: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Compensating for Mistakes

• Mistakes made during partitions can be fixed in many ways• Airline overbooking: compensation is literal $• Amazon shopping carts: Union of two carts. Deleted itemsmay reappear, and customer manually corrects final cart, orescalates to customer service• ATMs: Withdrawals can proceed even when partitioned. Butbanks place a hard-limit on withdrawals to limit the risk.

Page 11: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

PACELC

• If there is a Partition, how does the system tradeoff A and C• Else, how does it tradeoff latency and C• Generalization of CAP• Latency differences between synchronous and asyncoperations, location of primary, etc.

Page 12: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Eventual Consistency In a Nutshell

Page 13: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Eventual Consistency

• If no additional updates are made to a given data item, allreads to that item will eventually return the same value.• No guarantees about when the replicas will converge• Usually implemented through async writebacks• Conflicts merged/resolved asynchronously in the background• On conflict: Arbitrate or roll-back

What are the safety and liveness properties?

Page 14: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

More Eventual Consistency

• Probabilistically Bounded Staleness• New approaches can provide stricter consistency guaranteeson top of eventually consistent stores• COPS, Eiger, Bolt-on causal Consistency, ...

Page 15: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Strong Eventual Consistency

• Two nodes can receive the same set of updates in a differentorder• Their view of the shared state is identical

• No Conflicts!• Any conflicting updates are merged automatically

• Possible to implement without centralized server/leader• Local execution can proceed without waiting for otherprocesses• Intermittent connectivity useful for mobile devices, offlineoperations, etc

Page 16: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

CRDTs

• (Conflict-free/Commutative/Convergent) Replicated DataTypes• Key idea: Use commutative operations for state convergence• Provably converge to a sequentially consistent result• States should form a “semi-lattice”

• Partial order, and Least Upper Bound always exists• Values of different replicas merged using lattice-join operation• Can be implemented with Last-writer-wins• CRDT’s allow using values and not causal-histories to

merge• State vs. Operation-based CRDT

Page 17: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Event counting with CRDT’s

• Simple growth counter (increment operations arecommutative)• All writes (updates) made only to local copy/version• Each replica maintains a vector of values V [1..N]

• Similar to vector clock timestamps• Value of the counter is sum of all local values (

∑V )

• Merge is pairwise max of the two vectors

Page 18: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Event counting with CRDT’s

• Simple growth counter (increment operations arecommutative)• All writes (updates) made only to local copy/version• Each replica maintains a vector of values V [1..N]

• Similar to vector clock timestamps• Value of the counter is sum of all local values (

∑V )

• Merge is pairwise max of the two vectors

Page 19: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

General Counters with CRDT’s

• CRDT’s work when the states are monotonically increasing• Value of object itself used for merging, no timestamps needed• What if we want to support increment and decrement ops?

• State (value) can decrease, and hence not monotonic!

• Maintain two monotonically increasing counters• One (P) for increments, and another (N) for decrements• Value is

∑P −

∑N

Page 20: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

General Counters with CRDT’s

• CRDT’s work when the states are monotonically increasing• Value of object itself used for merging, no timestamps needed• What if we want to support increment and decrement ops?

• State (value) can decrease, and hence not monotonic!

• Maintain two monotonically increasing counters• One (P) for increments, and another (N) for decrements• Value is

∑P −

∑N

Page 21: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Sets with CRDT’s

• Parition-tolerant sets: One each for added and deleted items• Actual set is Added−Deleted.• Can prune sets by applying delete operations, when system isunpartitioned

• Deep result: Combining two CRDT’s results is still a CRDT• CRDT’s can be combined to form complex CRDT’s

• Sets, maps, counters, graphs, sequences• JSON objects

• Popular usecase: Collaborative document editing

Page 22: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Sets with CRDT’s

• Parition-tolerant sets: One each for added and deleted items• Actual set is Added−Deleted.• Can prune sets by applying delete operations, when system isunpartitioned

• Deep result: Combining two CRDT’s results is still a CRDT• CRDT’s can be combined to form complex CRDT’s

• Sets, maps, counters, graphs, sequences• JSON objects

• Popular usecase: Collaborative document editing

Page 23: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

Ordered Lists with CRDT’s

• Attach a unique location-id=(index, node-id) to each element• “Hello” → H:0a, e:1a, l:2a,...

• Operation-based: Add/delete operations specified relative tolocation• Insert “!” after 4a

• Merge lists by replaying operations• If conflict, skip over existing list elements with greater ID

Page 24: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

References

• https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed• Amazon Dynamo paper• Werner Vogels:https://www.infoq.com/presentations/availability-consistency• https://jvns.ca/blog/2016/11/19/a-critique-of-the-cap-theorem/• Martin Kleppmann. A Critique of the CAP Theorem• CRDT papers by Marc Shapiro et.al.

Page 25: CAP Theorem, Eventual Consistency, and CRDT'shomes.sice.indiana.edu/prateeks/ds/lec16.pdf · Recap: Primarybasedprotocol Data store Primary server for item x Client Client Backup

END

Please attend the talk at 2.30 inLuddy 3166