Download - Fault Tolerance in Cassandra
![Page 1: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/1.jpg)
Richard Low
[email protected]@acunu
@richardalow
Cassandra London Meetup, 5 Sept 2011
Fault tolerance in Cassandra
Tuesday, 6 September 2011
![Page 2: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/2.jpg)
Menu
• Failure modes
• Maintaining availability
• Recovery
Tuesday, 6 September 2011
![Page 3: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/3.jpg)
Failure modes
Tuesday, 6 September 2011
![Page 4: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/4.jpg)
Failures are the norm
• With more than a few nodes, something goes wrong all the time
• Don’t want to be down all the time
Tuesday, 6 September 2011
![Page 5: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/5.jpg)
Failure causes
• Hardware failure
• Bug
• Power
• Natural disaster
Tuesday, 6 September 2011
![Page 6: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/6.jpg)
Failure modes
• Data centre failure
• Node failure
• Disk failure
Tuesday, 6 September 2011
![Page 7: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/7.jpg)
Failure modes
• Data centre failure
• Node failure
• Disk failure
• Temporary
• Permanent
Tuesday, 6 September 2011
![Page 8: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/8.jpg)
Failure modes
• Network failure
• One node
• Network partition
• Whole data centre
Tuesday, 6 September 2011
![Page 9: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/9.jpg)
Failure modes
• Operator failure
• Delete files
• Delete entire database
• Incorrect configuration
Tuesday, 6 September 2011
![Page 10: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/10.jpg)
Failure modes
• Want a system that can tolerate all the above failures
• Make assumptions about probabilities of multiple events
• Be careful when assuming independence
Tuesday, 6 September 2011
![Page 11: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/11.jpg)
Solutions
• Do nothing
• Make boxes bullet proof
• Replication
Tuesday, 6 September 2011
![Page 12: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/12.jpg)
AvailabilityTuesday, 6 September 2011
![Page 13: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/13.jpg)
How maintain availability in the
presence of failure?
Tuesday, 6 September 2011
![Page 14: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/14.jpg)
Replication
• Buy cheap nodes and cheap disks
• Store multiple copies of the data
• Don’t care if some disappear
Tuesday, 6 September 2011
![Page 15: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/15.jpg)
Replication
• What about consistency?
• What if I can’t tolerate out-of-date reads?
• How restore a replica?
Tuesday, 6 September 2011
![Page 16: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/16.jpg)
RF and CL
• Replication factor
• How many copies
• How much failure can tolerate
• Consistency Level
• How many nodes must be contactable for operation to succeed
Tuesday, 6 September 2011
![Page 17: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/17.jpg)
Simple example
• Replication factor 3
• Uniform network topology
• Read and write at CL.QUORUM
• Strong consistency
• Available if any one node is down
• Can recover if any two nodes fail
Tuesday, 6 September 2011
![Page 18: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/18.jpg)
In general
• RF N, reads and writes at CL.QUORUM
• Available if ceil(N/2)-1 nodes fail
• Can recover if N-1 nodes fail
Tuesday, 6 September 2011
![Page 19: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/19.jpg)
Multi data centre
• Cassandra knows location of hosts
• Through the snitch
• Can ensure replicas in each DC
• NetworkTopologyStrategy
• => can cope with whole DC failure
Tuesday, 6 September 2011
![Page 20: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/20.jpg)
RecoveryTuesday, 6 September 2011
![Page 21: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/21.jpg)
Recovery
• Want to maintain replication factor
• Ensures recovery guarantees
• Methods:
• Automatic
• Manual
Tuesday, 6 September 2011
![Page 22: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/22.jpg)
Automatic
Tuesday, 6 September 2011
![Page 23: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/23.jpg)
Automatic processes
• Eventually moves replicas towards consistency
• The ‘eventual’ in ‘eventual consistency’
Tuesday, 6 September 2011
![Page 24: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/24.jpg)
Hinted Handoff
• Hints
• Stored on any node
• When a node is temporarily unavailable
• Delivered when the node comes back
• Can use CL.ANY
• Writes not immediately readable
Tuesday, 6 September 2011
![Page 25: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/25.jpg)
Read Repair
• Since done a read, might as well repair any old copies
• Compare values, update any out of sync
Tuesday, 6 September 2011
![Page 26: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/26.jpg)
Manual
Tuesday, 6 September 2011
![Page 27: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/27.jpg)
Repair: method
• Ensures a node is up to date
• Run ‘nodetool -h <node> repair’
• Reads through entire data on the node
• Builds a Merkel tree
• Compares with replicas
• Streams differences
Tuesday, 6 September 2011
![Page 28: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/28.jpg)
Repair: when
• After node has been down a long time
• After increasing replication factor
• Every 10 days to ensure tombstones are propagated
• Can be used to restore a failed node
Tuesday, 6 September 2011
![Page 29: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/29.jpg)
Replace a node: method
• Bootstrap new node with <old_token>-1
• Tell existing nodes old node is dead
• nodetool remove
Tuesday, 6 September 2011
![Page 30: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/30.jpg)
Replace a node: when
• Complete node failure
• Cannot replace failed disk
• Corruption
Tuesday, 6 September 2011
![Page 31: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/31.jpg)
Restore from backup: method
• Stop Cassandra on the node
• Copy SSTables from backup
• Restart Cassandra
• Make take a while reading indexes
Tuesday, 6 September 2011
![Page 32: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/32.jpg)
Restore from backup: when
• Disk failure
• with no RAID rebuild available
• Operator error
• Corruption
• Hacker
Tuesday, 6 September 2011
![Page 33: Fault Tolerance in Cassandra](https://reader034.vdocuments.us/reader034/viewer/2022042501/54c385504a79593a698b45bd/html5/thumbnails/33.jpg)
Thanks :)
@acunu@richardalow
Tuesday, 6 September 2011