apache cassandra: building a production app on an eventually-consistent db
TRANSCRIPT
Apache Cassandra:building a production app on an
eventually-consistent DB
Oliver Lockwood
Prague, 20-21 October 2016
Agenda• Brief introduction to Cassandra• Gotchas when using an eventually-consistent DB• Performing DB schema and data evolution in Cassandra for a
production app
Oliver Lockwood Prague, 20-21 October 2016
Introduction to CassandraWhat it is, and what it’s good for
• NoSQL database• Distributed architecture with no “master” – highly scalable and
resilient• Write-optimised• Eventual consistency
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dbas-guide-to-nosql
Introduction to CassandraHow storage, reads, writes and conflict resolution work
• Replication factor = how many copies
• Replication strategy determines storage location
• Contact points used initially• Client connection is to cluster• Co-ordinator could be any node
(based on load balancing policy)• Storage is independent of co-
ordinator• Last Write Wins for conflicts
Oliver Lockwood Prague, 20-21 October 2016
http://www.slideshare.net/DataStax/understanding-data-consistency-in-apache-cassandra
Client
Client 2
Introduction to CassandraWhat it’s not good for
Oliver Lockwood Prague, 20-21 October 2016
http://planetcassandra.org/blog/flite-breaking-down-the-cql-where-clause/
GotchasLessons we learned the hard way
• Distributable nature of Cassandra dependson synchronized clocks
• What happens if clocks drift?• INSERT, DELETE, READ from a single client.• What if Node 3’s clock is slow?
Oliver Lockwood Prague, 20-21 October 2016
https://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-1-the-problem/http://datascale.io/how-to-create-a-cassandra-cluster-in-aws/
Client
(1) INSERT
(2) DELETE
GotchasLessons we learned the hard way
Demo!
Oliver Lockwood Prague, 20-21 October 2016
http://stackoverflow.com/questions/17474830/configuring-cassandra-with-private-ip-for-internode-communicationshttps://github.com/oliverlockwood/aws-ansible-cassandra
GotchasLessons we learned the hard way - resolution
• Node 3’s clock is slow• Use client-side timestamps?
CQL protocol v3 supports this.• Avoid time-sensitive query patterns
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dev/blog/java-driver-2-1-2-native-protocol-v3
Client
(1) INSERT
(2) DELETE
Schema evolution in CassandraIntroduction
• DB schemas evolve – accept it!• Automation is better than manual processes• For RDBMS: Flyway, Liquibase etc.
• For Cassandra… … cqlmigrate!
Oliver Lockwood Prague, 20-21 October 2016
https://flywaydb.org/http://www.liquibase.org/
Schema evolution in CassandraIntroducing cqlmigrate
Oliver Lockwood Prague, 20-21 October 2016
https://github.com/sky-uk/cqlmigratehttp://developers.sky.com/internal/ovp/cassandra/schema/evolution/2016/07/05/cqlmigrate/
Schema evolution in CassandraDiving deeper into cqlmigrate
• Schema update operations are recorded, so each CQL file is applied only once
• Locking mechanism uses LWT to avoid race conditions
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/past/03F/notes/paxos-simple.pdf
Schema evolution in CassandraDiving deeper into cqlmigrate
Demo!
Oliver Lockwood Prague, 20-21 October 2016
https://github.com/oliverlockwood/cqlmigrate-example-app
In conclusionTakeaway menu
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/past/03F/notes/paxos-simple.pdf