designing resilient application platforms with apache cassandra - hayato shimizu (datastax)
DESCRIPTION
Presented at JAX London 2013 All too often I have observed infrastructure designs for deploying Java applications come as an afterthought by businesses, technical analysts, and application developers. Choices of technologies are frequently made with no final deployment infrastructures being discussed. The talk will cover the design considerations on building resilient applications, and application deployment platforms across multiple data centres, and how organisations can leverage technologies such as Apache Cassandra to achieve this.TRANSCRIPT
Building Highly Available Services Using Cassandra
USE jax_london;!SELECT * FROM presenters WHERE name = ‘Hayato Shimizu’;! name | title | company | area!----------------+---------------------+----------+------! Hayato Shimizu | Solutions Architect | DataStax | EMEA!
Apache Cassandra
• Created by Avinash Lakshman and Prashant Malik at Facebook • Amazon Dynamo + Google BigTable • Highly distributed database with data replication for redundancy • Active-Active Multi DC, master-less design – no single point of failure • High throughput! • Linearly scalable – volume, throughput • Used by many mission critical applications and services • 2.0 is out!
©2013 DataStax Confidential. Do not distribute without consent. 2
C* Architecture – Data Replication
©2013 DataStax Confidential. Do not distribute without consent. 3
• Token Range 0 -> 2127-1 in Ring Formation • Consistent Hashing Algorithm • Replica nodes in clockwise
C* Architecture - No Single Point of Failure
©2013 DataStax Confidential. Do not distribute without consent. 4
• Client Load Balances • Do not use a hardware LB
C* Architecture - Multi DC Replication
©2013 DataStax Confidential. Do not distribute without consent. 5
C* Architecture – Data Consistency
• C* offers TUNABLE consistency • Client decides consistency per query • ANY, ONE, TWO, THREE, QUORUM, LOCAL_QUORUM,
EACH_QUORUM, ALL • QUORUM = (replication_factor / 2 ) + 1 • Replication Factor = 3 can maintain Quorum with tolerance of 1 node
failure
©2013 DataStax Confidential. Do not distribute without consent. 6
Setting Up Cassandra for Multi DC On each node – edit the following file:
conf/cassandra-rackdc.properties
With the following entry: dc=DC1
rack=RACK1
On each node – edit the following file: conf/cassandra.yaml
With the following entry: endpoint_snitch: GossipingPropertyFileSnitch
Create keyspace:
CREATE KEYSPACE new_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 3, 'DC2' : 3};
©2013 DataStax Confidential. Do not distribute without consent. 7
C* Architecture – Data Centre Configuration
©2013 DataStax Confidential. Do not distribute without consent. 8
Data 1 Data 1
Data 1
Data 2
Data 2 Data 2 Data 3
Data 3
Data 3
Data 4
Data 4 Data 4
Cassandra Architecture - Writes
©2013 DataStax Confidential. Do not distribute without consent. 9
INSERT INTO…
Commit log
memtable
SSTable
Tooling
• Cassandra Download (http://planetcassandra.org) • DataStax Enterprise Download (http://www.datastax.com/download) • DataStax JAVA Driver (http://github.com/datastax/java-driver) • DataStax DevCenter (http://www.datastax.com)
©2013 DataStax Confidential. Do not distribute without consent. 10
Building
Highly Available Services
©2013 DataStax Confidential. Do not distribute without consent. 11
Single Data Centre
• Resiliency through C* Data Replication
©2013 DataStax Confidential. Do not distribute without consent.
12
Multi DC – Active/Passive
• Wasteful • Do you test this? Does it actually work when it fails over? • What is the decision point for failing over? • Do you try and fix your problem in the active DC? • Is it a manual process? • How long does it take to failover to passive DC? • How many people and which departments will need to be involved? • Incident managers?
©2013 DataStax Confidential. Do not distribute without consent. 13
Active-Everywhere is the Norm
Datacenter
Cloud
Source: (http://www.datastax.com/resources/whitepapers/bigdata) ©2013 DataStax Confidential. Do not distribute without consent. 14
Design Considerations - Active-Everywhere DC Strategies
• 24 x 7 services are what businesses and consumers now expect • Service failure costs money and reputational damage • 99.999+% service up time?
• Data Replication Strategies • Consistent data replication across all DCs • Eventually consistent replication across DCs
©2013 DataStax Confidential. Do not distribute without consent. 15
Design Considerations - Data Replication Strategies
• Latency is not going away – embrace it • Possible Solutions
• Sharded users • Full data consistency in all DCs • Eventually consistency to other DCs
©2013 DataStax Confidential. Do not distribute without consent. 16
Design Considerations - Full Consistency Across All Data Centres
• Does your service really require this? • Performance considerations • Think about your service usage patterns • Failure scenarios
• WAN Link failure
©2013 DataStax Confidential. Do not distribute without consent. 17
Design Considerations - Eventual Consistency Across DCs
• Identify data access patterns for each service • Data access patterns
• Write-Only • Read-Only • Mixture of both • Access frequency
©2013 DataStax Confidential. Do not distribute without consent. 18
Design Considerations - Failure Scenarios
• Data centre total failure – natural disaster, power, etc • Network storm • Network kit firmware upgrade failure • SAN Upgrades – wrong Fibre Channel cable pulled out • WAN link failure • Service dependency failure • Etc, etc
• Failure probabilities - do your maths!
©2013 DataStax Confidential. Do not distribute without consent. 19
User Session Persistence to One DC
©2013 DataStax Confidential. Do not distribute without consent. 20
DC1
C*
Service
Session 1
DC2
C*
Service
Session 2
Async Replication
DC Session Persistence Technique 1
• GTM – Global Traffic Management • DNS based solution • Hardware / SaaS solutions • Traffic weighting for each DC • Persistence guaranteed in private network using hardware • Internet facing slightly more difficult – DNS RFC spec
©2013 DataStax Confidential. Do not distribute without consent. 21
DC Session Persistence Technique 2
• A famous company providing edge based load balancing • Users connect via their service • Cookie / query string based
©2013 DataStax Confidential. Do not distribute without consent. 22
DC1 DC2
Edge Load Balancer https https
async
DC Session Persistence Technique 3
©2013 DataStax Confidential. Do not distribute without consent. 23
Application Tier Resilience
• Make it fault tolerant – stateless. • Make it horizontally scalable • Load balancer stickiness – really? • Use C* to store sessions - sessions will recover in a DR scenario.
©2013 DataStax Confidential. Do not distribute without consent. 24
App Tier App Tier
Session1 Session1
App Tier App Tier
Session1
Cassandra Replication Session1
Seamless Application Releases & System Maintenance
• 99.999+% SLA includes maintenance! • C* rolling upgrades • Kernel patching etc
• Schema Changes – C* will help • Code should now handle the data structure versions • Code deployment - statelessness will help here again!
©2013 DataStax Confidential. Do not distribute without consent. 25
Business Intelligence
©2013 DataStax Confidential. Do not distribute without consent. 26
Embracing the Cloud
• High demand can kill your service – make it scalable • Bursting into the cloud for peak load • Flexible provisioning model • DR on the cheap
©2013 DataStax Confidential. Do not distribute without consent. 27
Conclusions
©2013 DataStax Confidential. Do not distribute without consent. 28
Conclusion
• Developers - think about your infrastructure. Don’t just leave it to the Ops or DevOps teams.
• Ops / DevOps Engineers – think about the application and learn how they work.
• Collaborate with each other. • Building out resilient infrastructure is not that hard. Just requires some
thoughts, communications, and execution. • Think about scale. • Keep IT Simple • Use great tools like Cassandra
©2013 DataStax Confidential. Do not distribute without consent. 29
Thank You
Twitter @hayato_shimizu @datastaxEU @planetcassandra @cassandraEUROPE
©2013 DataStax Confidential. Do not distribute without consent. 30
Downloads http://planetcassandra.org http://www.datastax.com http://cassandra.apache.org
Thank You
Q&A
©2013 DataStax Confidential. Do not distribute without consent. 31