ABOUT NETFLIX!
NETFLIX
ACTIVE - ACTIVE!
WHAT IS ACTIVE ACTIVE?
Also called dual active, it is a phrase used to describe a network of independent processing nodes where each node has access to replicated database. Traffic intended for a failed node is either passed onto an existing node or load balanced across the remaining nodes.
WHY ACTIVE-ACTIVE ?!
ENTERPISE IT SOLUTIONS
WEB SCALE CLOUD
SOLUTIONS
RAPID SCALING
HIGH AVAILABILITY
DOES AN INSTANCE FAIL?!• It can, plan for it!• Bad code / configuration pushes!• Latent issues!• Hardware failure!• Test with Chaos Monkey!
DOES A ZONE FAIL?!• Rarely, but happened before!• Routing issues!• DC-specific issues!• App-specific issues within a zone!• Test with Chaos Gorilla!
DOES A REGION FAIL?!
• Full region – unlikely, very rare!• Individual Services can fail region-wide!• Most likely, a region-wide configuration issues!• Test with Chaos Kong!
EVERYTHING FAILS… EVENTUALLY!
• Keep your services running by embracing isolation and redundancy!
• Construct a highly agile and highly available service from ephemeral and assumed broken components!
ISOLATION!
• Changes in one region should not affect others!• Regional outage should not affect others!• Network partitioning between regions should not affect
functionality / operations!
REDUNDANCY!
• Make more than one (of pretty much everything)!• Specifically, distribute services across Availability
Zones and regions!
HISTORY: X-MAS EVE 2012!
• Netflix multi-hour outage!• US-East1 regional Elastic Load Balancing issue!!• “...data was deleted by a maintenance process
that was inadvertently run against the production ELB state data”!
ACTIVE-ACTIVE ARCHITECTURE!
THE PROCESS!
IDENTIFYING CLUSTERS FOR AA !
SNITCH CHANGES!
EC2Snitch! EC2MultiRegionSnitch!
Uses Private IPs! Uses Public IPs!
PRIAM.MULTIREGION.ENABLE =TRUE!
storage_port : Using Private IPs!
ssl_storage_port : Using Public IPs!
SPIN UP NODES IN NEW REGION!
us-east-1! us-west-2!
APP
UPDATE KEYSPACE!
Update keyspace <keyspace> with placement_strategy = 'NetworkTopologyStrategy'! and strategy_options = {us-east : 3, us-west-2 : 3};!
Existing region and replication factor ! New region and replication factor!
REBUILD NEW REGION
Run – nodetool rebuild us-east-1 on all us-west-2 nodes
RUN NODETOOL REPAIR
VALIDATION!
BENCHMARKING GLOBAL CASSANDRA WRITE INTENSIVE TEST OF CROSS-REGION REPLICATION
CAPACITY 16 X HI1.4XLARGE SSD NODES PER ZONE = 96 TOTAL 192 TB OF SSD IN SIX LOCATIONS UP AND RUNNING
CASSANDRA IN 20 MINUTES!
Cassandra Replicas
Zone A
Cassandra Replicas
Zone B
Cassandra Replicas
Zone C
US-West-2 Region - Oregon
Cassandra Replicas
Zone A
Cassandra Replicas
Zone B
Cassandra Replicas
Zone C
US-East-1 Region - Virginia
Test Load
Test Load
Validation Load
Interzone Traffic 18TB Backup Restored from S3 using Priam
1 Million Writes!CL.ONE (Wait for One Replica to ack)!
1 Million Reads!after 500 ms!CL.ONE with No!Data Loss!
Interregional Traffic!Up to 9Gbits/s, 83ms! 18 TB backups
from S3
TEST FOR THUNDERING HERD!
TEST FOR RETRIES!
FAILURE RETRY
KEY METRICS USED!
• 99 /95 th Read Latency (Client & C*)!• Dropped Metrics on C*!• Exceptions on C*!• Heap Usage on C*!• Threads Pending on C*!
CONFIGURATION FOR TEST!
• 24 Node C* SSDs!• 220 Client instances!• 70+ Jmeter Instances!
C* IOPS
TOTAL READ IOPS
TOTAL WRITE IOPS
95th LATENCY
99th LATENCY
CHECK FOR CEILING!
NETWORK PARTITION!
us-east-1 us-west-2
TAKEAWAYS!
REPAIRS AFTER EXTENSION ARE PAINFUL !!!
TIME TO REPAIR DEPENDS ON!
• Number of regions!• Number of replicas!• Data size!• Amount of entropy!!
ADJUST GC_GRACE AFTER EXTENSION!• Column Family Setting!• Defined in seconds!• Default 10 days!• Tweak gc_grace settings to
accommodate time taken to repair!• BEWARE of deleted columns!
RUNBOOK!
PLAN FOR CAPACITY!
CONSISTENCY LEVEL !
• Check the client for consistency level setting!• In a Multiregional cluster QUORUM <>
LOCAL_QUORUM!• Recommended consistency levels
LOCAL_ONE (CASSANDRA-6202) for reads and LOCAL_QUORUM for writes!
• For region resiliency avoid – ALL or QUORUM calls!
HOW DO WE KNOW IT WORKS? CREATE CHAOS!!!
Benchmark …!!Time Consuming !!But worth it!!