architecting for the cloud: hoping for the best, prepared for the worst
TRANSCRIPT
AWS Loft: Behind the scenes with Cotap
Architecting for the Cloud:Hoping for the best, prepared for the worst.
Design for automation
● AutoScalingGroups
● Hardware: CloudFormation
● Software: Configuration management
● Cattle not Cats
Monitoring & Alerting
● Cost of○ Interruptions○ Waking somebody up
● Channels● Self-healing infrastructure● External monitoring● Page only when critical
Monitoring & Alerting
Situation Channel Page
Disk full 60% Chat, Email ✗
Disk full 90% Chat, Email, PagerDuty ✓
Chef not running for > 30m Chat, Email ✗
Redis not running for > 3 x 5s Chat, Email, PagerDuty ✓
ElasticSearch N-1 Chat, Email ✗
ElasticSearch N-2 Chat, Email, PagerDuty ✓
Monitoring & Alerting
● Cost of○ Interruptions○ Waking somebody up
● Channels● Self-healing infrastructure● External monitoring● Page only when critical
Platform to fail
● Easy creation of temporary “Stacks”● Branches can get their own hardware● Clients can talk to a branch● QA happens on Sandbox● Exact copy of Production● Scale up/down based on needs● Different Region (us-east-1)
Platform to fail
● Easy creation of temporary “Stacks”● Branches can get their own hardware● Clients can talk to a branch● QA happens on Sandbox● Exact copy of Production● Scale up/down based on needs● Different Region (us-east-1)
Disaster Recovery
● Multi-AZs● Traffic routing● Multi-Regions (S3 too)● AutoScalingGroups Min:1 Max:1● Off-site backups (VPN + Disks)● RPO + RTO
Security
● MFA● Public key distribution● Root key rotation● Private/Public Subnets● ACLs/Security Groups● Update AMIs● Trusted Advisor!
Cost Control
● Tags○ Role○ Environment
● Cost explorer● Threshold alerting
● Share monthly● Export to CSV● Right-Scale (ASG)
Cost Control
● Tags○ Role○ Environment
● Cost explorer● Threshold alerting
● Share monthly● Export to CSV● Right-Scale (ASG)
4 rules of 5 nines.
● All changes have to be under VC
● No instance should be launched manually
● All changes are deployed to Sandbox first
● Production is just a more powerful Sandbox