don’t let your site go down: running high availability drupal websites with acquia and aws
TRANSCRIPT
1
Thomas Jones Sr. Solutions Architect Amazon Web Services
Andrew Kenney VP Platform Engineering
Acquia
2
Agenda !• Introduction
– Challenges Building & Maintaining Sites – The Costs of Failure
• The Future Is The Cloud—Amazon Web Services – Why AWS – Building Fault-Tolerant Applications In the Cloud
• The Cloud & Drupal—Acquia – Building Fully Redundant, Fault-Tolerant Environments – Cloud Platform vs. DIY Hosting
• Q&A
Creating killer websites is hard…!
5
Site Failure Scenarios!• Machine loss / service
outage • Network disruption • Storage system,
database, etc. failure
• Traffic spike • Security attack / DDOS • Failed code deployment • Bad code • Human error
6
The Cost of Failure Is Huge!THE 13 WEBSITES THAT CRASHED DURING SUPER BOWL 2013
7
You might think...!It won’t happen to me.
8
But it will.!
9
Best Practices For High Availability!1. Ensure redundancy for all components
– Servers – Services – Data centers
2. Utilize elastic scalability 3. Plan for failure
10
Amazon Web Services!
Broad & Deep Set of Cloud Services !Compute Networking Storage Database App
Services Management
AWS Premium Support
AWS Professional Services AWS Training
Amazon EC2 Amazon EMR Amazon ELB
Amazon Workspaces
Amazon VPC Amazon Route 53
AWS Direct Connect
Amazon S3 Amazon Glacier Amazon EBS
Amazon Import Exp
Amazon RDS Amazon DynamoDB Amazon Elasticache
Amazon RedShift
Amazon CloudFront Amazon CloudSearch
Amazon SWF Amazon SQS Amazon SNS Amazon SES
Amazon Kinesis
Amazon IAM Amazon CloudWatch AWS CloudFormation AWS Trusted Advisor AWS Data Pipeline
AWS OpsWorks AWS CloudHSM AWS Marketplace AWS CloudTrail
AWS Elastic Beanstalk
Amazon Global Infrastructure!• 10 AWS Regions
– US East (Virginia) – US West (N. California) – US West 2 (Oregon) – EU West (Ireland) – Japan (Tokyo) – South America (Sao Paulo) – ASP 1 (Singapore) – ASP 2 (Sydney) – GovCloud – BJS 1 (Beijing) Limited Preview
• 25 Availability Zones • 51 Edge locations
AWS Region Edge Loca0on
Gartner Magic Quadrant for Cloud !Infrastructure as a Service!(August 19, 2013)!
Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong ([email protected]). Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
Gartner Magic Quadrant for Cloud !Infrastructure as a Service!(August 19, 2013)!
Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong ([email protected]). Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
“AWS is the overwhelming market share leader, with more than five times the compute capacity in use than the aggregate total of the other fourteen providers in this Magic Quadrant.”
• $5.2B retail business
• 7,800 employees
• A whole lot of servers
Every day, AWS adds enough
server capacity to power that
whole $5B enterprise
Scale Matters!
Pace of Innovation!
2008 2009 2010 2011 2012 2013
24 48
61 82
159
280 • Significant Product & Feature Launches
Cloud Computing Benefits!No Up-Front
Capital Expense
Self-Service Infrastructure
Deploy
Easily Scale Up and Down
Low Cost Pay Only for What You Use
Improve Agility & Time-to-Market
Cloud Computing Fault-Tolerance Benefits!No Up-Front HA Capital Expense
Self-Service DR Infrastructure
Deploy
Easily Deliver Fault-Tolerant Applications
Low Cost Backups
Pay For DR Only When You Use It
Improve Agility & Time-to-Recovery
What is “fault-tolerant”?!• Degrees of risk mitigation • Automated • Tested
Old Barriers to HA Now Surmountable!
Old-School Fault Tolerance: Build Two!
Fault Tolerance Using Availability Zones!
23
But what about:!SITE !
UPTIME SECURITY &
COMPLIANCE BACKUPS TESTING UPDATES
24
Acquia Cloud!
25
What is Acquia Cloud?!• Drupal-tuned platform-as-a-service • The most comprehensive toolset for developing,
deploying, and maintaining Drupal sites • The most secure, scalable, and reliable hosting
environment for Drupal sites • Built on Amazon Elastic Compute Cloud (EC2)
26
10 Layers of PaaS!
Acquia Cloud PaaS provides the space to build, test, tune & deploy web apps in a secure way that guarantees performance
Virtual Machine
Drupal Support
Network Services
Remote Administration
High Availability
Configuration Management
Optimization
Monitoring
Development Lifecycle
Custom Code
27
Challenges of HA for Drupal!• Drupal is dynamic, and traffic spikes can occur
– Website traffic can quickly scale – Operates best with a reverse proxy cache such as
Varnish in front of Drupal • System dependencies
– POSIX filesystem expected – HA database expected – Memcache recommended
28
Designing HA for Drupal!• Like Noah’s Ark – two of everything • Automate scaling quickly and reliably • Leverage regions and availability zones • Select reliable synchronization technologies
– Database MySQL master-master replication – Persistent file system—GlusterFS on EBS – “Trust but verify”
Acquia Cloud Enterprise: HA Infrastructure !
Load Balancers!• Elastic Load Balancers"• Acquia Load Balancers: Elastic IP addresses, Varnish cache, Nginx for load
balancing"
Web Servers!• Drupal-tuned"• Scale vertically or horizontally"
File Systems!• HA file system via GlusterFS"• POSIX compatible"
Databases • MySQL 5.5 • Master-master replication
AWS Elastic IPor
Elastic Load Balancer
Availability Zone A
DED
MySQL
User FilesGlusterFS
BAL
Varnish
nginx
APACHE
Memcache
Availability Zone B
DED
MySQL
User FilesGlusterFS
BALVarnish
nginx
APACHE
Memcache
Drupal via Git/SVN Drupal via Git/SVN
30
If availability is your lifeblood, you need multi-region failover.!• Run your site from 2 AWS regions • Use an enterprise-class DB
replication technology—Tungsten Replicator from Continuent with Enterprise Support
• Utilize a content distribution network (CDN) – Durability – Manageability – Security
Availability Zone B
WEB
Availability Zone A
FSDB Mesh
MySQL
User FilesGlusterFS
BALVarnish
nginx
WEB
UserFiles
APACHE
Memcache
CDN
DNS
AWS Elastic IPor
AWS Elastic Load Balancer
FSDB Mesh
MySQL
User FilesGlusterFS
BALVarnish
nginx
APACHE
Memcache
Drupalvia
Git/SVN
UserFiles
Drupalvia
Git/SVN
Availability Zone B
WEB
Availability Zone A
FSDB Mesh
MySQL
User FilesGlusterFS
BALVarnish
nginx
WEB
UserFiles
APACHE
Memcache
AWS Elastic IPor
AWS Elastic Load Balancer
FSDB Mesh
MySQL
User FilesGlusterFS
BALVarnish
nginx
APACHE
Memcache
Drupalvia
Git/SVN
UserFiles
Drupalvia
Git/SVN
US-West US-East
Content Delivery Network provides allows site to be visible for anonymous traffic even while origin region is down.
Enhanced DNS provider (Akamai eDNS, Dyn, etc.) allows for super low latency TTLs and quick global DNS changes
traffic flows to active region
Tungsten MySQLmultiregion replication
One way file rsync
Multi-master MySQL replication (no master promotion necessary)Drupal still only writes to one master at a time in active region only
DNS change required to failover to backup region
DNS
www.domain.como.www.domain.com
31
Nothing Is Irreplaceable!• All components of the platform can tolerate failure • Automate everything • Simulate and handle failures – Netflix’s “Chaos
Monkey” • General best practices:
– Disaster recovery – Replication – Backups
32
BackupValidator
EBS Volume EBS Volume
Amazon Elastic Block Store Amazon S3
Snapshots
mysqldump
Drupal Filesystem
Percona MySQL
Snapshots
EBS Backed Filesystem(s)
Nightly or user-initiated DB dumpsRetention Policy:3 days retained
EBS snapshots are performed every four hoursRetention Policy:4-hour backups: 1 dayDaily backups: 1 weekWeekly backups: 1 monthMonthly backups: 3 months
EBS snapshots are automatically saved to Amazon's S3 services, providing distributed saves to multiple AWS availability zones.
Backup validator regularly audits snapshots to ensure they are recoverable
Recovering From Failure!
33
We Won’t Let You Fail.!• 24 x 7 x 365 critical issue
response • 99.95% SLA for infrastructure
AND application • World-class team of Drupalists
– 50+ professionals – 250+ years of combined Drupal
experience – 50,000+ customer requests
completed each year
34
How is Acquia Cloud Enterprise different?!
Infrastructure & Application Health Security Scanning Third Party Tools
Acquia Insight
Acquia Platform Health
Acquia Uptime Monitoring
35
Questions!• For more information, visit www.acquia.com • Contact us:
– [email protected] – 888.9.ACQUIA
• Follow us on Twitter: @acquia • Comments welcome:
36
THANK YOU