Download - Deployment Checkup: How to Regularly Tune Your Cloud Environment - RightScale Compute 2013
april25-26
sanfrancisco
cloud success starts here
Deployment Checkup: How to Regularly
Tune Your Cloud Environment
Brian Adler, Sr. Services Architect, RightScale
#2#2
#RightscaleCompute
Tune Ups - Why They Make Sense
• Over time, deployments grow organically
• Some things are done quickly to solve an immediate issue and
are never readdressed
• “If it ain’t broke, don’t fix it”
• Squeaky wheels get the grease. But over-greased wheels tend
to get ignored – and can be costly
#3#3
#RightscaleCompute
Whys and Whats
• Deployment Checkups (the “why”)
• Cost
• Performance
• Best Practices
• Deployment Considerations (the “what”)
• Cost Optimization
• Server Utilization
• High Availability and Disaster Recovery (HA/DR) Implementation
• Security
• Best Practices
#4
#rightscalecompute
Deployment Checkup
Cost Optimization
#5#5
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
#6#6
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
#7#7
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
#8#8
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
#9#9
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
• EBS Volumes
#10#10
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
• EBS Snapshots
#11#11
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
• EIPs
#12#12
#RightscaleCompute
Cost Optimization
• Unused/Unneeded Resources
• S3 objects
#13#13
#RightscaleCompute
Cost Optimization
• Storage
• Don’t over-allocate initially. Increase capacity when needed.
#14#14
#RightscaleCompute
Cost Optimization
• Bandwidth
• Cross-AZ• Generally acceptable/recommended for HA configurations
• Cross-region/cloud• Compress/Optimize
#15
#rightscalecompute
Deployment Checkup
Server Utilization
#16#16
#RightscaleCompute
Server Utilization
• Choose the right sized instance for the task at hand
• Utilize Reserved Instances (PlanForCloud can be of great
benefit here)
Something odd happened around this time…use alerts to
be notified and load testing to find the right instance size for the task
#17#17
#RightscaleCompute
Server Utilization
• CPU affinity
• taskset
• irqbalance
Spread the load
if/when possible
#18#18
#RightscaleCompute
Server Utilization
• Memory
• Numerous different memory options – find the right fit• 613 MiB instances through 244 GiB instances
#19#19
#RightscaleCompute
Server Utilization
• Load average
• Good indicator of overall capacity utilization
• 1.0 x Number of Cores = fully utilized
#20#20
#RightscaleCompute
Server Utilization
• Monitoring and Alerts
• Use them! Find small problems before they become big problems
• Look for trends and act on them (not on spikes or anomalies)
• Look for underutilization as well as overutilization
#21
#rightscalecompute
Deployment Checkup
High Availability and Disaster Recovery
#22#22
#RightscaleCompute
HA/DR Considerations
• Avoid single points of failure (SPOF)
• Use Availability Zones (AZ) to your benefit
• Always place one of each component (load balancers, app servers,
databases) in at least two AZs
• Replicate data across AZs (HA) and backup or replicate across
regions/clouds for failover (DR)
• Setup monitoring, alerts and operations to identify and
automate problem resolution or failover process
#23#23
#RightscaleCompute
HA/DR Implementation
• Availability Zone Distribution
#24#24
#RightscaleCompute
HA/DR Implementation
• Availability Zone Distribution
#25#25
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#26#26
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#27#27
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#28#28
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#29#29
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#30#30
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#31#31
#RightscaleCompute
HA/DR Implementation
• Availability Zone and Regional Distribution
#32
#rightscalecompute
Deployment Checkup
Security
#33#33
#RightscaleCompute
Security
• Security Groups
#34#34
#RightscaleCompute
Security
• iptables
#35#35
#RightscaleCompute
Security
• iptables
#36#36
#RightscaleCompute
Security
• iptables
#37#37
#RightscaleCompute
Security – OS Patching
• Upcoming ServerTemplate release (v13.4) will have recipes to:
• Unfreeze Security repos on Ubuntu
• Unfreeze Upstream repos on CentOS
• Perform security update
#38
#rightscalecompute
Deployment Checkup
Best Practices
#39#39
#RightscaleCompute
Best Practices
• Image Bundling - Friends don’t let friends bundle
• Use ServerTemplate with base RightImage
• Configure server at boot time
• Use EBS-backed images to speed boot times if needed
• Rare case where bundling is recommended• Manual install of software required
• Boot time is unacceptably long to respond to dynamic event
#40#40
#RightscaleCompute
Best Practices
• No SPOF
• Distribute servers in each tier across multiple AZs
• Use Deployments as Application containers
#41#41
#RightscaleCompute
Best Practices
• Commit your ServerTemplates (don’t use HEAD)
#42#42
#RightscaleCompute
Best Practices
• Use Credentials for all sensitive inputs
#43#43
#RightscaleCompute
Best Practices
• Use autoscaling in combination with right-sized instances for
the task at hand
• Automate all operations – no manual changes
• Reboot – all is lost
• Automation is documentation
#44#44
#RightscaleCompute
Summary
• Deployment sprawl can lead to cost and operational
inefficiencies
• Cost Optimizations can be found hiding in many places
• Find the right size resource for the job
• Don’t use a hammer if it ain’t a nail
• High Availability should be designed in from the start
• Look for SPOF and find ways to eliminate them
• Disaster Recovery can be affordable
• Security – open up only what you have to
• Best Practices promote operational efficiencies
april25-26
sanfrancisco
cloud success starts here
Questions?