deploying software in an autoscaled aws environment

37
Deploying Software in an Autoscaled AWS Environment Jeff Horwitz Director of Cloud Engineering, ShopRunner [email protected] Presented at Philly DevOps January 20, 2015

Upload: jeff-horwitz

Post on 16-Jul-2015

262 views

Category:

Internet


2 download

TRANSCRIPT

Page 1: Deploying Software in an Autoscaled AWS Environment

Deploying Software in an Autoscaled AWS

EnvironmentJeff Horwitz

Director of Cloud Engineering, [email protected]

Presented at Philly DevOps January 20, 2015

Page 2: Deploying Software in an Autoscaled AWS Environment

• Members get 2-day free shipping and returns

• Benefits apply across a variety of retailers

• Extending our reach into China w/ Alipay

Page 3: Deploying Software in an Autoscaled AWS Environment

Applications

• Many different applications (10+)

• Each with its own repository and set of servers

• Multiple deployments per day

Page 4: Deploying Software in an Autoscaled AWS Environment

AWS @ ShopRunner

• Infrastructure is 100% in the cloud

• AWS + other services

• Heavy use of VPC, AutoScaling, CloudFormation

Page 5: Deploying Software in an Autoscaled AWS Environment

How We Launch• Everything launched via a CloudFormation Stack

• Use nested stacks to stay DRY

• single_instance.json

• autoscaling_group.json

• CloudInit bootstraps each instance

• Puppet applies role-specific configurations

Page 6: Deploying Software in an Autoscaled AWS Environment

ASG in Cloudformation... !"ContentGroup": { "Type": "AWS::CloudFormation::Stack", "Properties": { "Parameters": { "ServerEnvironment": "prd", "ServerRole": "content", "InstanceType": "m3.medium", "LoadBalancerNames": { "Ref": "ContentELB" }, "AvailabilityZones": { "Fn::Join": [ ",", { "Ref": "AvailabilityZones" } ] }, "VPCZoneIdentifier": { "Fn::Join": [ ",", { "Ref": "ASGroupSubnets" } ] }, "SecurityGroupIds": { "Fn::Join": [ ",", { "Ref": "ContentSecurityGroup" } ] }, "DesiredASCapacity": 3, "MinASSize": 3, "MaxASSize": 6 }, "TemplateURL": "https://s3.amazonaws.com/BUCKET/cloudformation/autoscaling_group.json", "TimeoutInMinutes": 30 } } !...

Page 7: Deploying Software in an Autoscaled AWS Environment

Waiting for Puppet

• Puppet can take some time to run

• Group shouldn't go live until puppet is finished

• Use CloudFormation Wait Conditions

• Wait for stack status CREATE_COMPLETE

Page 8: Deploying Software in an Autoscaled AWS Environment

Puppet Wait Conditions "PuppetWaitHandle" : { "Type" : "AWS::CloudFormation::WaitConditionHandle", "Properties" : {} }, ! "PuppetWaitCondition": { "Type" : "AWS::CloudFormation::WaitCondition", "DependsOn" : "AutoScalingGroup", "Properties" : { "Handle" : { "Ref" : "PuppetWaitHandle" }, "Timeout" : "1800", "Count" :{ "Ref": "DesiredASCapacity" } } },

Page 9: Deploying Software in an Autoscaled AWS Environment

Signal Wait Handler

...

!"command": { "Fn::Join": [ "", [ "/opt/aws/bin/cfn-signal -s $success ", "-r \"puppet agent exited with code $rc\" ", "-i \"puppet-signal-$EC2_INSTANCE_ID\" '", { "Ref": "PuppetWaitHandle" }, "'" !...

Page 10: Deploying Software in an Autoscaled AWS Environment

Legacy Deployments• one long-lived AS group per application

• per-application scripts launch AS groups

• scripts pull code from git into an EBS volume

• create snapshot and upload ID to S3

• rsync volume to servers in existing AS group

• restart services as necessary

Page 11: Deploying Software in an Autoscaled AWS Environment

Problems• scripts w/o CloudFormation diverge quickly

• can't easily launch multiple versions

• no association with a tag/branch/commit

• rsync changes code on running servers

• can't easily stage new code before deploying

• can't easily warm servers before deploying

• no clean or consistent rollback procedure

Page 12: Deploying Software in an Autoscaled AWS Environment

Solutions• stop treating our infrastructure like it's static

• create new stacks for each deployment

• store state in etcd

• stop deploying code changes

• start deploying stacks

Page 13: Deploying Software in an Autoscaled AWS Environment

Tenets of SR Deployments• Unit of deployment is the stack

• Deployed servers are immutable

• Deployments are reproducible

• Fail back to old stacks, fail forward to new stacks

• DB migrations should be backwards compatible

• Test on the same configuration as production

Page 14: Deploying Software in an Autoscaled AWS Environment

ELB Catch-22• new instances added to ELB once running

• autoscaling needs services to start automatically

• what if we're not ready?

• what if the service is actually broken?

• wait to associate ELB w/ ASG? can't do that!

Page 15: Deploying Software in an Autoscaled AWS Environment

Delay Service Start?

• configure instances not to start services on launch and only start services when ready to deploy

• manage with manual steps or custom code

• initial launch versus scale-out event

• feature flags (etcd, other orchestration)

Page 16: Deploying Software in an Autoscaled AWS Environment

Lifecycle Hooks FTW

• Register hooks for ASG lifecycle events

• Lifecycle halts until told to proceed

• Can launch our group but tell it not to go live

Page 17: Deploying Software in an Autoscaled AWS Environment

Autoscaling Lifecycle Hooks

Copied from AWS documentation athttp://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html

Page 18: Deploying Software in an Autoscaled AWS Environment

Autoscaling Lifecycle Hooks

Copied from AWS documentation athttp://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html

Page 19: Deploying Software in an Autoscaled AWS Environment

Autoscaling Lifecycle Hooks

Copied from AWS documentation athttp://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html

Page 20: Deploying Software in an Autoscaled AWS Environment

Pre-deploymentLonely ELB

Page 21: Deploying Software in an Autoscaled AWS Environment

Deployment #1Launch Autoscaling Group

ASG v1

PENDING

Cache warming Launch status check "go/no-go"

OOPS I launched the wrong thing -- run away!

Page 22: Deploying Software in an Autoscaled AWS Environment

Deployment #1Deploy Autoscaling Group

ASG v1

GO LIVE

Page 23: Deploying Software in an Autoscaled AWS Environment

Deployment #1Deploy Autoscaling Group

ASG v1

Page 24: Deploying Software in an Autoscaled AWS Environment

Deployment #2Launch Autoscaling Group

ASG v1 ASG v2

PENDING

Page 25: Deploying Software in an Autoscaled AWS Environment

Deployment #2Deploy Autoscaling Group

ASG v1 ASG v2

GO LIVE

Page 26: Deploying Software in an Autoscaled AWS Environment

Deployment #2Multiple ASG Backends

ASG v1 ASG v2

Page 27: Deploying Software in an Autoscaled AWS Environment

Deployment #2REVERT!

ASG v1 ASG v2

Page 28: Deploying Software in an Autoscaled AWS Environment

Deployment #2Multiple ASG Backends

ASG v1 ASG v2

Page 29: Deploying Software in an Autoscaled AWS Environment

Deployment #2Remove ASG v1

ASG v1 ASG v2

Page 30: Deploying Software in an Autoscaled AWS Environment

Deployment #2Suspend Processes on ASG v1

ASG v1 ASG v2No scalingNo ELB

Page 31: Deploying Software in an Autoscaled AWS Environment

Deployment #2Delete ASG v1

ASG v2

Page 32: Deploying Software in an Autoscaled AWS Environment

Deployment #2ASG v2 Deployed

ASG v2

Page 33: Deploying Software in an Autoscaled AWS Environment

Suspend/Resume

• Launch

• Terminate

• AddToLoadBalancer

• AlarmNotification

• AZRebalance

• HealthCheck

• ReplaceUnhealthy

• ScheduledActions

Page 34: Deploying Software in an Autoscaled AWS Environment

Standby State• Removes instances from autoscaling group

• Resources are still managed by the group

• Option to maintain capacity while in standby

• Once ready, return the instance to service

• Great for debugging w/o affecting capacity

Page 35: Deploying Software in an Autoscaled AWS Environment

Attach/Detach Instances

• Relatively new feature

• Use to attach to a pre-launch testing ASG/ELB

• Move instances to production ASG when ready

Page 36: Deploying Software in an Autoscaled AWS Environment

Deployment Procedure1. Build the app.

2. Create snapshot and register it in etcd.

3. Launch a deployment with the build snapshot.

4. Perform pre-launch tasks (warming, etc.).

5. Release deployment (completes lifecycle).

6. Revert to or remove old deployment.

7. Delete old deployment.

Page 37: Deploying Software in an Autoscaled AWS Environment

Future Work

• Test instances with a pre-launch testing ELB

• Register Jenkins builds for deployment

• Support multiple environments

• UI/Dashboard