scaling up continuous deployment
DESCRIPTION
How do you continue to ship 50 times a day, when you're constantly hiring more engineers? How can you continue, when every day you write more tests that need to be run on every commit? This talk will cover how to scale up Continuous Integration and Continuous Deployment infrastracture, for teams as small as a handful of engineers and as large as hundreds of engineers.TRANSCRIPT
Scaling Up Continuous Deployment
Timothy Fitz (.com)
“Continuous deployment involves deploying early and often, so as to avoid
the pitfalls of "deployment hell". The practice aims to reduce timely rework and thus
reduce cost and development time.”
Scalability
• Maintaining availability, performance and happiness
• As a function of # of people• As a function of # of tests
Availability
• The build must always be green• Set a “green SLA”– 99% green– Never red for > 15m
• Measure, track and report on these numbers
Performance
• Measure (intent to) commit to time of deploy– Goal: < 5 minutes
• Measure local development test loop– Goal: < 2s
Happiness
• CI/CD System is a product• Software Engineers are the customer• Keep your customer happy!
Testing Pyramid
How do you make tests fast?
• Tests can exercise large amounts of code without being slow
• Minimize system calls (no I/O, no disk)• Minimize test data size• Make sure all systems are cheap to
instantiate/teardown• No external state makes tests more reliable
Run Tests in Parallel
• Multiprocess• Multimachine• Multi-VM• Instant multi-VM: http://circleci.com
Hardware Scale
• CI Cluster will get huge– Function of cumulative engineering man-months– Rule of thumb: 10% of your cluster size
• You will need a CI/CD DevOps person– CI cluster monitoring / alerting– Configuration Management critical
Scale testing infrastructure recap
• Write the right kind of tests• Make those tests as fast as possible• Run those tests in parallel
People / Roles
• Sheriff– Designated reverter / problem troubleshooter– Common pattern (IMVU, Chromium, Firefox)
• CD “Product Owner”– Held accountable for SLA / Performance– Manage infrastructure backlog
Single trunk
• Do this until it doesn’t work for you• Gets painful in the 16 – 32 developer range• Faster commit->deploy reduces the pain– But effort becomes prohibitive
“Try” pipeline
• Conceptually, a second tree that “doesn’t matter” but still gets tested for feedback
• Buildbot implements a patch-pushing version• Takes a significant amount of pressure off of
trunk builds
CI Server takes active role
• Server automatically reverts red commits• Server merges green commits to trunk
Feature branches
• All incremental development happens on branches, branches land when feature is “ready”
• If “feature” is kept small, can be 2-3 per engineer per week on average
• Less continuous, but scales much better– Feature branches tested before merge
Merge tree
• Tree per team / feature• Trees merged into trunk daily (if green)• Scale up via tree of trees (of trees…)• Again, less continuous
Federation
• Each team gets their own deploy pipeline• Requires SOA / component architecture• Each team can set their own CD pace• “Enterprise Ready”
Recap
• Single trunk + Try pipeline / Autorevert• Feature Branches• Merge Tree• Federation
Questions?
Timothy Fitz (.com)