10 deploys a day - a case study of continuous delivery at envato
DESCRIPTION
A presentation to the BankWest Solution Delivery team and the Perth DevOps Meetup describing the delivery processes at Envato that enable us to deploy 10 times a day.TRANSCRIPT
continuous delivery at
envatoAT
10 Deploys a Day a case study of
PRESENTATION TEMPLATE from Envato’s Graphic River - http://graphicriver.net/item/karbon-keynote-presentation-template/2580765
john viner
envato ABOUT US7 years ago - a designer scratching an itch creates a flash-plugin marketplace
and now …
‣ 3.2m members ‣ 4.6m Items ‣ 10 Authors sold > $1m
‣ 130m PVs p/m ‣ 20M Visits p/M ‣ Item sold every 10 sec
Envato
our marketplace Sites
wordpress - our biggest marketEnvato
> 3400+ Themes for Sale> Most Popular Theme has sold 37,500 times and generated $2m in gross revenue> Average price of a Theme is $42> Average sales per TF author is 2,075> Provides a Passive Income source!
19% of the web runs on Wordpress! (4B monthly page view)
we’ve grown - and continue to growEnvato
> Themeforest is now Ranked ~ #180 on the Alexa Rankings> My development team was 8, 15 months ago and is now 24
the marketplace team
‣ 40 - in melbourne ‣ 30 - remote around the world
- 19 Back End Developers- 2 Front End Developers- 1 Operations Developer- 5 Product Managers and UX Designers
‣ delivery team
‣ 6 Teams ‣ finance, ‣ back office, ‣ front end, ‣ maintenance ‣ search ‣ 10x
Envato
marketplace tech stack
‣ ruby on rails web app ‣ mysql database ‣ elastic Search ‣ rackspace managed hosting in USA
> 90M App Server Requests per Week> Backend Response time of 148 ms> Front end Response time of 2.5sec> 45 Virtual Machines > 4 Physical Machines (DBs, NFS)
> Avg 6 deploys a day - Peak of 10> Average 100 commits a day > Peak Load of 15,000 req per minute> 4 Physical Machines (DBs etc…)
Envato
THIS IS HOW WE ROLL
pull work, talk to Product, write code, test code deploy code, verify production repeat … about twice a week per developer
> Story on the wall > flip or no flip? > Create a Branch > write failing test > write code > Run Local Tests > Run full personal build > create Pull request > advertise pull request (IRC) !
> wait for +1s (code review) > Merge Pull request > Full Master Build Green > notify team in IRC > deploy master to Production > watch System - monitoring tools > rollback if failed > Rinse and Repeat
optional > deploy to staging > show product owner > manual test
Envato
ENVATO
in pictures
Story
TestsCode
ENVATO
flippin good>Flip allows us to roll out changes to % of users> Something might be flipped off for months> But we Always Deploy It
ENVATO
Code Review
Deploy
CI Build
quality
You Need Deep Dev and Operational Experience right?
> Luke - Graduate with 4 months Exp> Rakesh - Swinburne Industry Student> Ben - Front End Developer> Emmanuel - 2 yrs Dev Exp
4 Other Developers MentionedAcross 4 Separate Teams
coordinating reviews and deploys .. by irc
ENVATO
now check the systems
New Relic Deploy With Change
Traceability
things can break … And Do
ENVATO deploys gone bad
Rollbar Error Logging
>1 - Improve our ability to Automatically rollback on monitored 500 errors!
> 2 - NO Recommendation to Improve Preventative Testing!
> Why? - Too Costly and complex to simulate production> Why? - Would slow development flow
the post incident review recommends?Envato
so who operates this site anyway?
> 16 Developers on a Weekly Rotation> They come out of their team> They are Trained, have a Buddy for First Week> Supported by a Secondary > They are First Line of Support for 168 hours straight
1 developer all developers you build ityou run it
Tooling is:> Scout , New Relic and Pingdom monitoring and generating alerts> Pagerduty forwarding to them On-Call Developer> Rollbar to track Errors> IRC to find out what went wrong> ALL Alerts are also pumped into IRC so everyone knows whats going on and to get Help!
Envato
we reflect on failures
and act to correct
Envato
> We are just Good Enough right now, Just> The rotation leaves without a continuous view of system health> We don’t always clean up behind us, root causes often come back> Monitoring, Metrics, Performance insights are all Operational Skills that many developers don’t have
but we aren’t great at operations
> We are hiring Operational Specialists for the first time> Our Single DevOps guy is actively training all the devs - AJs Cafe
Envato
so what do we not do
> No - testing role or formal testing gate > NO - change control gate > No - formal comms of every change > NO - release management > No - separate deployment team > NO - Separate operations team
all of these are not feasible at 10 deploys a day
Envato
how do we solve the problems that these practices solve ?
> TDD and 1000s of automated tests> Use Production Users as your test team> Flips help us control the rollout> We test major infrastructure and architecture changes in production as well … using Flips> We Fail Fast and Loud
‣ No Testing Capability ‣ No change control> Operations (i.e. Developers) know of changes before they happen because they review them as PRs> There are changes we cannot “signoff” as testing in pre-prod environments> A green build is our sign off> Devs have skin-in-game so are incentivised for the code to work!> Small Batch efficiency (small deltas are easier to identify and rollback)
‣ No release management> Releases are SMALL, no need to define or document> Release self-document and radiate through IRC
Envato
OK How did we do it?
> The Team started and stayed very small (less than 8 devs) for 4 years> They started with this structure and process and have never changed.
‣ it was our startup dNA‣ it scales> We’ve tripled the team, and split into 5 streams> We’ve been able to scale this model
Envato
‣ but with a lag> But it took 3 months to see the results
so do you want to drink the kooL-aid?
> We have a tolerant savvy users> We have users that come back to complete a purchase if it fails> We are in CONTROL of our full stack
‣ it might not be for you
> Tech is the easy part> You can’t optimise your “gate-keeping” processes to 10 deploys
‣ If you want it, you have to change ‣‣ > people ‣ > processes ‣ > assumptions
‣ a leap is required> Existing processes can get you close, say once a fortnight or a week> A leap is required to get to many times a day
Envato
distributed collaboration vs central control‣ quality ‣ release planning ‣ deployment ‣ operations !
Envato
did I say Continuous Delivery or deployment?
meet the real rockstars at Envato …
the microlancer team
automatically deploy every commit to master … always
Envato
whats the difference?
Envato
what our next challenges?
‣ continuous automated reactive deployments ‣ flips + split testing and with every product feature ‣ flexibility of our infrastructure ‣ improved operations !
Envato