Download - Delivery at Scale
Delivery at Scale
Manu Cupcic and Adrian Perreau
Scale
● > 38 PB on our HDFS
● > 1 billion ad impressions per day
● > 5000 servers world wide
cost of change
number of devs
The problem
Once upon a time
Single C# Repository
Build on local machines
Build on local machines
Build on local machines
Continuous Integration??
Running applications
No Testing Culture
Weekly Merges
20 engineers
50 engineers
Weekly Merges: Disaster!
140 engineers
First attempt at fixing
Bring in the horse power
Internal Open Source Model
Splitting the Repository
Hmmmm… NuGets!
Problems
Repositories everywhere
2012 => 33 repositories2013 => 160+ repositories
Spaghetti and NuGetscriteo.frontend
criteo.display
criteo.cookies
common.utils
criteo.display
criteo.urlscriteo.objectscriteo.rendering
criteo.bidding
criteo.serve
criteo.sql
criteo.data
criteo.banners
criteo.memcache
flash
criteo.biere
Repository OwnershipAll repos to be owned by a teamMerge requests for repos you don’t own
Component Teams
Conway’s Vengeance“Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations.”
Teams dealt with unclear interfaces, animosity grew stronger.
Propagating changes
SOX compliance (an example)Required updating the SDK (lowest level repository)Program Manager full time for over one monthSome TLAs were 5 SDK versions behind
Release problem
Code Freeze during peak season:Development continued
When code unfroze:One month worth of code to integrateTook 5 months to release again
It wasn’t scaling!Team had almost doubled
TLAs could take over ayear to release next time
We could lose our abilityto release forever
So why did we fail ?
developers owners of application
ae64ca3
57d21a9
4567a81
9aad4cb
f478ff347ac58171da3b5
Integrate early
developers
57d21a94567a81 9aad4cbf478ff3 47ac58171da3b5
Trunk based development
Trunk based development
TBD is a real pain...
“Branch by Abstraction" is a technique for making a large-
scale change to a software system in gradual way that
allows you to release the system regularly while the
change is still in-progress.
Martin Fowler
But we love it
What did it take ?
Build infrastructure
Virtual team for 6 months.
Afterwards, team became
permanent.
Killing the NuGet lag
● Move every git repository to the new code review tool
● Revert to the last version in production
● Build master continuously and show progress
● Deploy in preproduction then production
Test plans progress
Our setupin great (technical) details
Code reviews with gerritMOAB
...
...
...
Building and publishing project 70783/70784 : XXX ... [OK]
Building and publishing project 70784/70784 : XXX.UTest... [OK]
70784 projects successfully built (100.0 %).
10:09:12 cbs assembly-set update --moabId 13998
assemblyset.json written.
10:09:20 cbs export
Build 6a306a43dd147cd6a6fcacbf40e20b25f3f69845 exported.
10:10:23 cbs assembly-set upload
Assemblyset pushed to [email protected]:qa/moab.git
10:10:31 cbs tag HEAD build/current
The MOAB
The MOAB pipeline
The sandboxesa datacenter
a sandbox
How we deploydatacenter
1. scp
2. bittorrent
How we deploy
Automatic metrics checks
cost of change
number of devs *
Conclusion
* including 5% working full time on engineering tools
We’re not the only ones
Mandatory “we’re hiring” slide
Questions ?