delivery at scale

Post on 07-Jul-2015

2.002 Views

Category:

Engineering

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Have you ever wondered how large software companies with an engineering culture make sure they are able to deliver software over and over to production? How do you coordinate 100+ software engineers so that there are no bottlenecks and quality is not compromised? In this talk you will see how a Continuous Delivery system was implemented at Criteo, the fastest growing IT company in EMEA 2012. Before starting the project there were 160+ code repositories with dependency hell. They were being built independently and releases to production were error prone and painful. You will see the technical architecture behind a successful implementation of a Continuous Delivery system. The system was made up of a Gerrit code review tool connected to a Jenkins build pipeline, building 160 repositories with over 7M lines of code. We will explore different architectural choices such as branching system, hot fixes, sandbox and pre-production environments, and how these were developed and used by the large R&D department. Authors: Adrian Perreau de Pinninck, Manu Cupcic

TRANSCRIPT

Delivery at Scale

Manu Cupcic and Adrian Perreau

Scale

● > 38 PB on our HDFS

● > 1 billion ad impressions per day

● > 5000 servers world wide

cost of change

number of devs

The problem

Once upon a time

Single C# Repository

Build on local machines

Build on local machines

Build on local machines

Continuous Integration??

Running applications

No Testing Culture

Weekly Merges

20 engineers

50 engineers

Weekly Merges: Disaster!

140 engineers

First attempt at fixing

Bring in the horse power

Internal Open Source Model

Splitting the Repository

Hmmmm… NuGets!

Problems

Repositories everywhere

2012 => 33 repositories2013 => 160+ repositories

Spaghetti and NuGetscriteo.frontend

criteo.display

criteo.cookies

common.utils

criteo.display

criteo.urlscriteo.objectscriteo.rendering

criteo.bidding

criteo.serve

criteo.sql

criteo.data

criteo.banners

criteo.memcache

flash

criteo.biere

Repository OwnershipAll repos to be owned by a teamMerge requests for repos you don’t own

Component Teams

Conway’s Vengeance“Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations.”

Teams dealt with unclear interfaces, animosity grew stronger.

Propagating changes

SOX compliance (an example)Required updating the SDK (lowest level repository)Program Manager full time for over one monthSome TLAs were 5 SDK versions behind

Release problem

Code Freeze during peak season:Development continued

When code unfroze:One month worth of code to integrateTook 5 months to release again

It wasn’t scaling!Team had almost doubled

TLAs could take over ayear to release next time

We could lose our abilityto release forever

So why did we fail ?

developers owners of application

ae64ca3

57d21a9

4567a81

9aad4cb

f478ff347ac58171da3b5

Integrate early

developers

57d21a94567a81 9aad4cbf478ff3 47ac58171da3b5

Trunk based development

Trunk based development

TBD is a real pain...

“Branch by Abstraction" is a technique for making a large-

scale change to a software system in gradual way that

allows you to release the system regularly while the

change is still in-progress.

Martin Fowler

But we love it

What did it take ?

Build infrastructure

Virtual team for 6 months.

Afterwards, team became

permanent.

Killing the NuGet lag

● Move every git repository to the new code review tool

● Revert to the last version in production

● Build master continuously and show progress

● Deploy in preproduction then production

Test plans progress

Our setupin great (technical) details

Code reviews with gerritMOAB

...

...

...

Building and publishing project 70783/70784 : XXX ... [OK]

Building and publishing project 70784/70784 : XXX.UTest... [OK]

70784 projects successfully built (100.0 %).

10:09:12 cbs assembly-set update --moabId 13998

assemblyset.json written.

10:09:20 cbs export

Build 6a306a43dd147cd6a6fcacbf40e20b25f3f69845 exported.

10:10:23 cbs assembly-set upload

Assemblyset pushed to git@git.corp.criteo.com:qa/moab.git

10:10:31 cbs tag HEAD build/current

The MOAB

The MOAB pipeline

The sandboxesa datacenter

a sandbox

How we deploydatacenter

1. scp

2. bittorrent

How we deploy

Automatic metrics checks

cost of change

number of devs *

Conclusion

* including 5% working full time on engineering tools

We’re not the only ones

Mandatory “we’re hiring” slide

Questions ?

top related