building a distributed build system at google scale (strangeloop 2016)

Post on 19-Jan-2017

505 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Building a Distributed Build System

at ScaleAysylu Greenberg

September 17, 2016

Aysylu Greenberg

@aysylu22

Building Distributed Build System

at Google Scale

Building Distributed Build System

at Google Scale

Build Scenario

Project with dependencies

Makefilemain.o : main.c defs.h cc -c main.cmain.c : curl –O http://remote/main.c

Build Scenario

Project with dependencies

Find dependenciesMakefilemain.o : main.c defs.h cc -c main.cmain.c: curl –O http://remote/main.c

Build Scenario

Project with dependencies

Find dependencies

Build project with the dependencies

Download build artifacts

Test Scenario

Project with dependencies

Find dependencies

Build project with the dependencies

Download build artifacts Run the test

Get the results of the test

Building Distributed Build System

at Google Scale

Building Distributed Build System

at Google Scale

Scale• Engineers: >30,000 developers in 40+ offices

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC• Builds and tests: 5M per day through BuildRabbit

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC• Builds and tests: 5M per day through BuildRabbit• Petabytes of output artifacts

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC• Builds and tests: 5M per day through BuildRabbit• Petabytes of output artifacts• 1 repository

Working in One Repository • Linear revision history

Working in One Repository • Linear revision history• Extensive code sharing & reuse

• Everything can be cross-referenced• Large-scale refactoring for codebase modernization• Easier collaboration across teams

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management

• No confusion about authoritative version of the file• No forking of shared libraries• Components for library releases = Git subtree or Git

subcomponents to separate release from WIP versions

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management• Predictable, repeatable builds from source

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management• Predictable, repeatable builds from source• Optimizations to avoid compiling same artifacts

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management• Predictable, repeatable builds from source• Optimizations to avoid compiling same artifacts• Decouple each team's processes as much as

possible

Building Distributed Build System

at Google Scale

Building Distributed Build System

at Google Scale

Towards Distributed Build System

Towards Distributed Build System

Towards Distributed Build System

Towards Distributed Build System

Towards Distributed Build System

Towards Distributed Build System

BuildRabbit:Distributed Build System

BuildRabbit: Distributed Build System

IN THE CLOUD

What does BuildRabbit do?

Continuous integration

system

BuildRabbit

Source System

Google engineers &

teams

Release infrastructure

Integration testing

infrastructure

Build artifact storage

Blaze

Building Distributed Build System

at Google Scale

Building Distributed Build System

at Google Scale

From Push to Pull: Client / Server

Client forUser

BuildRabbitScheduler

BuildRabbitWorker

From Push to Pull: Build Service

User

Persistent Queue

BuildRabbit Worker

Build ArtifactsBuild Progress

Info

eventRPCstream

From Push to Pull: Build Service

User

Persistent Queue

BuildRabbit Worker

Build ArtifactsBuild Progress

Info

eventRPCstream

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

RESILIENCE LOGIC

RESILIENCE LOGIC

Continuous integration

system

BuildRabbit

Source System

Google engineers &

teams

Release infrastructure

Integration testing

infrastructure

Build artifact storage

Blaze

Migration with Zero DowntimePhoto by moonjazz / CC BY SA

MIGRATE BACKENDS FIRST

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

FOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

TARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

DECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

MAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

PRACTICE THE LAUNCHMAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

PRACTICE THE LAUNCH, A LOTMAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

HAVE A SOLID ROLLBACK PLANPRACTICE THE LAUNCH, A LOTMAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODE

Distributed Design Patterns: Distributed Control

Design implementation is scaled from 1 machine

Bigger overhead during evolution

Distributed Design Patterns: Centralized Control

Distributed design from the start

Bigger overhead during design phase

Building a Distributed Build System

at ScaleAysylu Greenberg

@aysylu22

top related