building a distributed build system at google scale (strangeloop 2016)

64
Building a Distributed Build System at Scale Aysylu Greenberg September 17, 2016

Upload: aysylu-greenberg

Post on 19-Jan-2017

505 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building a Distributed Build System

at ScaleAysylu Greenberg

September 17, 2016

Page 2: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Aysylu Greenberg

@aysylu22

Page 3: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 4: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 6: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Build Scenario

Project with dependencies

Makefilemain.o : main.c defs.h cc -c main.cmain.c : curl –O http://remote/main.c

Page 7: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Build Scenario

Project with dependencies

Find dependenciesMakefilemain.o : main.c defs.h cc -c main.cmain.c: curl –O http://remote/main.c

Page 8: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Build Scenario

Project with dependencies

Find dependencies

Build project with the dependencies

Download build artifacts

Page 9: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Test Scenario

Project with dependencies

Find dependencies

Build project with the dependencies

Download build artifacts Run the test

Get the results of the test

Page 10: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 11: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 12: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Scale• Engineers: >30,000 developers in 40+ offices

Page 13: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day

Page 14: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC

Page 15: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC• Builds and tests: 5M per day through BuildRabbit

Page 16: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC• Builds and tests: 5M per day through BuildRabbit• Petabytes of output artifacts

Page 17: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Scale• Engineers: >30,000 developers in 40+ offices• Commits: 15K by humans + 30K by robots/day• Source code: 2 billion LOC• Builds and tests: 5M per day through BuildRabbit• Petabytes of output artifacts• 1 repository

Page 20: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Working in One Repository • Linear revision history

Page 21: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Working in One Repository • Linear revision history• Extensive code sharing & reuse

• Everything can be cross-referenced• Large-scale refactoring for codebase modernization• Easier collaboration across teams

Page 22: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management

• No confusion about authoritative version of the file• No forking of shared libraries• Components for library releases = Git subtree or Git

subcomponents to separate release from WIP versions

Page 23: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management• Predictable, repeatable builds from source

Page 24: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management• Predictable, repeatable builds from source• Optimizations to avoid compiling same artifacts

Page 25: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Working in One Repository • Linear revision history• Extensive code sharing & reuse• Simplified dependency management• Predictable, repeatable builds from source• Optimizations to avoid compiling same artifacts• Decouple each team's processes as much as

possible

Page 26: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 27: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 29: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Towards Distributed Build System

Page 30: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Towards Distributed Build System

Page 31: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Towards Distributed Build System

Page 32: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Towards Distributed Build System

Page 33: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Towards Distributed Build System

Page 34: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Towards Distributed Build System

Page 35: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

BuildRabbit:Distributed Build System

Page 36: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

BuildRabbit: Distributed Build System

IN THE CLOUD

Page 37: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

What does BuildRabbit do?

Page 38: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Continuous integration

system

BuildRabbit

Source System

Google engineers &

teams

Release infrastructure

Integration testing

infrastructure

Build artifact storage

Blaze

Page 39: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 40: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building Distributed Build System

at Google Scale

Page 41: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

From Push to Pull: Client / Server

Client forUser

BuildRabbitScheduler

BuildRabbitWorker

Page 42: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

From Push to Pull: Build Service

User

Persistent Queue

BuildRabbit Worker

Build ArtifactsBuild Progress

Info

eventRPCstream

Page 43: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

From Push to Pull: Build Service

User

Persistent Queue

BuildRabbit Worker

Build ArtifactsBuild Progress

Info

eventRPCstream

Page 44: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

Page 45: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

Page 46: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

RESILIENCE LOGIC

RESILIENCE LOGIC

Page 47: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Continuous integration

system

BuildRabbit

Source System

Google engineers &

teams

Release infrastructure

Integration testing

infrastructure

Build artifact storage

Blaze

Page 48: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Migration with Zero DowntimePhoto by moonjazz / CC BY SA

Page 49: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

MIGRATE BACKENDS FIRST

Page 50: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

Page 51: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

FOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Page 52: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

TARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Page 53: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

Page 54: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

DECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Page 55: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Client for

User

BuildRabbitScheduler

BuildRabbitWorker

User

Persistent Queue

BuildRabbit Worker

Build Progress InfoBuild Artifacts

Page 56: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

MAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Page 57: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

PRACTICE THE LAUNCHMAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Page 58: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

PRACTICE THE LAUNCH, A LOTMAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODEMIGRATE BACKENDS FIRST

Page 59: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

HAVE A SOLID ROLLBACK PLANPRACTICE THE LAUNCH, A LOTMAXIMUM VISIBILITY INTO SYSTEM STATEDECOUPLE LAUNCH OF SERVICESTARGET LAUNCH-FRIENDLY CLIENTSFOCUS ON MIXED MODE

Page 62: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Distributed Design Patterns: Distributed Control

Design implementation is scaled from 1 machine

Bigger overhead during evolution

Page 63: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Distributed Design Patterns: Centralized Control

Distributed design from the start

Bigger overhead during design phase

Page 64: Building A Distributed Build System at Google Scale (StrangeLoop 2016)

Building a Distributed Build System

at ScaleAysylu Greenberg

@aysylu22