saltconf14 - brendan burns, google - management at google scale

27
Management at Google Scale Converging managed infrastructure between Google and the Cloud community Brendan Burns Staff Software Engineer

Upload: saltstack

Post on 11-May-2015

1.165 views

Category:

Technology


0 download

DESCRIPTION

As a leading developer of highly scalable, large-scale Web services, Google was forced early on to develop systems to support the deployment and management of diverse workloads at an immense scale. As the broader developer community embraces cloud technologies we see significant parallels between the internal management infrastructure which Google has built over the last decade, and open source management technologies of today. This talk will describe Google's experience in managing large-scale compute services, draw parallels to open source efforts underway today, and sketch out how our past experience shapes our future development of the Google Cloud Platform.

TRANSCRIPT

Page 1: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Google confidential │ Do not distribute

Management at Google ScaleConverging managed infrastructure between Google and the Cloud community

Brendan BurnsStaff Software Engineer

Page 2: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Storage

Cloud Storage Cloud SQLCloud

Datastore

Compute

Compute Engine

App Engine

App Services

BigQuery Cloud Endpoints

Google Cloud Platform

Page 3: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Google confidential │ Do not distribute

For the past 15 years, Google has been building the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

Images by Connie Zhou

Page 4: SaltConf14 - Brendan Burns, Google - Management at Google Scale

We’ve had some practice

Page 5: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

Page 6: SaltConf14 - Brendan Burns, Google - Management at Google Scale

A view into my life

• Google engineer for 6 years

• Search Infrastructure (Realtime Search, Google+ Search …)

• Cloud Infrastructure

• Build software to expect failure

• Never had [email protected], despite web search oncall for 4+ years

Page 7: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

Page 8: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Imperative management leads to “Snowflake” Servers

Declarative Management

Page 9: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Separate textual declaration from Physical (Virtual) Manifestation

Declarative Management

Page 10: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Reasoning in a formal declaration (and version control) unlocks tremendous potential

Declarative Management

Page 11: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Declarative configurations facilitate re-use

Declarative Management

Page 12: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

Page 13: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Google has a long history with containers (Process GGroups, LMCTFY [https://github.com/google/lmctfy])

Of late, there has been a great deal of external interest as well.

Containers

Page 14: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Google has a long history with containers (Process CGroups, LMCTFY [https://github.com/google/lmctfy])What containers are good for?

Containers

Page 15: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

Page 16: SaltConf14 - Brendan Burns, Google - Management at Google Scale

(or how I learned to forget about SSH)C

ontainers don’t really have SSH (well, they can, but…)

Still want containers to be self-contained

Introspection

Page 17: SaltConf14 - Brendan Burns, Google - Management at Google Scale

?

?

There’s an exciting road ahead...

Page 18: SaltConf14 - Brendan Burns, Google - Management at Google Scale

[email protected]

Eric Johnson’s talk

I’ll be at the Google booth

Walk up and say “Hi”

Resources/Contact

Page 19: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Thomas Hatch, SaltStack CTO

Page 20: SaltConf14 - Brendan Burns, Google - Management at Google Scale

The Top Six Things You Didn’t Know About SaltStack

Page 21: SaltConf14 - Brendan Burns, Google - Management at Google Scale

1. Fast, flexible comms protocol

• SaltStack provides options• Different solutions for different problems• Flexibility and plug-ability• ØMQ

– Super fast• SSH

– For certain use cases– 50x faster than other other SSH-based tools

• RAET

– UDP or TCP– Even faster– More control over job queuing and prioritization– More infrastructure visibility

Page 22: SaltConf14 - Brendan Burns, Google - Management at Google Scale

2. Salt Virt

• Doesn’t get much attention• Salt originally designed as a

cloud controller (Butter)• A completely different approach

to cloud management– Database free– Evolving but being used in production

Page 23: SaltConf14 - Brendan Burns, Google - Management at Google Scale

3. Declarative or imperative? Yes.• Stick a fork in this debate• Most flexible configuration management• Finite order execution is a core Salt

design principle• 0.17 introduced more state ordering

choice• Compiler and run time

– Salt modularity– No sacrifice or compromise of speed

Page 24: SaltConf14 - Brendan Burns, Google - Management at Google Scale

4. Generic device automation

• Minion proxy for network devices (Juniper, Arista, Broadcom, F5, etc.)

• Not just executing CM routines• Finite device control w/ remote execution• Easy to communicate with and control these

typically dumb devices• Stateful configuration and one-off queries• Integrated with standard Salt workflows and

methodologies

Page 25: SaltConf14 - Brendan Burns, Google - Management at Google Scale

5. The Salt test suite

• More stable Salt releases• Pedro Algarvio!• Running lives tests constantly on real infra

– Jenkins– Spinning up VMs on Rackspace to run tests– Hooked into Docker containers

• PyLint coverage (thx Hulu & LogiLab)• Test coverage doubled in three months

Page 26: SaltConf14 - Brendan Burns, Google - Management at Google Scale

6. The SaltStack name

• Not SLC• FLOSS Weekly

realization• Gimli, son of Gloin• Ubiquitous nature of Salt

Page 27: SaltConf14 - Brendan Burns, Google - Management at Google Scale

Thank You