puppet camp sydney feb 2014 - a build engineering team’s journey of infrastructure as code

Post on 10-May-2015

2.020 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

A Build Engineering Team’s Journey of Infrastructure as Code - the challenges that we’ve faced and the practices that we implemented as we went along our journey. 

TRANSCRIPT

Monday, 10 February 14

@peterleschev

Husband, Father of 3 & Atlassian

Build Engineering Team Lead

Peter Leschev

Monday, 10 February 14

A Build Engineering Team’s Journey of

Infrastructure as Code

Monday, 10 February 14

• Build platform & services used internally within the company• 60k builds per month• 35k automated tests for JIRA

Build Engineering today @ Atlassian

Monday, 10 February 14

• 600 build agents (own hardware + EC2 instances)• include SCM clients, JDKs, JVM build tools, databases, headless

browser testing, python builds, NodeJS, installers & more

• Maintain 20 AMIs of various build configurations• 6 Bamboo Servers• maven.atlassian.com / 6 Nexus instances • Monitoring - opsview / graphite / statsd

Build Engineering today @ Atlassian

Monday, 10 February 14

Infrastructure as Code

= Puppet + SCM ?

Monday, 10 February 14

• Manually maintained snowflakes• Started using puppet

3 years ago...

Monday, 10 February 14

Production rollout

puppetmaster

build agents

Monday, 10 February 14

Production rollout failure

puppetmaster

build agents

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence of Change

Dev Rollout Soak in Prod

Monday, 10 February 14

Monday, 10 February 14

http://atlassian.com/git

https://bitbucket.org/

Monday, 10 February 14

Style in Pull Requests

Monday, 10 February 14

• Automated style checking• Setup automated build that runs checks & posts results• Still need to implement a ratchet build

Puppet Lint https://github.com/rodjek/puppet-lintTim Sharpe

@rodjek

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence of Change

Dev Code review Rollout Soak in Prod

initial + Code review

Monday, 10 February 14

• Coding on Puppet Master• Culture of manually modifying

production - Configuration Drift

• Impact on Builds

Using Staging for Development

puppetmaster

build agentsbuild agents

staging puppet environment

Monday, 10 February 14

• Easily spin up Infrastructure locally on your laptop• Disposable / reproducible environments• Machine provisioning via Virtual Box / VMWare / AWS• Configuration applied via Shell Scripts / Puppet / Chef• Develop and test infrastructure changes locally

Vagrant http://www.vagrantup.com/Mitchell Hashimoto

@mitchellh

Monday, 10 February 14

Vagrant

Vagrantfile

vagrant basebox

http://www.vagrantup.com/Mitchell Hashimoto

@mitchellh

Monday, 10 February 14

Vagrant

Spins up a local VM to a known state

Destroy the VM when done

Make some puppet changes and then run:

to apply your changes

SSH into your VM using:

to check your changes

http://www.vagrantup.com/Mitchell Hashimoto

@mitchellh

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence of Change

Dev Code review Rollout Soak in Prod

initial + Code review + Vagrant

Monday, 10 February 14

• Vagrant basebox differences with production machines• Originally using publicly available vagrant baseboxes

• Installed packages biggest differences

• Generating a basebox manually was a painful process

Vagrant != Production

Monday, 10 February 14

VeeweeAutomated

Vagrant basebox generationhttps://github.com/jedi4ever/veewee

Patrick Debois@patrickdebois

Ubuntu installation iso vagrant baseboxVeewee definitions.rbpreseed.cfgpostinstall.sh

+

Monday, 10 February 14

Veeweehttps://github.com/jedi4ever/veewee

AutomatedVagrant basebox generation

Patrick Debois@patrickdebois

Monday, 10 February 14

• Latest basebox generated in CI & published to fileshare• No need to generate baseboxes locally

Basebox generation via CI

Monday, 10 February 14

• VirtualBox Guest additions• Reduced to a minimal

There are still differences!

Monday, 10 February 14

Common Preseed / Postinstall

preseed.cfg postinstall.sh

+

custom ISOsvagrant basebox PXEBoot

Monday, 10 February 14

Packer http://packer.ioMitchell Hashimoto

@mitchellh

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review Rollout Soak in Prod

initial + Code review + Vagrant + Veewee

Monday, 10 February 14

Developing locally

Rolling out to production

Broken build agents!

Rolling out to staging

Monday, 10 February 14

• Behaviour Driven Development

Cucumber

Monday, 10 February 14

Cucumber & Vagrant

Vagrant

Custom Provisioner

Virtual Box

VM

puppet apply

cucumber *.features

via ssh

Monday, 10 February 14

• Requires cucumber dependencies to be installed on tested VM

• Tests run within the VM making testing firewall rules harder

Disadvantages

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review Rollout Soak in Prod

initial + Code review + Vagrant + Veewee + Cukes

Monday, 10 February 14

But it works on my machine!– Every Developer”“

Monday, 10 February 14

• ‘From scratch’ provisioning• Confidence that you can rebuild in disaster

Continuous Integration

Monday, 10 February 14

The Pets: you give nice names,

you stroke them, and when they get ill,

you nurse them back to health,

taking a long time over it

– Tim Bell, CERN”

The Cattle: you give them numbers.

When they get ill, you shoot them

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review CI & Rollout Soak in Prod

initial + Code review + Vagrant + Veewee + Cukes + CI

Monday, 10 February 14

Provisioning from scratch is slow

Monday, 10 February 14

Spread out CI

provision VM1

provision VM2

provision VM3

provision VM4

provision VM1

provision VM2 provision VM3

provision VM4Moved from sequentialto parallel provisioning

Monday, 10 February 14

There are so many MacPros you can steal

Monday, 10 February 14

The onesI have my eye on....

Monday, 10 February 14

Profiling Puppet Runs

Add “--evaltrace” to puppet apply

+ =Collect and show the longest occurrences of:“Evaluated in ([\d\.]+) seconds”

Monday, 10 February 14

Profiling Cucumber runs

http://itshouldbeuseful.wordpress.com/2010/11/10/find-your-slowest-running-cucumber-features/

Monday, 10 February 14

• Provision locally & for CI• Faster & different class of problems found• Matches production state

Delta Provisioning

‘from scratch’ provision delta provision

provision VM1

export VM1 fileshare

import VM1 box

provision VM1

on success

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review CI & Rollout Soak in Prod

initial + Code review + Vagrant + Veewee + Cukes+ CI + Delta CI

Monday, 10 February 14

Infrequent Releases

Monday, 10 February 14

• Puppet runs impacted running builds• Disabling all the build agents

• Performing the roll out

• git clone / librarian-puppet / symlink update on puppetmaster

• Manually kick off puppet on all the build agents

• Enabling all the build agents

• Set of Puppet environments for every bamboo server

Painful Puppet Rollouts

Monday, 10 February 14

Graceful Service restarts

+Bamboo Agent JVM process watches for touch file & shutdowns when Idle(written as a Bamboo Plugin)

Monday, 10 February 14

• BEFORE - Multiple puppet envs for each Bamboo Server• jbac_staging

• jbac_production

• cbac_staging

• cbac_production

• etc

• AFTER - Changed to use ‘staging’ & ‘production’ only

Puppet Environments

Monday, 10 February 14

• BEFORE: Manually on puppetmaster• git clone the puppet tree

• run librarian-puppet to pull external modules

• Update staging / production symlink

• AFTER: Bamboo build which performs the above steps automatically

Updates on Puppetmaster

Monday, 10 February 14

Less Human interaction +

More automation=

Higher Confidence

Monday, 10 February 14

Less Human Effort =

Increased frequency of releases

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review CI & Rollout Soak in Prod

initial + Code review + Vagrant + Veewee+ Cukes + CI + Delta CI + Frequent releases

Monday, 10 February 14

Should I be scared?– Peter Leschev, 3 months ago”“

I’m scared!– Peter Leschev, 3 years ago”“

Monday, 10 February 14

Hipchat integration

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review CI & Rollout Soak in Prod

initial + Code review + Vagrant + Veewee+ Cukes + CI + Delta CI + Frequent releases+ Notification

Monday, 10 February 14

HIGH

NONE

Lifecycle of an infra change

Confidence in Change

Dev Code review CI & Rollout Soak in Prod

before after

Monday, 10 February 14

Confidence in Change

or

Finding & fixing problems sooner rather

than later

Monday, 10 February 14

Commit Graph

Monday, 10 February 14

Snowflakes

Pets

Cattle

Stateless Machines

Monday, 10 February 14

We’re still on the Journey

Come join us!

atlassian.com/jobs

Monday, 10 February 14

Questions?

Monday, 10 February 14

Thank you!

Monday, 10 February 14

Monday, 10 February 14

top related