bol.com - what would you do if you could do it all over without getting fired

About us

Jos Houtman: -‐  professional bit byter -‐  10 years on the job -‐  [email protected]

Guido Bakker: -‐  stubborn guy from Hoorn who next

Cme will be on stage! -‐  15 years on the job and the technical

orchestrator -‐  @guido_bakker

Niels van de Wall: -‐  responsible for IT operaCons -‐  15 years on the job -‐  [email protected]

>6.700.000 products >12 categories >4.000.000 customers

>8.000.000 products >4.000.000 customers

>150 engineers >50 applica8ons

>30 scrum teams

3

What happened under the hood last 2 years

start

pla:orm and automa8on

ways of working

team

……

Build team

4

With passionate experienced professionals that have great feel how to deal with risky situations Whom love to automate and structurally improve stuff based on measurements Take the lead and own it

and…..

have the right attitude!

5

The BAD: •  Takes Cme to find the right

people…. we’ve succeeded but next Cme…

•  New team, new ways of working, new plaWorm…… takes iniCally more Cme and energy to fine-‐tune

The UGLY: •  pressure cooking… joiners

needed to go through aggressive ramp-‐up period!

The GOOD: •  Got the right people just in

Cme without concessions •  To-‐be colleagues were

observed how they behave and deal with ma\ers on the command line.

•  Building and running done by the same team

•  Ownership and focus •  AutomaCon mindset •  Fun!

6


start

ways of working

team pla:orm and automa8on

……

7

Platform and automation

Principles: •  single version of truth •  no manual actions •  If it isn't high available it’s bad •  set boundaries, be conditional •  measure and monitor everything •  manage all environments the same •  only peer reviewed changes

8

Asset management with API

Goal: holds the truth of our infrastructure and is used during the whole lifetime of an asset. •  provisioning: os, hostname, network, etc. •  configuration: role •  operation: state determines monitoring

visibilty

9

The GOOD: •  administraCon is up-‐to-‐date

and enforced •  Ce key components together

with api’s/scripts/whatever •  changes are cheap •  Strict naming scheme allows

for easier automaCon.

The BAD: •  It’s good start but needs

more to it! •  Majority of infrastructure

informaCon ended up in hiera.

The UGLY: •  Need for place to store

infrastructure informaCon

•  no CLI •  Truth needs to be

available

10


start

ways of working


……

11

Configuration management

Source: h\p://www.craigdunn.org/2012/05/239/

12

Config – hiera data

•  Hiera is suboptimal as a data source for complex information used by different modules / functionality

•  Solution: custom functions to retrieve only

subsections of a hiera hash

13

Config – deployments

•  Complete state is maintained, puppet installs releases.

•  Rundeck does orchestration of puppet

runs, database deploys, restarts

•  More tomorrow by Steven Meunier

14

Config – monitoring

•  Exported resources to configure nagios checks.

•  checks defined on abstraction levels: role, os, etc.

•  then exported in the various classes of the profile layer

15

The GOOD: •  Define on abstracCon levels

not individual systems •  Monitoring, logging and

metrics integral part of our profiles

•  No separate deployment needed ader installaCon

•  2 hours from scratch to fully working environment

•  Destroyed and rebuild enCre environments

The BAD: •  Puppet(db) slow due to

amount of resources •  Prone to dependency hell

The UGLY: •  Double administraCon

necessary in hiera •  Exported resources is the

wrong choice for most problems

16


start

……


ways of working

17

Ways of working – next steps

collaboraCon & shared

responsibility

ConCnuous delivery

bol.com - what would you do if you could do it all over without getting fired

Technology