puppet camp berlin 2015: andrea giardini | configuration management @ cern: going agile with style

Post on 15-Jul-2015

70 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Configuration Management @CERN

Going Agile with Style

Andrea Giardini

CERN

andrea.giardini@cern.ch

24 April 2015 - PuppetCamp Berlin

Configuration Management @ CERN 2

Outline

IntroductionWhat is CERNDatacenters Overview

Puppet @ CERNCurrent InfrastructureModules, Hostgroups and Environments

Configuration ManagementManaging ChangesTools

Conclusions

Configuration Management @ CERN 3

What is CERN

I European Organization for NuclearResearch

I Situated in the border betweenSwitzerland and France

I 21 Member states

I Big challenges

Configuration Management @ CERN 4

Big Challenges - The FCC

Configuration Management @ CERN 5

The LHC

Configuration Management @ CERN 6

The Detectors

Configuration Management @ CERN 7

Data Flow

Configuration Management @ CERN 8

Datacenters in Numbers

Two datacenters:

I Budapest

I Geneva

Two dedicated links:

I 2 x 100Gbps

The number of resources is growing yearby year. As today:

I 15k servers

I 100PB on tape

I 200PB on disk

Configuration Management @ CERN 9

Going Agile

Requirements started to grow

I Agile approach was needed

Since a few years we started using Openstack to deploy virtualmachines for our users and Puppet to configure the services

Configuration Management @ CERN 10

Our Setup

We started using Puppet a few years ago and, since then, things evolved a lot . . .

We changed several time the configuration of our puppet masters in order to keep upwith the requests and we found out that:

I Puppet scales horizontally quite well

I The NFS filer underneath . . . does not

NFS is used to share configurations and Puppet code between different masters.All the masters used to mount the same shared folder . . .

Configuration Management @ CERN 11

Clusters and Pools

I Catalog compilation time ∼ 90sec∼ 180 catalogs / minute

I ∼ 17k Puppet hosts

I Batch ∼ 300 cores

I Interactive ∼ 12 cores

Configuration Management @ CERN 12

Few concepts

I Modules (∼ 280)The various modules available should be viewed as a library that your hostgroupcode can reuse.

I Hostgroups (∼ 160)Groups of nodes that are part of the same service and have some configurations incommon.

I Environments (∼ 180)Collections of modules and hostgroups at different development levels.

Configuration Management @ CERN 13

Environments allow us to . . .

Environment ”production” → All modules/hg from ”master” branchEnvironment ”qa” → All modules/hg from ”qa” branch

Custom environments (for testing purpose):I Possibility to set a default branchI Specify specific branch for one or more modules/hostgroups

Configuration Management @ CERN 14

Manage changes

Three important concepts:

I Modules

I Hostgroups

I Environments

A configuration change has to beapproved through a request in Jira.

Every git repo has at least two branches:

I master

I qa

Configuration Management @ CERN 15

Puppet Run

Configuration Management @ CERN 16

Jens

Jens creates Puppet environments for the Puppet Masters

I Using repository metadata and a list of environments definitions

I Allows dynamic environments and isolates puppet code for different services

Has recently been opensourced on GitHub:

https://github.com/cernops/jens

Useful for those running different services under the same puppet infrastructure

Configuration Management @ CERN 17

Configuration Change Process

Configuration change process:

I Modify a module on feature branch

I Create a custom env and test the module

I Open a ticket on Jira and announce the change

I Merge to qa

I After one week, merge to production

Service managers use the same module for different services: we need to be sure thatall the service managers are happy with the change before merging it to production.

Configuration Management @ CERN 18

Jenkins and Continuous Integration process

I Machines are built and tested beforemerging a change to production

I More automation, less manual work

I Still work in progress, but lookspromising

Configuration Management @ CERN 19

Dashboard

Configuration Management @ CERN 20

Automating procedures - RunDeck

I Tedious prone-error tasks replacedby executable code

I Handing off operational tasks toothers

I Procedures as a list of individualand atomic steps

I Ability to react to failures

Configuration Management @ CERN 21

Renaming hosts

Configuration Management @ CERN 22

Mcollective

Framework for server orchestration andparallel job execution

Problems in the past with big clusters> 3000 nodes

Latest improvements:

I Direct addressing

I New PuppetDB discovery method

I Threaded Mode

I Batched requests

Configuration Management @ CERN 23

Configuration Drifts

Configuration drifts started to be a problem:

I Out of sync machines

I Possibility for service managers to have snapshots

I Possibility to freeze their environment

It’s not easy to keep all the configuration in sync

Configuration Management @ CERN 24

Package Inventory

Centralized service for package inventory:

I Using Elasticsearch

I Queryable using Cli

I Compare a set of hosts

I Reports differences and misalignments

I Package History

Configuration Management @ CERN 25

Conclusions

Moving from a traditional infrastructure to an Agile one allowed us to:

I Optimize our resources

I Speed up the development cycle

I Reduce interventions time

I Have more free time :)

Configuration Management @ CERN 26

Conclusions

Puppet gives us the right combination between elasticity and efficiency

I Big community

I Active development

I Highly customizable

Configuration Management @ CERN 27

Questions?

Andrea Giardini

andrea.giardini@cern.ch

@GiardiniAndrea

Configuration Management @ CERN 28

top related