running at scale: practical performance tuning with puppet - puppetconf 2013
DESCRIPTION
"Running at Scale: Practical Performance Tuning with Puppet" by Sam Kottler Engineer, Red Hat. Presentation Overview: This session will talk about some production issues I've seen running Puppet in large environments. From how to manage a single master with hundreds of hosts to real-life patterns for building high availability clusters that scale to 10's of thousands of agents. Another important topic that will be covered is how to deploy networked filesystems that perform well under high load and streaming files to many hosts simultaneously. Speaker Bio: Sam Kottler is a software engineer in the Virtualization R&D group at Red Hat. He's helped build infrastructure for leading startups, including Digg.com, Acquia, and Venmo and is a contributor to Puppet, the Fedora Project, Drupal, and the Rubygems.org. Sam speaks around the world on the topics of internet security, systems automation, and software architecture.TRANSCRIPT
Sam Kottler | Puppet at Scale1
Puppet at ScaleSam Kottler@samkottler
Sam Kottler | Puppet at Scale2
About me
● Worked on large-scale infra for the web @ Venmo, Acquia, and Digg
● Rubygems.org infrastructure
● Bundler core
● Fedora developer
● Core committer on the Foreman
Sam Kottler | Puppet at Scale3
What we'll cover
1. Some basics
2. Master vs. masterless deployment
3. CA management
4. Clustering
5. Node management
6. Development + deployment practices
Sam Kottler | Puppet at Scale4
Why we care
● Hyperscale computing
● Massive, multi-DC infrastructure
● Dynamic environments
● The Cloud ™
Sam Kottler | Puppet at Scale5
Master vs. masterless
Sam Kottler | Puppet at Scale6
Provisioning nodes with a master
1. New node comes online
2. A script is run to install packages and configure /etc/hosts
3. The agent gets run, generates a CSR, and sends it to the master
4. The cert gets signed based on an autosign rule or `puppet cert --sign <nodename>`
5. Puppet runs
Sam Kottler | Puppet at Scale7
Provisioning nodes without a master
1. New node comes online knowing its role
2. A script runs to install packages and retrieve package/tarball
3. puppet apply
Sam Kottler | Puppet at Scale8
Certificate authority
● Use by Puppet to authenticate agents
● CSR generated and signed by the CA
● Shared CRL across all CA machines
Sam Kottler | Puppet at Scale9
Clustering patterns● CA has lots of state
● Masters should be stateless
● Reduce the number of file shares
Sam Kottler | Puppet at Scale10
DNS-based clustering
Sam Kottler | Puppet at Scale11
Load balanced clustering
Sam Kottler | Puppet at Scale12
Masters across data-centers
● Shared CA vs. per-region
● Deploy in stages across data-centers
Sam Kottler | Puppet at Scale13
Multi-cluster
Sam Kottler | Puppet at Scale14
Node classification
Sam Kottler | Puppet at Scale15
Sam Kottler | Puppet at Scale16
External node classifiers
● Output YAML based on external data
● The Foreman, Puppet Enterprise, Puppet Dashboard
● Your own custom data source
● Key integration source with your own CMDB
Sam Kottler | Puppet at Scale17
Sam Kottler | Puppet at Scale18
Packaging for masterless
https://github.com/skottler/librarian-masterless-packaging
● Use /etc/puppet/modules (or modulepath)
● Build RPM's/deb's for distribution
● Publish packages to a repo
● Install/update packages on all machines
Sam Kottler | Puppet at Scale19
Distributed runs● Run puppet based on changes in your code
● Mcollective/SSH/cron
Sam Kottler | Puppet at Scale20
Deployment
● Masters are just another deployment target!
● Build CI pipelines
● One-click deployments to masters
● Lint and test your modules
Sam Kottler | Puppet at Scale21
cap puppetmaster deploy DC=london
Sam Kottler | Puppet at Scale22
Controlled releases
● Separate hosts into groups to do red/black releases
● Build smaller sub-groups of canary hosts
● Monitor your puppet runs