automating monitoring with puppet
TRANSCRIPT
Automating Monitoring
with Puppet
Chris MagueMoovwebMay 23, 2012
Where I Want to be.....
+
What I'll Settle For.....
Requirements
Rock solid stability
Automated node addition (discovery)
Scales horizontally
Service dependency models
Easy to write plugins
Promotes sane workflows
Unified front end view
Flexible configuration
Tool Stack
What????
but #monitoringsucks and #ihatenagios
How could you?
In defense of Nagios
Been around since 1996
Has Service dependencies
Easy to write plugins
Easy-ish to troubleshoot
ROCK SOLID
Valid attacks on Nagios
No automated discovery
It's complicated to setup
Text files really?
Front end won't win any beauty contests
Development is slow
Stats collection is a PITA
Solutions
Use Icinga!
Use Puppet to auto configure
Stats leave it to graphite. It's really good at that
Big boys and girls learn their tools
Icinga
Fork of Nagios
Configurations are compatible
More solid architecture ( core, API, Web, IDODB )
Nice front end, nice mobile front end
Can use NRPE
High Level View
Configure Icinga Servers
using Puppet Standard Types
Things to configure with Standard Types
icinga.cfg (file) => icinga main config file
Apache icinga.conf (file) => http access to each server
cgiauth.cfg (file) => cgi access
cgi.cfg (file) => options, users
templates.cfg (file) got lazy => use for basic classes
idomod.cfg (template) => template for hostname to DB
Configure Icinga using Nagios Types
Puppet Nagios Types
nagios_command
nagios_contact
nagios_contactgroup
nagios_host
nagios_hostdependency
nagios_hostescalation
nagios_hostextinfo
nagios_hostgroup
nagios_service
nagios_servicedependency
nagios_serviceescalation
nagios_serviceextinfo
nagios_servicegroup
nagios_timeperiod
Configuring Hosts
Overview
Detailed Overview
Store Configs
Store puppet info in a DB
Retrieve information from DB
Share info across nodes
Use thin_storeconfigs
Set up on puppet master
Exporting Nagios_host Resources
Export = Save to DB
Use facter for dynamic data
PRO TIP: use ENC
PRO TIP: use targets
PRO TIP: hostgroups
PRO TIP: use tags
PRO TIP: Use your ENC
PRO TIP: use targets
Use cfg_dir in icinga.cfg
Create a unique file per host or service
Addition and removal are now super easy
Also default dirs are in a horrible place /etc/nagios
PRO TIP: hostgroups
Add machines to a hostgroup
Add services to a hostgroup
New machines inherit all of the services associated with a hostgroup
PRO TIP: use tags
Tags allow you to filter resources so that you only realize those resources that you need
Configuring Services/Commands
Icinga Services
OR 'Stuff I want to monitor'
Associate with a hostgroup
Use a target
Icinga Commands
OR 'What actually gets run'
Use Macros to set paths in resource.cfg
Dependencies
PRO TIP: Dependencies
Unreliable services
Cut down on the number of alerts
Tell me what's really wrong
Route alerts accordingly
Nagios_servicedependencies
NRPE
NRPE
Runs on client
Secured via SSL
Has ACLs
Runs as nobody
Can run commands
Useful for other things...
Configuring NRPE
NRPE Checks
Plugins
exchange.nagios.org
Writing Plugins
Write in any language
Output 1 line to stdout
NRPE/Icinga/Nagios all use exit codes to determine status
Run by hand to check
Workflows
Watching Monitoring
Scheduling Downtime
Filtering
Alerting
#monitoringisawesome
REMOVE unreliable checks
Just MONITOR don't bolt on - especially stats
TIER your monitoring
Use timeperiods for sanity
Delegate responses
Use dependencies to pin down problems quickly
Work smart
Resources
Icinga http://icinga.org
Puppet http://docs.puppetlabs.com/references/latest/type.html#nagioscommand
NRPE http://nagios.sourceforge.net/docs/3_0/addons.html
IRC##infra-talk, #icinga, #puppet
[email protected], @maguec, #gaijin (freenode), http://blog.mague.com
ThanksYvonne Kong, Michael Catlin, Juan Ortega, Anthony Kong, Puppet Labs, Icinga Team