Transcript
Page 1: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

From #MonitoringSucks to

#MonitoringLove

(and back)

@KrisBuytaert OSMC 2014 , Nuremberg, Germany

Page 2: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Kris Buytaert ●I used to be a Dev, ●Then Became an Op ●Chief Trolling Officer and Open Source Consultant @inuits.eu ●Everything is an effing DNS Problem ●Building Clouds since before the bookstore ●Organising Conferences ●Evangelizing devops

Page 3: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

An opinionated talk about the Open Source Monitoring tooling landscape

In which I hope to learn from YOU

Page 4: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#devops=~C(L)AMS ● Culture

● (Lean)

● Automation

● Monitoring and Measurement

● Sharing

● Damon Edwards and John Willis

Gene Kim

Page 5: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Monitoring is usually an aftertought ENOBUDGET, ENOTIME

Page 6: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

An 2008 OLS Paper ● We have bloated Java tools

● Some open Core stuff

● DYI folks want traditional Nagios

● DBA Required

Page 7: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#monitoringsucks ● John Vincent (@lusis), june 2011

● A sub #devops movement

● https://github.com/monitoringsucks/

Page 8: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Why #monitoringsucks ● Manual config (gui)

● Not in sync with reality

● Hosts only

● Services sometimes

● Aplication never

● Chaos or out of sync with reality

● Alert Fatigue

Page 9: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Let's forget about ● Tools with no (stable) API

● Tools with strong focus on GUI

● Unless you are an SME with < 100 nodes

● Zenoss, Hyperic, GroundWork, ....

● P.S. : don't even mention proprietary software to me

Page 10: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

What we want

● Small , well suited components

• Collect

• Transport / Mangle

• Store

• Analyse

• Act / Alert

• Visualize

Page 11: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#monitoringlove

•Ulf Mansson #devopsdays Rome 2011

•A new era of tooling

•#monitoringlove hacksessions @inuits

•#monitorama

Page 12: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 13: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 14: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 15: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Icinga •2009 Fork

•I consider Nagios dead

•Vibrant Community (or they stalk me)

•Throw great parties in Nurnberg

•Nobody can pronounce it anyhow

•https://github.com/Inuits/puppet-icinga/

Page 16: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Stored Configs

Page 17: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#monitoringlove But the love was about :

Page 18: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Sensu ● Awesome for non static environments

● Scaling a clustered RabbitMQ ?

● This is Europe, U no do cloud

Page 19: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Automation of #monitoring brought back

the #love

Page 20: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

●Autodetection

●Multiplexing

●Trend Forecasting

I love CheckMK

Page 21: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

•Autodetection ?

•Service,

•Business Functionalities

•eg. vhosts etc

•Single Source of Truth

I hate CheckMK

Page 22: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Monitoring a service vs

Monitoring a Service

Page 23: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

definition of done:

monitored and in production

Page 24: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

A software project is not done untill your last end user is dead

Page 25: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Culture,

Automation,

Measurement : measure all the things

Sharing

Page 26: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Deploy Statistics ● Time To Deploy

● Deploy Frequency

● Lifecycle frequency

● Map to other metrics

Page 27: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

CollectD all the metrics, at high intervals

Page 28: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Oldschool graphite

Page 29: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Self Service Gdash based pipelines

Puppetized Templates (wip)

Page 30: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Gdash

Page 31: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Grafana

Page 32: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Graphite++ ● Dashboards

• Grafana

● Engines :

• InfluxDB

• Cyanite

Page 33: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Triggers on Graphs ● Export Java Metrics

● JMXTrans

● Export JMXConfigs

● Configure NRPE Check

● Export NagiosCheck

● Collect JMX Exports on JMXTransNode

● Graph Em

● Collect Icinga Configs on Icinga

Page 34: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Aggregation ● Alert on streams

● Alert on aggregated metrics

Page 35: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 36: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 37: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Riemann ● I still don't get it ?

● Distributed Top

● Do you like Clojure ?

● Riemann Health plugin ?

● s/riemann-health/collectd/g;

● Output to graphite

Page 38: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Graphs to Knowledge

Skyline

•Oculus

•Creating Information out of this data

•Big data

•Machine Learning

Page 39: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

But I have log files..

Page 40: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 41: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 42: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 43: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Logs and Metrics ● Graylog2

● ELSA (Enterprise Log Search and Archive)

● ELK Stack

Page 44: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

● Collect from anywhere

● Filter

● Send anywhere

● Queing

Page 45: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 46: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Black on White ?

Page 47: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

APM But what about my apps ?

Half the world cheers about SAAS tools :(

Page 48: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Packetbeat ● Traffic Flow through network

● Transactions causing errros

● SQL per HTTP

● API call usage

Page 49: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

PacketBeat

Page 50: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

This new “D” hype

Page 51: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Containers are the new black

● 1 process per container

● Metric collection ?

● Service health ?

Page 52: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

So you want service registration of your healthy (containerized) applications ?

Page 53: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Enter Consul.io ● Service discovery

● Failure detection

● Using Gossip build on top of Serf

● Random node 2 node communication

● A HashiCorp project

Page 54: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Consul ● Uses monitoring_plugins for health

● Creates unhealthy dns setups

● Sensu alike

● Key-Value store

● Consul_template => fills your templates

Page 55: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Everything is a freaking dns problem

Page 56: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Self Healing ● Pacemaker Corosync (ocf resource that monitors your service)

● Mesos

● Kubernetes

● Scale changes, Consensus Models change

Page 57: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

So your DC fails

Whom to alert when ?

Page 58: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 59: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

'New' kids on the block ● Flapjack

● flapjack.io

● monitoring notification routing + event processing system

● OpenDuty

● github.com/szechuen/OpenDuty

● Duty management

Page 60: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

My Alerting Strategy

Is still in beta

Page 61: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

And back :(

In 2014 I`m still running the same check for

- service registration (consul)

- high availability (pacemaker/corosync)

- monitoring (icinga)

Page 62: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

But I love where Monitoring is heading

We have much less false positives

And we have a Maintainable Monitoring Infra

Kinda

Page 63: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Your next trip to Gent !

CfgMgmtcamp.eu February 2 and 3, 2015

CFP is Open !

Page 64: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Contact [email protected] Further Reading @krisbuytaert http://www.krisbuytaert.be/blog/ http://www.inuits.eu/

Inuits Duboistraat 50 2060 Antwerpen Belgium 891.514.231 +32 475 961221


Top Related