time to say goodbye to your nagios based setup

27
So you want to switch off ? Time to say goodbye to your Nagios based setup! © 2014 - Olivier Jan - Check my Website @olivjan - [email protected]

Upload: check-my-website

Post on 02-Jul-2015

3.400 views

Category:

Technology


3 download

DESCRIPTION

Time to say goodbye to your Nagios based setup. Discover all the new cool tools out there to do some more efficient monitoring. A talk made at OSMC 2014. https://www.youtube.com/watch?v=_BAWi9Zhmic

TRANSCRIPT

Page 1: Time to say goodbye to your Nagios based setup

So you want to switch off ?

Time to say goodbye to your Nagios based setup!

© 2014 - Olivier Jan - Check my Website@olivjan - [email protected]

Page 2: Time to say goodbye to your Nagios based setup

About me

❖ System admin and architect

❖ Co-founder of « Communauté Francophone de la Supervision Libre »

❖ Writer of the book « Nagios 3 au cœur de la supervision Open Source »

❖ Co-founder of Check my Website, a SaaS service for remote monitoring of

websites and applications (current)

Page 3: Time to say goodbye to your Nagios based setup

Content

❖ Why switch off ? the good and maybe not so good reasons to do so !

❖ Which way to take ?

❖ Building a monitoring solution without Nagios :

❖ Tools available

❖ A personal work in progress

❖ Migrating from Nagios to this kind of solution

Page 4: Time to say goodbye to your Nagios based setup

Some reasons to switch off…

❖ The godfather of OSS monitoring is dead as an

Open Source project ?

❖ Can’t do better with it

❖ Cool new kids out there

❖ Better « cloud » support

❖ Clear states, metrics and messages monitoring

distinction

❖ Better charting solution

❖ Near realtime monitoring

❖ Routing, aggregation, correlation…

❖ YOUR reasons ;)

Page 5: Time to say goodbye to your Nagios based setup

Which way to take ?

❖ The « 4 mousquetaires »

❖ Naemon

❖ Icinga 2

❖ Shinken

❖ Centreon

❖ Reboot from building blocks

❖ Collect

❖ Store

❖ Visualize

❖ Alert

Page 6: Time to say goodbye to your Nagios based setup

Tools : Collecting metrics and messages

❖ Packetbeat (metrics & messages)

❖ Rsyslog, NX log, Syslog-ng

(messages)

❖ sFlow Toolkit, Host sFlow

❖ Logstash-forwarder (messages)

❖ Collectd (metrics)

❖ Diamond (metrics)

❖ OSquery, WMI (metrics)

❖ Network level (sFlow)

❖ System Level

❖ Application Level

Page 7: Time to say goodbye to your Nagios based setup

Tools : External collecting

❖ End user perspective

❖ Controls done closest to the

end-user

❖ Application behavior

❖ Real User Monitoring

❖ Webpagetest

❖ Selenium

❖ PhantomasJS

❖ Boomerang

❖ Bucky

Page 8: Time to say goodbye to your Nagios based setup

Tools : Routing metrics and messages

❖ Messages : Logstash, Flume, Fluentd

❖ Metrics : StatsD

❖ Metrics : Carbon Relay NG

One or more messages can fire an event

Page 9: Time to say goodbye to your Nagios based setup

Tools : Databases

❖ Graphite : The most used.

❖ OpenTSDB : HBase

❖ KairosDB : Cassandra

❖ InfluxDB : The most promising ?

❖ Elasticsearch : Index database

Page 10: Time to say goodbye to your Nagios based setup

Tools : Visualizing metrics and messages

❖ Kibana

❖ Grafana

❖ Dashboards collection

Page 11: Time to say goodbye to your Nagios based setup

Tools : Alerting

❖ Seyren : Alerting dashboard for

Graphite.

❖ Cabot : Get alerted when services

go down or metrics go crazy

❖ Bosun : An advanced, open-source

monitoring and alerting system

❖ Skyline : Real-time anomaly

detection system

❖ Oculus : Anomaly correlation

component of Etsy's Kale system

❖ Esper : Complex Event Processing

Page 12: Time to say goodbye to your Nagios based setup

The French Monitoring Community Xperience

❖ Reboot from building blocks

❖ Collect

❖ Store

❖ Visualize

❖ Alert

Page 13: Time to say goodbye to your Nagios based setup

The French Monitoring Community Xperience

Is it working ? What is not working ?

Page 14: Time to say goodbye to your Nagios based setup

Collecting metrics : Collectd

❖ InfluxDB Collectd proxy

❖ In Golang like InfluxDB

❖ Temporary solution

❖ Native Collectd plugin

LoadPlugin network

<Plugin network>

# proxy address

Server "127.0.0.1" "8096"

</Plugin>

❖ PHP5-FPM metrics

❖ Nginx metrics

❖ MariaDB metrics

❖ System metrics

❖ <metricname>:<value>|<type>

Page 15: Time to say goodbye to your Nagios based setup

Collecting messages : Rsyslog❖ Nearly ready log consumption

❖ Native distribution package

❖ Nginx Log, MySQL slow query

log

template(name=« ls_json"

type=« list" option.json="on") {

constant(value=« {")

constant(value="\"@timestamp\":\"") property(name="timereported" dateFormat=« rfc3339")

constant(value=« \",\"@version\":\"1")

constant(value="\",\"message\":\"") property(name=« msg")

constant(value="\",\"host\":\"") property(name=« hostname")

constant(value="\",\"severity\":\"") property(name=« syslogseverity-text")

constant(value="\",\"facility\":\"") property(name=« syslogfacility-text")

constant(value="\",\"programname\":\"") property(name=« programname")

constant(value="\",\"procid\":\"") property(name=« procid")

constant(value=« \"}\n")

}

Page 16: Time to say goodbye to your Nagios based setup

Collecting @ network level : Packetbeat

❖ Specific agent

❖ Collect traffic for

❖ HTTP

❖ MySQL

❖ PostgreSQL

❖ Redis

Page 17: Time to say goodbye to your Nagios based setup

Routing messages : Logstash

❖ Inputs

❖ Codecs/filters

❖ Outputsinput {

udp {

port => 10514

codec => "json"

type => "syslog"

}

}

filter {

# This replaces the host field with the host that generated the message (sysloghost)

if [sysloghost] {

mutate {

replace => [ "host", "%{sysloghost}" ]

remove_field => "sysloghost"

}

}

}

output {

elasticsearch { host => localhost }

}

Page 18: Time to say goodbye to your Nagios based setup

Routing metrics : StatsD

❖ Is now a protocol implemented

in all languages

❖ InfluxDB plugin

❖ Collectd can behave as a statsD

daemon (plugin)

❖ Very easy to push metrics

echo "foo:1|c" | nc -u -w0 127.0.0.1 8125

Page 19: Time to say goodbye to your Nagios based setup

Storing metrics : InfluxDB

❖ Make it behave like Graphite

❖ graphite-api

❖ carbon-relay-ng

❖ graphite-influxdb

❖ Cluster, cluster, cluster

❖ Design for events and metrics

Page 20: Time to say goodbye to your Nagios based setup

Storing messages : Elasticsearch

❖ Index database

❖ Cluster, cluster, cluster

❖ Full text search

Page 21: Time to say goodbye to your Nagios based setup

Visualizing @ network level : Packetbeat

❖ Kibana 3 modified version

❖ Dashboards ready out

of the box

Page 22: Time to say goodbye to your Nagios based setup

Visualizing metrics : Grafana

❖ Compatible

❖ Graphite

❖ InfluxDB

❖ OpenTSDB

❖ Built on Kibana 3

Page 23: Time to say goodbye to your Nagios based setup

Visualizing messages : Kibana 4

❖ Easy install

❖ Interactive dashboards

❖ Multiple indices

Page 24: Time to say goodbye to your Nagios based setup

What's missing ? Wishes

❖ Alerting

❖ External monitoring

❖ Repository for dashboards…

❖ Giving sense to metrics and

messages

Page 25: Time to say goodbye to your Nagios based setup

Alerting reboot

❖ Alert only on end user problems from an end

user perspective

❖ IRC, Chat channel…

❖ Alert thresholds based on history vs static

thresholds

❖ Statistics functions

❖ Boolean conditions

❖ Dynamic thresholds

❖ Anomaly detection

❖ Standard deviation

Page 26: Time to say goodbye to your Nagios based setup

Coming from Nagios

❖ Graphios will inject perfdatas in Graphite or InfluxDB

❖ Check_graphite can query Graphite API from Nagios for alert based on

history

❖ Logstash will send events to NSCA

❖ Nagios log in Kibana with Grok %{NAGIOSLINE}

❖ Keep Nagios for states ?

Page 27: Time to say goodbye to your Nagios based setup

Questions ?

@olivjan

[email protected]