metrics with ganglia

41
gareth rushgrove | morethanseven.net Collecting Metrics With Ganglia and Friends Cambridge Geek Night 28th March 2011 http://www.flickr.com/photos/memestate/45986749

Upload: gareth-rushgrove

Post on 16-Jan-2015

6.776 views

Category:

Technology


2 download

DESCRIPTION

Talk about using Ganglia and other tools for storing all kinds of web application metrics for both operations and business purposes. Presented at Cambridge Geek Night

TRANSCRIPT

Page 1: Metrics with Ganglia

gareth rushgrove | morethanseven.net

Collecting MetricsWith Ganglia and Friends

Cambridge Geek Night 28th March 2011

http://www.flickr.com/photos/memestate/45986749

Page 2: Metrics with Ganglia

Gareth Rushgrove

gareth rushgrove | morethanseven.net

Page 3: Metrics with Ganglia

Work at FreeAgent

gareth rushgrove | morethanseven.net

freeagentcentral.com

Page 4: Metrics with Ganglia

Blog at morethanseven.net

gareth rushgrove | morethanseven.net

Page 5: Metrics with Ganglia

Curate devopsweekly.com

gareth rushgrove | morethanseven.net

Page 6: Metrics with Ganglia

Covering (Business Version)

gareth rushgrove | morethanseven.net

- Capacity planning metrics

- Metrics for your application- Business analytics

- Having everything in one place

Page 7: Metrics with Ganglia

Covering (Tech Version)

gareth rushgrove | morethanseven.net

- Ganglia Store metrics and view graphs

- Logster Get log files into Ganglia

- Gmetric Get anything into Ganglia

- Syslog Using Loggly to view individual log items

Page 8: Metrics with Ganglia

Everyone Uses Something Like?

gareth rushgrove | morethanseven.net

Page 9: Metrics with Ganglia

Use Something Like This Too

gareth rushgrove | morethanseven.net

Page 10: Metrics with Ganglia

What is Ganglia?

gareth rushgrove | morethanseven.net

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.ganglia.sourceforge.net

Page 11: Metrics with Ganglia

Example: vagrantbox.es

gareth rushgrove | morethanseven.net

Page 12: Metrics with Ganglia

Load Averages

gareth rushgrove | morethanseven.net

Page 13: Metrics with Ganglia

CPU

gareth rushgrove | morethanseven.net

Page 14: Metrics with Ganglia

Aggregate Graphs

gareth rushgrove | morethanseven.net

Page 15: Metrics with Ganglia

Across Entire Cluster

gareth rushgrove | morethanseven.net

Page 16: Metrics with Ganglia

Predicting When Your System Will Fail

gareth rushgrove | morethanseven.net

A strategy for anticipating future workloads of your computers, with the aim of creating a computing environment that can handle future workloadIBM

Page 17: Metrics with Ganglia

Disk Space

gareth rushgrove | morethanseven.net

Page 18: Metrics with Ganglia

Monitoring Your Application

gareth rushgrove | morethanseven.net

Page 19: Metrics with Ganglia

86.26.7.33 - - [26/Mar/2011:20:39:52 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.1" 200 2081 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"

Web Server Logs

gareth rushgrove | morethanseven.net

Page 20: Metrics with Ganglia

Logster from Etsy

gareth rushgrove | morethanseven.net

Page 21: Metrics with Ganglia

Tail a log file and filter each line to generate metrics that can be sent tocommon monitoring packages.

Options: -p METRIC_PREFIX, --metric-prefix=METRIC_PREFIX Add prefix to all published metrics. This is for people that may multiple instances of same service on same host. --gmetric-options=GMETRIC_OPTIONS Options to pass to gmetric such as -d 180 -c /etc/ganglia/gmond.conf (default). These are passed directly to gmetric. --graphite-host=GRAPHITE_HOST Hostname and port for Graphite collector, e.g. graphite.example.com:2003 -s STATE_DIR, --state-dir=STATE_DIR Where to store the logtail state file. Default location /var/run -d, --dry-run Parse the log file but send stats to standard output. -D, --debug Provide more verbose logging for debugging.

Logster

gareth rushgrove | morethanseven.net

Page 22: Metrics with Ganglia

logster SampleGangliaLogster /../access.log

Logster Command Line

gareth rushgrove | morethanseven.net

Page 23: Metrics with Ganglia

HTTP Responses with a 2xx Status Code

gareth rushgrove | morethanseven.net

Page 24: Metrics with Ganglia

The Ganglia Metric Client (gmetric) announces a metricon the list of defined send channels defined in a configuration file

Usage: gmetric [OPTIONS]... -V, --version Print version and exit -c, --conf=STRING The configuration file to use for finding send channels (default='/etc/ganglia/gmond.conf') -n, --name=STRING Name of the metric -v, --value=STRING Value of the metric -t, --type=STRING Either string|int8|uint8|int16|uint16|int32|uint32|float|double -u, --units=STRING Unit of measure for the value e.g. Kilobytes, Celcius (default='') -s, --slope=STRING Either zero|positive|negative|both (default='both') -x, --tmax=INT The maximum time in seconds between gmetric calls (default='60') -d, --dmax=INT The lifetime in seconds of this metric (default='0') -S, --spoof=STRING IP address and name of host/device (colon separated) we are spoofing (default='') -H, --heartbeat spoof a heartbeat message (use with spoof option)

Gmetric

gareth rushgrove | morethanseven.net

Page 25: Metrics with Ganglia

Gmetric Scripts for Common Applications

gareth rushgrove | morethanseven.net

Page 26: Metrics with Ganglia

gmetric -n sales -v 200 -t float

Gmetric Command Line

gareth rushgrove | morethanseven.net

Page 27: Metrics with Ganglia

Our Custom Metric in Ganglia

gareth rushgrove | morethanseven.net

Page 28: Metrics with Ganglia

import subprocess

from bottle import route, run, abort, default_app

@route('/:name/:value')def index(name, value): try: cmd = 'gmetric -n %s -v %s -t float' % (name, value) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except subprocess.CalledProcessError: abort(500, "Error")

app = default_app()

Gmetric HTTP Interface

gareth rushgrove | morethanseven.net

Page 29: Metrics with Ganglia

http://../sales/200

Gmetric URL

gareth rushgrove | morethanseven.net

Page 30: Metrics with Ganglia

import subprocessimport SocketServer

class GmetricTCPHandler(SocketServer.BaseRequestHandler):

def handle(self): self.data = self.request.recv(1024).strip() items = self.data.split(' ') try: cmd = 'gmetric -n %s -v %s -t float' % (items[0], items[1]) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except Exception: return "Error"

if __name__ == "__main__": HOST, PORT = "0.0.0.0", 8001 server = SocketServer.TCPServer((HOST, PORT), GmetricTCPHandler) server.serve_forever()

Gmetric TCP Interface

gareth rushgrove | morethanseven.net

Page 31: Metrics with Ganglia

sales 200

Gmetric TCP

gareth rushgrove | morethanseven.net

Page 32: Metrics with Ganglia

Syslog

gareth rushgrove | morethanseven.net

Syslog is a standard for logging program messages. It allows separation of the software that generates messages from the system that stores them and the software that reports and analyzes them.Wikipedia

Page 33: Metrics with Ganglia

Loggly - Logging as a Service

gareth rushgrove | morethanseven.net

Page 34: Metrics with Ganglia

View logs

gareth rushgrove | morethanseven.net

Page 35: Metrics with Ganglia

Logstash

gareth rushgrove | morethanseven.net

Page 36: Metrics with Ganglia

Graylog2

gareth rushgrove | morethanseven.net

Page 37: Metrics with Ganglia

Other Things You Could Monitor

gareth rushgrove | morethanseven.net

- Database table sizes

- Cache hits- Time taken for test runs

- Codebase size

- Signups, sales, subscriptions

- Twitter followers

Page 38: Metrics with Ganglia

What Next?

gareth rushgrove | morethanseven.net

- Wikipedia http://ganglia.wikimedia.org/

- Install Ganglia deb and rpm packages available

- Add system metrics web servers, databases

- Add business metrics users, sales, tweets

- Try Loggly or at least investigate syslog

Page 39: Metrics with Ganglia

gareth rushgrove | morethanseven.net

Reading

Page 40: Metrics with Ganglia

CBGN11

2 months free on FreeAgent

gareth rushgrove | morethanseven.net

Page 41: Metrics with Ganglia

Questions?

gareth rushgrove | morethanseven.net http://flickr.com/photos/psd/102332391/