operations playbook: monitoring and automation - rightscale compute 2013

25
Automation + Monitoring Chris Deutsch, RightScale Operations Cloud Management Platform

Upload: rightscale

Post on 20-Aug-2015

805 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Automation + Monitoring Chris Deutsch, RightScale Operations

Cloud Management Platform

Page 2: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

What I'll be talking about • Meet RightScale Operations • Monitoring

o How monitoring works on RightScale o How to build a custom monitor o How we monitor web servers and cassandra

• Automation o The RightScale API o The chimp command line tool o How we automate releases

• Tips from Ops

Page 3: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

RightScale Operations • Deployed over 5 continents • Over 700 cloud servers administered • RightScale runs on RightScale

Page 4: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: what is it? • open source metric collection tool • modular architecture • uses the ubiquitous rrdtool • more information: http://collectd.org/

Page 5: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: built-in plugins • host monitoring

o cpu o disk space o disk I/O o memory o network

• application monitoring o process state o memory use o cpu usage

Page 6: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

How does monitoring work?

Page 7: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

How does monitoring work?

Page 8: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

How does monitoring work?

Page 9: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

How does monitoring work?

Page 10: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins •  Custom plugins written using the Exec plugin •  Can be written in any language •  Ruby, python and perl are common •  Simple

Page 11: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins What we're going to look at: • building an example monitor using the Exec plugin • http error code monitor • cassandra database server monitor

Page 12: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins: example /etc/init.d/collectd/example.conf: example.rb: https://collectd.org/wiki/index.php/Plugin:Exec

#!/usr/bin/ruby while true do time = Time.now.to_i puts "PUTVAL "host/cpu-0/cpu_overview" interval=20 #{time}:1" sleep 20 end

<Plugin exec> Exec "nobody" "example.rb" </Plugin>

Page 13: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins: http codes

Page 14: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins: http codes

https://gist.github.com/christopherdeutsch/db2380a47b62730ddf69

Page 15: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins: cassandra •  cassandra is a key-value data store (aka nosql) server •  data is stored on a ring •  a ring consists of nodes

Page 16: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

collectd: custom plugins: cassandra

Page 17: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: the rightscale api

Page 18: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: the rightscale api • RightScale API is RESTful and easy to traverse • right_api_client - ruby client library • CloudFlows - the future

Page 19: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: the command line • needed a tool that would let us be lazy • the "chimp" executes commands on servers • let's jump into a demo

Page 20: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: chimp • select what to update using tags • update across multiple deployments • update one server at a time so service isn't disrupted • track success/failure

Page 21: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: scripting languages • having a command line tool lets us use scripting languages like bash

or ruby to automate common tasks • we ended up using Ruby rake files to tie it all together

Page 22: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: a RightScale release • chimp used to run commands on servers • supports "rolling" operations • uses tag service to scope operations • we use rake to organise tasks that make up a release • developed chimpd so we could run more commands in parallel

Page 23: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

automation: chimp release • RightScale has released chimp as open source! • gem install right_chimp

Page 24: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Cloud Management Platform

tips • assume instances will die eventually • always reboot test ServerTemplates • use tags. everywhere. all the time. • use chimp to make ad-hoc queries • monitor not just host metrics but system metrics • design everything to be runnable in a server array

Page 25: Operations Playbook: Monitoring and Automation - RightScale Compute 2013

Thanks!

Chris Deutsch, RightScale Operations [email protected] @ispeakdeutsch

Cloud Management Platform