nagios conference 2012 - sheeri cabral - alerting with mysql and nagios
DESCRIPTION
Sheeri Cabral's presentation on using Nagios plugin to generate dynamic values for MySQL Alerts. The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcnaTRANSCRIPT
![Page 1: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/1.jpg)
Alerting With MySQL and Nagios
http://bit.ly/nagios_mysql2012
Sheeri CabralSenior DB Admin/Architect
Mozilla@sheeri
![Page 2: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/2.jpg)
What is Monitoring?
Threshold alerting
Graphing/trending
![Page 3: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/3.jpg)
Why Monitor?
Problem alerting
Find patterns– capacity planning
– troubleshooting
Early warning for potential issues
![Page 4: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/4.jpg)
What to Alert?
Problems you can fix– max_connections
– long running queries
– locked queries
– backup disk space
![Page 5: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/5.jpg)
Nagios is great because...
...anyone can write a plugin
![Page 6: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/6.jpg)
The problem with Nagios...
...anyone can write a plugin
![Page 7: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/7.jpg)
Official Nagios Pluginsfor MySQL
check_mysql
check_mysql_query
![Page 8: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/8.jpg)
check_mysql
db connectivity slave running slave lag using seconds_behind_master
![Page 9: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/9.jpg)
check_mysql_query
Checks the output of a query is within a certain range (numerical)
![Page 10: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/10.jpg)
System vars Status vars Caching Calculations
check_mysql yes yes no no
check_mysql (standard) no yes no no
check_mysqld_status no yes no no
check_mysql_stats yes no yes no
check_mysqld no many no no
check_mysql_health yes yes no Hard-coded
check_mysql yes yes no Hard-coded
Third party plugins
![Page 11: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/11.jpg)
“Let us know what you'd like to see!”
![Page 12: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/12.jpg)
What I wanted
System vars Status vars Caching Calculations
mysql_health_check.pl yes yes yes flexible
![Page 13: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/13.jpg)
mysql_health_check.pl
![Page 14: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/14.jpg)
Caching
Save information to a file
![Page 15: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/15.jpg)
Caching
Save information to a file --cache-dir /path/to/dir/
![Page 16: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/16.jpg)
Caching
Save information to a file --cache-dir /path/to/dir/
Use the file instead of connecting again
![Page 17: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/17.jpg)
Caching
Save information to a file --cache-dir /path/to/dir/
Use the file instead of connecting again
--max-cache-age <seconds>
![Page 18: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/18.jpg)
Caching
Save information to a file --cache-dir /path/to/dir/
Use the file instead of connecting again
--max-cache-age <seconds>
--no-cache to force connection
![Page 19: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/19.jpg)
--mode=varcomp
%metadata{varstatus}
![Page 20: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/20.jpg)
--mode=varcomp
%metadata{varstatus} SHOW GLOBAL VARIABLES SHOW GLOBAL STATUS
![Page 21: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/21.jpg)
--mode=varcomp
%metadata{varstatus} SHOW GLOBAL VARIABLES SHOW GLOBAL STATUS
![Page 22: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/22.jpg)
--mode=varcomp
%metadata{varstatus} SHOW GLOBAL VARIABLES SHOW GLOBAL STATUS
--expression allows word replacement
![Page 23: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/23.jpg)
--mode=varcomp
%metadata{varstatus} SHOW GLOBAL VARIABLES SHOW GLOBAL STATUS
--expression allows word replacement
--warning --critical are flexible
![Page 24: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/24.jpg)
Sample Command Definitiondefine command {
command_name check_mysql_tmp_tables
command_line $USER1$/mysql_health_check.pl
--hostname $HOSTADDRESS$ --user myuser --password mypass
![Page 25: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/25.jpg)
Sample Command Definitiondefine command {
command_name check_mysql_tmp_tables
command_line $USER1$/mysql_health_check.pl
--hostname $HOSTADDRESS$ --user myuser --password mypass
--cache-dir=/var/lib/nagios/mysql_cache
--max-cache-age=300
![Page 26: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/26.jpg)
Sample Command Definitiondefine command {
command_name check_mysql_tmp_tables
command_line $USER1$/mysql_health_check.pl
--hostname $HOSTADDRESS$ --user myuser --password mypass
--cache-dir=/var/lib/nagios/mysql_cache
--max-cache-age=300
--mode=varcomp
--expression=
"Created_tmp_tables/Uptime_since_flush_status"
--warning=">(8/60)" --critical=">(10/60)"
}
![Page 27: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/27.jpg)
Sample Command Definition
--mode=varcomp
--expression=
"Created_tmp_tables/Uptime_since_flush_status"
--warning=">(8/60)" --critical=">(10/60)"
}
![Page 28: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/28.jpg)
Sample Service Definition
define service { use generic-service host_name __HOSTNAME__ service_description MySQL Temp Tables check_command check_mysql_tmp_tables}
![Page 29: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/29.jpg)
Rates Compare to last run
![Page 30: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/30.jpg)
Rates Compare to last run
mode=lastrun-varcomp
![Page 31: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/31.jpg)
Rates Compare to last run
mode=lastrun-varcomp
current{expr}
![Page 32: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/32.jpg)
Rates Compare to last run
mode=lastrun-varcomp
current{expr}
lastrun{expr}
![Page 33: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/33.jpg)
Query Rate
mysql_health_check.pl [host,user,pass]
![Page 34: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/34.jpg)
mysql_health_check.pl [host,user,pass]
--mode lastrun-varcomp
Query Rate
![Page 35: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/35.jpg)
mysql_health_check.pl [host,user,pass]
--mode lastrun-varcomp
--expression "(current{Queries} - lastrun{Queries})
Query Rate
![Page 36: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/36.jpg)
mysql_health_check.pl [host,user,pass]
--mode lastrun-varcomp
--expression "(current{Queries} - lastrun{Queries})
/ (current{Uptime} – lastrun{Uptime})"
Query Rate
![Page 37: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/37.jpg)
mysql_health_check.pl [host,user,pass]
--mode lastrun-varcomp
--expression "(current{Queries} - lastrun{Queries})
/ (current{Uptime} – lastrun{Uptime})"
--comparison ">80"
Query Rate
![Page 38: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/38.jpg)
Other Modes
--mode=long-query --mode=locked-query
%metadata{proc_list} SHOW FULL PROCESSLIST
![Page 39: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/39.jpg)
Sample Command Definition
define command {
command_name check_mysql_locked_queries
command_line $USER1$/mysql_health_check.pl
--hostname $HOSTADDRESS$ --user myuser --password mypass
--cache-dir=/var/lib/nagios/mysql_cache
--max-cache-age=300
--mode=locked-query
--warning=$ARG1$ --critical=$ARG2$
}
![Page 40: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/40.jpg)
Extending Information
sub fetch_server_meta_data {}
add a new hash key to %metadata
$metadata{varstatus} =
$dbh->selectall_arrayref(
q|SHOW GLOBAL VARIABLES|);
![Page 41: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/41.jpg)
Extending Information
sub fetch_server_meta_data {}
add a new hash key to %metadata
$metadata{innodb_status} =
$dbh->selectall_arrayref(
q|SHOW ENGINE INNODB STATUS|);
![Page 42: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/42.jpg)
For example
• %metadata{innodb_status}– SHOW ENGINE INNODB STATUS
• Already exists, unused
![Page 43: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/43.jpg)
For example
• %metadata{innodb_status}– SHOW ENGINE INNODB STATUS
• Already exists, unused
• %metadata{master_status}– SHOW MASTER STATUS
![Page 44: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/44.jpg)
For example
• %metadata{innodb_status}– SHOW ENGINE INNODB STATUS
• Already exists, unused
• %metadata{master_status}– SHOW MASTER STATUS
• %metadata{slave_status}– SHOW SLAVE STATUS
![Page 45: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/45.jpg)
“Standard” checks
% max connections
--expression 'Threads_connected/max_connections*100'
![Page 46: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/46.jpg)
“Standard” checks
% max connections
--expression 'Threads_connected/max_connections*100'
InnoDB enabled
--expression “have_innodb”
--comparison=“ne 'YES'”
![Page 47: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/47.jpg)
Did MySQL Crash?
Nagios set to check every 5 minutes
![Page 48: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/48.jpg)
Did MySQL Crash?
Nagios set to check every 5 minutes
Might miss a crash
![Page 49: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/49.jpg)
Did MySQL Crash?
Nagios set to check every 5 minutes
Might miss a crash
Uptime!
--expression 'Uptime'
--comparison=“<1800”
![Page 50: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/50.jpg)
“Standard” checks
read_only for slaves
--expression “read_only”
--comparison=“ne 'YES'”
![Page 51: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/51.jpg)
“Standard” checks
read_only for slaves
--expression “read_only”
--comparison=“ne 'YES'”
% of sleeping connections
# connected, # running, # max connections
![Page 52: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/52.jpg)
“Standard” checks
read_only for slaves
--expression “read_only”
--comparison=“ne 'YES'”
% of sleeping connections
# connected, # running, # max connections
--expression="(Threads_connected-Threads_running)/max_connections * 100"
--comparison=">$ARG1$"
![Page 53: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/53.jpg)
Limitations
One check/calculation per Nagios service– But, you can use many variables
– Cached output
Does not output for performance data– Not hard to modify, just no need yet
![Page 54: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/54.jpg)
Where to get it
https://github.com/palominodb/PalominoDB-Public-Code-Repository/tree/master/nagios/
www.palominodb.com->Community->Projects
![Page 55: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/55.jpg)
Other Nagios Plugins
![Page 56: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/56.jpg)
Nagios Plugin for Partitions
table_partitions.pl --host
--user --pass
--database --table
--range [days|weeks|months]
--verify #
![Page 57: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/57.jpg)
Nagios Plugin for Partitions
table_partitions.pl --host db1.mozilla.com
--user nagiosuser --pass nagiospass
--database addons --table user_addons
--range months
--verify 3
![Page 58: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/58.jpg)
After using pt-table-checksum (master)
Slaves have a table with checksum
this_crc vs master_crc
Nagios Plugin for Checksums
![Page 59: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/59.jpg)
Nagios Plugin for Checksums
check_table_checksums.pl -H host
--user username --pass password
-T checksum_table
-I checksum_freshness
-b dbs,to,skip
https://bugzilla.mozilla.org/show_bug.cgi?id=789058
![Page 60: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/60.jpg)
Mozilla is Hiring!
http://careers.mozilla.org/
DBA – MySQL, Postgres, NoSQL (some or all)
http://careers.mozilla.org/en-US/position/oTqLWfwK
SysAdmin – RHEL, CentOS, Fedora
http://careers.mozilla.org/en-US/position/o97xWfwtUS, Canada, UK, France, Spain, Germany, Netherlands,
Sweden, Denmark, Poland, China, Japan, New Zealand
In Mountain View, Senior Windows/Unix SysAdmin
http://careers.mozilla.org/en-US/position/oZmJWfwK
![Page 61: Nagios Conference 2012 - Sheeri Cabral - Alerting With MySQL and Nagios](https://reader034.vdocuments.us/reader034/viewer/2022042623/549823aaac7959412e8b550c/html5/thumbnails/61.jpg)
More Resources
www.palominodb.com->Community->Projects
kimtag.com/mysql
OurSQL podcast (oursql.com)
slides: http://bit.ly/nagios_mysql2012
MySQL Administrator's Bible
youtube.com/tcation
http://planet.mysql.com