opsview ral retreat 2011

21
Applica’ons and System Monitoring with Opsview/Nagios RAL Retreat 2011 Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Upload: others

Post on 03-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Applica'ons  and  System  Monitoring  with  Opsview/Nagios  

RAL  Retreat  2011  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

What  is  Opsview  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Opsview  vs.  Nagios  

Nagios   Opsview  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Opsview  vs  SNMP  

SNMP  

SNMP  Trap  

NRPE  

NSCA  

Opsview  

Server  

Nagios  Proprietary  Protocols  

SNMP  Protocols  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Opsview  vs.  RAL  Tools  

Opsview  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Opsview  Concepts  

Host  

Fan  Speed  

Services   Temprature  

Ping  

Memory/RAM  Clock  Synchroniza'on  CPU  

Disk  Space  RAID  

Service  Checks  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

How  does  it  work?  

Opsview  

Server  

Nagios  Remote  Plug-­‐Ins  Executor  

(NRPE)  

Scheduled  Execu'on  

Allowed  Commands  Only  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

What  about  firewalls?  Opsview  

Server  Nagios  Service  Check  Acceptor  

(NSCA)  No  response?  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Opsview  Web  Interface  

Comment  

Scheduled  Down'me  

Graph  Available  

Unhandled  Colors  

Handled  Colors  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Automa'c  No'fica'on  Management  

No'fica'on  only  aWer  mul'ple  failures  

Ping  

CPU  

Disk  

Memory  

NRPE  Dependencies  

OK  

Cri0cal  

OK  

Warning  

Cri0cal  

=   Flap  Detec'on  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Manual  No'fica'on  Management  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Accessing  Opsview  

•  Mul'ple  instances  at  RAL  •  E-­‐mail  helpdesk@  rap.ucar.edu  to  request  access.  

•  SNAT  instance:  h\ps://opsview.rap.ucar.edu  •  Just  type  “opsview”  in  your  browser.  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Custom  Nagios  Plug-­‐in  Development  

A  nagios  plug-­‐in  is  any  executable  that:  – Prints  a  one-­‐line  status  to  stdout;  and  – Has  an  exit  code  to  indicate  status:  •  0  –  Ok  •  1  –  Warning  •  2  –  Cri'cal  •  3  –  Unknown;  and  

– Op'onal  performance  data  appended  to  one-­‐line  status  •  |’Graph Label’=value;warning threshold;error threshold;min y-axis value;max y-axis value

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Performance  Data  

|’Graph Label’=value;warning threshold;error threshold;min y-axis value;max y-axis value

|’Age of madis decoded’=4389s;6400;7200;0;7500

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Passive  Checks  

•  Means  your  soWware  is  sending  a  message  to  a  nagios  server  

•  Forma\ed  string  provided  at  stdin  for  send_nsca  executable  

•  Perl  API  developed  (cvs/apps/nagios/src/passive)  –  Subrou'ne  takes  parameters  and  re-­‐formats  as  necessary  

•  Service  check  must  s'll  be  configured  in  OpsView  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Ac've  Check  Example  

Many  re-­‐usable  plug-­‐ins  in  cvs/apps/nagios/src/plugins  

Usage  ./check_mdv_data_'me.pl      -­‐u  <mdvUrl>                                full  URL  to  the  MDV  data  set      -­‐l  <maxDataAge>                        maximum  age  of  the  latest  data  before  being  considered  late  (seconds)      -­‐m  <maxDataAge>                        maximum  age  of  the  latest  data  before  being  considered  missing  (seconds)      -­‐n  <dataSetName>                      name  of  the  data  set  -­‐-­‐  used  in  an  alert  message  

   -­‐h                                                  show  this  message  Example:          ./check_mdv_data_'me.pl  -­‐u  mdvp:://<host>::<path>  -­‐l  1200  -­‐m  3600  -­‐n  MyData  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Passive  Check  Example  

Passive  check  is  a  call  from  applica'on  code  to  send_nsca  binary  

<hostname>[tab]<descrip>[tab]<return_code>[tab]<plugin_output>[newline]  

echo  “magen-­‐c1-­‐int1\tdata  archive\t0\tdata  archive  was  successful”  |  send_nsca  -­‐h  magen-­‐dev-­‐admin  -­‐c  nagios/etc/send_nsca.cfg  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Passive  Check  Example  

There  is  a  Perl  API  available  to  make  this  call  easier:  

require  “cvs/apps/nagios/src/passive/perl/sendNSCA.pm”;  …  …  ($success,$errorMsgs)  =  &sendNSCA(  

“magen-­‐dev-­‐admin”,              #  nagios  host  “magen-­‐c1-­‐int1”,          #  host  were  service  is  checked  “data  archive”,            #  service  check  name  0,  “data  archive  was  successful”  );  #  status  and  message  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Installing  Custom  Plug-­‐ins  

•  Copy  plug-­‐in  files  to:  nagios/libexec  •  Edit  nagios/etc/nrpe.cfg  to  allow  plug-­‐ins  to  be  used  

•  Restart  the  opsview-­‐agent  process  •  Configure  ac've  check  using  opsview  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.

Saving  and  Installing  Configura'ons  

•  OpsView  configura'on  is  stored  in  a  database  •  Only  way  to  restore  user/project  configura'on  is  to  restore  the  database  

Copyright © 2011, University Corporation for Atmospheric Research (UCAR). All rights reserved.