diagnostic system monitoring

9
Data Warehouse | 8/29/22 Diagnostic System Monitoring vs. Operational System Monitoring Kevin Jesse Data Warehouse Team | University IT David Andruczyk Web Services Team | University IT

Upload: kevin-jesse

Post on 30-Nov-2014

608 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

  • 1. Data Warehouse | July 13, 2013 Kevin Jesse Data Warehouse Team | University IT David Andruczyk Web Services Team | University IT
  • 2. Data Warehouse | July 13, 2013 Diagnostic monitoring refers to collecting ALL (or as many as possible) known system metrics at periodic intervals over time. The information given allows you to see fluctuations in areas of the system that may or may not impact operational use. This information also allows for detailed system metrics which can be used for further tuning.
  • 3. Data Warehouse | July 13, 2013 Operational monitoring refers to collecting KEY system metrics at periodic intervals over time. The information given allows you to refine that initial configuration to be more tailored to your requirements. The information also prepares you to address new problems that might appear on their own or following upgrades, increases in volumes, or new deployments.
  • 4. Data Warehouse | July 13, 2013 Apache Server Status OK 0.031554 seconds response time. Idle 29, busy 1, open slots 470 WARNING 0.029917 seconds response time. Idle 27, busy 353, open slots 120 Open Files OK: Open files is 9028 of 819200 System Core Files OK - 0 Core(s) found Java JVM Threads JMX OK - ThreadCount=352 JMX WARNING - ThreadCount=683 Total Number of Processes PROCS CRITICAL: 770 processes
  • 5. Data Warehouse | July 13, 2013 Apache HTTP/HTTPS HTTP OK: HTTP/1.1 200 OK - 245 bytes in 0.032 second response time System CPU 24 CPU, average load 3.2% < 50% : OK System Disk Usage DISK OK - free space: / 6717 MB (92% inode=99%) System Memory OK - 79444M free System Interfaces OK: host 'localhost', interfaces up: 7, down: 0, dormant: 0
  • 6. Data Warehouse | July 13, 2013 Benefits Helps identify key operational metrics Helps with holistic view of a system Performing poorly vs. down Gives additional insight in to system Allows for quicker understanding of a failure based on data Proactive monitoring of services which can forecast impending system failure Allows SMEs to have more visibility Enables vendors access to additional data for troubleshooting Risks or Downside Over use or redundant monitoring Initial implementation can have a high technical cost with SME Overwhelming amount of data to analyze Alert overload from misconfiguration Two systems to maintain (diagnostic and operational)
  • 7. Data Warehouse | July 13, 2013 Trend or Prediction Analysis Identification of Overall Performance Metrics Misconfigurations in Larger System Can Help to Identify and Pinpoint System Abuse Early detection via warning signals that an abnormality is occurring helps avoid the shock/panic factor Early detection of abnormalities vs. System Down Allow more time for analysis, assisting with scenario /what-if planning Insight into enhancements that would otherwise go un-noticed
  • 8. Data Warehouse | July 13, 2013 Nagios Cacti AWStats Logwatch Up.Time SCOM Tripwire Solar Winds Zabbix Munin Groundworks Big Brother Nfsen MRTG Hyperic HQ Tivoli http://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
  • 9. Data Warehouse | July 13, 2013 Diagnostic monitoring is something that SMEs specialize in along with their other skills. Many SMEs prefer to add a monitoring station as an individual component of a larger cluster or platform system. This helps an administrator focus on tuning vs. being impacted by other alerts or misconfigurations in the monitoring station. Smaller systems with less overall metrics may not warrant standing up a unique monitoring station. These systems would benefit most form a collaborative and centralize diagnostic monitoring station.