the northwestern mutual life insurance company – milwaukee, wi application monitoring jeremy...

43
The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Upload: august-spencer

Post on 23-Dec-2015

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

The Northwestern Mutual Life Insurance Company – Milwaukee, WI

Application Monitoring

Jeremy Kalsow

Page 2: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Why Application Monitoring

• Majority of all corporations

• Northwestern Mutual

• Total 1,000+ servers

• Team is 6 people

• Team uses 16 servers

• Average 50 applications per server

• Need a way to know status fast

Page 3: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

What is it?

• The ability to monitor performance and availability

• Gather metrics

• Show trends

• Pretty pictures for management

Page 4: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Why?

• Trends predict future problems

• Solve application issues faster

• Uptime relates directly to profit for many companies

• View all applications, servers, databases and other items being monitored with a single dashboard.

Page 5: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Types of Monitoring

• Fault

• Performance

• Configuration

• Security

• Accounting

Page 6: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Fault

• Detects major errors

• Easy to implement

• Examples– Network loss– Database Connectivity

• Very Important

Page 7: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Fault

Type of Monitoring

What to Monitor

When to monitor

Hardware CPU utilization CPU load Load > 99% for x minutes

Memory utilization Memory load Load > 99% for x minutes

Storage System Available space System out of Space

Applications Application available

Application working

Working or Error

Application Logs Error Log monitoring

If error occurred

Databases Database online Database is online Database is up/down

Network Latency Latency Latency > acceptable range

Page 8: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Performance

• Slow Performance

• Service Level Agreements

• Metrics

• Old and New Metrics

• Visual Display

Page 9: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Performance

http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff/polozoff.html

Page 10: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Configuration

• Configuration variables

• Connectivity

• Speed

• Performance

• Proactive

• Servers and Applications

Page 11: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Configuration

• Why would the configuration change?

• Hardware

• Storage

• Service packs

• Hot fixes

• Windows Updates

Page 12: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Security

• Attempts to access the system

• Open ports

• Inventories

• Firewall

• Packets

• System events

• Blocked Exploits

Page 13: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Accounting

• Monitors Usage

• Generally used for fees

• Profit/Loss

• Example– Electric Company– Northwestern Mutual

Page 14: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Types of Monitoring Recap

• Fault

• Performance

• Configuration

• Security

• Accounting

Page 15: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Types of Monitoring Recap

• Historical data

• Baseline test

• Current test

• Performance disagreements

Page 16: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Types of Monitoring Recap

• Allows for trends to be seen

• Modifications can be made

• Trends over multiple releases

Page 17: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Types of Monitoring Recap

• Monitoring is important

• Not enough time is given

• Implemented After discovery of an issue

• Monitoring only in areas of known problems

• Adding monitoring requires time and money

Page 18: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Challenges of application monitoring• Various types of systems

• Shared

• Clustered

• Virtualized

• Production logging

Page 19: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Shared Systems

• 1 server / Multiple applications

• System resources are shared

• Tracking individual usage is difficult

• Many applications may be impacted

• Server without access (production)

Page 20: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Clustered Systems

• Applications on more than one server

• Avoid single point of failure

• May be hard to target the issue

Page 21: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Production Logging

• Generally Limited

• Most errors repeated in test

• Application downtime

• Use of company resources

Page 22: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Implement Application Monitoring• Plan Early

• Monitor Proactively

• Create a Recovery Plan

• Create and use SLAs

Page 23: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Plan Early

• Planning stage

• Add monitoring during development

• Late additions cover known issues

Page 24: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Monitor Proactively

• Harder to implement

• Issues are dealt with before end user knows

Page 25: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Monitor Proactively

• Tools based approach

• Easy and relatively fast setup

• No code

• Multiple applications

Page 26: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Monitor Proactively

• Logging is directly in the code

• Less efficient

• More specific

• Developers have less time

Page 27: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Create a Recovery Plan

• Fast resolution

• Knowledge management

Page 28: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Recovery Plan Template

Page 29: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Service Level Agreements

• What percentage of time that the services will be up (uptime)

• How many people can use the application at once without performance issues

• Performance metrics and benchmarks to be used with performance monitoring alerts

• The rules for notification announcements• What statistics will be monitored and

when and where they will be available• Acceptable response time

Page 30: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Service Level Agreements

Page 31: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Using the Statistics

• Visual display

• Alerts

• Tickets

Page 32: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Visual (Dashboard)

• Easily view statistics

• Comparison results

• Trend comparison

• Cross Platform

• Auto-generated management reports

Page 33: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Dashboard

Page 34: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Alerts and Tickets

• Auto-generated alerts

• Tickets for queue system

• Vital information in each

Page 35: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Alerts and Tickets

• Most common: Email

• Text, popup, printout, recording and more

• Tickets: auto-generated

• Knowledge databases

• Common fixes and resolutions

Page 36: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Application Monitoring

• Maximize application uptime

• Higher end user satisfaction

• Higher Profit

Page 37: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

References

• Polozoff, A. (2003, April 9). Proactive Application Monitoring. IBM - United States. Retrieved October 20, 2011, from http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff/polozoff.html 

• Choice. (2009, December 20). Application Monitoring. Adminschoice - Unix Made Easy. Retrieved October 31, 2011, from http://adminschoice.com/application-monitoring

• Application Monitoring Software - uptime software. (n.d.). Server Monitoring Software - IT Systems Management, Capacity Planning, Application and Server Monitoring Tool by uptime software. Retrieved October 31, 2011, from http://www.uptimesoftware.com/application-monitoring.php 

• Marko, K. (2005, December 30). Proactive Application Monitoring. Processor.com:

• Data Center IT Equipment at Processor, Routers, Storage, Rackmount Servers, Computer Room Cabling and Flooring. Retrieved October 29, 2011, from http://www.processor.com/editorial/article.asp?article=articles%2Fp2752%2F43p52%2F43p52.asp 

• "IT Service Level Agreement Templates | ContinuityPlanTemplates." ContinuityPlanTemplates |Free Business Continuity Plan (BCP) Templates. ContinuityPlan Templates, n.d. Web.30 Oct. 2011. http://www.continuityplantemplates.com/it-service-level-agreement-templates

Page 38: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

XML

Page 39: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow
Page 40: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow
Page 41: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow
Page 42: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow
Page 43: The Northwestern Mutual Life Insurance Company – Milwaukee, WI Application Monitoring Jeremy Kalsow

Upcoming events with Dashboard•Ability to display visualized graphs and other pertinent information

•Ability to click a failed component and have the system auto generate a ticket

•Ability to Alert others of the issue found

•Performance monitoring as well as fault