neebula e book event consoles-slideshare

5
eBook

Upload: kateneeb

Post on 14-Jun-2015

122 views

Category:

Technology


0 download

DESCRIPTION

Any NOC operator is familiar with the Sisyphean task of handling hundreds, or even thousands of console messages day in and day out. When the right event is tracked, its cryptic language does not provide much context to the problem or help assess the impact on the business. This free eBook provides guidelines and tips on how to optimize your event console to significantly improve the event management experience, correlate IT events to business services, and improve root cause analysis. You’ll also learn how you could take a leap forward by automating service mapping and impact assessment to significantly improve the way you manage events and IT changes.

TRANSCRIPT

Page 1: Neebula e book  event consoles-slideshare

eBook

Page 2: Neebula e book  event consoles-slideshare

Success factors for effective root cause analysis

2www.neebula.com

Share this on:

Most companies do not use the event consoles to their fullest potential.

IntroductionAny NOC operator is familiar with the Sisyphean task of handling hundreds, or even thousands of console messages day in and day out. When the right event is tracked, its cryptic language does not provide much context to the problem or help assess the impact on the business.

This eBook provides guidelines and tips on how to optimize your event console to significantly improve the event management experience, correlate IT events to business services, and improve root cause analysis. You’ll also learn how you could take a leap forward by automating service mapping and impact assessment to significantly improve the way you manage events and IT changes.

“”

Page 3: Neebula e book  event consoles-slideshare

Common misuses of event consoles 3www.neebula.com

Share this on:

NOC Operator challenges

While data centers keep expanding and increasing in complexity, NOC operators are still expected to handle the flood of events coming out of a growing numberof systems.

But managing the sheer volume of events, isonly one of the challenges NOC operators face.

Event text is meaningless, as it typically uses technical language that can only be understood by domain experts.

Context is missing. Event consoles categorize events based on the severity of the IT component (‘Spectrum: RadWare CHASSIS DOWN IP:10.0.1.101 is Fatal’). As an operator, you have no way of determining the context and impact of the event on the business service or services (e.g., the eBanking application). An event defined as ‘critical’ may in fact have a limited impact on a business service, or vice versa.

Page 4: Neebula e book  event consoles-slideshare

Common misuses of event consoles 4www.neebula.com

Share this on:

Real events vs. derived events. A failure of one IT component typically triggers multiple derived event messages. For example, when a router fails, multiple error messages are triggered from all components behind the router. It is extremely difficult for operators to separate real events from derived events in order to identify the actual root cause of the problem.

Impact is unclear. Event consoles provide no way of assessing the actual impact of a event. A web server that crashed may - or may not - be a critical event. This would depend on the wider context - whether there is a single web server, or whether the server is one of four servers in a server farm.

To summarize, NOC operators manage a massive amount of ‘noise’ with very little assistance towards assessing their impact and identifying the root cause of problems.

Next, we’ll look at some key techniques to alleviate these challenges, such as correlation and enrichment rules. Indeed, the capabilities we’ll review are part of most event consoles, but they are often misused, or not used at all.

Is it a real problem? Assessing the impact of an IT component failure on the business service is quite difficult, using the standard event console.