opmantek strategic noc service v2€¦ · the strategic noc model is about improving collaboration...
TRANSCRIPT
4/10/18
1
OPMANTEKNETWORK MANAGEMENT AND IT AUDIT SOFTWARE
Developing a Strategic NOC Service using Opmantek’s Commercial Open-Source Solutions, v2 – April 2018
Housekeeping
• Attendees will be on mute during the presentation to prevent interruptions from feedback and background noise.
• If you wish to ask a question please ask via GoToWebinar’s chat
• We will have a Q&A session at the end and have allowed lots of time.
• This session will be recorded and made available to all attendees
4/10/18
2
Topics for Today
• Differences Between a Traditional NOC Model and a Strategic NOC
• Developing a Service Catalog with Measurable SLAs
• Architecting a Solution for Fast Client On-Boarding and Scalability
• Identify Your Fully Loaded Costs in Deploying and Growing the Strategic NOC
IT Service Management Maturity Model
CHAOTIC• Ad Hoc• Undocumented• Unpredictable• Multiple help desks• Minimal IT operations• User call notification
REACTIVE• Fight fires• Inventory• Desktop software
distribution• Initiate problem
management process• Alert and event
management• Measure component
availability (up/down)
PROACTIVE• Analyze trends• Set thresholds• Predict problems• Measure application
availability• Automate• Mature problem
configuration, change, asset andperformance mgmt. processes
SERVICES• IT as a service provider• Define services, classes,
pricing• Understand costs• Guarantee SLAs• Measure and report
service availability• Integrate processes• Capacity Mgmt.
VALUE• IT as a strategic business
partner• IT and business metric
linkage• IT/business collaboration
improves business process• Real-time infrastructure• Business planning
Tool Leverage
Operational Process Engineering
Service Delivery Process Engineering
Service & Account Management
Manage IT as a Business
Level 0
Level 1
Level 2
Level 3
Level 4
Increasing Performance & Value to Organization
4/10/18
3
DIFFERENCES BETWEEN A TRADITIONAL NOC MODEL AND A STRATEGIC NOC
Traditional NOC Model
• NOC is embedded with larger network/server/application support process
• May include maintenance functions (i.e. remote desktop, patching, anti-virus, etc.)
• Staff time split between fault resolution and routine equipment maintenance including
equipment refresh projects
• Monitoring focuses on equipment state (often siloed between network & server)
• Fault response is primarily reactionary, with no or little automation
What the Traditional NOC model has to offer…
4/10/18
4
Strategic NOC
• Understand Customers/Lines of Business care about user satisfaction, not equipment state
• Focus is on monitoring application performance and user experience; ensuring end-to-
end quality of the network
• Offers a clearly defined list of services offered and SLAs, marries these to pricing/value
• Initial fault response is automated
• Self Service is a key component at all levels
• NOC is actively involved with DevOps, Application Development, Test/QA, & Deployment
• NOC maintains proactive communication channels with Customers/Lines of Business
The strategic NOC model is about improving collaboration and response
Benefits
• Reduced time spent on routine responses
• Improved reaction time to UX impacting faults
• Reduced time to fault resolution, and
• Ability to predict outages and degradation
Why Invest in Converting to a Strategic NOC Model?
4/10/18
5
DEVELOPING A SERVICE CATALOG WITH MEASURABLE SLAS
Service Catalog
• Application Monitoring (suggest at least 2, but recommend 3-tier system)
• Bronze; monitors underlying equipment and services required for application
• Silver; Bronze plus custom automated response to defined faults
• Gold; Silver plus synthetic transactions and deployed UX monitors
• Performance Trending
• Trend underlying equipment to understand UX impact and investment needs
• Self Service
• Offer custom dashboards so Customer/LOB can see application performance in near
real time
What Services and SLAs will Your Strategic NOC Offer?
4/10/18
6
Service Level Agreements
• How much down-time can your Customer/LOB absorb each day/week/month/quarter?
• How does network/application downtime affect business income?
• Fault/problem response time; usually to first touch not resolution
• Hours/Days of operation; use of on-call technicians will impact Response Times
What Can Your NOC Team Support?
SLA Downtime
SLA Minutes/Month Hours/Year97.00% 1,314.90 263.0
98.00% 876.60 175.3
99.00% 438.30 87.7
99.90% 43.83 8.8
99.99% 4.38 0.9
ARCHITECTING A SOLUTION FOR FAST CLIENT ONBOARDING AND SCALABILITY
4/10/18
7
Open-SourceNMIS: Core performance and fault monitoring
Commercial SolutionsOAE: Scheduled discovery and auditingopHA: Supports horizontal and vertical scalabilityopCharts: Customer Portal w/customized dashboardsopEvents: Automated event responseopTrend: Predictive trend analytics
Architecting a Solution
Equipment Sizing
• NMIS – Performance and Fault Monitoring
• Individual server w/6-8vCPU and 16-24GB RAM can support 3-5k devices
• Total devices served by an individual server depends on total number of interfaces
collected, device latency and response time, and performance of storage
• More devices than an individual NMIS server can support?
• Add opHA to support horizontal and vertical scaling
How Much Equipment Is Needed to Support the Services and SLA?
4/10/18
8
Example Scaled Server Architecture
Slave
Poller01Slave
Poller02Slave
Poller03(…)
Master
Master01Master
Master02 (…)Master
Portal01 (…)
Additional Polling servers are added as needed for customer or geographic expansion
Optionally, Additional servers can be added to service new NOCs if latency is > 100ms
Opmantek Application Flow
Subnet
Poller
NMIS opEvents opConfig
Master
opHA
opHA NMIS
cli data
syslogSNMP / WMI
metadata
metadata
metadata
trap
opEvents
meta-events
events
api
opCharts
opCharts
service monitor
opReports
opReports
reports
summary
metadata
detail-Link
metadata
Netflow Data
opFlow Collector
opFlow
4/10/18
9
opCharts
WHY – self service dashboards reduce client interruptions while providing the client with the feeling
of control and transparency; for billable clients it can be an up-sell or a service differentiator
• An implementation of opCharts is exposed to the internet via a reverse proxy
• Client accounts are created within opCharts, this can be scripted
• Custom Dashboards, Maps, Charts and Business Services are assigned to that user
• User can only see the elements you give them access to
Customer Portal with Customized Dashboards
opTrend
WHY – Equipment works differently in the real world than in the vendor’s best-case lab. By
understanding what’s normal for each device opTrend replaces static thresholds with what’s normal
Dynamic trending replaces static thresholds for alerting
4/10/18
10
Client On-Boarding and Scaling
• How will you onboard new clients?
• Are new clients added to existing polling servers, or will new servers be provisioned?
• How will new devices and services be added to polling servers?
• Will the service be charged back to the client?
• For each client identify –
• What applications are key to their LOB?
• Services and SLA per Application?
• What type(s) of synthetic transactions will the applications support?
Reduce Friction, Automate Where Possible, Document Everywhere
IDENTIFYING YOUR FULLY LOADED COSTS
4/10/18
11
Fully Loaded Cost
• Employee Fully Loaded Cost rate is generally 1.49 (SAP) - 2.34x annual salary
• 1.25x employment taxes and benefits
• 1.75x office space, equipment
• 1.25x related management expenses and non-billable work
• Salary of Network Engineer II/III in Charlotte, NC is $89,287
• Annual Fully Loaded Cost = $133,037 - $208,932
Tracking Value to the Business Starts With Understanding Your Costs
Staffing a 24/7 NOC
• A single engineer can effectively support 8-12k devices/shift
• Assumes properly configured and load balanced implementation of NMIS, opCharts,
opEvents, opConfig, and opTrend as well as appropriate user/admin training
• Assumes automation to add/update/retire devices
• Assumes at least Silver Service level; monitors underlying equipment and services
required for application, custom automated response to defined faults
• Assumes normal system operation
• Minimum staffing = 2r/shift = 9 FTE = $1,197,333 - $1,880,388 / year fully loaded
Tracking Value to the Business Starts With Understanding Your Costs
4/10/18
12
Return On Investment (ROI)
• An average Enterprise class business experiences 262.8hrs of system downtime/year• This equates to an operating SLA of 97%, well below most operational expectations
• Opmantek’s solutions have shown to reduce downtime by an average 68%• This reduces downtime from 262.8 to ~84hrs, increasing SLA to 99.0%• For a $150MM business this creates savings in revenue and productivity of $2MM
• This equates to an ROI of 93%• Investment generally pays for itself in < 10 months
CONTACT FOR FOLLOW UP
Commercial enquiries:
Tom WiriAccount Executive+1 (512) [email protected]
Technical enquiries:
Mark HenrySenior Engineer+1 (207) [email protected]