itil best practice for software companies

Post on 11-Nov-2014

158 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Detailed outline of an Information Technology Infrastructure Library (ITIL) is a set of practices for IT service management (ITSM) that focuses on aligning IT services for Software Companies

TRANSCRIPT

Introduction to Service Management

ITILS Services for Software Companies

Daniel Brody© 2014

Service Components

Process Orientated Working

Incident Management

Problem Management

Service Level Management

Change Management

What is ITIL?

IT Infrastructure Library

IT Infrastructure Library

• A set of books describing

code of best practice for IT

Service provision

• UK Government

• First edition – Late 80’s

• Revised in 2000

• Non-proprietary • Platform independent

ITIL

Two Books

Service Delivery

Service Support

ITIL’s Service Management

Service Support

Focuses on the day to day operation and support of IT Services

Service Delivery

Focuses on long term planning and improvement of IT Service provision

ITIL Publications

Planning to Implement Service Management

The Business Perspective

Application Management

ICT Infrastructure Management

Security Management

Service Support

Service Delivery

Planning to Implement Service Management

Service Management

ServiceSupport

ServiceDelivery

The

Business

The Business

Perspective

Applications Management

ICTInfrastructureManagement

The

Technology

Security Management

IT Service Management….

FoundationSupport

Configuration Management

Mission

To identify, control and audit the information required to manage IT services by defining and maintaining a database of controlled

items, their status, lifecycles and relationships and any information needed to

manage the quality of IT services cost effectively

Asset Management

vs

Configuration Management

Objectives

• Identify & record management information

• Account for all IT assets & configurations

• Control the information in the database

• Ensure that information reflect reality

• Provide a basis for management

• Provide status of components

Key Terms

Configuration

• Hardware

• Software

• Documentation

• Communication

• Environmental Equipment

• Staff

Anything that needs to be controlled

Configuration Item (CI)

PC

KeyboardMonitor System Unit

CD ROM StiffyDrive

MemoryHardDrive

CPU

• A component within a configuration

• A configuration in its own right

How low do you go???

• CI Levels

– Lowest level of independent change

– Who are you and what are you doing?

– Information value vs collection effort

Attributes

Memory

PC

KeyboardMonitor System Unit

CD ROM StiffyDrive

MemoryHardDrive

CPU

Key Terms

• Relationship– Primary– Secondary

• Baseline– Snapshot of a CI at a time or stage

• Variant– A baseline with minor differences

• Model, Version and Copy numbers– Type– Unique– Version / Copy

Life Cycles

• Stages in the life of a CI

• Allow CIs to be moved and trackedOrdered

Delivered

Set up

Installed

Withdrawn

Maintenance

Key Activities

Stages in Configuration Management

• Identification

• Control

• Status Accounting

• Verification / Audit

Identification

• Logical– What items need to be recorded?– What do we need to know about them?

• Physical– Marking items that are under Configuration

Management control

Logical & Physical

Basic Principles

• CIs must be uniquely identified

• Prominent & clearly visible

• Meaningful naming

• Copy numbers must be catered for

• Cater for growth

Control

• Information in the CMDB– Access– Changes– Adding new items

• To achieve control– Agree and freeze CI specification– Only allow changes through change management

Status Accounting

• Uses lifecycles and attributes

• Records and reports on– Current data– Historical data

Verification & Audit

• Does the CMDB reflect reality?

• Accuracy is improved by– Active rather than passive CMDB– Automatic updating– Integration with other processes– Automatic checks

The CMDB

Accounts Sales Marketing Manufacturing

Inventory Purchasing Distribution HR

Information Management

Accounts Sales Marketing Manufacturing

Inventory Purchasing HRDistribution

ManagementInformation

Accounts Sales

HR

Distribution

Marketing

Manufacturing

InventoryPurchasing

ManagementInformation

CorporateDatabase

NETWORK SLMCHANGEPROBLEM

CAPACITY FINANCIAL SERVICE DESK PERFORMANCE

CMDB

PROBLEM

CAPACITY

CHANGEFINANCIAL

SERVICE DESK

PERFORMANCESLM

ManagementInformation

CMDB

Underlying Databases

VIRTUALCMDB

Benefits, costs & problems

Benefits

• Accurate information & documentation on CI’s

• Control of valuable CI’s

• Legal Obligations

• Financial & expenditure planning

• Registration of Software Changes

• Contingency planning

• Improving Release Management

• Improved security

• Trending data

Costs

• Staff costs– Initial audit– Management

• HW & SW identification & Level of control• Number of users who have access• Need for tailoring • Diversity & quality of information• Level of integration

Possible Problems

• Incorrect CI level• Emergency changes • Over-ambitious schedules• Circumvention of procedures• Manual systems • Over expectation• Isolated implementation• Difficult without Change Management• Difficult to cost justify• No operational use of the system

Change Management

Mission

To manage all changes that could impact on IT’s ability to deliver services through a formal,

centralised process of approval, scheduling and control to ensure that the IT Infrastructure

stays aligned to business requirements with minimum risk

Objectives

• Manage the process of:– Requesting changes– Assessing changes– Authorising changes– Implementing changes

• Prevent unauthorised changes• Minimise disruption• Ensure proper research and relevant input• Coordinate build, test and implementation

Scope

· Hardware

· System Software

· Communications Equipment and Software

· ‘Live’ Application Software

· All documentation, plans and procedures relevant to the running, support and maintenance of live systems

· Environmental Equipment

Key Terms

Key Terms and Roles

• Request for Change (RFC)– Contains all necessary information to make the change

• Change Advisory Board (CAB)– Assesses resource requirements and impact– Advises the Change Manager

• CAB Emergency Committee (CAB/EC)– Urgent changes– 1-3 senior staff

• Forward Schedule of Change (FSC)– Details of approved changes & dates

• Projected Service Availability (PSA)– Best time for change to be implemented

• Change Model– Pre-defined path

• Standard Change– Pre-authorised change

Change Management Procedure

Initiate Change

Filter Requests

Initial Priority

Decide Category

Urgent?

Normal ChangeProcedure

Reject

To urgent procedureYes

Change Model?To Change Model

procedure

Yes

Minor Significant Major

Assess impact and resources. Confirm priority and ScheduleAuthorised?

No

Yes

Refers RFC upwards. IT Director decides then passes to CAB for

actioning

Circulates RFCs to CAB members

Authorises and schedules change. Report action to

CAB

Independent Testing

Build change, Testing & back out Plans

Co-ordinates implementation

Working?

Monitor/Review Change

Back out / Refer back to CAB

Normal Change Implementation

Procedure

From Normal Change

Yes

No

Successful?

Yes

Close

NoTo Start

Failure

Update Documentation

Urgent CAB or CAB/EC meeting

Assess impact resource requirements and urgency

Urgently prepares the change

Urgent? To normal procedure

Urgent ChangeProcedure

Time for test? Urgent TestingYes

No

Yes

Failure

No

Co-ordinate implementation

Satisfactory?

Co-ordinates implementation

Implements back-out plans. Change is referred back to

CAB/EC

Ensure records are brought up to date

Review Change

Urgent ChangeProcedure

Satisfactory?No Yes

CloseTo Start

Benefits, Costs & Problems

Benefits to Business

• Greater IT & business alignment

• Higher availability

• Increased productivity

• More communication – greater trust

• IT can handle more changes

• Balance between need for change & potential impact

Financial Benefits

• More accurate forecasting

• Better quality decisions

• Reduction in amount of rework

IT Benefits

• Easier to meet SLA’s

• Fewer change failures

• Back out plans – easier restore

• Valuable input for problem & availability

• Increased productivity of IT Staff

Costs

• Software costs• Integration & modification• Staff • Accommodation

Possible Problems

• Bureaucratic procedures • Resistance to “control culture”• Bypassing of procedures• Integration to Configuration Management• Inaccurate information• Handling urgent changes• Detecting unauthorised changes• Too broad scope for a change• Unclear ownership

Release Management

Mission

To take an holistic view of a change to an IT Service and ensure that all aspects of a

release, both technical and non-technical, are considered together

Why Release Management?

• Large or critical hardware roll-outs

• Major software roll-outs

• Bundling or batching related sets of changes

In-house applications

“Other” software

Utility Software

System Software

Hardware Specifications

Assembly Instructions

User Manuals

Key Concepts

Key Concepts

• Release– Collection of authorised changes– Major / minor / emergency

• Definitive Hardware Store (DHS)– Storage of Hardware spares

• CMDB– Definitions of planned releases– Records of CI’s impacted by release– Information about the target of environment

Key Concepts

• Definitive Software Library (DSL)– Physical secure storage– Source code & Original media

• Build Management– Controlled environment– Compiled on dedicated “build hardware”

• Release Policy– Roles, responsibility & content – Form part of initial planning

• Release Unit– Components released together

Release Units

• Systems, suites, programs and modules

• Factors affecting the level of release– Number and extent of changes

– Number of changes that can be managed

– Available resources and time

– Ease of implementation

– Complexity of the release

Release Units

System 1

Suite 2.1

Program 2.2.1

Module 2.2.2.1 Module 2.2.2.2 Module 2.2.2.3

Program 2.2.2 Program 2.2.3

Suite 2.2 Suite 2.3

System 2 System 3

IT Infrastructure

Development Releases - and

• Managed by development

• Must not affect live services

• Should not require production resources

• Customer agreement obtained

• Usage covered in SLAs

• Must not replace live systems

• Must be licensed

Normal Release - Full

• All components built, tested, distributed & implemented together

• Better integrated testing

• Easier to detect & rectify problems

• Complex & will require more resources

Normal Release -

• Partial release• Contains only new or changed items• Not as stable as full releases• Authorisation of a delta release depends on:

– Size of a full release compared to the delta– Urgency of required facilities– Number of changes already made– Potential business impact– Available resources

Normal Release – Package

• Combination of release units• Reduces number and frequency of releases• Better integration and testing• Less old or incompatible software• Could result in delays to fixes or

enhancements• Greater potential for disruption

C1

C2

C3

C4

C3

Package Release

Delta Release

M1

M1

M2

M3

M4

Full Release

Urgent Releases

• Disruptive and error prone• Often used to bypass Change Management• Controls are essential

– Use software from the DSL– Software must be replaced through the DSL– Must follow Change Management– CMDB must be updated– Version control– Testing and documentation– Give notice

Back-Out Plan

• Documents actions that will restore service • Still part of change• Two approaches

– A full reversal of release– Contingency plans to restore as much as possible

• Should be verified and tested

Key Activities

Configuration Management Database (CMDB)&

Software Library

Release P

olicy

Release P

lann

ing

Desig

n &

develo

p, o

r o

rder &

pu

rchase

softw

are

Bu

ild &

con

figu

re the

Release

Fit-fo

r-pu

rpo

se testin

g

Release A

cceptan

ce

Ro

ll-ou

t plan

nin

g

Co

mm

un

ication

P

reparatio

n &

train

ing

Distrib

utio

n &

in

stallation

Development Environment Controlled Test EnvironmentLive

Environment

Release Management

Release Policy

• Basis of subsequent activities• Management roles & responsibilities

Release Planning

• Agreeing release content• Planning phases of releases• Produce schedule• Assess hardware at target site• Plan resource requirements• Obtain quotes if upgrades are required• Produce back out plans• Develop quality plan• Plan acceptance of support groups

Designing, Building & Configuring

• Components assembled in controlled process

• All components of release should be under Configuration control

Testing & Release Acceptance

• Before going to live• Types of testing

– Functional testing– Operational testing– Performance testing– Integration testing– Testing & back out plans

• Final acceptance & sign off – part of Change

• Rejection treated as failed change

Rollout Planning

• Wholesale / “big bang”• Phased roll outs

– Geographical– Functional– Technological– Combination

Communication, Preparation & Training

• Support staff & customers• Training • Parallel working• Involvement in acceptance process• Rollout planning meetings

Distribution & Installation

• Distribution– Equipment reaches destination in time & in tact– Secure Storage Areas– Checked against relevant documentation– Final check before implementation

• Installation– Functional checks of equipment– Automate deployment– Installation routines– Include check of target– User checklists?

Software ordered

Software developedand supplied

Acceptance checks

OK?

RectificationAction

NoSoftware placed in DSL

Final approval

Package built intest environment

Operationalacceptance testing

OK?No Build in liveenvironment

Distribute to liveenvironment

Implemented onlive environment

CMDB

Normal Flow of software

Benefits, Costs & Problems

Business Benefits

• Minimum disruption• Better quality of service• Fewer & less frequent releases• Effective scheduling of users for testing• Overall reduction in business risk• Business knows what to expect & can plan

Financial Benefits

• Assets more controlled• Less time & resources spent on rework• More responsive to revenue producing

opportunities• Prevention of duplication of activities

IT Benefits

• Consistent quality of releases• Centralised control• Improved quality and control of changes• Effective planning of staff activities• Number of regressions are reduced• Easier detection of unauthorised and

incorrect versions• Less blame shifting

Costs

• Storage costs• Build , test and archive environments• Secure equipment stores• Software distribution tools• Network bandwidth• Telecommunications• Staff and training

Problems

• Circumvention of procedures• Emergency fixes• Distribution of builds directly from

development• Uncoordinated implementation of Software

and Hardware• Resources not available for testing• Test results are invalid• Process is seen to be unclear or bureaucratic

Relationships

• Configuration Management• Change Management• Problem Management• Service Desk• Project Management• Developers and suppliers

Service Desk

The Service Desk

Structure not a process

• Drive & improve service to the

business• Single point of contact

– Advice– Guidance– Rapid restoration to service

Role of the Service Desk

• Supports the incident & problem management function

• Provide a central point of contact– Preventing the same incident being reported to

different people over & over– Preventing the loss of incidents– Preventing technical people being disrupted– Preventing unnecessary work if already known

error

Objectives

• Single point of contact for reporting of incidents

• Accurately record information about incidents

• Co-ordinate activities to restore service to normal

• Support the incident & problem management

functions

• Provide management information

• Provide support & advice to business

Key Elements & Processes

Service Desk Functions

• Log Incident• Pre-scan phase

– Not Known Error– Proper procedure have been followed– Required supporting evidence is complete & present

• Incident Management• Service Desk remains responsible• Responsible for escalation• Regularly feeds back to user

Service Desk & Change Management

• Log Changes & cross reference to problems

• Issue change schedules

• Monitor & track changes & assist with

escalation

• Inform users of change once complete &

update change schedules

Common Features of Service Desks

• A single point of contact for all users

• A central log of all incidents

• Each incident uniquely numbered and date/time stamped

• Diagnostic scripts and other aids

• Configuration Management Support Tools

• Known Error Lists

• An impact coding system

Common Features of Service Desks

• Automatic escalation procedures based on impact, priority and elapsed time

• Telephone and electronic mail communication with all support staff

• Interface to Service Level Agreements

• Regular progress reporting

• Classification of incidents at call closure

• Regular management summaries of calls received and resolved

Service Desk Structures

Local Service Desk

• Local desk meeting local needs

• Support staff also local

• Becomes impractical with multiple locations

• Several local desks – operational standards

• Common processes across all locations

Local Service Desk

Local User

Local User Local

User

Third Party Support

Network & Operations

Support

Application Support

Desktop Support

Service Desk First line Support

Centralised Service Desk

Customer Site 1

Customer Site 2

Customer Site 3

Third Party Support

Network & Operations

Support

Application Support

Desktop Support

Service Desk

Second Line Support

Internet

WanModem

Virtual Service Desk

Paris Service Desk Sydney Service Desk

Modem

Third Party Supplier Service Desk

Cape Town Service Desk

Local Users

LAN

ServiceManagementDatabase(s)

London Service Desk

Toronto Service Desk

fax

LAN

Durban Service Desk

User Site ‘n’User Site ‘n’User Site IUser Site I

Telephone

Local Users Remote Users

Virtual Desks

• Physical location immaterial• Used for global organisation• Benefits include

– Reduced operational costs– Consolidated management overview– Improved usage of available skills– Knowledge sharing

• Onsite assistance still required

Outsourcing

• Have outsourcers use your Service Desk tool

• Keep ownership of management information

• Ensure suitably skilled staff

• Request details of staff

• Monitor value for money

• Check supplier dependencies

• Ensure deliverables are clearly understood

Service Desk

Skill Sets

Staff Profiles

• Understanding of business

• Understanding of IT Infrastructure

• Exceptional interpersonal skills

Technically Unskilled Staff

• Centralised Service Desks• Emphasis on interpersonal skills• Large call volumes, little support• Administrates and coordinates calls• Relies on diagnostic scripts and other tools• Technical staff are not distracted or demotivated• No in-depth support• Potential job satisfaction is high

Technically Skilled Staff

• Lower call volumes, greater support

• Longer call times

• May become to involved in technical aspects

• Job satisfaction issues

• Customer satisfaction issues

• Peak time staffing issues

• Familiarity breeds contempt

Expert Staff

• Resolve all calls

• Staff are more important than procedures

• Will play the role of technical departments

Incident Management

Definition of an Incident

Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or reduction in, the quality of that

service

Includes• New services• Automatically registered events

Mission

To minimise the impact of service disruptions to the business by restoring that service through

effective management of incidents

Scope

• Inputs– Incident details from service desk– Configuration details– Matched incidents, problems & known errors– Resolution details– RFC

• Outputs– RFC for resolution– Resolved & closed incidents– Communication to Customers– Management information

Objectives

• Restoration of service as quickly as possible• Ensure timely resolution of all incidents• Identify trends that may assist in incident

resolution• Assist problem management in identifying

trends

Key Concepts

Incident Handling

• Service Desk owns Incidents• Progress reporting• Incident Lifecycles

– New– Accepted / Assigned– Scheduled– WIP– On Hold / Waiting– Resolved– Closed

Levels of Support

• 1st line Support– Service Desk

• 2nd Line Support– Incident Management

• 3rd Line Support– Specialist Group

Key Concepts (cont.)

• Ownership & Communication– Monitor status against open Incidents– Incidents passed between support groups– Affected users are kept informed– Check for similar Incidents– Incidents that are likely to exceed SLA times

• Escalation– Functional Escalation– Hierarchical Escalation

Classification

• Category

Operating System

Application

Financials

Line

Network Connection

Modem

Monitor

Mouse

Hardware

Printer

Terminal

Connector

Software

Network

Impact Code

IMPACTCODE

DESCRIPTIONTARGET

RESOLUTION TIME

HighMajor service unavailableMany Users affected

1 Hour

MediumCustomer terminal or printer downCannot function

4 Hours

LowCustomer terminal or printer experiencing intermittent failure

8 Hours

Key Concepts (cont.)

• Incidents, Problems and Known Errors– Incidents are events or occurences that degrade

or disrupt a Service– A problem is the underlying cause of one or more

incidents that have not yet been diagnosed– Known Errors are

• Problems that have been diagnosed and have not yet been rectified

• Problems that have been diagnosed and for which a resolution or circumvention exists

Incident Management Activities

Reasons for classification• Identifying the service the Incident is

related to• Associate the Incident with the SLA • Selecting the most suitable support team• Indication of the impact and/or severity• Match Incidents to Known Errors• Determine a reporting structure

Incident detecting & recording

Initial classification & support

Service Request

Service Request Procedure

• Service Desk

• System Monitoring tools

• Capturing base & initial data• Diagnostic Scripts• Known Error Database• Skill Levels• Knowledge base and/or expert software

Incident detecting & recording

Initial classification & support

Service Request

Investigation & diagnosis

Resolution & recovery

Incident Closure

Service Request Procedure

Ow

ners

hip,

mon

itorin

g, tr

acki

ng a

nd

com

mun

icat

ion

• Support group accepts assignment

• Advise if work around can be provided

• Attempt resolution

• Record all details

• Monitor status against open Incidents

• Incidents passed between support

groups

• Affected users are kept informed

• Check for similar Incidents

• Incidents likely to exceed SLA times

• Escalation

Ow

ners

hip,

mon

itorin

g, tr

acki

ng a

nd

com

mun

icat

ion

Incident detecting & recording

Initial classification & support

Service Request

Investigation & diagnosis

Resolution & recovery

Service Request Procedure

Ow

ners

hip,

mon

itorin

g, tr

acki

ng a

nd

com

mun

icat

ion

Incident Closure

Incident

CMDB

Incident, Problem & KE databases

Diagnostic data system dumps and

journals

Support staff allocation

Basic fact gathering

Enquiries on historical data

Support staff allocation

Allocate further support

Incident Closure

Diagnosis/ Circumventions?

Escalation threshold exceeded

Liaise with Problem Management to create

Problem or Known Error record where appropriate

Who, When•

Results?•

Correlations•

Dumps, ID’s etc•

Diagnosis and resolution/

circumvention action

•What, why when?

•When

Incident progress summary

Free Format text record

Diagnostic data search

Y

Y

N

N

Problem Management

Mission

To minimise the disruption of IT services by organising IT resources to resolve problems,

preventing them from recurring and recording information that will improve the

way in which IT deals with problems, resulting in higher levels of availability and

productivity.

Scope

• Reactive – Solving problems in response to incidents

• Proactive– Solving problems before incidents occur

The main goal of Problem Management is the detection of the underlying causes of an incident and their subsequent

resolution and prevention

Objectives

• Identify, manage and resolve problems

• Prevents recurrence of problems

• Reduce the number and severity of problems

• Minimise impact to business

• Ensure right level of staff

• Record & manage information• Ensure vendor compliance when resolving problems

Key Concepts

Incident

Problem

Known Error

Change

IncidentControl

ProblemControl

ErrorControl

ChangeControl

Service Desk / Incident Management

ProblemManagement

ChangeManagement

Problem V Error Control

Problem Control

Transforms Problems into Known Errors

Error control

Resolving Known Errors via the Change

Management process

Problem Control

Problem Control

Problem Identification

Problem Classification

Problem Investigation and diagnose

Problem Identification

• Initial support could not match the Incident to a known problem

• Analysis of Incidents

• Analysis of IT infrastructure

• Significant or Major Incidents

Problem Classification

• Impact– Direct effect on the business

• Urgency– The measure of business criticality based on impact and

business need

• Priority– The order in which a series of items should be addressed– P=I x S x U

Defining Priority

Priority – sequence in which an Incident or Problemneeds to be resolved

Impact – measure of the business criticality ofan incident

Severity – what is the effect on the infrastructure / resources?

Urgency – extent to which the resolution of a Problemor error can bear delay

Priority = Impact Severity Urgencyx xP = I S Ux x

Investigation & Diagnosis

• Diagnosis of root cause

• Update of problem record

• May reclassify at closure

• Methods of problem Analysis

Error Control

Error Control Activities

Error identification & recording

Error assesment

Error resolution recording

Error closure

Error resolution monitoring

Error Identification

• Identification of root cause

• Identification of work around

• Two sources of known errors– Problem control– Development

Error Assessment

• Assessment of resolution– Priority, impact & urgency

• Logging of change

Error Resolution Recording

• Resolution recording– Known error database

• Data on all CIs are available for incident

matching

Error Closure

• Closed together with any Incidents

• Resolved or Closed Pending on review

Error Resolution monitoring

– Not responsible for RFCs– Monitor progress– Escalate through CAB if necessary

Proactive Problem Management

Proactive Problem Management

• Identifying and resolving problems before incidents occur

• Activities include:– Trend Analysis– Targeting support action– Providing information to business

top related