ITIL® Foundation Certification Notes
: Service Operation : Processes
6.5 Processes
6.5.1 Incident Management Process
An incident is defined as an unplanned interruption to
an IT service, a reduction in the quality of an IT service,
or a failure of a CI (configuration item) that has not yet
impacted an IT service.
Incident Management is responsible for progress of all
incidents from reporting to closing - usually the responsibility
of service desk.
A) Purpose of Incident Management Process
The purpose of Incident Management is to recover
normal (i.e. agreed) service operation, as quickly as
possible, after an incident has been detected / recorded.
B) Objectives of Incident Management Process
1) Make sure that standardized procedures and methods
are used for prompt and efficient response, documentation,
analysis, reporting of incidents and ongoing management
2) Communication and visibility of incidents should
be improved.
3) Improve business perception of IT with the help
of a professional approach so that incidents will be
resolved and reported quickly.
4) Align incident management activities and priorities
with those of the business.
5) Maintain the user satisfaction without losing
the quality of IT services.
C) Scope of Incident Management Process Handle all incidents (event which disrupts, or which could disrupt,
a service), either by service desk reports or event management
tool alerts. Incidents are reported and/or logged by technical
staff.
D) Basic Concepts
1) Incident: An incident is defined as an unplanned interruption
to an IT service, a reduction in the quality of an IT service,
or a failure of a CI (configuration item) that has not yet
impacted an IT service.
2) Timescales: Timescales must be agreed for all incident
handling according to their priority; this includes response
and resolution targets. All support groups should be made
fully aware of these timescales. The tool should be used
to automate timescales and escalate the incident as required
based on predefined rules.
3) Incident Model: An incident model is a template that
can be reused for recurrent incidents. It can be practical
to predefine 'standard' incident models and apply them when
incidents occur. They contribute to a faster entry and a
more efficient treatment.
4) Major Incident: define what constitutes a major incident
and follow pre-defined procedures, need to inform users
on the progress.
5) Incident Status Tracking:
Incident status tracking field value examples:
1) New - an incident is submitted but is not assigned
to a group or resource for resolution.
2) Assigned - an incident is assigned to a group
or resource for resolution.
3) In process - the incident is in the process of
being investigated for resolution.
4) Resolved - a resolution has been put in place.
5) Closed - the user has agreed that the incident
has been resolved and that normal state operations have
been restored
6) Expanded Incident Life cycle: Detailed stages
in the lifecycle of an incident. The stages are detection,
diagnosis, repair, recovery and restoration. The expanded
incident lifecycle is used to help understand all contributions
to the impact of incidents and to plan for how these
could be controlled or reduced.
7) Impact: A measure of the effect of an incident,
problem, or change on business processes. Impact is
often based on how service levels will be affected.
Impact and urgency are used to assign priority
8) Urgency - a measure of how long it will be until
an incident, problem or change has a significant impact
on the business.
9) Priority - a category used to identify the relative
importance of an incident, problem or change, based
on impact and urgency. High priority (Priority 1) is
given for an incident with high impact and high urgency.
E) Incident Management Process Activities
Lifecycle of Incidents are as follows:
1) Incident Identification - realize an incident before the
user notices / reports with event management (a reactive process)
2) Incident Logging - log ALL incidents for service-level
management reporting and problem management. Information may
include:
a) unique reference number
b) incident category
c) impact, urgency and priority
d) steps to resolution and known errors
e) time from logging to closure
f) Activities undertaken to resolve the incident and
when these took place.
g) Resolution date and time
h) Closure category, date and time
3) Incident Categorization - use a simple categorization
for effective implementation
4) Incident Prioritization - consider business impact and
urgency, to be completed in a pre-agreed time depending on the
priority, may change during the lifecycle
5) Initial Diagnosis - the service desk to diagnose the fault
and try to resolve it with the known error database (by problem
management), incident models or other tools (incident matching)
6) Incident Escalation - the incidents are owned by service
desk (need to track till closure).
a) Functional escalation - service desk unable to solve
the incident within a given time.
b) hierarchic escalation - inform management of major
incidents / incidents not progressing based on SLA target
time
7) Investigation and Diagnosis - try to find out what has
happened and how to resolve
8) Resolution and Recovery - test potential resolutions to
ensure the incident has been solved without causing adverse
consequences
9) Incident Closure - contact user to verify and review categorization,
finish documentation. Closed incidents may be re-opened if the
incident re-surfaces again. Any appropriate function can close
the incident.
Figure: Lifecycle of incidents
F) Incident Management Process- Interfaces with other stages
of ITIL Service Lifecycle.
1) Interfaces with Service Design
a) Service Level Management: The ability to resolve incidents
in a specified time is a key part of delivering an agreed
level of service.
b) Information Security Management: Providing security-related
incident information as needed to support service design
activities and gain a full picture of the effectiveness
of the security measures as a whole based on an insight
into all security incidents.
c) Capacity Management: Incident management provides
a trigger for performance monitoring where there appears
to be a performance problem.
d) Availability Management: Availability management will
use incident management data to determine the availability
of IT services and look at where the incident lifecycle
can be improved.
2) Interfaces with Service Transition:
a) Service asset and configuration management:
Provides data used to identify and progress incidents
and to assess the impact of an incident; also contains
information on which categories of incident to assign
to which support group.
In turn, incident management can maintain the status
of faulty CIs. It can also assist service asset and
configuration management to audit the infrastructure
when working to resolve an incident.
b) Change Management:
Where a change is needed to implement a workaround
or resolution, it will be logged as an RFC and progressed
through change management.
In turn, incident management is able to detect and
resolve incidents that arise from failed changes.
3) Interfaces with Service Operation:
a) Problem management:
For some incidents, it will be appropriate for problem
management to investigate and resolve the underlying
cause to prevent or reduce the impact of recurrence.
Incident management provides reporting point for
these.
Problem management can provide known errors for faster
incident resolution through workarounds to restore service.
b) Access management:
Incidents should be raised when unauthorized access
attempts and security breaches have been detected
A history of incidents should also be maintained
to support forensic investigation activities, resolution
of access breaches.