Home > Insights > Blogs 

CA Community

This Blog

Could Service Level Management Help Avoid Data Service Provider Outages?

Published: August 24 2009, 03:41 PM
by Michael King

I recently ran across this CIO article, "With Recent Outages, Big Data Service Providers Take Hits," and it got me thinking. Within the past 12 months, I have read a number of articles that have identified service outages by many Service Providers.  According to these articles, the reasons cited for the outages vary from human error to facility issues (i.e. power outages) to spikes in capacity.  The results of some these outages ranged from an apology to customers to millions of dollars in financial penalties paid by Service Providers to customers.

As I read through many of these articles, a question occurred to me, could Service Level Managment (SLM) have helped to avoid these service outages?  The answer that I came up with is quite possibly. The reason I believe this is that a good SLM process is dependent upon and enhanced by its relationship to other established ITIL management processes to enable its success. These processes include:

  • Availability Management - This process monitors and measures the availability of systems and components that make up a service offering. This information is an input to the SLM process used to analyze, process and report SLA results
  • Capacity Management - Similar to the Availability process, Capacity Management will provide input into the SLM process on service component capacity. This data will be used to analyze, process and report SLA results.
  • Incident and Problem Management - The SLM process will provide input data to the Incident Management process when a SLA threshold warning or breach occurs. An incident ticket will be created when the warning or breach occurs. If multiple incident tickets have been created, they can be tracked to determine if a problem needs to be created in the Problem Management process.
  • Change Management - If the SLM process has generated an issue and ultimately a problem has been identified, an RFC can be created to make modifications to the troubled service to correct the identified problem. This RFC will also trigger activity in the Configuration Management process as well.
  • Services Continuity Management - The SLM process will also provide input data to the Services Continuity process as well. During regular service reviews, Service Providers can look at SLA statuses and trends to determine if actions need to be taken to prevent breaches before they occur or if the SLA needs to be renegotiated due to changes in usage patterns.

Simply put, if a Service Provider conducts regular Quality of Service (QoS) reviews of SLAs, OLAs, UCs (Underpinning Contracts), there is a good chance that service breaches and degraded service qualities can be caught prior to complete service outages occurring. These QoS reviews may also include the review of associated KPIs and/or metrics, and related incident/problem/change tickets because they can provide a more holistic view of a service.  By catching issues prior to complete outages, the Service Provider will ultimately save money by not paying SLA penalties, maintain good relations with their customers, and build their reputation as a reliable Service Provider.

Share this post:  EmailEmail

By: Michael King
Michael King is a Senior Engineering Services Architect in CA’s Service Management group. Michael has over 19 years of experience in IT that includes software engineering, operations management, systems integrations, and process reengineering. Currently, Michael concentrates on Service Level Management...
Read More..

Comments:

No Comments
 
 
Page Tools