Published:
September 26 2008, 12:23 AM
|
no comments
by
Peter Doherty
I just ran a series of seminars here in Oz on what has really changed between ITIL V2 and V3 and the 5 killer areas that people should look to drive value.I will not bore you with the differences because I think there are a million blogs and sites that cover that. What I will cover is some of the new processes and how you can get quick returns. Becoming proactive is very important to providing high levels of service and customer satisfaction because if the first time you know about a disruption to service is when the phone rings, you are in a bad place. One of the ways that ITIL V3 addresses this is through the Service Operations Event Management process.
In V2 nearly Events were considered to be Incidents whether they effected Services or not - what a pain and waste of resources in dealing with this. In V3 Event Management has its own process and this is the cornerstone of moving to be proactive. Events can come from many different sources, the infrastructure or software itself can generate events about their operational state and errors and these will generally go to logs. These logs may get looked at manually or through log monitoring applications which have intelligence to recognise important events in the log and propagate them to a monitoring platform. But who really cares about these events, someone has to or why are they being generated.
It is generally the platform or application people who care about them from an operational perspective but rarely do they have the time to review them.Another area that generates events is the Network and systems management tools like CA has where either through agents or agentless methods key metrics are monitored. Thresholds can be set against these and if the thresholds are breached an event is created - yet more events coming in. The problem is that we generally set these thresholds around things like CPU and memory utilisation. When the CPU or memory spikes, or where we see sustained breaches we create an event.But still, so what. Without baselining the environment these are all generic thresholds that tell us nothing. Does it really matter that the CPU is running at 90%? Well it might but it might also be normal behavior that indicates that everything is humming away fine.
So setting generic thresholds without a baseline is a bit fruitless, this is why people get so many false positive Incidents out of monitoring tools. There is no 'one size fits all' threshold.Once you do baseline and start setting intelligent thresholds you will start creating more meaningful events. You still want to keep information about those spikes but the operational people can look at that from a trending and performance perspective as they are not impacting Services.Unless you work in a small organisation you are probably going to be faced with a multitude of monitoring tools generating events. Where this is the case you need to put a layer of correlation in that all events are directed to.
Some will instantly be recognised as needing an Incident created for and at other times, by performing correlation across a number of disparate events within a finite time range an Incident will be created. Once you decide you do want to create an Incident, what do you create it on, certainly not the box that it came from? You want to generate it for the Service that has been impacted so there is a need to understand the relationship from what generated the event and the business service that it affects. This is one of the things that V3 forces us to think about - how does something relate to a Service.Now you are seeing Incidents be created possibly prior to a customer impact, that is what being proactive all is about.I will cover the other areas in subsequent posts - but it is not all about me (well yes it is) so what do you think?
By: Peter Doherty
Peter Doherty is an ITILv3 contributing author and a Principal Consultant for CA. With 25 years IT experience in Service Management as well as Enterprise Network and Systems Management, Peter Doherty is CA’s foremost Service Management evangelist in the Asia Pacific region. His day-to-day responsibility...
Read More..