Skip to main content

Not being lead astray by vendors

Do vendors really understand problem management or do they lead their customers astray? Do they lie? This is the typical vendor take:
  • Incident management is the ability to create a ticket.
  • Problem management is the ability to associate one ticket with another.
  • Root cause analysis is the ability to type in a description field on a ticket.
Vendors call this ticketing system a service desk, previously in was a help desk. The migration from help desk to service desk was a simple step and involved the act of renaming. However, to further pull the wool over the customer's eyes a special ticket was created called a change request. This ticket was the same as any other ticket except that it had an approver field.
Some vendors also drank CMDB Kool-aid and offered up a pimped up inventory database. This was meant to keep this house of cards from crashing down.
"A fool with a tool is still a fool!" I have been such a type of fool and along the way probably spent unnecessary moolah! So I mused as to what combination of tools, knowledge and learning would constitute a practical benefit to problem management. My musings and wanderings lead me to discover the problem management "sweet spot!" The problem management "sweet spot" is the diagnosis of a service failure of severe negative business consequence. It is not the detection or the actual act of return to service. It is not the physical infrastructure or the logical application. It is all about prioritizing the diagnosis of an immediate workaround, then investigating a sustainable resolution.
There are as many vendors selling detection tools as there are Schmidts in Germany. However, detection is never an underlying issue, as usually your customers will tell you (and often in a not to pleasant manner!) Alternatively, a return to service is totally dependant on a diagnosis. Diagnosis is extremely difficult to nail down to a set time period. When 5h1t happens it is not possible to say with absolute certainty that it would be possible to nail a diagnosis within a fixed time period, say five minutes. Many service providers are quite prepared to commit this to a "SLA" and ride their luck.
So back to a tool for problem management. Does one exist? Problem management is dependant on knowledge created from learning. This is not provided by a tool but the output should at least be stored in a type of repository.
The most current practical tools are not bought for a million dollars but for a measly few.  They called pen and paper and are the basis for the start of documentation.

I am not trying to be a similar heretic to those weird consultants who proclaim enterprise architecture can be done with spreadsheets. It is about being practical and using a set of tools. Most tools in IT that provide service, network and systems management are disproportionally priced for the value and business case they address. And then to crown it all, none really do root cause analysis as defined in Heinrich's multiple causation model. The exception being the open source ones, who have a perfect price!
Besides the Bic pen and A4 counter book, then next most important tool is the voice multipoint conference. A tool does not communicate or involve stakeholders. (And automated emails is an overrated spam creator!) This leads to the further form of collaboration whose backbone is documentation. I have previously blogged about the Techie curse, which is a reluctance to create and work with documentation. A Pareto analysis of the major reasons for delays in times to diagnose, would be the lack of appropriate documentation! If this is the issue then the most important tool for problem management is one that addresses this oversight.
A key cornerstone of documentation is Standard Operating Procedures which need to be readily available and accessible. A tool that fulfils these requirements is a wiki. This is a type of portal to which any problem management collaboration is directed. It provides an appropriate use of Web 2.0 social networking tools in a service management context. The worst form of collaboration is email. It stinks and is often misconstrued.
Any resolution to a problem is never complete unless the paper work is done.

Comments

Popular posts from this blog

Why Madge Networks, the token-ring company, went titsup

There I was shooting the breeze with an old mate. The conversation turned to why Madge Networks which I wrote about here went titsup. My analysis is that Madge Networks had a solution and decided to go out and find a problem. They deferred to more incorrect strategic technology choices. The truth of the matter is that when something goes titsup, its not because of one reason only, but a myriad of them all contributing to the negative consequence. There are the immediate or visual ones, which are underpinned by intermediate ones and finally after digging right down, there are the root causes. There is never a singular root cause for anything but I'll present my opinion and encourage everyone else to chip in. All of them together are more likely the reason the company went titsup. As far as technology brainfarts go there is no better example than Kodak . They invented the digital camera that killed them. However, they were so focused on milking people in their leg

Flawed "ITIL aligned"​ Incident Management

Many "ITIL aligned" service desk tools have flawed incident management. The reason is that incidents are logged with a time association and some related fields to type in some gobbledygook. The expanded incident life cycle is not enforced and as a result trending and problem management is not possible. Here is a fictitious log of an incident at PFS, a financial services company, which uses CGTSD, an “ITIL-aligned” service desk tool. Here is the log of an incident record from this system: Monday, 12 August: 09:03am (Bob, the service desk guy): Alice (customer in retail banking) phoned in. Logged an issue. Unable to assist over the phone (there goes our FCR), will escalate to second line. 09:04am (Bob, the service desk guy): Escalate the incident to Charles in second line support. 09:05am (Charles, technical support): Open incident. 09:05am (Charles, technical support): Delayed incident by 1 day. Tuesday, 13 August: 10:11am (Charles, technical support): Phoned Alice.

A checklist for troubleshooting network problems (22 things to catch)

  Assumptions! What is really wrong? Is it the network that is being blamed for something else? Fully describe and detail the issue . The mere act of writing it down, often clarifies matters. Kick the tyres and do a visual inspection. With Smartphones being readily available, take pictures. I once went to a factory where there was a problem. Upon inspection, the network equipment was covered in pigeon pooh! The chassis had rusted and the PCB boards were being affected by the stuff. No wonder there was a problem. In another example, which involved radio links. It is difficult with radio links to remotely troubleshoot alignment errors. (I can recall when a heavy storm blew some radio links out of alignment. Until we climbed onto the roof we never realised how strong the wind really was that day!) Cabling. Is the cable actually plugged in? Is it plugged into the correct location. Wear and tear on cabling can also not b