Skip to main content

The relevance of Muphy's Law to problem management

Murphy’s Law states: "if anything can go wrong, it will." The first reported use of the term Murphy’s Law is in 1952 in a book by Anne Roe, quoting an unnamed physicist. The observation inherent in Murphy's Law, with which so many IT professionals have an affinity, has great relevance to the field of problem management. There is a close correlation between Murphy’s Law and Heinrich’s Incident Pyramid (described below). In complex technological systems as found in IT, it is inevitable that incidents will happen. Both "Murphy" and Heinrich point to the inevitability of an incident, one is an adage and the other a research but both have a similar conclusion. The means to combat "go wrong" lies in IT Safety. The terms of reference of IT Safety is to reduce the rate at which shit happens ("go wrong"). It is possible to reduce shit happening, from once a day to once a week, by using safer processes that result in the time period between near misses being larger. This improves safety in IT.
The Incident Pyramid originated in 1931 when H.W. Heinrich described it in his book, Industrial Accident Prevention: A Scientific Approach. The Incident pyramid proposes that for every 300 unsafe acts there are 29 minor injuries and one major injury. The Incident Pyramid is corroborating evidence for Murphy's Law, which was published 21 years later.

Besides the Incident Pyramid the book also illustrates Heinrich's theory of incident causation. Unsafe acts lead to minor injuries and, over time, to major injury. All incidents occur as a result of many factors or multiple causes. Root Cause Analysis based on this theory is used in incident investigations whereby the obvious physical circumstance of the incident is investigated to determine its cause, and what led to that, and so forth, until no further factors can be identified. To avoid highlighting functional inadequacies many organizations simply identify the cause of most incidents as human error, or failure to follow safety rules. This dishonesty is often labelled as scapegoating. This habit of blaming major incidents on humans damages IT Safety.
In 1969, the Insurance Company of North America conducted a subsequent study using more than 1.7 million incidents reported by nearly 300 companies in 21 industrial groups. That study revealed a similar pattern to Heinrich’s but with slight deviations in the ratios. For each serious injury, there were 10 minor injuries, 30 property-damage incidents and 600 near-miss incidents that resulted in no injury or property damage.
The incident pyramid from Dresser-Rand.

Comments

Popular posts from this blog

easywall - Web interface for easy use of the IPTables firewall on Linux systems written in Python3.

Firewalls are becoming increasingly important in today’s world. Hackers and automated scripts are constantly trying to invade your system and use it for Bitcoin mining, botnets or other things. To prevent these attacks, you can use a firewall on your system. IPTables is the strongest firewall in Linux because it can filter packets in the kernel before they reach the application. Using IPTables is not very easy for Linux beginners. We have created easywall - the simple IPTables web interface . The focus of the software is on easy installation and use. Access this neat software over on github: easywall

No Scrubs: The Architecture That Made Unmetered Mitigation Possible

When building a DDoS mitigation service it’s incredibly tempting to think that the solution is scrubbing centers or scrubbing servers. I, too, thought that was a good idea in the beginning, but experience has shown that there are serious pitfalls to this approach. Read the post of at Cloudflare's blog: N o Scrubs: The Architecture That Made Unmetered Mitigation Possible

Should You Buy A UniFi Dream Machine, USG, USG Pro, or Dream Machine Pro?