EHS - June 2017
By: Terry L. Mathis
When accidents happen, organizations analyze them to determine causation. Causation is not always a one-dimensional thing, i.e. one cause producing one effect; therefore, we seek to determine root cause. The theory is that removing or addressing the root cause is the most elegant and effective preventative measure. However, there are several problems with this theory.
Accidents do not always have a single root cause, and one or all root causes cannot always be accurately determined. There are several popular methods for determining root cause and they have some interesting differences. All suggest that asking "why?" is key. Ask why the accident happened to determine the immediate cause. Then ask why the cause happened and so on. Some suggest root cause is always the fifth underlying cause and others differ on their interpretations. In his book, The Apollo Root Cause Method, Dean Gano suggests all accidents have multiple causation chains. He thinks these chains tend to be either behavioral or conditional and that there can be multiples of either or both. This assumption complicates the theory of a singular root cause that can be determined and addressed to prevent future accidents. Gano also points out that we tend to assume we have determined the root cause when we have simply run out of information to answer the next "why" question.
As we follow the causal chain(s), asking why at each level, sometimes the next underlying cause has multiple causes. This means causation is more of tree diagram than a linear chain. Even if the analysis starts with a single chain, the chain can split at any level into multiple causal lines. Arriving at the conclusion that an accident has multiple root causes suggests a more complex solution to prevent future accidents. Does one have to address all or just some of the root causes, and, if some, which ones?
Even if root causes are accurately identified, they cannot always be neutralized. In some instances, the risks that can cause accidents are critical to the mission of the organization. In our work with nuclear power plants, we were often told jokingly that if we could just eliminate radiation, steam and high voltage the plant would be a relatively safe place to work. Certainly, not all root causes are underlying risks. In some instances, root causes of accidents involve the failure of risk-control measures or the failure of workers to utilize these measures. In other cases, root causes can involve mechanical failure which can often be addressed. Mechanical design issues can be more challenging. Many mechanical equipment configurations create pinch points or line-of-fire issues that cannot be either eliminated nor effectively guarded. Many of our clients keep a list of issues they have identified but have not yet found good solutions to address the risks.
This partial list of the limitation of root-cause analysis is not a condemnation of the practice but rather an attempt to put it in proper perspective. Too many view root-cause analysis as the ultimate way to address accident prevention. Those with technical training tend to apply the concept of root cause to people issues as they have been taught to apply them to machines or circuits. If a machine malfunctions and the root cause is not addressed, the problem will recur. This is not necessarily true of human performance issues. Humans can adopt precautions and work-arounds that can prevent accidents even with the primary risk still present. This means accident investigations should look for such preventative measures even if they are not root causes. There is a potential danger that strict adherence to root-cause analysis processes could blind investigators from seeing other possible solutions.
Root-cause analysis, like so much of traditional safety, is a reactive intervention. An accident must occur before the analysis can propose any action to impact safety. It is neither predictive nor prescriptive before the accident. This does not mean root-cause analysis cannot prevent accidents. It simply means it cannot do so proactively. Like all reactive safety, root-cause analysis is driven by a lagging indicator. As safety improves, lagging indicators decrease. Unfortunately, before they reach zero they often lose their statistical significance and thus become less useful in preventing accidents over the short term. Many safety programs make great improvements using lagging indicators until those very indicators became so few and far between that they are no longer effective. Reacting to single data points, even if done well, may not address the lower-probability accidents that tend to remain when high-probability accidents have been prevented.
Root cause can potentially compete with innovation in three ways:
- It can consume the organization with reactive activities and put it in fire-fighting mode rather than proactive improvement mode. Most safety departments and programs have limited bandwidth (budget, personnel, time, etc.) and once that is spent, there is nothing left for other efforts.
- It can become a blame game in which the goal is to fix blame and assume assignment of blame fixes the problem. We have seen organizations where the most common corrective action from root-cause analysis was retraining a worker. In many cases the worker's knowledge or skills was impacted by another influence, but that influence was not down the causal chain.
- It can create a mindset that safety is problem-solving rather than innovating. Problem solving is largely aimed at issues which disrupt the status quo and can be addressed to return the workplace to a previous condition. Innovation is not a previous condition, but a totally new and improved one.
Root-cause analysis is a serious and valuable tool for safety improvement if used in the proper context with an appreciation of both its power and limitations. Effective use of this tool begins with a mindset that addressing root cause is not the only way to prevent accidents and realizing the importance of keeping an eye open to the other preventative opportunities.