Mark paradies

Why didn’t senior management reach the same conclusions that my boss did when presented with the same facts? The problem was that our management couldn’t critically review their own management systems. They thought that they were doing great, and any other feedback was outside their paradigm. They were not being self-critical. They could not face the facts. And no one else was willing to tell them that they needed to improve (lots of “yes men” around them).

First, they had a hard time getting past blaming the operators and supervisors (five were fired). There was an internal BP group (the Bonse report – see Bonse Main Report.pdf) that recommended management discipline (blaming the lower levels of senior management). However, no immediate disciplinary action was taken.

Within two years, all the senior line management from the Refinery General Manager to the CEO were gone (none were fired immediately as part of the incident response). So the ability of that management to learn didn’t make any difference – they were gone!

Second, senior management needs to understand root cause analysis. This may be more common today than it was thirty plus years ago because more senior managers have had some experience with advanced root cause systems (TapRooT®). They need to understand their (management’s) impact on management systems and their impact on investigations and implementation of corrective actions.

First, let me say that my high-reliability organization experience was in Admiral Rickover’s Nuclear Navy. Seven years and two ships – the USS Arkansas and the USS Long Beach – both nuclear powered cruisers. I had close friends on submarines and nuclear-powered carriers. To learn more about the Nuke Navy record and Rickover’s philosophy, read the series of article at:

I think the answer is ZERO. Rickover’s Nuclear Navy was (and still is even after he is gone for almost forty years) an organization that achieves the amazing record of ZERO reactor safety related accidents. There were no fatalities, no major releases of radioactive material, and no core-melt accidents. And this record was set during Rickover’s leadership when the Navy was running hundreds of reactor at sea every year. Today that’s a record of sixty years of continuous operation of many nuclear reactors without a major reactor related accident.

Let’s use a sports example. If you are a great athlete, you have the chance to excel at a sport. But you probably can’t be good at every sport. The world’s greatest basketball player may not be able to switch over and become the worlds greatest baseball player … or golfer. They might be better than average but they probably won’t be great.

If I was going to get advice about becoming a high-reliability organization (quite a challenge if you aren’t one), I would talk to someone who lived in a high-reliability organization and who also has worked at a non-high-reliability organization. Or, better yet, I would hire that person to help lead the effort. I would not become part of someone’s university research detached from the experience of living in the organization.

She participated in the investigation and was pleasantly surprised. The investigation identified a number of Causal Factors including her “screw up.” But, to her surprise, they didn’t just stop there and blame her. They looked at the reasons for her mistake. They found there were three “root causes” that could be fixed (improvements that could be made) that would stop the mistake from being made in the future.

Back early in his career, he had been an engineer involved in the construction and startup of a major facility. One day when they were doing testing, the electrical power to some vital equipment was lost and then came back on “by itself.” This caused damage to some of the equipment and a delay in the startup of the plant. An investigation was performed and no reason for the power failure or the reason for the power coming back on could be found. No one admitted to being in the vicinity of the breaker and the breaker was closed when it was checked after the incident.

Thirty years later they held an unofficial reunion of people who had worked on the project. At dinner, people shared funny stories about others and events that had happened. An electrician shared his story about accidentally opening the wrong breaker (they weren’t labeled) and then, when he heard alarms going off, re-shutting the breaker and leaving the area. He said “Well, I’m retired and they can’t punish me for it now.”

That electrician’s actions had been the cause of the incident. The refinery manager telling the story added that the electrician probably would have been fired if he had admitted what he had done at the time. The refinery manager then added that, “It is a good thing that we use TapRooT® and know better than to react to incidents that way. Now we look for and find root causes that improve our processes.”

Back in 1985, my boss suggested I (Mark Paradies) look at better ways to analyze and trend root causes. I already understood human factors (from training for my master’s degree from the University of Illinois) and equipment reliability (from my experience in the Nuclear Navy). Therefore, I had a good start on the knowledge I needed to develop a system for root cause analysis.

He explained that senior management would go nuts if we said there was a leadership issue. We argued for quite a while. I said that management would just have to accept that they, too, could improve. He explained that I had done such a good job NOT having blame categories in the system that I shouldn’t have a blame category for leadership and that’s how the company’s management would view a Leadership category.

Eventually, he convinced me that we could come up with a better name. He wanted to call it “Organizational Factors.” I said that it should be “Management.” He said that was just as bad as Leadership. He wanted to call it “Systems.” I thought that clouded the responsibility for where the changes needed to be made. Finally I suggested “Management System.” It took a while, but he finally agreed.

The first place it spread to was INPO (Institute of Nuclear Power Operation). I had interviewed for a job there and they had tried to hire me but I had already taken another job. However, I stayed in contact with Joe Bishop who was in charge of developing the HPES (Human Performance Evaluation System) system for INPO. When they had a version of the system ready, he sent me a copy for review. I commented that they had missed the Management System causes. Sure enough, my work on Management System made it into INPO’s HPES system.

With almost 1 out of 5 having significant concerns, and two thirds having some concerns, it made me wonder about the blame being placed on the ship’s Commanding Officers and crew. Were they set up for failure by a training program that sent officers to sea who didn’t have the skills needed to perform their jobs as Officer of the Deck and Junior Offiicer of the Deck?

According to and article in The Maritime Executive Lt j.g. Sarah Coppock, Officer of the Deck during the USS Fitzgerald collision, pled guilt to charges to avoid facing a court martial. Was she properly trained or would have the Navy’s evaluators had “concerns” with her abilities if she was evaluated BEFORE the collision? Was this accident due to the abbreviated training that the Navy instituted to save money?

In other blame related news, the Chief Boatswains Mate on the USS McCain plead guilty to dereliction of duty for the training of personnel to use the Integrated Bridge Navigation System, newly installed on the McCain four months before he arrived. His total training on the system was 30 minutes of instruction by a “master helmsman.” He had never used the system on a previous ships and requested additional training and documentation on the system, but had not received any help prior to the collision.

You might laugh at these root causes but they are included in real systems that people are required to use. The “operator is stupid” root cause might fit in the “reasoning capabilities less than adequate,” the “incorrect performance due to mental lapse,” the “poor judgment/lack of judgment,” or the “insufficient mental capabilities” categories.

To deal with human frailties, we implement best practices to stop simple memory lapses from becoming incidents. In other words, that’s why we have checklists, good human engineering, second checks when needed, and supervision. The root causes listed on the back side of the TapRooT® Root Cause Tree® are linked to human performance best practices that make human performance more reliable so that a simple memory lapse doesn’t become an accident.

What happens when you make a pick list with blame categories like those in the bulleted list above? The categories get overused. It is much easier to blame the operator (they had less than adequate motor skills) than to find out why they moved the controls the wrong way. Its easy to say there was a “behavior issue.” It is difficult to understand why someone behaved the way they did. TapRooT® looks beyond behavior and simple motor skill error to find real root causes.