- Cause Mapping
- Tools & Resources
- About Us
Join US for the next Cause Mapping Root Cause Analysis Public Workshop on MARCH 19-21 in Philadelphia PA
Download the PDF
It was supposed to be NASA’s first manned Apollo mission to Earth orbit, but it was over before it ever got off the ground.
Scheduled to blast off in late February, 1967, the Apollo Command/Service Module was larger and more complex than any spacecraft had ever been. Astronauts Gus Grissom, Edward White, and Roger Chaffee had trained extensively for their mission, and the spacecraft was given a passing grade in August of 1966.
On January 27, 1967, the crew gathered for what was supposed to be merely a routine pre-launch test, a test that moreover was considered non-hazardous because neither the spacecraft nor the launch vehicle was loaded with fuel, and the pyrotechnic systems were disabled.
Fully pressure-suited, Grissom, White, and Chaffee entered the command module, were strapped in, and connected to the spacecraft’s oxygen and communications systems. The test proceeded with only minor issues for the next five hours, and then, around 6:31 EST, it took a dramatic turn for the worse when Chaffee reported, “we’ve got a fire in the cockpit.” Seventeen seconds later, the transmission from the crew ended with a cry of pain. All three crew members died in the tragedy.
The fire aboard Apollo 1 serves as an example of how the root cause analysis process can be applied to a specific incident. Root cause analysis methodology follows three steps:
Each step will be discussed below.
Cause Mapping any incident, the Apollo 1 fire included, requires defining the problem by asking four questions:
Though it may seem simple, this question can elicit myriad responses. Was the problem the fire? The fact that the astronauts White Grissom, and Chaffee were killed? Difficulty in communicating between the crew and mission control? An arced wire? The more closely we examine the Apollo fire, the more elements we can identify that cumulated in a tragic loss of life.
Our cause mapping professionals note all such problem ideas without judgment, and analyze them later in the root cause analysis exercise.
Root cause analysis requires specifying a date in order to be able to gauge change. In this case, the Apollo 1 fire took place on January 27, 1967, around 6:30 pm EST.
Location is also an important contextual piece of information that root cause analysis necessarily takes into account. For the Apollo 1 Fire, this includes the immediate location (aboard the Apollo 1 / Saturn space vehicle), the facility (Cape Canaveral), and the task that was being performed at the time (Launch pad test).
When performing root cause analysis any specific incident, it is important to note how the problem impacted several goals, and to be as specific about defining those goals as possible. For Apollo 1, for example, the goal might appear to have been simply to launch the space shuttle into space; in fact, a closer look reveals that the Apollo fire involved five distinct goals, each of which was impacted by the problem.
The safety goal was affected by the deaths of three astronauts. Vehicle and property goals were impacted by the rupture of the command module. The resulting delay in the mission impacted the mission goals. Finally, rescue efforts had an effect on the labor and time goals.
For any Cause Map, dissecting the problem and the goals is the starting point. In the analysis step, the incident is broken down into causes, which are captured on the Cause Map. First, we write down the goals that were affected as defined in the problem outline. Next, we can ask why an incident occurred and trace its causes.
For the fire on Apollo 1, the most significant problem was the human tragedy: three astronauts were killed. Their deaths impacted the safety goal. This is the first cause-effect relationship in the analysis.
The root cause analysis continues by asking “why?” questions, moving to the right of the cause-effect relationship above.
Why did the three astronauts die? Because of a lack of oxygen.
By continuing this line of inquiry, we develop the Cause Map further.
Why was there a lack of oxygen? Because of a fire in the command module.
Why was there a fire? Any fire requires heat, fuel and oxygen.
Though no single ignition source was conclusively identified, the heat source was believed to be electric arcs, which are formed when electricity jumps from one electrode to another, creating an electrical current. Evidence indicating that several arcs had occurred supports this theory, and is included beneath the cause box. The arcing is believed to be due to inadequate insulation, as well as its installation: there were numerous issues with design and workmanship of the insulation onboard.
Due to discrepancies between specifications, extensive combustible materials were in the cabin, providing the fuel for the fire.
Pure oxygen was used in the spacecraft in order to match conditions in space, where increased oxygen is needed.
The fire, however, was not uniquely responsible for the deaths of the three astronauts. They were killed also because they could not escape the command module.
Why were they trapped? The astronauts were unable to exit the command module before being overcome with the lack of oxygen because the hatch took five minutes to open; this, in turn, was due to a change in hatch design that minimized the possibility of accidental opening after a misfire of the previous hatch design on the Mercury Liberty Bell 7 nearly drowned Gus Grissom in 1961.
Every issue has its causes, and should be broken down into a sufficient level of detail to prevent the incident, to reduce the risk of an incident occurring to an acceptable level. This is why solutions and work processes at, for example, a coffee shop are not as thorough or detailed as they are at an airline or nuclear power facility. The risk or impact to the goals dictates how effective the solutions should be. Lower risk incidents will have relatively lower detail investigations while significantly high risk to an organization’s goals requires a much more thorough analysis.
Once the Cause Map is built to a sufficient level of detail with supporting evidence, it can be used to develop solutions. The Cause Map identifies all possible solutions for a given issue so that the best solutions can be implemented. The detail inherent to root cause analysis makes it easier to identify many possible solutions than oversimplified analysis of the issue that considers only the most immediate problems and causes permits.
Solutions can be documented directly on the Cause Map, and are typically placed in a green box directly above the cause that the solution controls. At this stage, all solutions are considered and listed on the Cause Map.
After the analysis is complete, the best solutions are selected based on their impact on the organization’s goals. Shown below are the action items implemented as a result of the Apollo 1 incident.
A major factor in the death of the three astronauts was the fact that they could not get the hatch open; furthermore, because the hatch was designed to be opened from the inside, the engineers on the outside of the spacecraft could do nothing to help.
A spark from faulty insulation around the wires ignited the fire. The wires were coated with Teflon, which is an excellent material for insulation; Teflon is also, however, a soft material that wears away easily. This Teflon coating had indeed worn away, exposing electrical wires.
Extensive combustible materials (mostly Nylon materials and Velcro) were used in the cabin, as these seemed to solve the problem of holding equipment in place. The crew had voiced concerns over the amount of flammable materials back in August of 1966, and indeed, the flammable material onboard accelerated the fire..
The cabin had been pressurized with pure oxygen at a level of 16.7 pounds per square inch, creating an environment in which materials that are not normally highly flammable will burst into flame.
After the tragedy on Apollo 1, NASA was determined to learn from the incident. The command module was extensively evaluated and redesigned, the Apollo program continued. The next manned mission, launched in October of 1968, was successful in part because the Apollo 1 fire prompted NASA to make several key improvements, including an onboard TV camera, emergency oxygen masks, improved radio communications, a fire extinguisher, fewer combustible materials in the command module, better wiring, and a new system for stabilizing atmospheric conditions. In 1969, an Apollo 1 patch was left on the moon’s surface by Apollo 11 crew members in memory of the mission that never made it off the ground.
The root cause analysis method focuses on the basics of the cause-effect principle so that it can be applied consistently to everyday issues as well as catastrophic, high risk issues. Focusing on the basics of the cause-effect principle make the Cause Mapping approach to root cause analysis a simple and effective method for investigating safety, environmental, compliance, customer, production, equipment or service issues.”
Click on “Download PDF” above to download a PDF showing the Root Cause Analysis Investigation.
Want to see more space-related cause maps? Check out our root cause analysis of the 2003 loss of the Columbia space shuttle on re-entry , or the 1986 Challenger explosion .
Or take a look at some of our other “Fires & Explosions” cause maps:
Schedule a workshop at your location to train your team on how to lead, facilitate, and participate in a root cause analysis investigation.