Fault Tree Analysis is the method of choice when you need to understand not just what caused a failure, but how combinations of events can conspire to produce a catastrophic outcome. Developed by Bell Labs in 1961 for the Minuteman missile program and later adopted by NASA, nuclear power, and the automotive safety industry, FTA uses Boolean logic to build a visual map from a top-level failure down to every root cause that could contribute to it. This guide explains the gate logic, walks through the 7-step process, shows three complete examples, and tells you exactly when to use FTA instead of FMEA or 5 Whys.

What Is Fault Tree Analysis?

A fault tree is a top-down, deductive diagram that starts with a single top event — an undesired failure, accident, or safety incident — and breaks it down into all possible causes using logic gates. The analysis answers: “What combinations of events, if they occurred simultaneously, would cause this top event?”

Three key characteristics define FTA:

FTA in one sentence: Starting from an undesired top event, decompose it through AND/OR logic gates until you reach basic events (component failures or human errors), then identify the smallest combinations of basic events that cause the top event — these are your minimal cut sets and highest-priority risks.

AND Gates and OR Gates: The Logic of FTA

Every node in a fault tree connects to its children through either an AND gate or an OR gate. Getting these right is the most important skill in FTA.

OR Gate

OR Gate

The output event occurs if any one of the input events occurs. One failure is sufficient. Represents parallel failure paths — each path independently leads to the output.

Example: “Loss of power” occurs if the mains supply fails OR the backup generator fails OR the battery runs out.

AND Gate

AND Gate

The output event occurs only if all input events occur simultaneously. All safeguards must fail at once. Represents redundancy — where multiple independent protections exist.

Example: “Undetected fire” occurs only if Detector A fails AND Detector B fails AND the manual inspection is missed.

Why this matters for risk: OR gates increase system risk because any single failure propagates upward. AND gates represent safety redundancy — they are where defence-in-depth protects you. A system with all OR gates between the top event and basic events has no redundancy at all; every basic event is a single point of failure.

Other FTA Symbols

Minimal Cut Sets: Where the Real Risk Lives

After building the fault tree, the most important analytical step is finding the minimal cut sets (MCS).

A cut set is any combination of basic events whose simultaneous occurrence causes the top event. A minimal cut set is the smallest such combination — you cannot remove any event from it without breaking the path to the top event.

Cut set priority ranking

Single-event MCS Highest priority. One failure alone causes the top event. This is a single point of failure (SPOF) — requires immediate action.
Two-event MCS High priority. Two independent failures must coincide. Risk increases when one failure is latent (undetected until the second occurs).
Three+ event MCS Lower priority. Three or more simultaneous independent failures required. Probability is low if events are truly independent.

Identifying MCSs tells you exactly where to focus corrective action: eliminate single-event MCSs first (add redundancy or remove the failure mode), then work through two-event MCSs.

How to Perform a Fault Tree Analysis: 7 Steps

  1. Define the top event precisely. State the exact undesired outcome: “Main landing gear fails to extend on approach”, not “landing gear problem”. Vague top events produce vague trees. Include the system boundary and operating conditions.
  2. Assemble a cross-functional team. Include design engineers, operators, maintenance staff, and safety specialists. FTA done alone misses failure modes that practitioners know from experience.
  3. Identify the immediate causes of the top event. Ask: what direct conditions would cause this? Connect them with AND or OR as appropriate. This is the most judgement-intensive step — getting the gate type wrong changes the entire risk picture.
  4. Decompose recursively until you reach basic events. For each intermediate event, repeat step 3. Keep decomposing until you reach component-level failures or human errors that cannot be broken down further. Typical fault trees have 3–7 levels.
  5. Identify all minimal cut sets. Work through the tree systematically using Boolean algebra or software to find every minimal combination of basic events that causes the top event. Flag all single-event MCSs as single points of failure.
  6. Perform quantitative analysis if data is available. Assign failure probabilities to basic events (from reliability data, field history, or engineering estimates). Calculate top event probability using the cut set probabilities. This step is optional but required for safety certifications.
  7. Define corrective actions and maintain the tree. For each high-priority MCS: add redundancy, strengthen detection, or eliminate the failure path. Document the revised tree. Update the fault tree whenever the design, process, or operating environment changes.

Fault Tree Example 1: Aircraft Engine Shutdown (Aerospace)

Top Event: In-flight engine shutdown (IFSD)

Context: Simplified fault tree for a turbofan engine in-flight shutdown event, used as part of SAE ARP4761 safety analysis for aircraft type certification.

Simplified Fault Tree — In-Flight Engine Shutdown
In-Flight Engine Shutdown
OR
Fuel supply failure
Engine control failure
Mechanical seizure
AND
OR
AND

Fuel supply failure (AND gate): Requires primary fuel pump failure AND backup pump failure simultaneously — two-event MCS. Low probability due to redundancy.

Engine control failure (OR gate): Either FADEC Channel A fails OR Channel B fails OR both lose power — three separate single-event MCS paths. Each is a single point of failure and must achieve target probability of < 10−7/flight hour.

Mechanical seizure (AND gate): Requires bearing failure AND oil starvation together — two-event MCS.

Minimal cut sets

Single-event FADEC Channel A failure alone causes IFSD
Single-event FADEC Channel B failure alone causes IFSD
Single-event Dual-channel power loss alone causes IFSD
Two-event Primary pump failure + backup pump failure
Two-event Bearing failure + oil starvation

Action: The three single-event FADEC MCSs drive the design to dual-channel architecture with independent power supplies and cross-channel monitoring. Each channel must independently meet the 10−7/hour target.

Fault Tree Example 2: Chemical Tank Overflow (Process Safety)

Top Event: Hazardous chemical tank overflow

Context: Process safety FTA for a storage tank containing a corrosive liquid at a chemical plant. Required under IEC 61025 and OSHA Process Safety Management (PSM). The tank has two independent high-level shutdown systems.

Top event decomposition (OR gate): Tank overflow occurs if either the fill system delivers too much product OR the overflow detection and shutdown fails to activate in time.

Fill system delivers excess (OR gate):

Detection and shutdown fails (AND gate): Both independent shutdown systems must fail simultaneously:

Minimal cut sets — Tank Overflow

Single-event Fill valve sticks open alone causes overflow
Single-event Operator sets fill rate incorrectly alone causes overflow
Two-event LS-101 fails + LS-102 fails simultaneously

Actions: Both single-event MCSs are addressed first: (1) add a solenoid valve with position feedback and hard interlock on fill valve; (2) add a flow meter with totalizer alarm to detect fill rate errors before overflow is reached. For the two-event MCS: implement staggered testing of LS-101 and LS-102 on alternating weeks to ensure latent failures are detected before both switches fail simultaneously.

Fault Tree Example 3: Web Application Data Breach (Software / Cybersecurity)

Top Event: Unauthorized access to customer PII database

Context: Security FTA for a SaaS application handling customer personally identifiable information (PII). Performed as part of a SOC 2 Type II preparation and threat modelling exercise.

Top event decomposition (OR gate): Unauthorized PII access occurs via external attack OR insider threat OR accidental exposure.

External attack path (AND gate): Attacker reaches database AND authentication is bypassed:

Insider threat path (OR gate): Malicious employee with DB access OR compromised contractor credential — both are single-event MCSs.

Accidental exposure (OR gate): DB credentials in public code repo OR unencrypted backup on public storage — both single-event MCSs.

Highest-priority minimal cut sets

Single-event DB credentials committed to public GitHub repository
Single-event Unencrypted DB backup in public S3 bucket
Single-event Malicious employee with unrestricted DB access
Single-event Compromised contractor credential (no MFA)
Two-event Firewall misconfiguration + SQL injection vulnerability

Actions (in priority order): (1) Scan all repos with secret detection tool, rotate all exposed credentials immediately; (2) audit S3 bucket ACLs, enforce encryption at rest; (3) implement least-privilege DB access with role separation; (4) enforce MFA on all contractor accounts; (5) deploy WAF with SQLi ruleset and quarterly pen testing.

FTA vs FMEA vs 5 Whys vs Fishbone

ToolDirectionTimingBest used forKey strength
Fault Tree Analysis Top-down Proactive or reactive Specific high-consequence event; combination failures; safety certification Models AND logic — where multiple safeguards must all fail
FMEA Bottom-up Proactive Comprehensive coverage of ALL failure modes in a product or process Systematic — nothing slips through; produces prioritised RPN risk register
5 Whys Linear (single chain) Reactive Single known failure; quick root cause in 15–30 min Fast, no training required, works with small teams
Fishbone diagram Lateral (categories) Reactive Complex problem; brainstorm all possible causes across departments Visual, team-based, covers 6M or 6P categories comprehensively
Pareto Analysis Statistical Reactive / analytical Prioritise which problem to investigate first using 80/20 rule Data-driven prioritisation — tells you where to focus first

When to choose FTA specifically: Use FTA when (1) you are analysing a specific catastrophic or safety-critical event, (2) you need to model combinations of simultaneous failures where AND logic matters, (3) a safety standard requires it (SAE ARP4761, IEC 61025, ISO 26262, NRC PRA), or (4) you need to calculate failure probability from component reliability data. For everything else, 5 Whys or FMEA is faster and sufficient.

Quantitative FTA: Calculating Top Event Probability

When component failure rate data is available, FTA can produce a numerical probability for the top event. This is required in aerospace (FAA/EASA certification), nuclear (NRC probabilistic risk assessment), and automotive functional safety (ISO 26262).

The calculation uses the minimal cut sets:

For a two-event MCS with P(A) = 10−3 per hour and P(B) = 10−3 per hour, the combined probability is 10−6 per hour — this is how redundancy achieves the extreme reliability targets required in aviation and nuclear power.

For qualitative FTA (finding cut sets without probabilities), no special software is needed. For quantitative FTA with complex trees, tools like OpenFTA, CAFTA, or Relex are commonly used.

Common FTA Mistakes

Investigate Root Causes with Free RCA Tools

After identifying basic events in your fault tree, use our free 5 Whys and Fishbone tools to drill into the root cause of each one — no signup, no downloads.

Explore Free RCA Tools →

Frequently Asked Questions

What is fault tree analysis?

Fault Tree Analysis (FTA) is a top-down, deductive failure analysis method that uses Boolean logic (AND and OR gates) to map all possible combinations of events that could lead to a specific undesired top event — such as a system failure, accident, or safety incident. Starting from the top event, you decompose it through logic gates down to basic events (component failures or human errors). Identifying the minimal cut sets — the smallest combinations of basic events that cause the top event — reveals single points of failure and guides corrective action.

What is the difference between an AND gate and an OR gate in FTA?

An OR gate outputs a failure if ANY one of its input events occurs. It represents parallel failure paths where one failure alone is sufficient — these are higher risk because no redundancy protects the output. An AND gate outputs a failure only if ALL input events occur simultaneously — it represents true redundancy where multiple independent safeguards must all fail at once. OR gates increase risk; AND gates contain it. Getting the gate type right is the most critical skill in FTA.

What is a minimal cut set in fault tree analysis?

A minimal cut set (MCS) is the smallest combination of basic events whose simultaneous occurrence causes the top event. Single-element MCSs are the most critical — they are single points of failure (one event alone triggers the top event). Two-element MCSs require two simultaneous failures. Identifying MCSs tells you exactly where to focus risk reduction: always eliminate single-element MCSs first by adding redundancy or removing the failure mode.

When should you use fault tree analysis instead of FMEA or 5 Whys?

Use FTA when: (1) you need to analyse a specific high-consequence event (fire, crash, data breach), (2) combination failures matter — where multiple safeguards must fail simultaneously, (3) you need to calculate the probability of the top event from component failure rates, or (4) a safety standard requires it (SAE ARP4761, IEC 61025, ISO 26262). Use FMEA for comprehensive coverage of all failure modes. Use 5 Whys when a single failure has already occurred and you need the root cause quickly.

What industries use fault tree analysis?

FTA is mandatory or widely used in: aerospace (SAE ARP4761 for aircraft certification, FAA/EASA type certification), nuclear power (NRC probabilistic risk assessment), chemical and process industries (IEC 61025, OSHA PSM), automotive (ISO 26262 functional safety for electronic systems), medical devices (IEC 62304, ISO 14971), and defence (MIL-STD-1629A). It is increasingly applied in software, cybersecurity, and financial risk modelling.

What is the difference between FTA and FMEA?

FTA is top-down: you start with one specific undesired top event and work downward to find all combinations of causes. FMEA is bottom-up: you start with every possible component failure and work upward to determine its effect on the system. FTA is better for modelling failure combinations (AND gate logic) and calculating system-level failure probability. FMEA is better for comprehensive coverage — ensuring no failure mode is missed. In safety-critical systems like aircraft and nuclear plants, both are typically required.

Can fault tree analysis be done without specialised software?

Yes — for qualitative FTA (identifying cut sets and logic structure), you can draw a fault tree on paper, a whiteboard, or any diagramming tool (Visio, Lucidchart, draw.io). Manual FTA is standard practice for team-based analysis and is entirely sufficient for most root cause analysis and safety review purposes. Quantitative FTA (calculating top event probability from component failure rates for large, complex trees) benefits from dedicated tools like OpenFTA, CAFTA, or Isograph. For learning and everyday quality work, manual FTA on a whiteboard is the right starting point.

Related Resources