top of page

6 Failure Patterns You should Understand as a Maintenance Engineer


In the 1960s, maintenance programs were designed around the idea that equipment components had expected lifetimes. The approach involved replacing or overhauling these components at set intervals to avoid failures, leading to extensive maintenance activities such as inspections, rebuilds, and overhauls that required disassembling many parts. Despite these efforts, equipment failure rates remained high.


In 1978, Stanley Nowlan and Howard Heap from United Airlines (UAL) introduced a report called "Reliability-Centered Maintenance (RCM)" which revolutionized the industry. The report revealed startling facts but significantly improved reliability, reduced maintenance costs, and enhanced equipment safety. It was sponsored by the US Department of Defense (DOD) to serve as a guideline for implementing RCM in military equipment.


The United Airlines report disproved the old beliefs by showing that fixed-interval overhauls often did not enhance reliability; instead, they could lead to more failures. Nowlan and Heap's research showed that only 9% of failures were linked to aircraft age, while the majority were either random or caused by the very maintenance procedures meant to prevent them.


Patterns of Failure


The United Airlines report identified six unique failure patterns in equipment. Understanding these patterns can explain how decreasing maintenance activities can enhance efficiency and reduce costs.


The shape of the failure curve enables us to determine if the failure mode occurred during the early stages of life, due to random factors, or as a result of wear and aging. These curves are created by plotting the equipment's failure rate over time.







How to read Failure Curves

Assets can experience various types of failures, and these can be illustrated using curves. Each curve represents the likelihood of failure over time. As you move to the right on the curve, it signifies the passage of time or the duration of equipment usage. The further a curve moves away from the x-axis, the higher the probability of failure becomes. It's crucial to observe whether the curve is rising (moving away from the x-axis), falling (approaching the x-axis), or remaining steady (flat), as this indicates how failure probabilities change over time.

Assets can fail in different ways, which can be visualized through curves. Each curve represents the probability of failure occurring over time. As you move to the right on the curve, it signifies the duration of equipment usage. The further the curve extends from the x-axis, the greater the likelihood of failure. It is important to monitor whether the curve is ascending (moving away from the x-axis), descending (approaching the x-axis), or staying constant (flat), as this reflects the evolution of failure probabilities over time.


All curves has 3 phases as indicated on the above curves


Constant Failure Phase (Wealth Phase):

  • In this phase, the curve is flat and the failure rate is low, predictable or both

  • This is the phase in which the asset is used and the value (wealth) is derived from it


Infant Mortality Phase (Early Life Phase):

  • This phase occurs during equipment start up after installation

  • Possible reasons for faults at this phase are issues in design, installation, shipment, ... etc.


Wear-out Phase (End-of-Life or Breakdown Phase):

  • In this Phase, failures sharply increase toward the end of its useful life

  • Maintenance departments often closely monitor what occurs to assets at this point.

  • Many organizations opt to trade in or dispose of equipment once it reaches this phase.

  • It is common for high-tech equipment to become obsolete before reaching this stage (for example, your desktop computer may become obsolete long before wearing out).




Pattern A: The Bathtub Curve

  • It represents the most common pattern of failure

  • It combines the infant mortality, constant failure, and wear-out curves

  • The pattern involves an initial phase of infant mortality, followed by a period of constant failure probability, until the failure rate increases towards the end of the asset's useful life.

  • This pattern represents about 4% of failures.

  • Example:

    Trucks exhibit high failure rates initially due to labor and parts defects as well as inherent design flaws. Once these issues are eliminated, the vehicles enter a flat section in the curve until a critical system experiences significant wear. Subsequently, the overall reliability of the vehicle decreases, and the number of maintenance incidents increases until complete failure takes place





Pattern B: The Wear-out Curve

  • The probability of failure is random until the end of the life cycle, then it increases rapidly. it is similar to the bathtub curve, without the high infant mortality rate

  • Replacing parts before the expected wear-out phase improved an asset’s reliability.

  • This type of failure is expected for about 2% of all components

  • Example: This failure mode is characterized by mechanical systems that wear until they reach a certain point, after which they are at significant risk of failure




Pattern C: The Fatigue Curve (Increasing)

  • The probability of failure slowly increases over time or utilization. There is no defined break-point before which you can plan replacement.

  • As there is no clearly defined wear-out point, part replacement is determined when the probability has reached a point unacceptable to your operation. Most change-outs occur after 67% or 75% of life.

  • This effect is common for items that are subject to direct wear.

  • This pattern occurs to approximately 5% of components.




Pattern D: The break-in Curve (Increasing then stable)

  • The probability of failure increases rapidly and then settling at a constant conditional probability for the remainder of the component’s life.

  • This pattern reflects about 7% of all failures.

  • Example: An electric heating element in a hot water heater provides an example. The probability of failure increases as the unit is turned on and then stabilizes to a random level.




Pattern E: The Random Curve

  • The Probability of failure remains consistent across all time periods (e.g., the chance of failure in month 109 is equal to the chance of failure in month 23). Failures may occur due to unexpected or random occurrences.

  • This means failure can randomly happen at any point of its life, and accounts for 14% of all failures.

  • This pattern can occur randomly at any moment throughout its lifespan, representing 14% of all failures.

  • Example: This particular curve is frequently observed in electronics and systems that become obsolete before wearing out, as the curve appears flat during the period of interest.




Pattern F: The Infant-Mortality Curve

  • Initially, the failure rate is high but then stabilizes at a constant level for the rest of its life.

  • Defects, production errors, and human mistakes often contribute to this high rate of early failures.

  • This pattern is widespread and accounts for 68% of all failures.

  • Most complex systems experience a surge in failures at the beginning due to material or workmanship defects. Manufacturers address this by offering warranties that cover these initial failures.



Conclusion


1- Age-related Failures:

  • Out of the six failure patterns mentioned, only three (A, B, and C) show a clear correlation between aging or time passing and a higher likelihood of failure.

  • These three patterns can be classified as "age-related" failures, accounting for around 15% of all failures. In these cases, it is beneficial for machine and asset reliability to replace components after a specific period.


2- Random Failures:

  • In approximately 85% of all failures, failure patterns (D, E, and F) occur randomly due to probabilities of failure. To address this prevalent issue and enhance asset reliability despite the unpredictable nature of failures, we apply the P-F model which will be the subject of an upcoming article.


References:

Comments


Commenting has been turned off.
IMG_5498_edited.jpg

Thank you for dropping by!

I'm Abdelrahman Shaker, an electrical engineer specializing in maintenance since 2018. On the blog, I'll be sharing crucial information, valuable experiences, and insightful tips about engineering, productivity, and various other topics. Join me as we explore and delve into the world of engineering and beyond

Subscribe to the blog and have the posts delivered straight to you !

  • LinkedIn
  • Soundcloud

Get in touch with me !

bottom of page