Friday, September 4, 2009

Reliability and Reliability Engineering (Part 1)

From my previous entry 8 Dimensions of Quality I noted the Dr. David A. Garvin identified Reliability as one of the keys strategic objectives that quality practitioners can focus on. In this series I would discuss more the details of Reliability Management and Reliability Engineering. I will depend heavily on two great books, the “Quality Engineering Handbook” by Thomas Pyzdek (second edition, 2003 by Marcel Dekker, Inc), and of course “Juran’s Quality Handbook” by Joseph Juran and Blanton Godfrey (fifth edition, 1999 by McGraw-Hill). I will incorporate as well some examples on how to execute Reliability Analysis in JMP. I am using JMP 7.0.1 for this and will use its documentation as a reference.


Definition of Reliability


In the entry 8 Dimensions of Quality I differentiated quality and reliability by an analogy. If quality is a snapshot of the goodness of a product at a certain point in time, reliability is the consistency of that level of quality as time goes on. Thus, if quality is a photograph, reliability is a video. For a formal definition however, we will quote from Pyzdek’s book:
“…reliability is defined as the probability that a product or system will perform a specified function for a specified time without failure.”
In addition Pyzdek also cautions that for a reliability figure to be meaningful it has to be defined within the context of specific operating conditions. You do not expect a laptop to work in the same level of quality when it is submerged in water as when it is used in a room environment. For that exact reason, warranty is limited to a pre-defined correct way of using a product. In the same manner warranty claims are first examined if valid by comparing the condition in which it failed against the conditions stipulated in the warranty.


Key Measures of Reliability


The following is a list of the common measures of reliability.
  • Mean Time to Fail (MTTF) or Mean Time to First Failure (MTFF) – This applies to products (or systems in general) that can not be repaired once it fails or break-down. When buying a light bulb for example, we assess how long it will take before it finally breaks down. The longer the time before it finally fails, the better we say that light bulb is.
  • Mean Time between Failures (MTBF) – This applies to products (or systems in general) that breaks down but can be repaired and return to use. MTBF is defined as the average exposure a product will take until it will fail again. This exposure value may take a unit of time or count of usage. Within the context of the unit used, the higher the value of MTBF, the better the reliability is. Thus a machine that broke down 20 days after the repair is better than the one that took only half a day. In the same way, an oven that has to be re-calibrated by a thermocouple every 20 uses is much worse than an oven that needs only recalibration every after 500 uses. Both MTBF and MTTF can satisfactory be modelled by an Exponential distribution or by its generalized form, the Weibull distribution
  • Failure Rate – This is the value of 1/MTBF. It is defined as the number of failures per unit of exposure. From the example above an MTBF of 20 days means failure rate of 1 machine per 20 days or 0.05 failures per day. Failure rate is important as it is often can be modelled by a Poisson distribution which facilitates ease of analysis.
  • Mean Time to Repair (MTTR) – This measures the amount of time the product or tool or system in general is down. It is defined as the elapsed time between occurrence of failure and re-endorsement for use. Because of this definition MTTR is only applicable for repairable entities.
  • Availability – This is defined as the proportion of time a product, or tool, or system in general is in a usable or operable state. Thus it is the ratio of time it is not under repair, to the total available time. Thus, Availability = MTBF/ (MTBF+MTTR).


The Life Cycle Model


Most failures of a product occurs either at the early stage (in cases there is an issue in the manufacturing process), or at the late stage where wear, tear, and degradation begins to take effect. This concept is called the System Life Cycle usually modelled by what is known as the Bathtub Curve. A typical Bathtub curve is shown below.



The Bathtub Curve above is produced by a Beta Distribution. In practice though, the infant stage is modelled by a Weibull distribution. That is, we can see the Bathtub Curve as an overlap of Weibull Distributions one of which is for the Infant Stage as shown below:





..to be continued

1 comment:

  1. Hi! This is a good read. Keep it up! I will be looking forward to visit your page again and for your other posts as well. Thank you for sharing your thoughts about packaging engineering jobs. I'm glad to stop by your site and know more about reliability engineering jobs.
    Reliability may be defined as the capacity of a designed, produced or maintained item to perform as required over time.
    Assess emerging designs for requirements compliance and document findings; provide alternative solutions as needed - that's one of the responsibilities that reliability engineering jobs has.

    ReplyDelete

 
Custom Search