mean time between failures

52
• Mean Time Between Failures https://store.theartofservice.com/the-mean-time-between- failures-toolkit.html

Upload: nigel-lane

Post on 28-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mean Time Between Failures

• Mean Time Between Failures

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 2: Mean Time Between Failures

History of computing hardware - Second generation: transistors

1 Problems with the reliability of early batches of point contact and alloyed junction transistors meant that the

machine's mean time between failures was about 90 minutes, but

this improved once the more reliable bipolar junction transistors became

available.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 3: Mean Time Between Failures

Time-multiplexed optical shutter - General features

1 Mean time between failures: The first components it is expected to fail in a TMOS technology is the illumination system. LEDs usually have 100,000

hours MTBF under continuous operation; as TMOS uses LEDs at 1/3 duty cycle, the maximum expected

MTBF is 300,000 hours.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 4: Mean Time Between Failures

Time-multiplexed optical shutter - Advantages

1 Mean Time Between Failures: TMOS life could achieve 300,000 hours, overcoming the 10,000 hours of

OLED, 30,000 of plasma displays, 40,000 hours of CTRS and the

100,000 hours of LCD.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 5: Mean Time Between Failures

Parallel computing - Application checkpointing

1 As a computer system grows in complexity, the mean time between failures usually

decreases

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 6: Mean Time Between Failures

Service-level agreement

1 In this case the SLA will typically have a technical definition in terms

of mean time between failures (MTBF), mean time to repair or mean

time to recovery (MTTR); various data rates; throughput; jitter; or

similar measurable details.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 7: Mean Time Between Failures

Solar micro-inverter - Micro-inverters

1 Mean time between failures (MTBF) are quoted in hundreds of years.

[http://www.enphaseenergy.com/downloads/Enphase_M190_Datasheet.pdf

Enphase Microinverter M190], Enphase Energy

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 8: Mean Time Between Failures

Life expectancy

1 The term life expectancy may also be used in the context of manufactured

objects although the related term shelf life is used for consumer

products and the terms mean time to breakdown (MTTB) and Mean time between failures|mean time before

failures (MTBF) are used in engineering.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 9: Mean Time Between Failures

Parallel programming - Application checkpointing

1 As a computer system grows in complexity, the mean time between failures usually

decreases

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 10: Mean Time Between Failures

Ssd - Flash-based SSDs

1 These applications require the exceptional mean time between

failures (MTBF) rates that solid-state drives achieve, by virtue of their

ability to withstand extreme shock, vibration and temperature ranges.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 11: Mean Time Between Failures

Computer power supply - Life span

1 Life span is usually measured in mean time between failures (MTBF). Higher MTBF ratings

are preferable for longer device life and reliability. Quality construction consisting of industrial grade electrical components or a

larger or higher speed fan can help to contribute to a higher MTBF rating by keeping

critical components cool. Overheating is a major cause of PSU failure. Calculated MTBF value of 100,000 hours (about 11 years of continuous operation) is fairly common.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 12: Mean Time Between Failures

Maintenance, repair, and operations - MRO software

1 **Reliability data: Mean time between failures|MTBF, MTTB (mean

time to breakdown), MTBR (mean time between removals),

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 13: Mean Time Between Failures

Glossary of fuel cell terms - Mean time between failures

1 : Mean time between failures (MTBF) is the mean (average) time between

failures of a system, and is often attributed to the useful life of the device i.e. not including 'infant

mortality' or 'end of life' if the device is not repairable.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 14: Mean Time Between Failures

Service life

1 Service life is different from a predicted life, or MTBF|MTTF/MTBF (Mean Time to Failure/Mean Time

Between Failures)/Maintenance-free operating period|MFOP

(maintenance-free operating period)

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 15: Mean Time Between Failures

Service-level agreement

1 In this case the SLA will typically have a technical definition in terms

of MTBF|mean time between failures (MTBF), mean time to repair or mean

time to recovery (MTTR); various data rates; throughput; jitter; or

similar measurable details.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 16: Mean Time Between Failures

Pump - Pump repairs

1 Examining pump repair records and MTBF (mean time between failures) is of great importance to responsible

and conscientious pump users

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 17: Mean Time Between Failures

Ring laser gyroscope - Description

1 Many tens of thousands of RLGs are operating in inertial navigation

systems and have established high accuracy, with better than 0.01°/hour

bias uncertainty, and mean time between failures in excess of 60,000

hours.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 18: Mean Time Between Failures

Functional safety - Achieving Functional Safety

1 4. Verification that the system meets the assigned SIL, Automotive Safety Integrity Level|ASIL, PL or agPL by determining the Mean Time Between Failures and the Safe Failure Fraction (SFF), along with appropriate tests. The Safe

Failure Fraction is the probability of the system failing in a safe state: the dangerous (or critical) state states are identified from a Failure Mode and Effects Analysis or (Failure Mode, Effects,

and Criticality Analysis) of the system (FMEA or FMECA).

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 19: Mean Time Between Failures

Fault-tolerant system - Examples

1 Hardware fault-tolerance sometimes requires that broken parts can be taken out and

replaced with new parts while the system is still operational (in computing known as hot swapping). Such a system implemented with

a single backup is known as 'single point tolerant', and represents the vast majority of fault-tolerant systems. In such systems the mean time between failures should be long enough for the operators to have time to fix

the broken devices (mean time to repair)

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 20: Mean Time Between Failures

Logistic engineering

1 Logistics engineers work with complex mathematical models that

consider elements such as mean time between failures (MTBF), mean time to failure (MTTF), mean time to

repair (MTTR), failure mode and effects analysis (FMEA), statistical

distributions, queueing theory, and a host of other considerations

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 21: Mean Time Between Failures

SCADA - Operational philosophy

1 The reliability of such systems can be calculated statistically and is stated as the mean time to failure, which is

a variant of Mean Time Between Failures (MTBF)

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 22: Mean Time Between Failures

Monte Carlo method - Engineering

1 * In reliability engineering, one can use Monte Carlo simulation to

generate mean time between failures and mean time to repair for

components.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 23: Mean Time Between Failures

High availability - System design for high availability

1 Zero downtime system design means that modeling and simulation indicates mean

time between failures significantly exceeds the period of time between planned

maintenance, upgrade events, or system lifetime. Zero downtime involves massive

redundancy, which is needed for some types of aircraft and for most kinds of

communications satellite. Global Positioning System is an example of a zero downtime

system.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 24: Mean Time Between Failures

Data corruption - Overview

1 Hardware and software failure are the two main causes for data loss.

Background radiation, head crashes, and Mean time between failures|

aging or wear of the storage device fall into the former category, while

software failure typically occurs due to Software bug|bugs in the code.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 25: Mean Time Between Failures

Commodity hardware

1 At some point, the number of discrete systems in a cluster will be greater than the mean time between failures (MTBF) for any hardware platform, no matter how reliable, so fault

tolerance must be built into the controlling software.http://www.morganclaypool.com/doi/abs/

10.2200/S00193ED1V01Y200905CAC006http://insidehpc.com/2008/06/02/google-fellow-sheds-some-light-on-infrastructure-robustness-in-face-of-failure Purchases should be optimized on cost-

per-unit-of-performance, not just absolute performance-per-CPU at any cost.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 26: Mean Time Between Failures

Eight dimensions of quality - Reliability

1 This dimension reflects the probability of a product malfunctioning or failing within a specified time period. Among the most common measures of reliability are the

mean time to first failure, the mean time between failures, and the failure rate per

unit time. Because these measures require a product to be in use for a specified period, they are more relevant to durable goods than to products and services that are

consumed instantly.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 27: Mean Time Between Failures

Failure rate

1 In practice, the mean time between failures (MTBF, 1/λ) is often reported

instead of the failure rate

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 28: Mean Time Between Failures

Balanced Automatics Recoil System - Direct impingement

1 These combined factors reduce service life of these parts, reliability,

and mean time between failures.Major Thomas P

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 29: Mean Time Between Failures

AirPort Time Capsule - Features

1 Apple states that the Hitachi Deskstar meets or exceeds the 1 million hours mean time between

failures (MTBF) recommendation for server-grade hard

drives.[http://db.tidbits.com/article/9479 Time Capsule Ships with Support

for USB Drive Backups]

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 30: Mean Time Between Failures

Entry-Level Power Supply Specification - Life span

1 Life span is usually specified in mean time between failures (MTBF), where higher MTBF ratings indicate longer

device life and better reliability. Using higher quality electrical components at

less than their maximum ratings or providing better cooling can contribute to a higher MTBF rating because lower

stress and lower operating temperatures decrease component failure rates.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 31: Mean Time Between Failures

Mean time between failure

1 'Mean time between failures (MTBF)' is the predicted elapsed time between inherent failures of a

system during operation.Jones, James V., Integrated Logistics Support

Handbook, page 4.2 MTBF can be calculated as the arithmetic mean

(average) time between failures of a system

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 32: Mean Time Between Failures

Lambda - Lower-case letter λ

1 * Lambda denotes the failure rate of devices and systems in reliability

theory, and it is measured in failure events per hour. Numerically, this

lambda is also the reciprocal of the mean time between failures.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 33: Mean Time Between Failures

Reliability, availability and serviceability (computer hardware) - Definitions

1 Reliability can be characterized in terms of mean time between failures

(MTBF), with reliability = exp(-t/MTBF).

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 34: Mean Time Between Failures

M60 machine gun - M60E2

1 This version achieved a mean time between failures of 1,669 during testing in the 1970s..

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 35: Mean Time Between Failures

Life Expectancy Index

1 The term that is known as life expectancy is most often used in the context of human populations, but is also used in plant or

animal ecology; life tables (also known as actuary|actuarial tables). The term life

expectancy may also be used in the context of manufactured objects, although the

related term shelf life is used for consumer products and the terms mean time to

breakdown (MTTB) and mean time between failures (MTBF) are used in engineering.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 36: Mean Time Between Failures

Run Book Automation - Runbook Automation

1 According to Gartner, the growth of RBA has coincided with the need for IT operations executives to enhance IT operations efficiency measures—

including reducing mean time to repair (MTTR), increasing mean time

between failures (MTBF), and automating provisioning of

Information technology|IT resources

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 37: Mean Time Between Failures

Manchester computers - Transistor Computer

1 Problems with the reliability of early batches of transistors meant that the

machine's mean time between failures was about 90 minutes, which

improved once the more reliable Bipolar junction transistor|junction

transistors became available

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 38: Mean Time Between Failures

Peltier cooler - Construction

1 * Has a long life, with mean time between failures (MTBF) exceeding 100,000 hours

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 39: Mean Time Between Failures

Commodity server

1 At some point, the number of discrete systems in a cluster will be greater than the mean time between failures (MTBF) for any hardware platform, no matter how reliable, so fault

tolerance must be built into the controlling software.http://www.morganclaypool.com/doi/abs/

10.2200/S00193ED1V01Y200905CAC006http://insidehpc.com/2008/06/02/google-fellow-sheds-some-light-on-infrastructure-robustness-in-face-of-failure Purchases should be optimized on cost-

per-unit-of-performance, not just absolute performance-per-CPU at any cost.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 40: Mean Time Between Failures

Mean time to repair

1 MTTR is often part of a maintenance contract, where a system whose

MTTR is 24 hours is generally more valuable than for one of 7 days if

mean time between failures is equal, because its Operational Availability is

higher.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 41: Mean Time Between Failures

Synchronizer

1 * In electronics, whenever there is signal transfer between two systems operating at

different frequencies or same frequency with different phases, synchronizer block is used

as an interface so that signal from transmitter block is reliably interpreted by the receiver. The block usually uses metastable hardened flops offering single or double latency delays at the output. This block ensures that there is no metastability for a target MTBF i.e., Mean

Time Between Failures

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 42: Mean Time Between Failures

Fault tolerant system - Examples

1 In such systems the mean time between failures should be long

enough for the operators to have time to fix the broken devices (mean

time to repair) before the backup also fails

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 43: Mean Time Between Failures

Passenger car (rail) - Kawasaki

1 Kawasaki Heavy Industries|Kawasaki has been manufacturing passenger rail cars at its

facility in Lincoln, Nebraska since 2001. Kawasaki's Lincoln plant has manufactured

rail cars for MBTA, NYCT, PATH, MNR with cars that have led the way with the industry's best

MTBF (Mean Time Between Failures). Kawasaki Rail Car was the first American rail car manufacturer to achieve the International

Organization for Standardization ISO-9002 certification.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 44: Mean Time Between Failures

Water pump - Pump repairs

1 Examining pump repair records and mean time between failures (MTBF) is of great importance to responsible

and conscientious pump users

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 45: Mean Time Between Failures

M109 howitzer - M109 KAWEST

1 New electrical system increases reliability (better than Mil STD 1245A,

higher operational readiness, increased mean time between

failures, fault-finding diagnostics with test equipment.)

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 46: Mean Time Between Failures

Hard disk failure - Causes

1 HDD manufacturers typically specify a MTBF|Mean Time Between Failures or an Annualized Failure Rate (AFR) which are population statistics that can not predict the behavior of an

individual unit

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 47: Mean Time Between Failures

Hard disk failure - Metrics of failures

1 The mean time between failures (MTBF) of SATA drives is usually

specified to be about 1.2million hours (some drives such as Western Digital Raptor have rated 1.4million hours

MTBF), while SAS/FC drives are rated for upwards of 1.6million hours

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 48: Mean Time Between Failures

Charging handle

1 One issue is the mean time between failures due to metal fatigue

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 49: Mean Time Between Failures

Life expectancy at birth

1 Life expectancy is also used in plant or animal ecology; life tables (also known as actuary|actuarial tables).

The term life expectancy may also be used in the context of manufactured objects, although the related term

shelf life is used for consumer products and the terms mean time to

breakdown (MTTB) and mean time between failures (MTBF) are used in

engineering.https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 50: Mean Time Between Failures

Water turbine - Time line

1 Around 1890, the modern fluid bearing was invented, now

universally used to support heavy water turbine spindles. As of 2002,

fluid bearings appear to have a mean time between failures of more than

1300 years.

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html

Page 51: Mean Time Between Failures

Direct impingement - Evaluation

1 These combined factors reduce service life of these parts, reliability,

and mean time between failures.Major Thomas P

https://store.theartofservice.com/the-mean-time-between-failures-toolkit.html