the economics of repeated tube thickness surveys

The economics of repeated tube thickness surveys

John Price*

Department of Mechanical Engineering, Monash University, P.O. Box 197, 3145 Caulfield East, Melbourne, Vic., Australia

Abstract

The use of tube thickness surveys in boilers is an example of a commonly applied condition monitoring (CM) technique for maintenance

and it leads to condition-based maintenance (CBM) of the boiler tubes. There are, however, limits to the economics of this type of strategy

which are frequently overlooked in discussion of CBM strategies.

This paper considers several models of maintenance strategies. Conditions in which breakdown maintenance (BM), routine total

replacement (routine maintenance, RM) and condition-based replacement (which for simplicity is referred to as CM) are considered. Some

general rules about the economical range of each strategy are developed. The case study examines the use of ultrasonic testing of boiler tubes

in power stations in some detail.

q 2002 Elsevier Science Ltd. All rights reserved.

Keywords: Condition monitoring; Breakdown maintenance; Tube thickness measurement; Warning margin

1. Introduction

Many text books and papers discuss the basis for

maintenance strategies (Higgins [1] for example). The

choice of maintenance strategies involving breakdown

maintenance (BM), routine maintenance (RM) and con-

dition monitoring (CM) can be quite complicated and there

have been some studies of this issue [2,3]. This paper uses

simple statistical numerical models to consider the econ-

omic ranges of the various options. Assumptions for the

models always include:

† each component is dominated by a single failure mode

and

† the maintenance activity is replacement and completely

restores the equipment to original performance.

2. Models of maintenance strategies

2.1. The basic model

The first model considers the case where we have N

components which can fail in a particular mode causing a

cost of $X or be replaced (or repaired) before failure with a

cost of $Y. For our base example it will be assumed that

there are 1000 identical components and that the failure cost

is $100,000 while the replacement cost is $1000.

The model assumes that the components have a statistical

distribution of lives. One such distribution might be the

normal distribution. A normal distribution with a life

expectancy (mean) of E years and a standard deviation of

f years, where f is a fraction of the mean. For example let us

assume the expected life (mean life) is 1 year and f is 0.1, so

the standard deviation is 0.1 years. To give a feeling of the

numbers involved, at 0.9 years 15.9% of the components

would have failed. The curve is shown in Fig. 1.

Fig. 1 also shows a Weibull distribution of similar shape.

The Weibull distribution has a cumulative distribution

function of

Fðt;a;bÞ ¼ 1 2 e2ðt=bÞa

where t is the time at which the cumulative number of

failures is taken (years), b is the inverse of the mean time

between failures (MTBF) and a is the shape function.

The Weibull function was developed to fit the failure

distribution of wear out type failures. The matching of the

two distribution shows that the normal distribution, which

many people find easier to understand, can be made similar

in shape to the Weibull distribution. The Weibull distri-

bution is more correct, especially as the shape factor

decreases (normal distribution’s standard deviation

increases) since the Weibull distribution does not permit

failures before zero time.

0308-0161/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved.

PII: S0 30 8 -0 16 1 (0 2) 00 0 89 -3

International Journal of Pressure Vessels and Piping 79 (2002) 555–559

www.elsevier.com/locate/ijpvp

* Tel.: þ61-3-9903-2868; fax: þ61-3-9903-2766.

E-mail address: [email protected] (J. Price).

http://www.elsevier.com/locate/ijpvp

2.2. Breakdown and routine total replacement strategies

(BM and RM)

First consider a total breakdown strategy. In this case the

components would fail with a total cost of $NX. In our

example $NX is 1000 £ $100,000 and is much larger than

the repair cost, $NY (1000 £ $1000), so breakdown strategy

is not attractive. Note that if Y is very similar to X, which is

the case for many situations (such as light bulbs), then

breakdown strategy can be satisfactory.

Now consider routine replacement. Routine replacement

means that there would be total replacement of all the

components with new components. This must occur

significantly before the mean life expectancy of the

component at a time we shall designate as R years. A

simple analysis gives the cost of the total replacement is

$NY and this averages out at an expenditure rate of $NY/R

per year. There will, however, be an expected number of

failures, n, before replacement occurs, and these will have a

cost of $nX or $nX/R per year. This cost must be added to

the replacement costs and the total cost for the replacement

strategy is (NY þ nX )/R $ per year.

This cost has a minimum between zero and the MTBF.

Two cases are shown in Fig. 2. As will be seen from Fig. 2,

the minimum is significantly below the MTBF. The

minimum cost point is lowered by a larger variation in the

failure lives (larger shape factor, a ).

Fig. 3 shows that the time of minimum cost in the total

replacement strategy is also influenced by the ratio of the

replacement cost to the failure cost ( f ). As f approaches 1,

the minimum cost point approaches MTBF.

There are also other factors affecting the positioning of

the alarm signal level as shown in Fig. 4. This relates to the

possibility that:

† there are errors in the signal itself,

† there is a possibility that the signal will not be monitored

continuously, or

† it is not always possible to immediately replace the

component when the alarm signal is reached due to

operational restraints.

2.3. Condition monitoring

If a component can have its condition monitored then a

third replacement strategy becomes possible. In this

example we assume that the condition of a component can

be measured by a signal which gives a warning of the failure

of the component.

The signal from components being monitored can follow

a range of possibilities. Sometimes the component will fail

when the signal is relatively low (a ‘bad’ case), while some-

times the component will survive to higher levels of signal

(a ‘good’ case). This range of possibilities is shown in Fig. 4.

Fig. 4 also shows that in order for the CM to work an

alarm level must be set. When the signal exceeds the alarm

level then the component is replaced. Clearly this alarm

level must be set at a point lower than a bad case, so that

components will be replaced before they fail.

In these cases a safety margin must be set on the alarm

signal, so that even given these problems, the required

proportion of the components will be replaced before

failure.

Fig. 1. Comparison of a normal and Weibull distributions. The normal dis-

tribution has a mean of 1 year and a standard deviation of 0.1 and the Weibull

distribution has a MTBF of 1 year and a shape factor of 10. The dashed lines

show the same distributions with a SD of 0.2 and shape factor of 5.

Fig. 3. The location of the minimum depends on cost of replacement. In this

case replacements cost $10,000. With the ratio f equal to 0.1 the minimum

shifts to 0.64 years, with a minimum average cost of $17.2 m/year. 11.2

failures are expected at this point. The whole curve is now much more

expensive because a replacement costs $10,000 instead of $1000 in Fig. 2.

Fig. 2. There is an optimum time for routine replacement. Full curve is for

components with a Wiebull distribution of failure, mean time between

failure of 1 year and a ¼ 10. The minimum cost is at about 0.51 years at

which point there is an expectation of 1.2 failures among 1000 components.

This system costs an average of $2.2 m/year. Dashed curve is for MTBF of

1, a ¼ 5. Minimum is at about 0.30 years after about 2.4 failures in 1000

components. The system costs an average of $4.13 m/year.

J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559556

Given that these requirements on the alarm level for the

signal exist, it can be easily shown that many components

will be replaced very early. As is shown in Fig. 4, if there is

a scatter between the good and bad cases, then a significant

proportion of the life of the good cases will be lost. The

proportion of lost service increases and there is also lost

service even for the bad cases when a safety margin is set in

the alarm signal level.

This loss of service represents a penalty for using the CM

which can be very important. Also if CM cannot exclude the

possibility of in-service failures then these add to the cost of

using CM.

Let us consider two different CM situations. Once

again we have 1000 components which cost $1000 to

replace or $100,000 if they fail in service. The MTBF is

1 year and the shape function is 10 as shown in Fig. 1.

We will also assume CM costs $100 per component per

year.

In Fig. 5 the cases considered are CM systems with a

mean of either 1 month or 3 months warning of failure. The

savings over a RM system where all components are

replaced at the optimum time is shown in Fig. 5(a).

The plot considers the situation that not all failures may

be prevented by the CM system. The possibly of the CM

system not being 100% successful is represented in this

model by a standard deviation which is plotted here as a

ratio to the Mean of the warning time. This can also be

represented as the number of failures not prevented by CM

as is shown in Fig. 5(b).

Fig. 5(a) shows that at high reliabilities the CM system

can save large sums of money when compared to the

optimum RM strategy as shown in Fig. 2. The costs of

failures and replacements in the system can be halved.

However, if the CM system is not very reliable and fails to

prevent more than about seven failures per 1000 com-

ponents, then RM is to be preferred.

A CM system can be very inaccurate and still save

money. As shown in Fig. 5(a) it needs only to predict

failures an average of 6 months before the failure when the

MTBF is 1 year to save some costs in this example.

However, the reliability of the CM system is very

important. If more than 0.7% of components fail, then CM is

more expensive than total routine replacement (RM).

Fig. 5. The savings which can be achieved by a CM system depends on its ability to prevent failures. (a) The mean warning time is not so important as the

reliability of the CM system at preventing failures. (b) The ratio SD/mean for warning time for the CM system grows is related to the number of failures not

prevented.

Fig. 4. Setting the alarm signal for continuous CM.

J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559 557

3. A case study—boiler tubes

An important case of the use of CM is the extensive use

of thickness measurement of boiler tubes in large power

stations. The thickness is a good measure of the damage

done by wear and corrosion mechanisms to the tubes. The

thickness measurements are carried out with fairly simple

ultrasonic probes.

There are other more complicated measurements that

are carried out to determine creep life exhaustion of the

higher temperature tubes. These measurements require a

calculation based on the tube internal oxide thickness,

which can only be measured using sophisticated ultrasonic

equipment. These measurements are in the author’s

experience significantly less reliable than the simple

thickness measurements.

The measurements can only be carried out when the

boiler is out of service and requires people to enter the

boilers. As a result these methods are not continuous CM,

but rather intermittent. It is difficult to do these

measurements more frequently than once a year, and in

many situations they can only be done once in every four

years.

To examine this situation, the case chosen has 1000 tubes

with a MTBF of 30 years and a shape factor of 10 as shown

in Fig. 6.

Using a replacement cost of $2000 per tube and a failure

cost of $200,000 per tube then the total routine replacement

of the 1000 tubes reaches a minimum at 15 years with a total

life cost of $146,000 per year as shown in Fig. 7.

If CM is conducted continuously, is very reliable and

costs nothing, then considerable savings are to be made.

Fig. 8 shows that the savings are over $76,000 which is

almost half the cost of RM. If CM is unreliable, which as

shown in Fig. 8 is when the SD of the distribution exceeds

30% of the mean warning time, then CM cannot save

money, and total routine replacement is cheaper.

3.1. A more realistic model of CM

A more developed model of CM of the tube thickness

gives a cost to the inspection. This will be set at $100 per

Fig. 6. Weibull distribution of MTBF 30 years, shape factor 10. For

comparison a normal distribution of mean 30 years, standard deviation 3

years is plotted. The first failure in 1000 tubes is expected in year 15.

Fig. 7. The total routine replacement of the 1000 tubes produces minimum

total life cost at 15 years. The minimum cost is $146,000/year for a

replacement cost of $2000, and a failure cost of $200,000.

Fig. 8. If CM is continuous, reliable and costs nothing, significant savings

are to be made. Maximum savings are $77,000 per year which

approximately halves cost.

Fig. 9. Failures occur even with CM every 4 years. The CM has a mean

warning time, at the time it is done, of 4 years. This mean is reduced by one

year for every year after the CM is conducted. The SD of the CM system is

0.5 years, which is reliable in terms of Fig. 8.

Fig. 10. Savings using CM over routine total replacement. RM is best at 15

years. CM is probably not worth doing until about the time first failure is

expected (15 years). CM permits operation till about 19 years, after this the

1000 tubes cannot be operated profitably and should all be replaced. It is

assumed to cost $100 to inspect each component so the savings using CM

only every 4 years are greater.

J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559558

tube. Moreover these inspections cannot be conducted

continuously but can be conducted only once every 4

years.

Let us set the warning level of the CM at 4 years, that is,

the mean time for failure after the warning level is reached

is 4 years. The SD on the signal will be set at 0.5 years which

is very reliable according to Fig. 8.

If inspections are only carried out every 4 years, there

will be an increasing number of failures which are not pre-

vented by CM with each succeeding year. This is modelled

by moving the time closer to the mean of the normal

distribution by one year for each year after the inspection.

The effect of this is to produce a curve as shown in Fig. 9.

By year 28 on average 394 out of 1000 tubes would be

replaced, by year 32 on average 851 would have been

replaced. By year 36 virtually the entire bank would have

been replaced. In this analysis it is assumed that these new

tubes would not start failing by year 36, which is statistically

a minute error. However, in reality that assumption may not

be true since piecemeal replacement activities are often far

from perfect.

The effect of CM in this case has been to extend the life

of the tubes from 15 years if total replacement RM was

practiced to a mean of 26 years.

However, if the costs are now considered the advantage

of CM becomes less clear. Even the few failures

experienced after year 20 in Fig. 9 are expensive, and the

inspections themselves are costly. If these factors are

considered it is found that there are no savings using CM

after year 19 as shown in Fig. 10. Even though the number

of in-service failures are reduced with more frequent

inspections, the cost of the inspections outweigh the savings

as shown in Fig. 10.

4. Conclusions

The findings of this work are summarised in Table 1.

4.1. Boiler tube thickness surveys

In the particular case of the use of CM in relation to

boiler tube failures it was found that the savings to be

made using CM are actually quite limited when the costs

of conducting CM are taken into account.

The conclusion of the model in the paper is that the

cheapest way to run the example tube bank is as follows.

1. Operate the bank until the first failure or preferably just

before the first failure (at 15 years) (BM).

2. Carry out CM to remove tubes at risk over the next 4

years (CM).

3. Replace the entire bank at 19 years (RM).

This case illustrates a situation where all three mainten-

ance strategies may be relevant for the single set of

components.

References

[1] Higgins LR. Maintenance engineering handbook. New York: McGraw

Hill; 1995.

[2] Bahrami-Ghasrcharmi K, Price JWH, Mathew J. Optimum inspection

frequencies for manufacturing systems. Int J Qual Reliab Management

1999;15(3):250.

[3] Marquez AC, Heguedas AS. Models for maintenance optimization: a

study for repairable systems and finite time periods. J Reliab Engng

Syst Saf; in press.

Table 1

The situations in which various maintenance strategies can be economic

Situation Recommended maintenance strategy Comments

Cost of failure similar to cost

of replacement

BM Light bulbs

Life time of components show very

little scatter

RM Lack of accuracy in CM cases

loss of average component life

Life time of components vary over

wide range

CM Cross over to CM causing savings

depends on cost and reliability of

CM

CM has low reliability RM This case occurs when SD of

CM exceeds 30% of mean warning time

Mixed case. Perhaps the general case

and probably correct for boiler tube

thickness surveys

BMCMRM CM is useful over a limited

region of time

J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559 559

the economics of repeated tube thickness surveys

Documents