the economics of repeated tube thickness surveys
TRANSCRIPT
The economics of repeated tube thickness surveys
John Price*
Department of Mechanical Engineering, Monash University, P.O. Box 197, 3145 Caulfield East, Melbourne, Vic., Australia
Abstract
The use of tube thickness surveys in boilers is an example of a commonly applied condition monitoring (CM) technique for maintenance
and it leads to condition-based maintenance (CBM) of the boiler tubes. There are, however, limits to the economics of this type of strategy
which are frequently overlooked in discussion of CBM strategies.
This paper considers several models of maintenance strategies. Conditions in which breakdown maintenance (BM), routine total
replacement (routine maintenance, RM) and condition-based replacement (which for simplicity is referred to as CM) are considered. Some
general rules about the economical range of each strategy are developed. The case study examines the use of ultrasonic testing of boiler tubes
in power stations in some detail.
q 2002 Elsevier Science Ltd. All rights reserved.
Keywords: Condition monitoring; Breakdown maintenance; Tube thickness measurement; Warning margin
1. Introduction
Many text books and papers discuss the basis for
maintenance strategies (Higgins [1] for example). The
choice of maintenance strategies involving breakdown
maintenance (BM), routine maintenance (RM) and con-
dition monitoring (CM) can be quite complicated and there
have been some studies of this issue [2,3]. This paper uses
simple statistical numerical models to consider the econ-
omic ranges of the various options. Assumptions for the
models always include:
† each component is dominated by a single failure mode
and
† the maintenance activity is replacement and completely
restores the equipment to original performance.
2. Models of maintenance strategies
2.1. The basic model
The first model considers the case where we have N
components which can fail in a particular mode causing a
cost of $X or be replaced (or repaired) before failure with a
cost of $Y. For our base example it will be assumed that
there are 1000 identical components and that the failure cost
is $100,000 while the replacement cost is $1000.
The model assumes that the components have a statistical
distribution of lives. One such distribution might be the
normal distribution. A normal distribution with a life
expectancy (mean) of E years and a standard deviation of
f years, where f is a fraction of the mean. For example let us
assume the expected life (mean life) is 1 year and f is 0.1, so
the standard deviation is 0.1 years. To give a feeling of the
numbers involved, at 0.9 years 15.9% of the components
would have failed. The curve is shown in Fig. 1.
Fig. 1 also shows a Weibull distribution of similar shape.
The Weibull distribution has a cumulative distribution
function of
Fðt;a;bÞ ¼ 1 2 e2ðt=bÞa
where t is the time at which the cumulative number of
failures is taken (years), b is the inverse of the mean time
between failures (MTBF) and a is the shape function.
The Weibull function was developed to fit the failure
distribution of wear out type failures. The matching of the
two distribution shows that the normal distribution, which
many people find easier to understand, can be made similar
in shape to the Weibull distribution. The Weibull distri-
bution is more correct, especially as the shape factor
decreases (normal distribution’s standard deviation
increases) since the Weibull distribution does not permit
failures before zero time.
0308-0161/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved.
PII: S0 30 8 -0 16 1 (0 2) 00 0 89 -3
International Journal of Pressure Vessels and Piping 79 (2002) 555–559
www.elsevier.com/locate/ijpvp
* Tel.: þ61-3-9903-2868; fax: þ61-3-9903-2766.
E-mail address: [email protected] (J. Price).
2.2. Breakdown and routine total replacement strategies
(BM and RM)
First consider a total breakdown strategy. In this case the
components would fail with a total cost of $NX. In our
example $NX is 1000 £ $100,000 and is much larger than
the repair cost, $NY (1000 £ $1000), so breakdown strategy
is not attractive. Note that if Y is very similar to X, which is
the case for many situations (such as light bulbs), then
breakdown strategy can be satisfactory.
Now consider routine replacement. Routine replacement
means that there would be total replacement of all the
components with new components. This must occur
significantly before the mean life expectancy of the
component at a time we shall designate as R years. A
simple analysis gives the cost of the total replacement is
$NY and this averages out at an expenditure rate of $NY/R
per year. There will, however, be an expected number of
failures, n, before replacement occurs, and these will have a
cost of $nX or $nX/R per year. This cost must be added to
the replacement costs and the total cost for the replacement
strategy is (NY þ nX )/R $ per year.
This cost has a minimum between zero and the MTBF.
Two cases are shown in Fig. 2. As will be seen from Fig. 2,
the minimum is significantly below the MTBF. The
minimum cost point is lowered by a larger variation in the
failure lives (larger shape factor, a ).
Fig. 3 shows that the time of minimum cost in the total
replacement strategy is also influenced by the ratio of the
replacement cost to the failure cost ( f ). As f approaches 1,
the minimum cost point approaches MTBF.
There are also other factors affecting the positioning of
the alarm signal level as shown in Fig. 4. This relates to the
possibility that:
† there are errors in the signal itself,
† there is a possibility that the signal will not be monitored
continuously, or
† it is not always possible to immediately replace the
component when the alarm signal is reached due to
operational restraints.
2.3. Condition monitoring
If a component can have its condition monitored then a
third replacement strategy becomes possible. In this
example we assume that the condition of a component can
be measured by a signal which gives a warning of the failure
of the component.
The signal from components being monitored can follow
a range of possibilities. Sometimes the component will fail
when the signal is relatively low (a ‘bad’ case), while some-
times the component will survive to higher levels of signal
(a ‘good’ case). This range of possibilities is shown in Fig. 4.
Fig. 4 also shows that in order for the CM to work an
alarm level must be set. When the signal exceeds the alarm
level then the component is replaced. Clearly this alarm
level must be set at a point lower than a bad case, so that
components will be replaced before they fail.
In these cases a safety margin must be set on the alarm
signal, so that even given these problems, the required
proportion of the components will be replaced before
failure.
Fig. 1. Comparison of a normal and Weibull distributions. The normal dis-
tribution has a mean of 1 year and a standard deviation of 0.1 and the Weibull
distribution has a MTBF of 1 year and a shape factor of 10. The dashed lines
show the same distributions with a SD of 0.2 and shape factor of 5.
Fig. 3. The location of the minimum depends on cost of replacement. In this
case replacements cost $10,000. With the ratio f equal to 0.1 the minimum
shifts to 0.64 years, with a minimum average cost of $17.2 m/year. 11.2
failures are expected at this point. The whole curve is now much more
expensive because a replacement costs $10,000 instead of $1000 in Fig. 2.
Fig. 2. There is an optimum time for routine replacement. Full curve is for
components with a Wiebull distribution of failure, mean time between
failure of 1 year and a ¼ 10. The minimum cost is at about 0.51 years at
which point there is an expectation of 1.2 failures among 1000 components.
This system costs an average of $2.2 m/year. Dashed curve is for MTBF of
1, a ¼ 5. Minimum is at about 0.30 years after about 2.4 failures in 1000
components. The system costs an average of $4.13 m/year.
J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559556
Given that these requirements on the alarm level for the
signal exist, it can be easily shown that many components
will be replaced very early. As is shown in Fig. 4, if there is
a scatter between the good and bad cases, then a significant
proportion of the life of the good cases will be lost. The
proportion of lost service increases and there is also lost
service even for the bad cases when a safety margin is set in
the alarm signal level.
This loss of service represents a penalty for using the CM
which can be very important. Also if CM cannot exclude the
possibility of in-service failures then these add to the cost of
using CM.
Let us consider two different CM situations. Once
again we have 1000 components which cost $1000 to
replace or $100,000 if they fail in service. The MTBF is
1 year and the shape function is 10 as shown in Fig. 1.
We will also assume CM costs $100 per component per
year.
In Fig. 5 the cases considered are CM systems with a
mean of either 1 month or 3 months warning of failure. The
savings over a RM system where all components are
replaced at the optimum time is shown in Fig. 5(a).
The plot considers the situation that not all failures may
be prevented by the CM system. The possibly of the CM
system not being 100% successful is represented in this
model by a standard deviation which is plotted here as a
ratio to the Mean of the warning time. This can also be
represented as the number of failures not prevented by CM
as is shown in Fig. 5(b).
Fig. 5(a) shows that at high reliabilities the CM system
can save large sums of money when compared to the
optimum RM strategy as shown in Fig. 2. The costs of
failures and replacements in the system can be halved.
However, if the CM system is not very reliable and fails to
prevent more than about seven failures per 1000 com-
ponents, then RM is to be preferred.
A CM system can be very inaccurate and still save
money. As shown in Fig. 5(a) it needs only to predict
failures an average of 6 months before the failure when the
MTBF is 1 year to save some costs in this example.
However, the reliability of the CM system is very
important. If more than 0.7% of components fail, then CM is
more expensive than total routine replacement (RM).
Fig. 5. The savings which can be achieved by a CM system depends on its ability to prevent failures. (a) The mean warning time is not so important as the
reliability of the CM system at preventing failures. (b) The ratio SD/mean for warning time for the CM system grows is related to the number of failures not
prevented.
Fig. 4. Setting the alarm signal for continuous CM.
J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559 557
3. A case study—boiler tubes
An important case of the use of CM is the extensive use
of thickness measurement of boiler tubes in large power
stations. The thickness is a good measure of the damage
done by wear and corrosion mechanisms to the tubes. The
thickness measurements are carried out with fairly simple
ultrasonic probes.
There are other more complicated measurements that
are carried out to determine creep life exhaustion of the
higher temperature tubes. These measurements require a
calculation based on the tube internal oxide thickness,
which can only be measured using sophisticated ultrasonic
equipment. These measurements are in the author’s
experience significantly less reliable than the simple
thickness measurements.
The measurements can only be carried out when the
boiler is out of service and requires people to enter the
boilers. As a result these methods are not continuous CM,
but rather intermittent. It is difficult to do these
measurements more frequently than once a year, and in
many situations they can only be done once in every four
years.
To examine this situation, the case chosen has 1000 tubes
with a MTBF of 30 years and a shape factor of 10 as shown
in Fig. 6.
Using a replacement cost of $2000 per tube and a failure
cost of $200,000 per tube then the total routine replacement
of the 1000 tubes reaches a minimum at 15 years with a total
life cost of $146,000 per year as shown in Fig. 7.
If CM is conducted continuously, is very reliable and
costs nothing, then considerable savings are to be made.
Fig. 8 shows that the savings are over $76,000 which is
almost half the cost of RM. If CM is unreliable, which as
shown in Fig. 8 is when the SD of the distribution exceeds
30% of the mean warning time, then CM cannot save
money, and total routine replacement is cheaper.
3.1. A more realistic model of CM
A more developed model of CM of the tube thickness
gives a cost to the inspection. This will be set at $100 per
Fig. 6. Weibull distribution of MTBF 30 years, shape factor 10. For
comparison a normal distribution of mean 30 years, standard deviation 3
years is plotted. The first failure in 1000 tubes is expected in year 15.
Fig. 7. The total routine replacement of the 1000 tubes produces minimum
total life cost at 15 years. The minimum cost is $146,000/year for a
replacement cost of $2000, and a failure cost of $200,000.
Fig. 8. If CM is continuous, reliable and costs nothing, significant savings
are to be made. Maximum savings are $77,000 per year which
approximately halves cost.
Fig. 9. Failures occur even with CM every 4 years. The CM has a mean
warning time, at the time it is done, of 4 years. This mean is reduced by one
year for every year after the CM is conducted. The SD of the CM system is
0.5 years, which is reliable in terms of Fig. 8.
Fig. 10. Savings using CM over routine total replacement. RM is best at 15
years. CM is probably not worth doing until about the time first failure is
expected (15 years). CM permits operation till about 19 years, after this the
1000 tubes cannot be operated profitably and should all be replaced. It is
assumed to cost $100 to inspect each component so the savings using CM
only every 4 years are greater.
J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559558
tube. Moreover these inspections cannot be conducted
continuously but can be conducted only once every 4
years.
Let us set the warning level of the CM at 4 years, that is,
the mean time for failure after the warning level is reached
is 4 years. The SD on the signal will be set at 0.5 years which
is very reliable according to Fig. 8.
If inspections are only carried out every 4 years, there
will be an increasing number of failures which are not pre-
vented by CM with each succeeding year. This is modelled
by moving the time closer to the mean of the normal
distribution by one year for each year after the inspection.
The effect of this is to produce a curve as shown in Fig. 9.
By year 28 on average 394 out of 1000 tubes would be
replaced, by year 32 on average 851 would have been
replaced. By year 36 virtually the entire bank would have
been replaced. In this analysis it is assumed that these new
tubes would not start failing by year 36, which is statistically
a minute error. However, in reality that assumption may not
be true since piecemeal replacement activities are often far
from perfect.
The effect of CM in this case has been to extend the life
of the tubes from 15 years if total replacement RM was
practiced to a mean of 26 years.
However, if the costs are now considered the advantage
of CM becomes less clear. Even the few failures
experienced after year 20 in Fig. 9 are expensive, and the
inspections themselves are costly. If these factors are
considered it is found that there are no savings using CM
after year 19 as shown in Fig. 10. Even though the number
of in-service failures are reduced with more frequent
inspections, the cost of the inspections outweigh the savings
as shown in Fig. 10.
4. Conclusions
The findings of this work are summarised in Table 1.
4.1. Boiler tube thickness surveys
In the particular case of the use of CM in relation to
boiler tube failures it was found that the savings to be
made using CM are actually quite limited when the costs
of conducting CM are taken into account.
The conclusion of the model in the paper is that the
cheapest way to run the example tube bank is as follows.
1. Operate the bank until the first failure or preferably just
before the first failure (at 15 years) (BM).
2. Carry out CM to remove tubes at risk over the next 4
years (CM).
3. Replace the entire bank at 19 years (RM).
This case illustrates a situation where all three mainten-
ance strategies may be relevant for the single set of
components.
References
[1] Higgins LR. Maintenance engineering handbook. New York: McGraw
Hill; 1995.
[2] Bahrami-Ghasrcharmi K, Price JWH, Mathew J. Optimum inspection
frequencies for manufacturing systems. Int J Qual Reliab Management
1999;15(3):250.
[3] Marquez AC, Heguedas AS. Models for maintenance optimization: a
study for repairable systems and finite time periods. J Reliab Engng
Syst Saf; in press.
Table 1
The situations in which various maintenance strategies can be economic
Situation Recommended maintenance strategy Comments
Cost of failure similar to cost
of replacement
BM Light bulbs
Life time of components show very
little scatter
RM Lack of accuracy in CM cases
loss of average component life
Life time of components vary over
wide range
CM Cross over to CM causing savings
depends on cost and reliability of
CM
CM has low reliability RM This case occurs when SD of
CM exceeds 30% of mean warning time
Mixed case. Perhaps the general case
and probably correct for boiler tube
thickness surveys
BMCMRM CM is useful over a limited
region of time
J. Price / International Journal of Pressure Vessels and Piping 79 (2002) 555–559 559