practical considerations in developing an

Upload: islandengineer

Post on 09-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Practical Considerations In Developing an

    1/12

    I E E E T R ANSAC T I ONS O N R E L I AB I L I TY, V O L . 38, N O. 2,1989 J U N E

    Practical Considerations In Developing anInstrument-Maintenance Plan

    Michael A. S. GuthConsultant, Oak Ridge

    Key Words- reventive maintenance (PM), Programmedmaintenance plan, Nuclear reactor, Risk analysis.Reader Aids -Purpose: Present a case studySpecial math needed for derivations: NoneSpecial math needed to use results: Boolean algebra, Probability

    Results useful to: Instrument-maintenance managerstheory

    Summary & Conclusions - his article develops a general setof considerations to explain how a consistent, well organized,prioritized, and adequate time-allowance program plan for routinemaintenance canbe constructed. The analysisissupplemented withexperience from the High Flux Isotope Reactor (HFIR)at US OakRidge National Laboratory (ORNL).After defining the preventive maintenance (PM) problem, theinstruments on the schedule were selected based on manufacturersdesign specifications, quality assurance requirements, priorclassifications, experiences with the incidence of breakdowns orcalibration, and dependencies among instruments. The effects ofrepair error in PM should also be studied. The HFIR requires 3full-time technicians to perform both PM and unscheduledmaintenance.Some techniques from risk and fault-tree analyses can be bor-rowed in studying cause-consequence relations between instrumentsand maintenance. Examples of false-positive and false-negativesignals on the HFIR are given as well assome suggestions for howto model the breakdown incidence. An alternative approach usesapproximate statistical distributions and the mean value of prob-abilities for repair needs. These distributions can vary from knife-edged to hyper-exponential.Searching for congestion periods will assist in the allocationof resources to meet both PM and unscheduled maintenance needs.This article reviews some concepts from queuing theory to deter-mine anticipated breakdown patterns. In practice, the pneumaticinstruments have a much longer lifetime than the electriclelectronicinstruments on various reactors at ORNL. This article concludeswith a discussion of some special considerations and of risk aver-sion in choosing a maintenance schedule.

    1. INTRODUCTIONMost large engineering systems incorporate a programm-ed plan for preventive maintenance (PM) on the sensors andinstruments in the system. Over time, some of the sensors tendto drift out of calibration or otherwise require routinemaintenance. Development of a plan for scheduling PM helpsto assure the operator that the sensor signals are still accurate.

    25 3

    (In many places throughout this paper, the words sensor andinstrument are used somewhat interchangeably.)In first implementing a PM plan, a field engineer or repairtechnician may find more instruments to service than can beaccommodated at a given time. The engineer must set up apriority for which instruments need to be serviced first, whichare most critical for operations, and which allow some time-flexibility for maintenance. Once a balanced plan for PM isdeveloped, it should be possible to project PM needs into thefuture and eliminate potential periods of congestion.This article develops general considerations to explain howa consistent, well-organized, prioritized, and adequate time-allowance program plan for routine PM can be constructed.Unanticipated, and therefore unscheduled, maintenance re-quirementscanstill disrupt the program schedule; however, withsufficient planning the disruption should be resolved within theday-to-day operation of the PM schedule. The considerationsintegrate some general theories on reliability and maintenancewith the particular experiences of the High Flux Isotope Reac-tor (HFIR) at Oak Ridge National Laboratory (ORNL).Many safety trips or alarms are tied to sensor readings andnot directly to the position or state of the components. Thusa faulty sensor can trigger system shutdowns or alarms evenwhen the actual state of the system is normal. This article alsocontrasts the operation of a system under a bare bonesmaintenance philosophy vs PM standards.A second objective is to merge techniques from risk andreliability analysis with the design of PM and decision-supporttools. This paper discusses how to incorporate cause-consequence relations as well as view statistical data onmaintenance or aging of instruments from a risk-analysisperspective. The reader can then decide whether these risk-analysis techniques should be incorporated into his own PMplan.

    A third objective is to examine the effects of redundancyin sensors to see what gains are received from having multiplecopies of the same sensor. Section 5 also examines how redun-dancy can reduce or eliminate situations in which total reliancemust be placed on a particular sensor while other action is be-ing taken. The paper reviews the ranking of alternativemaintenance schemes from a safety objective to determine ifservicing instruments in a particular order increases the risk ofan accident.Practical illustrations and insights explain how engineersand managers set realistic reliability and maintenance re-quirements for an engineering system. We examine use ofreliability data from the field as compared with manufacturersreliability tests. Some of the obstacles to achieving worthwhilereliability requirements are explained. This practical paperminimizes the use of detailed mathematical models, statisticaldata, and theoretical work; it maximizes the use of field ex-periences and case-history observations.

    0018-9529/89/0600-0253$01000989 IEEE

  • 8/8/2019 Practical Considerations In Developing an

    2/12

    254 IEEE TRANSACTION S ON RELIABILITY, VOL. 38, NO. 2 ,1989 J U N E

    Section 2 identifies the maintenance time constraint prob-lem, provides some sources of information for deciding whichinstruments should be on the PM schedule, contains some sam-ple calculations on time spent for PM, and looks at the HFIRstaffing specifically to determine if and how long themaintenance goals will fall behind schedule with the existingrepair staff. Section 3 broadly discusses sensor validation andcomprises such issues as cause-consequence relations, detec-tion of faulty sensors, use of smart sensors, redundancy, andin-place calibration. Congestion periods, stability, and selectedconcepts from queuing theory are reviewed in section 4. Sec-tion 5 discusses some special maintenance considerations in-cluding instrument dependency on the same power sources,humans interacting with an observation of instruments, andsignal use in control systems. Section 6 presents alternative ap-proaches to PM that are based on likelihood vs severity ofaccidents.

    2. IDENTIFICATION OF THE PROBLEMThe first step in setting up a preventive maintenance (PM)

    schedule is to develop a short and succinct statement of the prob-lem. For the High Flux Isotope Reactor (HFIR), this statementmight be:Routine PM currently requires servicing of 850 instrumentson a programmed PM schedule. How can these instrumentsbe calibrated and serviced by the repair staff and still allowthe staff sufficient flexibility to handle unanticipatedbreakdowns?

    2.1 Selecting Instruments for PM ScheduleThe list of sensors in the engineering system should bedivided into a list of those sensors that should and should notbe on the scheduled PM list. Permanent or non-calibrated sen-sors should not appear on the list because they will not affectthe routine PM schedule. The permanent sensors can influencethe number of unanticipated breakdowns. The non-serviced sen-sors will probably be assumed to be accurate over somereasonable life for the instrument.To determine which sensors should be on a routine PMschedule, we found 5 sources of information to be helpful.1. The manufacturers design specifications often includesuggestions for calibration needs based on time and amount ofuse. 2. Many large engineering systems- articularly thosesubject to Government regulation - ave quality assurancedocumentation summarizing the PM requirements to keep thesystem in operation.

    3 . The operations department for the system might haveclassified the instruments or sensor into categories based on theiranticipated maintenance needs. Such was the case with theHFIR.

    4. The field engineer or maintenance repairman will havefirst-hand experience with servicing the instruments; he can giveinsight into which instruments currently on the PM scheduleshow little need for servicing and which sensors, not presently

    on the list, should be added.5. Instruments which are relied upon by other instrumentsmight need to be on the PM list. These instruments can affectthe failure rates of systems that depend on their readings [l].The HFIR PM schedule was developed though a jointmeeting of a representative of the HFIR Operations Division,the HFIR Field Engineer, and the Instrument Foreman for theInstrumentation & Controls (I&C) Division. The OperationsDivision representative provided input about the instrumentsmost critical to keep the reactor operating. The Field Engineeradvised as to which areas of the HFIR had been most in needof maintenance, and the Instrument Foreman provided generalknowledge on the repair rates of instruments in both reactorand non-reactor plants. Where manufacturers specification call-ed for recalibration (eg, annually), this figure was used as alower bound, so that such an instrument might be on a 6- or12-month PM schedule. Appendix A explains what instrumentsin the HFIR primary pressure system are on the PM schedule.

    2.2 Repair Error in Preventive MaintenanceThe issue of sensors being routinely serviced orcalibrated - ven when the service is not needed - aises thequestion of whether the sensors or the system might be in-advertently damaged in the process of this PM. Concern overPM error (eg, forgetting to realign a dial after PM on an in-strument), would seem more likely in situations where the in-strument was working properly prior to the PM. Thus, PM er-ror is more likely to avoid detection when the component wasworking properly than if the component needed servicing andthe technician subsequently repaired the device to its proper stateof operation.At present on the HFIR, following routine calibration orPM on an instrument, the technician performs an operationalcheck of the component after placing it back in service to deter-mine that it is working properly. Repair error has not been aproblem on the HFIR; hence, experience suggests that no ad-ditional guidelines or requirements are necessary. If managerswere concerned about technician repair errors, they could for-mulate a checklist to be completed by the technician or a super-visor to ensure that the component was restored to proper func-tion after servicing. However, such a checklist task would likelybecome tedious and cumbersome.2.3 HFIR PM Hours

    To illustrate the estimation of PM hours, one can examinethe maintenance requirements for the HFIR. Of the 1192 in-struments on the I&C Division inventory list for the HFIR, ap-proximately 950 instruments are on a routine PM schedule. Ofthose instruments on the PM schedule, approximately 650 areon a schedule of 12 months service or less. A calculation forthe month of 1987 July revealed that 61 instruments werecalibrated and/or serviced for a total of 69 hours of work, viz,an average service of 1 .1 hours/instrument. However, thetechnician lacked time to complete the PM tasks on some ofthe redundant units kept on the shelf during July, and these tasks

  • 8/8/2019 Practical Considerations In Developing an

    3/12

    GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 255

    were subsequently completed in 1987 August. The best timeestimate for PM at the HFIR is approximately 90-110hourdmonth .Experience with the HFIR has shown that most of the in-strument breakdown reports were written at night while the reac-tor was operating. The I&C Division personnel received 3-4breakdown reports per night with an average time of 3-4 hoursfor unscheduled maintenance on each instrument report. Thusunscheduled maintenance averaged about 12 houdnight. Manyof the breakdown reports were false alarms in that the reactoroperators were concerned about a reading they were receivingfrom some instrument and wanted it checked, but upon examina-tion the instrument was in working order.Based on experience with the HFIR and other reactors andtreatment plants operated by ORNL, the instrument foremanuses a rule-of-thumb of assigning 250 instruments/technician.This figure is not written in any operating procedures or manualbut is based solely on manpower experience. Using some ofthese average figures as background, it is useful to turn to thespecific HFIR unscheduled maintenance and PM schedules fromboth a supply and demand perspective.

    2.3.1 Demand for Maintenance ServicesIf unscheduled maintenance requires an average of 12hourdday and the HFIR is kept operating 25 daydmonth, thenapproximately 300 working hours would be needed forunscheduled maintenance on a monthly basis. However, the ob-jective of initiating a PM plan is to reduce the number of unan-ticipated breakdowns. Suppose the PM plan meets this objec-tive, and the unscheduled maintenance requirement is cut in halfto 150 working hours/month, viz, an average of 6 hourddayfor the 25 days of operation.If an average of 1.1 hours is spent on the 950 instrumentson the HFIR PM schedule then approximately 1045 hours are

    needed for this work. Distributed over a 12-month period, the1045 hours amounts to approximately 90 houdmonth. If weconsider special requests for equipment verification prior to ex-periments or other unique circumstances then we might wantto add a cushion of 10-20 hourdmonth to handle special PM .Thus we arrive at a figure of approximately 100-110hours/month for PM which fits the current best estimates ofHFIR PM needs. Adding the 150 hours/month for unscheduledmaintenance, 90 hours/month for routine PM , and 20hours/month for special PM yields approximately 260hours/month on the demand side.

    2.3.2 Supply of Maintenance ServicesPrior to 1988, the HFIR had one instrument technician tohandle all the maintenance; however, from 1985-1988 he reac-tor was shut down. The HFIR now has 3 instrument techniciansand an engineering technologist assigned to maintenance tasksfor the HFIR. Assuming 22 work-daydmonth and 8 work-hours/workday then each technician is paid for approximately

    176 hours/month. If all 3 technicians worked on HFIRmaintenance directly then the work would total approximately528 hours/month.The combination of vacation time, holidays, sick leave, andattendance at safety meetings takes up between 30-35% of thetechnicians available work-time. If the technicians spend only70%of their time on maintenance activities, the combined worktime for the 3 is approximately 370 hours/month (528 X 0.7= 370).Each technician spends approximately 40% of the work-time on unscheduled maintenance and60%on PM . Unscheduledmaintenance takes 40% of the 370 hours/month, viz, 148hours/month. That leaves 222 honrs/month for PM under theinstrument foremans rule-of-thumb. Table 1 summarizes theseresults.These calculations show that the unscheduled maintenancehours have a nearly perfect equilibrium between demand andsupply, but the combined routine and special PM hours appearto exceed 112 hours/month if all 3 technicians perform routinePM . In fact, one of the technicians is expected to handle changesin equipment required for various experiments at HFIR.If the HFIR repair staff were limited to 2 technicians thenthey would be paid for approximately 352 hours/month and have70% of that time (approximately 250 hourdmonth) to spendon actual instrument maintenance. Adding column 1 of table1 yields 260 hours/month, so that there might be a close cor-relation between the HFIR maintenance demand and the supp-ly services of 2 technicians.Five other considerations also affect the manpower deci-sions for implementing a successful PM plan.

    1. The 90-100 hourdmonth PM requirement might haveleft out the extra maintenance amenities that could be affordedwith some extra time. For example, the technician might nothave time to clean up oil or other materials used in hismaintenance/calibration work. A third technician could bejustified to allow for extra time to do a more thorough job witheach instrument.2. The instrument foremans rule-of-thumb of 250 in-strumentdtechnician suggests that the HFIR staff would beslightly overworked with only 2 technicians. If the figures usedin these calculations have omitted relevant work requirementsthat are considered in the foremans rule-of-thumb then HFIRmaintenance requirements might need to include the part-timeservices of the engineering technologist.3. Limiting the maintenance to 250 instruments/technicianon the HFIR might be too generous: The entire HFIR safetysystem with approximately 240 instruments can be serviced in3 days. Under the PM procedures written by the field engineer,some of these instruments can be testdcalibrated simultaneous-ly. Thus one large group of instruments on the PM plan actual-ly requires much less time for servicing than the average forthe remaining HFIR instruments.4. Timing is important. The restarted HFIR will run for25 days and then require a 4-day shutdown for maintenance.Some of the PM activities can be executed only during reactor

  • 8/8/2019 Practical Considerations In Developing an

    4/12

    256 IEEE TRANSA CTIONS O N RELIABILITY, VOL. 38, NO. 2 ,19 89 JUNE

    TABLE 1Demand and Supply of Maintenance Services for HFIR

    (hourdmonth)Demand

    or a fault tree. Fault trees impose a more rigid form on the rela-tions by: 1) passing events through Boolean logic, 2 ) requiringthe implications to hold in both directions so that one can moveup or down the fault tree, and 3) incorporating restrictions oncircular or overlapping branches of the tree. Appendix B con-Routine PM 90 222 tains a partial listing of cause-consequence relations that canSpecial PM be easily depicted in fault trees and are contained in the HFIRquality assurance documentation.

    SupplyUnscheduled maintenance 150 148

    20 (figured in above)

    shutdowns, and it is possible that only 2 technicians, even ifthey are working overtime, might not complete their scheduledPM tasks during the 4day shutdown. To analyze this constraintfurther, the list of PM tasks must be subdivided into: a) tasksthat can be completed during reactor operations, and b) tasksthat require a shutdown. The I&C Division inventory list of in-struments is being updated to include this information.. 5. Having additional personnel allows more time for fill-ing out maintenance paperwork. A PM plan is only as goodas the input it receives [2]. If a technician completes work onan instrument and fails to report the work then the PM systemacts as if that work has not been completed and omits the workfrom the cumulative totals. To cut down on manpower spentfiling paperwork, a computer terminal has been installed on-site at the HFIR to record maintenance/service and to make thereporting requirements less tedious.In the past, the HFIR shutdown period could last from 14hours to 3 days. It generally overlapped with nights orweekends, and the Operations Division could not afford to payovertime for PM. Because the instrument technician had littletime to perform PM, a shutdown was required. The chance thatI&C personnel might have 2 working days between the hoursof 8:OO am and 4:30 pm was slim. Now the HFIR shutdownis anticipated to last a minimum of 4 days, which is announcedin advance and thus possible to plan around.

    3 . INSTRUMENT VALIDATIONThe second major task in developing a preventivemaintenance (PM) schedule is to look at alternative instrumentvalidation techniques for allocating time. The most reasonablemethod of validating instruments involves local testing withmonitors. For example, the field engineer or instrument tech-nician tests display devices in the High Flux Isotope Reactor(HFIR) by disconnecting the devices from the system, connec-ting his own test equipment, and then applying a specified signalto see that the devices register the correct value.Another method of instrument validation comes from deter-mining the cause-consequence relations for the associated com-ponent failure. Knowledge of the consequences of component

    failures assists the planner in distinguishing actual events fromfaulty-sensor signals. If failure of a given component is knownto cause an observable event, then failure to witness this eventcould indicate that the problem rests with the given-sensor signalrather than being a component failure.It is often helpful to depict graphically the cause-consequence relations, either in the form of a semantic network

    - -Once the cause-consequence relations of the componentshave been identified, it should be possible to isolate particularevents stemming from abnormal conditions in the reactor. Onemethod of detecting the events is direct observation; anotheris to rely on alarms or annunciators. These events should thenbe compared to distinguish the difference in appearance betweenan abnormal condition and a seemingly abnormal conditioncaused by sensor failure.In the HFIR, the operator relies mainly on the instrumentreadings available in the control room. Only during his once-per-shift equipment inspections will he walk around the plantto check on other instrument readings. Thus, since operatorsrely more on sensor signals rather than actual observations todetermine abnormal conditions during reactor operations, it ishelpful to discuss more rigorously how sensor failure can beincorporated into conventional risk-assessment studies.Faulty sensors can lead to 2 patterns of observed failures:false positives and false negatives. In a false-positive pattern,sensor failure can trigger an alarm of some safety system evenwhen the true state of the system is normal. In a false-negativepattern, the sensor can fail to register some abnormal system-condition and give instead the appearance that all componentsare working properly.False positives from sensor failure can be incorporated in-to traditional fault-trees by including another parameter for sen-sor failure at each step where it can have an impact.

    Example 1The 2 initiating events, both A and B together cause someconsequence, C 1. A Boolean equation for this relationship is -A A B V C 1 . (3-1)NotationA Boolean AND operatorV Boolean OR operatorA , B initiating eventsC1C2S1S2

    a consequence event: An alarm sounds (alarm-trip)a consequence event: A separate alarm soundsevent: Sensor sl does not calibrate correctlyevent: Sensor s2 calibrates correctly and isworking

    The terms calibrate and working are a matter of degree.Some faulty sensor can independently cause the conse-quence, C1 which could be an alarm-trip. By joining an event

    S 1 to the 1.h.s. of (3-1) with a Boolean OR gate as shown in

  • 8/8/2019 Practical Considerations In Developing an

    5/12

    GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING A N INSTRUMENT-MAINTENANCE PLAN 251

    (3-2) an alternative source can be introduced to explain obser-vations of C1 when the true component initiatingevents, A andB, have not both occurred.

    Eq. (3-2) can be loosely translated as: if (both A and B arefailures) and/or (sensor SI fails) then an alarm is tripped, Cl. OExample 2

    False negatives can be modeled in a similar fashion withthe Boolean AND operator and a sensor-failure parameter.Begin with a causal relation similar to (3-l), except that nowwe are interested in showing how another consequence, C2,might not be observed even when both A and B have occurred,as shown in (3-3).( A A B ) A ~2 ~ 2 (3-3)That is, (3-3) shows that Occurrence of both A andB is necessarybut not sufficient to cause C2 to occur. With the addition ofS2, he associated sensor must be both calibrated and workingproperly for the anticipated consequence C2 o occur. Other-wise even if both A and B occur, the consequence C2 does notoccur. 0Discussion of Examples

    The numerical values related to S1 & S2 derive fromestimates of instrument failure-rates. Any instrument that re-mained in perfect calibrationhepair would have: Pr(S2) = 1and Pr(S1) = 0. However, S1 & S2 refer to 2 distinct rela-tionships. The S1 value might come from one particular instru-ment and the S2 from another. If S1 & S2 are both based onthe same instrument then that sensor might be sufficiently farfrom calibration to cause C1 to occur. In terms of probabilitiesderived from relative frequencies, an analyst could assign (oc-cur, NOT occur) values of (0.1, 0.9) to S1 and (0.8, 0.2) toS2. For additional explanations, see [3].On the HFIR one example of a false-positive signal is thefairly frequent (monthly) spurious trip of an annunciator dueto electrical noise. We might know that the intended cause forthe annunciator to go off is the joint event ( A AND B ). Whenthe consequence (annunciator going off) is observed, but notthe causes, then a preliminary hypothesis for the observationis sensor failure. The sensor might be picking up electrical noiseor there might be a fault in the electrical system. If a particularannunciator, on the average, goes off 13 times a year, and 2of the 13 times are spurious then the probabilities assigned toS1 are (2/13, 11/13).An example of a false negative on the HFIR has occurredon the resistance-bulb calibration for the cooling tower. Onseveral occasions, 1 of the 4 resistance bulbs has gone out ofcalibration, usually indicating a low temperature. If the fansare not on, or are not working proper ly, then information aboutrising water-temperatures is not correctly conveyed through the

    uncalibrated resistance-bulb. If, on the average, out of 280 daysof operation the resistance bulb was working properly, on 275days then the probabilities assigned to S2 are (279280 , 5/280).This logic implies that in 5/280 trials, the consequence (a signalthat the fans are not properly controlled) would not appear even0Few, if any, instruments on the HFIR have failure rateswith known or relevant frequencies. Most instrument

    breakdowns on the HFIR are unique and hence do not lendthemselves to probabilistic calculations on failure rates. On theother hand, many instruments on the HFIR need a 2-yearcalibration cycle. If the instrument is calibrated annually thenthe operator can rest assured of the accuracy of the signals com-ing into the control room. If the routine calibration is delayedto 2 years then the sensor signals become more questionabletowards the latter part of the cycle.Only one type of instrument on the HFIR, an early designof an operational amplifier using an electro-mechanical chop-per, has breakdowns that approach a pattern. At one time theHFIR had about 100 such amplifiers in service. After thebreakdown pattern was observed, it was found that they couldbe replaced by then state-of-the-art integrated circuits for lessmoney than the repaidupkeep on the old amplifiers. Therefore,advances in microelectronics led to cost savings by substitu-tion of another type of instrument rather than repair of the ex-isting type.Using risk analysis for instrument validation requires col-lecting data on individual sensor failure-rates or aging processes.In general, the failure rate and aging process depend on theroutine PM plan, so that some simultaneity-bias enters thesefigures. The lifetime of an instrument can be extended throughroutine maintenancehepair - ven beyond that lifetimeguaranteed by the manufacturer.One source of statistics on failure or aging rates of in-struments can be obtained from the log-books of repairman, eg,service hours, nature of the problem, time for repair, frequen-cy of repairs. For HFIR sensors, the data on failure rates arenot very complete. The HFIR has been in operation since ca1965. The first system of collecting and recording informationon system repairs (MAINS) went into effect about 1976. Theinstrument system was changed to the MAJIC ca 1986 January.MAJIC had some debugging problems with the entry of data;it did not get back into satisfactory operation until 1986 October.Thus 10 years of data are missing from the first years ofHFIR operations. The data on MAINS are available, at someeffort, on hard copy. The data on MAJIC for the first 8 monthsapparently contain some gaps. Thus there exists no consistentdata set on HFIR maintenance, and what data do exist mightnot be readily accessible or reliable, because certain informa-tion on repair incidence was not recorded.

    What emerges from applying risk analysis to the PMscheduling problem is a perspective of ranking various ac-cidentdevents that could occur if the sensors are not properlyserviced/maintained. Risk assessment combines prob-abilities-of-events with their seriousness, to arrive at a risk fac-tor. By focusing on these 2 variables, it is possible to re-examinethe design of a PM plan with a view toward eliminating all

    if the true reactor component state were abnormal.

  • 8/8/2019 Practical Considerations In Developing an

    6/12

    258 IEEE TRANSACTIONS ON RELIABILITY, VOL. 38, NO. 2,19 89 JUNE

    important risks, either through increased PM on these com-ponents or coupling additional safety systems to prevent the ef-fects of an accident or a failure from spreading.

    3 .1 Alternative Logic on Anticipated Running Time

    Morse [4] discusses the conventional logic applied to in-strument PM based on deriving a statistical distribution ofbreakdowns, then calculating lifetimes based on that distribu-tion. In general, calculation of these breakdown distributionsrequires knowing the performance of the same type of instru-ment across various environments as well as time-seriesknowledge about the history of failure rates in each environ-ment. The data on failure rates for the HFIR are not very good,but even assuming that, for example, the incidence of repairson some instrument amounted to 6jobs over the 20-year period,the paucity of data precludes useful statistical inference.Because of the lack of individual data-collection, managersof engineering systems must often trust the manufacturersdesign specifications on breakdown/recalibration rates. In theHFIR, the manufacturers specifications are taken as a lowerbound on required PM, and, when necessary, the PM on par-ticular instruments is increased over manufacturers specifica-tions - ased on first-hand experience with the instrument per-formance on the HFIR.Accordingly, many of the theoretical papers onmaintenance, in this or similar journals, assume distributionsfor instrument breakdowns to analyze the maintenance problemmore rigorously. For example, the theoretical literature oftenassumes that the breakdown follows a Weibull or lognormaldistribution. Without going into detail on the properties of eachdistribution, it is helpful to maintenance planners to see figure1which shows survivor functions for 3distributions (schedules).Schedule A shows a knife-edged distribution along somemean service time, Tm . The pattern implies that the overwhelm-ing majority of individual instruments (of that type) require ser-vice after or near time, Tm . This distribution corresponds toan instrument with a well-known breakdown time, and littledeviation from that time.

    Schedule B shows an exponential distribution which mightapply to an instrument with a variety of moving parts that canmalfunction. Or, the instrument might depend on many ad-justments that can deviate from proper working order.Schedule C shows a hyper-exponential distribution whichis more convex than the exponential distribution. It could bethe distribution for an instrument that, when perfectly calibrated,works properly for a long time. However, if not perfectlycalibrated, the instrument soon needs repair. This distributionlends itself to a bifurcation of the PM/calibration activity onthe instrument, whether it is perfect or not.Khandelwal et al. [5 ] describe a PM decision problem formachines subject to deterioration and uncertain breakdowns.The bang-bang control solution is derived from optimal-controltheory. Reliability calculations that use mean repair times, aswell as a discussion of the caveats with mean imes, are in [6,7].

    Tm timeFig. 1 . Survivor Functions for 3 Types of Breadkown Distributions

    3.2 Other Conceptual IssuesSeveral other conceptual issues for sensor validation include useof smart sensors, which can be either calibrated by remote con-trol or compensated for internal error. For example, a newgeneration of pressure transducers contain a valve manifold thatallows calibration to be adjusted, once the sensor has been placedin operation. Figure 2 shows that valves can be opened duringoperation to create a static pressure of 10oO psi on both sidesof the transducer. In the past it was necessary to calibrate thetransducer in the shop at 0 psi static. The engineer then hadto reset the zero point on the signal from the transducer at astatic pressure of 10oO psi and hope that the span calibrationremained intact at operational pressures. With the developmentof smart sensors the calibration can be set to the correct valuewhen the 10oO psi is placed on both ends of the transducer.

    hammer1000 psi 1000 psi

    TRANSDUCER

    Fig. 2. Pressure Transducer with loo0 psi Static Pressure Hammer.Similarly , a new generation of transmitters can adjust for internal er-rors. This enhanced sensory capability leads to the question ofwhether the benefit of increased calibration with smart sensorsis worth the up-front costs.For the HFIR it is difficult to conceive of situations or ex-periments where remote calibration of sensors is helpful. Theneed for remote calibration depends on instrument location (eg,if the instrument is under-ground or behind a lead shield). The

  • 8/8/2019 Practical Considerations In Developing an

    7/12

    GUTH: PRACTICAL CONSIDERATIONS IN DEVELOPING A N INSTRUMENT-MAINTENANCE PLAN 259

    instruments requiring recalibration after being placed in opera-tion on the HFIR are generally accessible, and they have localadjustment capabilities.The potential for sensors or instruments with built-in com-pensation capabilities could help to correct sensors that are sub-ject to fluctuations in electrical current, or air pressure forpneumatic instruments. A potential application at the HFIR isfor signals based on other signals, such as heat power for whicha 2% deviation in the temperature and flow probes can lead toan 8% deviation in the heat-power calculation. The related issueis how much you are willing to pay to have your heat-powersignal reduced to a maximum variation of 1% instead of 8% .Another conceptual issue for PM deals with the availabili-ty of spare components in inventory and the ability to servicethe instrument during normal operations. One-of-a-kind in-struments generally have no inventory spares, and they oftenrequire a reactor shutdown before any maintenance can takeplace. However, many of the instruments on the safety, servo,and counting channels of the HFIR are tracked in triplicate. Asa result, one panel of instruments can be removed from serviceduring normal operations and serviced while the reactor isoperating.

    Finally, tradeoffs between in-place calibratiodrepair com-pared to removing the instrument and talung it to a shop shouldbe included in calculations for sensor PM requirements.Moreover, some sensors require a system shutdown or theremoval of various obstacles before they can be serviced. Thuswhile actual repair time on a particular instrument might takeonly 2 hours, it might take a day to remove obstacles, and upto a week before the shop has time to work on the instrument.

    4. SEARCH FOR CONGESTION PERIODSThe objective of designing a routine preventive maintenance

    (PM) plan is to avoid situations in which an engineer or repair-man has too many instruments or sensors to service/maintainat a given time. The incidence of congestion periods generallydepends on the repair frequency of the sensors under study, thenumber of instruments or sensors in the study, and the priorityfor working on the sensors.

    PM priority should be given to those sensors and in-struments whose failure can cause the most serious, as well asthe most frequent, consequence. Safety considerations mustprevail over convenience or cost. Consequently, developmentof a PM plan must consider the effects of various accident-related scenarios stemming from instrument failure. The PMplan might pose questions such as: What is the worst accidentthat can happen if the instruments are serviced in the currentpriority ranking? What is the worst event that can occur if thepriority ranking is changed? After several iterations the mostimportant sensors should be identifiable.Time flexibility in repairs and redundancy of sensors are2 important factors in eliminating congestion periods. Wherethe sensors are redundant or a sensor can be serviced withoutaffecting operations, congestion periods can usually be alleviatedif not eliminated. The redundancy can take the form of either

    at least 2 identically functioning instruments on-line at a giventime, or an inventory of spare parts.On the High Flux Isotope Reactor (HFIR) the instrumentsand sensors that form the safety system are tracked in triplicate,and a safety-trip requires 2-out-of-3 to activate. Thus when oneof the sensors needs PM or repair, it can be taken out of ser-vice while the HFIR is operating. Scheduling-time constraintspose the greatest difficulty on the HFIR for those sensors andinstruments that have no redundancy. The non-redundant in-struments include such parts as the chemical treatment and de-aerator. PM on the sensors associated with the primary pressure-system also requires scheduling during a shutdown, since thepressure-system sensors have no built-in redundancy.In looking at congestion problems it is helpful to take sometechniques from queuing theory. If an instrument needs calibra-tion or service but no repairman is available then it is addedto the waiting queue. The service times for the instruments arerandom variables. Three important parameters to consider fromqueuing theory are:

    1. The waiting time for repair on each instrument2. The busy period during which one or more repairmen3. The queue size (number of instruments in the queue).are busyTwo aspects of PM on equipment distinguish it from otherqueueing processes:1. The possibility of PM introduces a simultaneity thatmeans the PM required to keep the system working is a func-tion of the amount of PM . This characteristic further impliessome control over the unanticipated nature of breakdowns -so that these instances can be controlled or reduced [8].2 . There is a finite population that can potentially breakdown. Once all of these have broken down and are in the queuefor repair, no more can enter the system. For most other queu-

    ing applications the effective population is infinite.When developing a program to ensure some PM objective(eg, all the sensors related to the primary pressure on the HFIRare properly calibrated and in working order), the PM schedulemust be integrated with the service requirements for the restof the instruments. Viewed as a separate plan to achieve somespecial objective, the PM plan should not reveal any conges-tion periods that would be evident when viewed simply as partof an overall schedule.Once a preliminary PM schedule is developed, the stabili-ty of the plan should be tested by adding some unanticipatedfailures of instruments. These additional exogenous shocks canshow what points in the PM schedule have sufficient flexibilityto accommodate unscheduled maintenance. For real PM plans,the number of unanticipated breakdowns in instruments is in-versely related to the time spent on PM.Borrowing some concepts from perturbation theory, theplanner could adjust the parameters of the model - mean timefor routine PM, mean time for unscheduled maintenance,number of instruments needing special PM , number of in-struments needing routing PM , number of instruments in the

  • 8/8/2019 Practical Considerations In Developing an

    8/12

    260 IEEE TRANSA CTIONS ON RELIABILITY, VOL. 38, NO. 2,1989 JU NE

    system, etc. - o determine the sensitivity of the PM scheduleto these parameters.In simple queuing theory, any maintenance is assumed torestore the item to good-as-new. This assumption violates theusual capital theory in which capital equipment wears out overtime. Maintenance, in capital theory, can prolong the life ofthe instrument but cannot extend it indefinitely.To bridge the gap between queuing theory and capital theoryperspectives, at least for the HFIR, it is helpful to distinguishbetween calibration and some forms of repair. Recalibrationgenerally does restore instruments to good-as-new in terms ofcalibration; but some forms of repair follow the capital theoryview that a restored instrument is merely bad-as-old.Figure 3shows the bathtub curve for failure rate of an in-strument; failure rate is on the vertical axis and time is on thehorizontal axis. The early period is called infant mortality, thefinal period is called wearout. In between, the failure rate ismore or less constant at Pi and represents the normal work-ing period.

    Of

    0time

    Fig. 3 . Instrument Failure RateFrom a capital theory perspective, the upward sloping partof the graph is the period in which maintenance can change themagnitude but not the direction of the slope; ie, beyond a cer-tain time, repairs are only palliative, not restorative. Put yetanother way, once the instrument lifetime is used up, thenrepairs are only a temporary fix.Figure 4illustrates the queuing theory concept of completerestoration. Suppose the instrument is serviced at time tx.Ratherthan following the ascending dashed line, the failure rate forthe instrument then jumps to the new descending schedule,followed by another long horizonal trend at failure rate, P i . The

    time between tx and ty represents the period in which the in-strument, put back into operation, has a relatively high butdecreasing failure rate due to maintenance error.Moving from theoretical to practical illustration, thelikelihood of instruments wearing out before the engineeringsystem itself is mothballed depends on each application. Forthe HFIR, most of the instruments will probably outlast the

    ProbabilityOffailure

    0 1t x tY time

    Fig. 4. Instrument Failure Rates with Maintenance.

    reactor; this list of instruments includes the permanent as wellas the routinely maintained sensors. Therefore, it is reasonableto view most of the newly serviced instruments on the HFIRas good-as-new. However, for other engineering applications,use of the sensor may wear it out, and the capital theory ap-proach should be used.One interesting point that emerged from history with reac-tors at Oak Ridge National Laboratory (ORNL) is that onewould anticipate the pneumatic instruments to wear out beforethe electric instruments because the pneumatic instruments haveparts that rub against one another. Experience has show the op-posite to be true: The pneumatic instruments last longer thanelectric instruments. The pneumatic instruments have soft con-versions; the electric instruments have more rigid conversions.Many of the pneumatic instruments from the 1950s are still infine working order and online at reactors in 1989. The electricinstruments are less reliable. Digital electronic instruments breakdown from static electric charges and glitches; the analog elec-tronic instruments are sensitive to fluctuation in ac voltage levelsand temperature.The choice of times at which to engage in PM depends onwhether the repair work is viewed from the good-as-newvs bad-as-old perspective. Once can envision a control-theory problemwith the objective function defined in terms of keeping the in-strument in good working order, and the control variable be-ing the amount of maintenance put into the instrument to keepit working. If the maintenance restores the instrument to good-as-new then the solution probably is bang-bang control wherethe planner uses an all-or-nothing approach to maximize the in-strument lifetime. In contrast, if the instrument is likely to wearout in any case then the cost of PM probably outweighs thebenefits.

    5. SPECIAL CONSIDERATIONSThere are five special considerations for preventivemaintenance (PM).

  • 8/8/2019 Practical Considerations In Developing an

    9/12

    GUTH : PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN

    ~

    26 1

    1 . Common causes for degradation or failure that defeatredundancy. Instruments share a common power source,shelf/location, circuits, cooling source, etc. The High FluxIsotope Reactor (HFIR) has been checked for common relianceby instruments on power source or other attributes. As a generalrule, similar instruments or sensors have been designed to relyupon different electrical sources so that a failure in one areadoes not affect the sensors in another. For example, the in-struments in the safety system are tracked in triplicate. Thuscore-inlet temperature is measured on panels A, B, C , withpower source and wiring for panel A physically independentof panels B, C .A power failure to one of the panels could affect all theinstruments on that panel. Each of the panels is connected toa separate battery bank that supplies electrical power to the panelin the event of a utility power failure. The HFIR has backupgenerators that require a failure-to-start before relying solelyon the reserve electricity in the battery banks.The HFIR has one important exception to the separatepower source & circuitry rule: The process systems, which in-clude the cooling-tower temperature-control and the pH treat-ment of primary water, do share a common power source.Moreover, in order to service a particular instrument in oneof the process systems, it might be necessary to open a circuitbreaker. Although drawings exist that show the relation betweena particular breaker and its associated instruments, no one hasstudied the total impact on the HFIR facility when a family ofinstruments is taken out of service, for example, when a cir-cuit breaker is opened.Tylee [9] evaluates the functional redundancy approach todetecting instrument failures in nuclear power plant instrumen-tation. His real-time method uses a bank of Kalman filters foreach instrument to generate optimal estimates of the plant state.By performing consistency checks among the outputs of ap-propriate filter, Tylee can identify failed instruments.2 . The number of instruments to be serviced. This numberpartially determines the manner in which PM is undertaken.The HFIR has 1132 instruments on the I&C Division inven-tory list, and of this total, approximately 850 are on the pro-grammed PM schedule. The kind of individual attention paidto instruments as well as the variety of problems that can bechecked may be limited by the vast number of instruments ina reactor. Given a long queue of instruments waiting for PM,a reactor instrument technician would likely have less time tospend on individual instruments and might feel pressured tocomplete his PM tasks. An analogy is preparing-meals: The waya person serves a meal to one person differs from the way hewould serve meals to an entire family. The method of servinga family, in turn, differs from the method of serving over loo0employees.3. The number of instruments on the PM list. Thls number,more so than just the time constraint, affects - ) decisionsabout purchasing tools for on-site repairs vs shop repairs, aswell as b) the number of employees in the PM program.

    4. The extent of human interaction with the instruments.A sensor can continue giving bad readings until it is observedby some operator or technician. In one scenario, an observer

    notices a particular instrument or sensor malfunction, yet failsto take corrective action immediately. The observer might con-clude that the instrument is unimportant and need not becalibratedhepaired immediately. Thus the instrument is allow-ed to remain unrepaired until such time as: a) a reading fromthat instrument is actually needed, or b) some accident occurswhence the instrument is needed to correct the system state.In developing a PM plan, it is helpful to determine the impactof allowing a bad sensor to remain unrepaired.The spare instruments kept in inventory at the HFIR canfall into this category of disuse and disrepair. The value ofredundant or spare parts is questionable if they are not knownto be in proper working condition. To resolve this issue, theinventory of spare instruments has been added to the PM plan.Now the I&C Division maintains closer control over the work-ing condition of the spares.5 . The manner in which signals from a sensor a re used bythe system. Some sensor signals feed directly to: a) a controlsystem, b) a recorder, or c) a local display only. Some sensorsthat serve only as local gauges or instruments, and which wereplaced on the reactor only for convenience, have been left offthe PM plan. From a risk-analysis perspective, studying variousaccident-related scenarios for each instrument left off the PMschedule could help predict whether such an instrument wouldbe needed in a crisis. However, all instruments on the safetysystem that are part of any control system and that give outputsignals to the control room have been placed on the PM plan.

    6 . ATTITUDES TOWARD RISKPreventive maintenance (PM) is not free. In general, theanticipated benefits of increased PM must be weighed againstthe costs. Risk assessment commonly assumes that some formsof risk, no matter how intolerable, cannot be completelyeliminated. Risk assessment often delivers a list of alternatives

    that can reduce the probability of some accident/event so thatits risk factor is acceptable. Risk factor is:

    NotationFiPi probability of the eventSi severity of the event-

    risk factor of an event

    The planner for the PM schedule has an objective function thatuses both the severity and the probability of an accident to in-fluence which instruments are on the PM schedule and whattheir priorities are. The assumptions about risk attitudes in-fluence why some managers use more PM than others do.Consider the operation of an engineering system with noPM plan and only one type of accident. The system componentsare repaired on a bare-bones approach. The costs are measuredin terms of the severity of the accident and quantified in dollars,the same as benefits.

  • 8/8/2019 Practical Considerations In Developing an

    10/12

    262

    Notation

    IEEE TRANSACTIONS ON RELIABILITY, VOL. 38 , NO. 2 , 1 9 8 9 J UNE

    a PM plan. To the extent that managers have other regulatoryconstraints to satisfy, a PM plan might be implemented in anycase. If the plant managers are risk averse then the model inthis section suggests that an increase in the probability of ac-cidents has a more important impact than an increase in theseverity of accidents to induce more managers to incorporatePM plans. If the managers prefer risk then the severity becomesmore important than the probability of an accident in determin-ing who implements PM.

    planners objective function, viz, utility; U is a functionof benefitbenefit from an engineering system when running with noPM, and no accidentcost to operations from an accidentprobability of an accidentThe s-expected (average) utility from running an engineer-ing system without a PM plan is:

    E{U} = (1 - p ) U ( B ) + p U ( B - C ) (6-2) ACKNOWLEDGMENTWhen E { U } exceeds the s-expected (average) utility from im-plementing a PM plan, then the engineering-system operatorsundertake a bare-bones approach to PM. Therefore, any changethat increases E {U } increases the incidence of operationswithout any PM. Similarly, decreases in E{U} increase thebenefits from adopting a PM plan.The statement, The likelihood of an accident has propor-tionally more impact than the severity of an accident on the deci-sion to run without a PM plan. can be expressed in terms ofderivatives:

    Funding for this research was provided by appointment tothe US Department of Energy Laboratory CooperativePostgraduate Research Training Program administered by OakRidge Associated Universities. Don Asquith, Field Engineerfor the High Flux Isotope Reactor (HFIR), provided invaluableassistance and most of the information on experiences with theHFIR reported herein. I also thank Bill Zabriske, Charlie Allen,and two anonymous referees for their helpful suggestions.

    (6-3a) APPENDIX A: Preventive Maintenance (PM) for thePrimary Pressure System on the HFIR[ U ( B ) - U ( B - C ) ] p / U > p U ( B - C ) C / U (6-3b)Hence (6-3) holds if[ U ( B ) - U ( B - C ) ] / C > U ( B - C ) . (6-4)Inequality (6-4) simply requires U to be convex; ie, the plan-ner prefers risk over the certainty equivalent. For example, ifgiven choices between a 60/40 lottery of receiving $100 or $0,and a second choice of $60 ($60 = 0.6 x $100 + 0.4 x $O),a risk averse person will, by definition, choose the $60, whilea person who prefers risk will, by definition, prefer the fairlottery. If the odds remain the same but the certain payoff islowered to an unfair $50 then the marginally risk averse per-son might, by definition, accept the slightly unfair lottery.Are operators of engineering-systems risk averse, or do theyprefer risk? The answer most likely varies across industrybecause risk-bearing can be a source of profits in the privatesector. If the question is limited to nuclear plants then all per-sonnel associated with managing the plant are risk averse. Inaddition, regulatory constraints added to the objective functionessentially eliminate all gains from operating without a work-ing PM plan. Even if the reactor operated without incident, theadministrators would be penalized for failing to take adequateprecautions.Additional inferences can be drawn from this model. Forexample, increasing the severity of the accident by raising thecost thereof, or increasing the probability of an accident willlower the s-expected (average) utility from operating without

    The primary pressure system on the HFIR comprises thefollowing instruments:1. Channel A Flux - measured 0 to 150%2. Channel B Flux- measured 0 to 150%3. Channel C Flux - measured 0 to 150%4. FM258 Letdown Cleanup Flow -measured 0 to 200

    gpm 5. HICM377 Secondary Flow Control Valve - demandsignal showing 0 to 100% closed for the 36 inch valve6. HICM377A Secondary Flow Control Valve - de-mand signal showing 0 to 100% closed for the 10 inch valve(attached to the inlet temperature controller)7. FM2 16 Pressurizer Pump Flow - measured 0 to 200gpm, this flow is measured after the letdown flow has passedthrough chemical processing and is returning to the primarysystem.8. PM127 Primary Pressure-measured in 0 - 1500 psi.This sensor actually measures from 3 - 15 lbs, which it thentranslates to the 0 - 1500 psi scale.9. PM127A Pressure Control Valve Position - measured0 to 100% open, a demand signal for the valve to open, not

    a feedback signal from the valve itself.10. #1 Primary Flow - measured 0 - 20 0oO gpm11. #2 Primary Flow - measured 0 - 20 0oO gpm12. #3 Primary Flow - measured 0 - 20 0oO gpm13. #1 Inlet Temperature -measured 75 - 200 degrees F14. #2 Inlet Temperature-measured 75 - 200 degrees F15. #3 Inlet Temperature- measured 75 - 200 degrees F

  • 8/8/2019 Practical Considerations In Developing an

    11/12

    GU T H: PRACTICAL CONSIDERATIONS IN DEVELOPING AN INSTRUMENT-MAINTENANCE PLAN 263

    16. #1 Outlet Temperature- measured 75 - 200 degrees F17. #2 Outlet Temperature- measured 75 - 200 degrees F18. #3 Outlet Temperture-measured 75 - 200 degrees F19. FM300 Secondary Flow - mesured 0 - 25 OOO gpm20. TM3 10B1 Coolinf Tower Inlet Temperture- measured20 - 120 degrees F, both this signal and the cooling tower outlettemperature are measured by a resistance bulb.21. TM3 1OA1 Cooling Tower Outlet Temperature -22. #1 Rod Position- measured 0 - 27 inches23. #2 Rod Position - measured 0 - 27 inches24. #3 Rod Position- measured 0 - 27 inches25. #4 Rod Position- measured 0 - 27 inches26. #5 Rod Position -measured 0 - 27 inches

    measured 20 - 120 degrees F

    Two other sensors, not presently linked to any other system, are:FM128 Low Pressure - local meter, provides no signalFM104 Backup Pressure Sensor to FM127 - ppears as digitalLED in control roomNotationpsigpm gallons per minuteF Farenheitlbs pounds

    pounds per square inch, gauge pressure

    Summary of Which Sensors are Ro utinely Calibrated and HowOfrenPM on the 3 flux channels can be separated into PM on theion chamber and PM on the instrument itself. Each of the threeion chambers is serviced on a 3 year basis, and staggeredso thatone chamber is serviced a year. The PM check takes only about

    2 hours; the chambers are readily accessible. The instrument itselfis serviced every 6 months, and this service effort-requires bet-ween 2 - 3 hours. The HFIR has a total of 9 flux sensors: 3 onthe safety, 3 on the servo, and 3 on the counting channels.The letdown cleanup flow, pressurizer pump flow, and threeprimary flow sensors are not calibrated. The manufacturersspecifications are taken as true and accurate. The devices areinstalled and not generally serviced. This practice is particularly

    true of the three primary flow sensors, which are on the ven-turi - n hour-glass shaped tube with an orifice for measuringflow, are permanent, and are not calibrated. The secondary flowis measured by a dah1 tube - funnel-shaped tube with anorifice, and is not calibrated.The core-mlet temperature sensors are routinely PMd onlyon the safety side. Each of the three transmitters is serviced an-nually and required about 30 minutes for calibration. The resistancebulbs are on a 3 year plan and staggered so that only one bulbis serviced in a given year. It is quite an ordeal to check the calibra-tion of these bulbs. First the primary water must be drained fromthe system, which requires a minimum of 8 hours. Once the bulbis removed it is sent to the Standards Office, which is located ina building about 1.5 miles from the HFIR, o be immersed in abath. The resistance bulbs rarely if ever show signs of driftingout of calibration. But since these bulbs are part of the safetysystem, they are serviced just to be sure that they have stayedin calibration.The core outlet temperature sensors are not part of the safe-ty system; hence, they are not routinely serviced like the coreinlet temperature sensors. Neither the HICM377 36 inch con-troller valve nor the HICM377A 10 inch controller valve requirerouting PM. Repairs on the valves, if ever needed, are split be-tween Instrumentation& Controls personnel for the top part ofthe valve (the controller box) and Plant & Equipment personnelfor the valve itself.Primary pressure PM127 is serviced annually. The calibra-tion check takes about 1 hour, but it may take all day beforethe sensor can be removed and taken to the shop. The PM127Apressure controller valve is serviced annually and takes aboutone hour to complete the PM check; the valve is easily accessedin the control room.The 5 rod-position sensors are not on a routine PM schedule.Because of the way they are locked into place by bolts, they donot drift out of calibration over time. The only part that requiresservicing is the pointer, which is visually compared to a yard-stick in the subpile room during each restart.The cooling toward inlet temperature TM31OB1 sensor con-tains one transmitter and one resistance bulb. The cooling toweroutlet temperature TM310A1 sensor contains4 transmitters and4 resistance bulbs with sensor reading being the average of thefour. The transmitters are serviced annually. Because of the loca-tion of the resistance bulbs, it is economically not feasible to ser-vice them on a routine basis.

    APPENDIX B. Illustration of Cause-Consequence Relations Contained in HFIR Quality Assurance DocumentationEvent Causes - OR gateprimary cleanup pump fails to start on request 1. basic pump failure2. basic motor failure

    3. overload relay tripped4. control room switch turned off5 . local switch turned off6. diesel engine #1 fails AND normal power outage7. auxiliary contact on transfer switch fails AND normal8. flow switch 217 fails AND primary cleanup flow dropspower outagebelow 75 gpm

  • 8/8/2019 Practical Considerations In Developing an

    12/12

    264 IEEE TRANSA CTIONS ON RELIABILITY, VOL. 38 , NO. 2 , 1989 JUN E

    reduction in primary cleanup flowprimary recirculating pump seals wearprimary cleanup pump fails off during operation

    pump bowl leak

    bearings and seals fail after extended usefuel-cladding failure

    reactivity-control lost or hindered

    1 . primary cleanup pump fails to start on requestreduction in primary cleanup flow1. basic motor failure2. overload relay tripped3. control room switch turned off4. local switch turned off5. timer relay TR-2 fails after flow switch 217 clears6 . timer relay TR-2 fails after auto transfer switch #1 goesback to normal1. basic pump failure2. mechanical seal failure3 . pump vent open4. pump drain value openloss of cooling water1. sufficient flow blocked or diverted away from fuel2. power transients occur1. control plates are jammed2. extension tubes are jammed3. shock tubes are jammed4. tracks are jammed5. moderator shifts to alter the core flux distribution6 . reflector material shifts to alter the core flux distribution

    region

    power transients occur reactivity-control lost or hindered

    REFERENCES[I ] Winfrid G. Schneew eiss, Th e failure of systems with dependent contro l,IEEE Trans. Reliability, vol R-35, 1986 Dec. p p 512-517.[2] J . B. Fussell, J. S . Arendt, System reliability engineering methodology:A discussion on tghe state of the art, Nuclear Safety, vol 20, Sep-Oct

    [3] M. A. S . Guth, A probabilistic foundation for vagueness and impreci-sion in fault tree analysis, revision submitted to IEEE Trans. Reliability,1988, (TR87-042/1).[4] P. M. Morse, Queues, Inventories and Maintenance, John Wiley & Sons,1958.[5] D. N. Khandelwal, Jaydev Sharma, L. M. Ray, Optimal periodicmaintenance policy for machines subject to deterioration and randombreakdown, IEEE Trans. Reliability, vol R-28, 19 79 Oct, pp 328-330.[6] S . E. Emoto, R. E. Schafer, On the specfication of repair time re-quirements, IEEE Trans. Reliability, vol R-29, 1980 Apr, pp 13-16.[7] John M. Sheppard, Discussion o f On the specification of repair timerequirements, IEEE Trans. Reliability, vol R-30, 1981 Apr, pp 36-37.[8] L. Takacs, Introduction to the Theoryof Queues, Oxford University Press,1962.[9] J . Louis, Tylee, On-line failure detection in nuclear power plant instrumen-tation, IEEE Trans. Automatic Control, vol AC-28, 1983 Mar, pp

    1979, pp 541-550.

    406-4 15.

    MANUSCRIPTS RECEIVED M ANUSCRIPTS RECEIVED

    AUTHORDr. A. S. Guth; RJO Enterprises; 116 Oklahoma Avenue; Oak Ridge, Tennessee378308604 USA.

    Michael Anthony Stephen Guth was born in Oak Ridge, Tennessee on1962 August 1. He completed his BA (Economics) from Rice University in1982, his MS (Economics) from California Institute of Technology in 1984,and his PhD (E conomics) from the University of Tennessee in 1988. He workedas a system analyst and economist at the NASA Jet Propulsion L aboratory from1982 - 1984, an economist at and postgraduate research fellow at Oak RidgeNational Laboratory form 1985 - 1988, and since 1988 July as a SeniorTechnical S pecialist with RJO Enterprises. His research interests include uncer-tainty theory, risk evaluation, and mathematical modeling of decision processes.He is a member of the American Economics Association and the OperationsResearch Society of America.Manuscript TR87-704 received 1987 October 8; revised 1988 September 1.IEEE Log Number 24512 4 T R b

    M A NUSCRIPTS RECEIVED MANUSCRIPTS RECEIVED

    A statistical method of obtainin g the factors in electronic-component re liabilit y-prediction models, Zhongsen Yang Dept. of Computer Science Univer-sity of Regina 0 egina, Saskatchewan S4S O A 2 CANADA. (TR89-057)

    Optimal apportionment of reliability & redundancy in series systems undermultiple objectives, Anoop K. Dhingra School of Mechanical Engineer-ing c Purdue University West Lafayette, Indiana 47907 o USA. (TR89-058)