failure data analysis by models involving 3 weibull ... · web viewstate probability of a...

State Probability of a Series-parallel Repairable System

with Two-types of Failure States

Gregory Levitin

Reliability Department, Planning, Development and Technology Division,

Israel Electric Corporation Ltd., P.O. Box 10, Haifa, 31000 Israel

Tieling Zhang, Min Xie

Department of Industrial and Systems Engineering

National University of Singapore, Singapore 117576

ABSTRACT

This paper presents a method for the analysis of series-parallel safety-critical system

where the system states can be distinguished into failure-safe and failure-dangerous. The

method incorporates Markov chain and universal generating function technique. In the

model considered, both periodic inspection and repair (perfect and imperfect) of system

elements are taken into account. The system state distributions and the overall system

safety function are derived based on the developed model. The proposed method is

applicable to complex systems for analyzing state distributions and it is also useful in

decision-making such as determining the optimal proof-test interval or repair resource

allocation. An illustrative example is given.

Keywords: Availability, Safety-critical system, Markov model, Universal generating function,

Periodic inspection, Failure-safe, Failure-dangerous

1

1. IntroductionSafety is of paramount concern for large and complex systems such as nuclear power and

chemical processing plants, aircraft navigation control system, power transmission and

high speed railway networks, and so on. The complexity of large systems raises many

important problems concerning safety such that it may be very difficult or even impossible

to ensure that the systems will always behave as expected under all foreseeable conditions.

Dangerous faults may be caused by not only random hardware failure but also systematic

faults inadvertently designed into the system. Safety analysis or risk assessment for such a

system thus becomes a complex problem that involves study of human factors (human

error), production process, manufacturing control, on-line measurement or test and repair,

diagnosis with periodic inspections and so on. See Dominguez-Garcia et al. (2006), Delon

et al. (2005), Cowing et al. (2004), Marseguerra et al. (2004), Burgazzi (2003) for some

related discussions on the recent reliability related research for safety-critical systems.

The use of safety-critical systems represents taking proactive measures to prevent a

process plant from occurrence of dangerous events. For example, emergency shutdown

controllers are widely used in chemical processing industry. Their function is to monitor a

plant process and to identify if the process is operating within the acceptable limits. If the

process moves outside of an acceptable operation range, the controller automatically shuts

the process down in a safe manner (Bukowski, 2001). In order to provide proper analysis of

safety-critical systems the dangerous and non-dangerous failures should be distinguished,

that are corresponding to failure-safe and failure-dangerous states of the system.

The international standard IEC 61508 (1998) includes two frameworks: One is risk

reduction with Safety-Related System (SRS) and the other is the Overall Safety Life-cycle.

Since its publication, it has been widely adopted in various safety related studies and

applications (see, e.g., Faller, 2004, Hokstad and Cornliussen, 2004, Zhang et al., 2003,

Nunns, 2000, and Knegtering, 1999). A typical architecture of SRS is regarded to consist

of components with diagnosis and periodic inspection, where the failures in each

component are classified into detectable and undetectable. There are a number of studies

on safety-critical systems which correspond to different specific system structures, see,

e.g., some recent references such as Kang and Jang (2006), Kim et al. (2005), Weber et al.

(2005), Lee et al. (2004), Latif-Shabgahi (2004) and Son and Seong (2003).

2

Periodic inspection is important for safety-critical systems and it has been studied in

reliability analysis in general (see, e.g., Cui et al., 2004, Biswas, 2003, Bris et al., 2003,

Bukowski, 2001). In various studies of safety-critical system performance, the effects of

periodic inspection have been either ignored or modeled by assigning quite longer average

repair times for unrecognized degraded states (Zhang et al., 2003 or 2006). In practice, the

unrecognized fault can not be repaired until the next periodic inspection (proof-test). In

fact, the repair for this kind of faults is carried out at determined time. However, only very

few studies have concerned the problem. Bukowski (2001) gives a method of

incorporating periodic inspection and repair into Markov model in which both perfect and

imperfect inspection and repair can be modeled. However, in Bukowski (2001), the

situation that both unrecognized and recognized degraded states may exist simultaneously

was not included in the Markov model. As the unrecognized failure can only be found at

periodic inspection, the two kinds of faults could exist in some period of time.

The purpose of this paper is to present a method for evaluating the probabilities of

failure-safe and failure-dangerous states for arbitrary complex series-parallel systems with

imperfect diagnostics and imperfect periodic inspections and repairs of elements. Each

kind of element failures whatever are of failure-safe or failure-dangerous can be either

detected or undetected. The emphasis is on exact state probability or availability of such a

system. See Bowles and Dobbins (2004), Chandrasekhar et al. (2004) and Carrasco (2004)

for some related study of other systems.

The remainder of this paper is composed of Markov model for determining state

distribution of a single system element, universal generating function technique for

determining state distribution of the entire system and an illustrative example presented.

Acronyms & Notations

FD failure-dangerous state

FS failure-safe state

W operational state

G set of states of element (system): G = {W, FS, FD} structure function

par structure function for elements connected in parallel

ser structure function for elements connected in series

Sj random discrete state variable of element j

3

sjk k-th realization of Sj: sjk G

Fd detected failure

Fu undetected failure

FDd detected failure-dangerous

FDu undetected failure-dangerous

FSd detected failure-safe

FSu undetected failure-safe

pfd probability of failure on demand

pfdD probability of failure-dangerous on demand

pfdS probability of failure-safe on demand system transition rate matrix

0k zero column vector of size k1

1k unit column vector of size k1

PW(t), PFS(t), PFD(t) probability of subsystem or the entire system is in state W, FS, FD at

time t

sd, dd, du, su failure rate of FSd, FDd, FDu, FSu

sd, dd, du, su repair rate of FSd, FDd, FDu, FSu

d fraction of detected failures that are detected correctly

TI Proof-test interval

Assumptions

1. System is composed of elements and each element can experience two categories of

failures: Dangerous and non-dangerous, corresponding respectively to failure-

dangerous and failure-safe events. Failure-dangerous and failure-safe events are

independent.

2. Both categories of failures can be detected and undetected.

3. Detected and undetected failures constitute independent events.

4. Failure rates for both kinds of failures are constant.

5. The element is in operation state if no failure event (detected or undetected) has

occurred.

6. The element is in failure-safe state if at least one non-dangerous failure (detected or

undetected) has occurred and no dangerous failure has occurred.

4

7. The element is in failure-dangerous state if at least one dangerous failure (detected or

undetected) has occurred.

8. The elements are independent and can undergo periodic inspections at different times.

9. The state of any composition of elements is unambiguously defined by the states of

these elements and the nature of elements interaction in the system.

10. The elements’ interaction is represented by series-parallel block diagram.

2. State distribution of single system element

According to IEC 61508, the typical system structure is composed of elements to which

diagnosis and periodic inspection and repair are applied. Failure-safe or failure-dangerous

events can occur independently. The failure category depends on the effects of a fault

occurrence. For example, if a failure results in shutdown of a properly operating process, it

is of the type of failure-safe (FS). This type of failure is referred in a variety of ways to

false trip and false alarm. However, if a safety-critical system fails in an operation which

is required to shut down a process, that could cause hazardous results, such as failure of a

monitor that is applied to control an important process. This type of failure is generally

called failure-dangerous (FD).

Both FS and FD events can be detected or undetected. The detected failure can be

detected instantly by diagnostic devices. An imperfect diagnosis model presumes that a

fraction d of detected failures can be detected instantaneously by diagnostic devices.

Whenever the failure of this kind is detected, the on-line repair is initiated. The failures

that can not be detected by the diagnostic devices or remain undetected because of the

imperfect diagnosis are considered to be undetected failures. These failures can be found

only by the proof-test (periodical inspection) just after the end of a proof-test interval. We

assume that failure rates of detected failure-safe and failure-dangerous (sd and dd,

respectively) as well as undetected failure-safe and failure-dangerous (su and du,

respectively) can be calculated or elicited from tests.

The state of any single element can be represented as combination of two independent

states corresponding to detected and undetected failures. Each of the two failures can be in

three different states of no failure (state W), failure of category FS and failure of category

FD. According to assumptions 5-7, the state of each element can be determined based on

each combination of states of failures using Table 1.

5

Table 1. States of single element.

The state of each element j can be represented by a discrete random variable Sj that

takes values from the set G = {W, FS, FD}. In order to obtain the element state

distribution pjW = Pr(Sj = W), pjFS = Pr(Sj = FS) and pjFD = Pr(Sj = FD), one should

summarize the probabilities of any combination of states of detected and undetected

failures that results in the element states W, FS and FD, respectively. Based on element

state transition analysis, one can obtain the Markov state transition diagram presented in

Fig. 1. In this diagram, each possible combination of the states of detected and undetected

failures (marked inside the cycles) belongs to one of the three sets corresponding to three

different states of element defined according to Table 1.

Practically, no repair action is applied to the undetected failure until the next proof-test.

In general, the periodic inspection and repair take very short time when comparing to the

proof test interval TI, and the whole system stops operation (in down state) during the

process of periodic inspection and repair. Therefore, it is reasonable to set repair rates for

undetected failures du = su = 0 when analyzing the behavior of a safety-critical system

within the proof test interval (unlike equivalent repair rates for du and su used in Zhang et

al. (2003).

6

Detected FailureW FSd FDd

Undetected

Failure

W W FS FDFSu FS FS FDFDu FD FD FD

Fig. 1. Markov state transition diagram used for calculating state distribution of a single element.

According to Fig. 1, the following group of equations describes the element’s

behavior:

Pj(t) = Pj(t) j (1)

Pj(t) = (pj1(t), pj2(t), …, pj9(t)) is the vector of state probabilities, P(t) is derivative of P(t)

with respect to t, and j is transition rate matrix, see appendix. According to Table 1, state

1 in the Markov diagram corresponds to state W of the element, states 2 - 4 correspond to

state FS of the element and states 5 - 9 correspond to state FD of the element. Having the

solution P(t) of Eq. (1) for any element j, one can obtain pjW = pj1, pjFS = pj2 + pj3 + pj4 and

pjFD = pj5+ pj6 + pj7 + pj8 + pj9. The solution of Eq. (1) can be expressed as

Pj(t) = Pj(0) exp(j t), for t 0; (2)

Pj(t) = Pj(n TI+) exp(j (t n TI)), for n TI

+ t (n +1) TI+ , n = 0, 1, 2,

To consider imperfect inspection and repair, the undetected fault can not be repaired as

good as new and some may still exist after inspection and repair. A matrix Mji is used to

describe this behavior. Each element of the matrix Mji describes the transition rate of

probability from one state to another. Thus, we have

Pj(TI+) = Pj(TI) Mj1 = Pj(0) exp(j TI) Mj1 (3)

Pj(2TI+) = Pj(2TI) Mj2 = Pj(0) exp(j TI) Mj1 exp(j TI) Mj2

Pj(n TI+) = Pj(n TI) Mjn

W, W Detected Undetected

W, FSu FSd, W W, FDu FDd, W

su

sd

du

dd

sd

su

sddu

dd

FSd, FDu FDd, FDuFDd, FSu

dd

dusu

du

su

sddu su

sd

sd

dd

du

su

dd

dd

FS FD

W1

2 3

4

5 6

7 98

7

FSd, FSu

= Pj((n 1 )TI+) exp(j TI) Mjn

= Pj((n 2 )TI+) exp(j TI) Mj(n 1) exp(j TI) Mjn

= Pj(0) exp(j TI) Mj1 exp(j TI) Mj2

exp(j TI) Mj(n 1) exp(j TI) Mjn for n = 1, 2, 3,

(4)

In Eq. (4), n represents the nth proof-test interval and Mji (i = 1, 2, 3, , n) is matrix

associated with the ith proof-test.

3. State distribution of the entire series-parallel system

In order to obtain the state distribution of the entire system, the procedure used in this paper is based on the universal generating function (u-function) technique. This method was introduced in Ushakov (1987) and has shown to be very effective for the reliability evaluation of different types of multi-state systems, see Levitin et al. (1998) and Lisnianski and Levitin (2003). The comprehensive description of the method and its numerous applications in reliability engineering can be found in (Levitin, 2005). For some recent and related applications, see e.g., Levitin (2004 and 2005), and Korczak et al. (2006).

The u-function of a discrete random variable Y is defined as a polynomial

(5)

where the variable Y has K possible values and qk is the probability that Y takes the value

of yk. In our case, the polynomial u(z) can define state distributions, i.e. it represents all of

the possible mutually exclusive states of the element (or any subsystem) by relating the

probabilities of each state to the value that takes the random state variable corresponding

to this element (subsystem) in that state. Note that the performance distribution of the

basic element j (probability mass function of discrete random variable Sj) can now be

represented as

, (6)

where sj1 = FD, sj2 = FS, sj3 = W for any j.

To obtain the u-function of a subsystem consisting of two elements, composition

operators are introduced. These operators determine the u-function for two elements

8

connected in parallel and in series, respectively, using simple algebraic operations on the

individual u-functions of basic elements. All the composition operators take the form

. (7)

The obtained u-function relates the probability of each combination of states of the

independent elements (which is equal to the product of the probabilities of these states) to

the value that the random state variable of the entire subsystem takes when this

combination is realized. The function (.) in composition operators expresses the

dependence of the entire subsystem state on the states of both of its elements. The

definition of the function (.) strictly depends on the physical nature of the system and on

the nature of the interaction of the system elements.

The structure functions for pairs of elements connected in parallel and in series should

be defined for any specific application based on analysis of system functioning. For

example, in the widely applied conservative approach the following assumptions are

made. Any subsystem consisting of two parallel elements is in failure-dangerous state if at

least one of elements is in failure-dangerous state and is in operational state if at least one

of the elements is in operational state. In the rest of cases, the subsystem is in failure-safe

state. This can be expressed by the structure function par(.) presented in Table 2. A

subsystem consisting of two elements connected in series is in the operational state if both

of the elements are in the operational state, whereas it is in failure-dangerous state if at

least one of elements is in failure-dangerous state. In the rest of cases, the subsystem is in

failure-safe state. This can be expressed by the structure function ser(.) presented in Table

3.

Table 2. Structure function for pair of elements connected in parallel.

Table 3. Structure function for pair of elements connected in series.

9

Element 1W FS FD

Element 2

W W FS FDFS FS FS FDFD FD FD FD

Element 1W FS FD

Element 2

W W W FDFS W FS FDFD FD FD FD

In the numerical realization of the composition operator in Eq. (7), we can encode the

states W, FS and FD by integer numbers 3, 2 and 1, respectively, as such sjk = k for any j.

In our case, k = 1, 2, 3. It can be seen that in this case the defined above functions par(.)

and ser(.) take the form:

par(sjk, sih) = and ser(sjk, sih) = min(sjk, sih).

Note that the nine possible different combinations of element states produce only three

possible states of the subsystem. The probabilities of combinations that produce the same

subsystem state should be summed in order to obtain this state probability. This can be

done by collecting terms with equal exponents in the u-function obtained by Eq. (7).

Finally, any subsystem state distribution can be represented by the u-function taking the

form of Eq. (6).

Any subsystem consisting of two elements can be further treated as a single equivalent

element with a performance distribution that is equal to the performance distribution of

this subsystem. Consecutively applying the composition operators and replacing pairs of

elements by equivalent elements, one can obtain the u-function representing the

performance distribution of the entire system.

The recursive algorithm

The following recursive algorithm obtains the u-function that represents the entire system

state distribution:

Step 1. Obtain the state probabilities for each element j using the Markov

transition diagram method presented in Section 2.

Step 2. Define the u-functions uj(z) for each element j using Eq. (6).

Step 3. If the system contains a pair elements connected in parallel or in a

series, replace this pair with an equivalent element with u-function obtained

by operator of Eq. (7) with the structure functions par(.) and ser(.),

respectively.

Step 4. If the system contains more than one element, return to Step 3.

Otherwise, the algorithm stops.

10

The coefficients of the obtained u-function are equal to probabilities of operational,

failure-safe and failure-dangerous states of the entire system.

With the state probabilities of each element in the form of functions of time, one can

use the algorithm presented above to get the probability values corresponding to any given

time. Finally, the entire system state probabilities and the overall system safety (defined as

the sum of operational probability and failure-safe state probability) as functions of time

can be obtained. In the following section, we use an example to illustrate the procedure

described here.

4. Illustrative example

Consider a combine-cycle power plant with two generating units. Each unit consists of a

gas turbine blocks and fuel supply systems. The fuel to each turbine block can be supplied

by two parallel systems. The simplified reliability block diagram of the plant is presented

in Fig. 2. Each fuel supply system as well as each turbine can experience both safe and

dangerous failures (detected and undetected).

Fig. 2. Reliability block diagram of combine cycle power plant

The parameters of fuel supply systems are: sd = 2.5610-5, su= 10-5, dd= 8.910-6,

du = 110-6, sd = 0.25; dd = 0.0833, su= du = 0; d = 0.99; TI = 1.5 years. The fuel

supply systems are statistically identical, but the inspection times of systems 2 and 4 are

shifted 0.5 year earlier relatively to inspection times of systems 1 and 3. The matrix Mji

11

1

25

3

46

Fuel supply systems

Turbine block

associated with each fuel supply system is M1i (i = 1, 2, 3, 4) as shown in Eq. (A2) in

Appendix.

The turbine blocks are also statistically identical. The parameters of the turbine blocks

are: sd = 2.5610-5, su= 6.54010-6, dd= 7.910-6, du = 7.810-7; sd = 0.25, dd =

0.0625, su= du = 0; d = 0.99; TI = 2 years. The matrix Mji associated with each turbine

block is M2i (i = 1, 2, 3) as shown in Eq. (A3) in Appendix.

The probabilities pjW(t), pjFS(t) and pjFD(t) for each system element obtained by solving

equations (2) and (3) for a period of time, 65000 hours, are presented in Fig. 3 - 5. At the

same time, the probabilities PW(t), PFS(t) and PFD(t) for single generating unit and for the

entire system (the structure functions are defined in accordance with Tables 2 and 3,

respectively), obtained using the algorithm given in Section 3, are also presented in Fig. 3

through 5. These figures show that the variations of these probabilities for single

generating unit and the entire system have also the property of periodicity.

The system safety S(t)=PW(t)+PFS(t) as the function of time is presented in Fig. 6.

0.84

0.88

0.92

0.96

1

0 10 20 30 40 50 60

t (thousands of hours)

PW

elements 1,3 elements 2,4 elements 5,6

single unit system

Fig. 3. Probabilities of working states

12

0

0.04

0.08

0.12

0 10 20 30 40 50 60


PS

elements 1,3 elements 2,4 elements 5,6 single unit system

Fig. 4. Probabilities of failure-safe states

0

0.016

0.032

0.048

0.064

0.08

0 10 20 30 40 50 60


PD

elements 1,3 elements 2,4 elements 5,6

single unit system

Fig. 5. Probabilities of failure-dangerous states

13

0.9

0.92

0.94

0.96

0.98

1

0 10 20 30 40 50 60


S

Fig. 6. Overall system safety

5. Conclusions

In this paper a method is proposed for the study of series-parallel systems with

imperfect diagnostics and imperfect periodic inspections and repairs of elements. Element

failures can be failure-safe and failure-dangerous and can be either detected or undetected.

The proposed model incorporates periodic inspection and repair (both perfect and

imperfect) of system elements. The Markov model is used for the determination of state

distribution of a single system element, while universal generating function technique for

state distribution of the entire system. The presented example shows that the procedure

can be easily implemented to estimate the state probabilities and the overall safety of a

safety-critical system.

The method presented in this paper can be applied to different research fields such as

power generation units, electronic devices and chips, data storage based on redundant

array of inexpensive disks (Katz et al., 1989; Gibson and Patterson, 1993, etc.) and so on. It

can be used for evaluating safety of a fault-tolerant single-chip multiple microprocessors

architecture (Yao, et al., 2004) which represents a promising solution to partly mitigate the

system faults and to increase the system dependability in mission-critical applications.

14

Acknowledgement:

This research was carried out while the first author was visiting National University of

Singapore supported by the research grant R-266-000-020-112 at National University of

Singapore. The authors would like to thank three referees for their constructive comments.

References

Biswas, A.; Sarkar, J. and Sarkar, S. (2003). Availability of a periodically inspected system, maintained under an imperfect-repair policy. IEEE Transactions on Reliability, 52 (3), 311-318.

Bowles, J.B. and Dobbins, J.G. (2004). Approximate reliability and availability models for high availability and fault-tolerant systems with repair. Quality and Reliability Engineering International, 20 (7), 679-697.

Bris, R., Chatelet, E. and Yalaoui, F. (2003). New method to minimize the preventive maintenance cost of series-parallel systems. Reliability Engineering & System Safety, 82 (3), 247-255.

Bukowski, J.W. (2001). Modeling and analyzing the effects of periodic inspection on the performance of safety-critical systems, IEEE Transactions on Reliability, 50 (2), 321 – 329.

Burgazzi, L. (2003). Reliability evaluation of passive systems through functional reliability assessment. Nuclear Technology, 144 (2), 145-151.

Carrasco, J.A. (2004). Solving large interval availability models using a model transformation approach. Computers & Operations Research, 31 (6), 807-861.

Chandrasekhar, P.; Natarajan, R. and Yadavalli, V.S.S. (2004). A study on a two unit standby system with Erlangian repair time. Asia-Pacific Journal of Operational Research, 21 (3), 271-277

Cowing, M.M.; Pate-Cornell, M.E. and Glynn, P.W. (2004). Dynamic modeling of the tradeoff between productivity and safety in critical engineering systems. Reliability Engineering & System Safety, 86 (3), 269-284.

Cui, L.R.; Loh, H.T. and Xie, M. (2004). Sequential inspection strategy for multiple systems under availability requirement. European Journal of Operational Research, 155 (1), 170-177.

DeLong, T.A.; Smith, D.T. and Johnson, B.W. (2005). Dependability metrics to assess safety-critical systems. IEEE Transactions on Reliability, 54, 498-505.

Dominguez-Garcia, A.D.; Kassakian, J.G. and Schindall, J.E. (2006). Reliability evaluation of the power supply of an electrical power net for safety-relevant applications. Reliability Engineering & System Safety, 91, 505-514.

Faller, R. (2004). Project experience with IEC 61508 and its consequences. Safety Science, 42 (5), 405-422.Gibson G. A. and Patterson D.A. (1993). Designing Disk Arrays for High Data Reliability, Journal of

Parallel and Distributed Computing, 17, 4 – 27. Goble, W.M. (1998). Control Systems Safety Evaluation and Reliability, 2nd ed: ISA. Hokstad, P. and Corneliussen, J. (2004). Loss of safety assessment and the IEC 61508 standard. Reliability

Engineering & System Safety, 83 (1), 111-120.IEC 61508 (1998). Functional safety of electric/electronic/programmable electronic safety-related systems,

Parts. 1–7, October 1998–May 2000. Inagaki, T. and Ikebe, Y. (1989). Performance analysis of a safety monitoring system under human-machine

interface of safety-presentation type, Microelectronics and Reliability, 29 (2), 1989, 165 – 175. Kang, H.G. and Jang, S.C. (2006). Application of condition-based HRA method for a manual actuation of

the safety features in a nuclear power plant. Reliability Engineering & System Safety, 91, 627-633.

15

Katz R.H.; Gibson G.A. and Patterson D. (1989). Disk System Architectures for High Performance Computing, Proceedings of the IEEE, 77, No. 12, pp. 1842 – 1858.

Kim, H.; Lee, H. and Lee, K. (2005). The design and analysis of AVTMR (all voting triple modular redundancy) and dual-duplex system. Reliability Engineering & System Safety, 88, 291-300.

Korczak, E.; Levitin, G and Ben Haim. H. (2005). Survivability of series-parallel systems with multilevel protection. Reliability Engineering & System Safety, 66, 45-54.

Knegtering, B. and Brombacher, A.C. (1999). Application of micro Markov models for quantitative safety assessment to determine safety integrity levels as defined by the IEC 61508 standard for functional safety. Reliability Engineering & System Safety, 66 (2), 171-175.

Latif-Shabgahi, G.; Bass, J.M. and Bennett, S. (2004). Taxonomy for software voting algorithms used in safety-critical systems. IEEE Transactions on Reliability, 53 (3), 319-328.

Lee, D.Y.; Han, J.B. and Lyou, J. (2004). Reliability analysis of the reactor protection system with fault diagnosis. Key Engineering Materials, 270, 1749-1754.

Levitin, G. (2004). A universal generating function approach for the analysis of multi-state systems with dependent elements. Reliability Engineering & System Safety, 66, 285-292.

Levitin, G. (2005). Uneven allocation of elements in linear multi-state sliding window system. Eyropean Journal of Operational Research, 163, 418-433.

Levitin G.; Lisnianski A.; Beh-Haim H. and Elmakis, D. (1998). Redundancy optimization for series-parallel multi-state systems, IEEE Transactions on Reliability, 47 (2), 165-172.

Lisnianski, A. and Levitin, G. (2003). Multi-state System Reliability, World Scientific, Singapore.Levitin, G. (2005). The Universal Generating Function in Reliability Analysis and Optimisation. Springer-

Verlag: Berlin, Springer Series in Reliability Engineering.Marseguerra, M.; Zio, E. and Podofillini, L. (2004). A multiobjective genetic algorithm approach to the

optimization of the technical specifications of a nuclear safety system. Reliability Engineering & System Safety, 84 (1), 87-99.

Nunns, S.R. (2000). Conformity assessment of safety related systems to IEC 61508 - the CASS initiative. Computing & Control Engineering Journal, 11 (1), 33-39.

Olbrich, T; Richardson, A.M.D. and Bradley, D.A. (1996). Built-in self-test and diagnostic support for safety critical Microsystems, Microelectronics and Reliability, 36, 1125– 1136.

Son, H.S. and Seong, P.H. (2003). Development of a safety critical software requirements verification method with combined CPN and PVS: a nuclear power plant protection system application. Reliability Engineering & System Safety, 80 (1), 19-32.

Ushakov I., (1987). Optimal standby problems and a universal generating function, Soviet Journal of Computer System Science, 25, 79-82.

Wang, D. and Inagaki, T. (1994).Time-dependent optimality of an alarm subsystem, Microelectronics and Reliability, 34, 1623 – 1633.

Weber, W.; Tondok, H. and Bachmayer, M.B. (2005). Enhancing software safety by fault trees: experiences from an application to flight critical software. Reliability Engineering & System Safety, 89, 57-70.

Yao, W.B.; Wang D.S. and Zheng W.M. (2004). A Fault-tolerant Single-chip Multiprocessor, ACSAC 2004 Proceedings of Advances in Computer Systems Architecture: 9 th Asia-Pacific Conference, Pen-Cheng Yew and Jingling Xue (eds.), Berlin: Springer, 2004, p. 137-145.

Zhang, T.L.; Long, W. and Sato, Y. (2003). Availability of systems with self-diagnostic components—applying Markov model to IEC 61508-6, Reliability Engineering & System Safety, 80, 133 – 141.

Zhang, T.L.; Xie, M. and Horigome, M. (2006). Availability and reliability of k-out-of-(M plus N): G warm standby systems. Reliability Engineering & System Safety, 91, 381-387.

Zhou, Z. (1987). Analysis of a two unit standby redundant fail-safe system. Microelectronics and Reliability, 27, 469 – 474.

16

Appendix

The transition rate matrix for one element is

c su sd du dd

0 0 0

0

0 0 0 0sd 0

dd

0

sd 0 0 0

su du 0 0

0

0 0 0 0

sd 0 dd

dd 0 0 0 0 0 su du

0 sd 0 0 0 sd 0 0 00 0 0 sd 0 0 sd 0 00 dd 0 0 0 0 0 dd 00 0 0 dd 0 0 0 0 dd

where c = sd + dd + du + su .

The matrices M1i (i = 1, 2, 3, 4) for fuel supply system are

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.90 0.10 0 0

1 0 0 00.80 0 0 0.20 15 05 05 05

17

(sd +dd)

(su + ddu +sd )

(sd +dd)

(su +du+ dd )j = (A1)

M11 = ,

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.88 0.12 0 0

1 0 0 0

0.776 0 00.22

4 15 05 05 05

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.85 0.15 0 0

1 0 0 0

0.747 0 00.25

3 15 05 05 05

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.808 0.192 0 0

1 0 0 0

0.711 0 00.28

9 15 05 05 05

The matrices M2i (i = 1, 2, 3) for turbine block are

18

M12 = ,

M13 = ,

M14 = .

(A2)

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.92 0.08 0 0

1 0 0 00.85 0 0 0.15 15 05 05 05

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.804 0.096 0 0

1 0 0 0

0.832 0 00.16

8 15 05 05 05

p1

p2 p3 p4

p5 p6 p7 p8 p9

1 0 0 009 09 09 09 090.882 0.118 0 0

1 0 0 0

0.810 0 00.19

0 15 05 05 05

19

M21 = ,

M22 = ,

M23 = .

(A3)

Gregory Levitin received a PhD degree in Industrial Automation from Moscow Research Institute of Metalworking Machines in 1989. From 1982 to 1990 he worked as software engineer and research associate in the field of industrial automation. From 1991 to 1993 he worked at the Technion (Israel Institute of Technology) as a postdoctoral fellow at the faculty of Industrial Engineering and Management. Dr. Levitin is presently an engineer-expert at the Reliability Department of the Israel Electric Corporation and adjunct senior lecturer at the Technion. His current interests are in operations research and artificial intelligence applications in reliability and power engineering. In this field Dr. Levitin has published over 100 papers and two books. He is senior member of IEEE. He serves in editorial boards of IEEE Transactions on Reliability and Reliability Engineering and System Safety.

Tieling Zhang received a Ph.D. in engineering from Tokyo University of Mercantile Marine in 2001. He has six years’ experience of teaching, three years’ working in industry and a few years holding research positions. Currently he is with Hitachi GST, Singapore. He has 30 articles included in peer-review journals and international conference proceedings. He holds a new practical patent of China. His research interests include system reliability, maintainability and safety, system optimization and vibration control.

Min Xie received his Ph.D. in Quality Technology from Linkoping University, Sweden, in 1987. Dr Xie has been active in reliability and quality related research since then. He has authored or co-authored over 100 articles in refereed journals and 6 books, including Software Reliability Modelling by World Scientific, Statistical Models and Control Charts for High Quality Processes by Kluwer Academic Publisher, and Weibull Models by John Wiley & Sons. He is a department editor of IIE Transactions, an associate editor of IEEE Trans on Reliability, and on the editorial board of several other journals. He is a fellow of IEEE.

20

failure data analysis by models involving 3 weibull ... · web viewstate probability of a...

Documents