partial stroke test

10
Form 7013 Issue 1 Page 1 of 10 Assessment of PVST in Accordance with IEC 61508:2010 For many years, diagnostics have been implemented for field devices (smart devices and final elements) comprising of shutdown valves, actuator and solenoid valves, using online measurement and testing technologies. Process plants strive for maximum efficiency and in doing so look to ensure continual operation avoiding sudden shutdown, very often reliant upon condition monitoring to optimize maintenance activities. Online product diagnostic is one of the measures that is used to identify the critical conditions where maintenance can be scheduled reliably, to extend the operating life of the plant and reduce cost. Partial stroke testing (PST) is widely used in the final element to ensure product availability, extend the proof test interval and improve the probability of failure on demand (PFD), but not the safe failure fraction. This PST can be integrated internally or implemented externally to the final element. Before the release of IEC61508:2010 (Ed 2) many final elements were certified and achieved a safety integrity level of SIL 3, on the basis of having PST as a tool for claiming detection of dangerous undetected failures without looking deeper to the indirect effects to the overall structure of the type of the product and subsequently to the overall safety instrumented function. This paper discusses some of the changes made in IEC 61508:2010 (edition 2) which may affect the assessment of the final element with diagnostic, implication of these changes on product certification and its impact on the currently certified products assessed as per edition 1. Example is reviewed with and without PST using Fault Tree analysis and recommendation made when reviewing the overall SIL as a result of these changes. By: Dr H El-Sayed, FS Consultant, SIRA Test and Certification Ltd, [email protected] 1 Introduction Partial Stroke testing (PST) is a useful feature in the final element as it has been used widely in safety instrumented system, mainly because these final elements are considered as slave devices. Hence the use of PST has contributed in the improvement of the product availability in the safety instrumented function (SIF) and considered as a measure of enhancing the confidence level in responding to a demand of a safety function such as an emergency shutdown (ESD). The technique has been used over the years and is well known. Having got this feature in the final element, a list of questions must be asked, such as: Will this tool contribute to reveal all or part of the claimed undetected dangerous failures λDU?

Upload: omar-rigane

Post on 12-Nov-2015

17 views

Category:

Documents


5 download

DESCRIPTION

PARTIAL STROKE TEST PFD determination

TRANSCRIPT

  • Form 7013 Issue 1 Page 1 of 10

    Assessment of PVST in Accordance with IEC 61508:2010 For many years, diagnostics have been implemented for field devices (smart devices and final elements) comprising of shutdown valves, actuator and solenoid valves, using online measurement and testing technologies. Process plants strive for maximum efficiency and in doing so look to ensure continual operation avoiding sudden shutdown, very often reliant upon condition monitoring to optimize maintenance activities. Online product diagnostic is one of the measures that is used to identify the critical conditions where maintenance can be scheduled reliably, to extend the operating life of the plant and reduce cost. Partial stroke testing (PST) is widely used in the final element to ensure product availability, extend the proof test interval and improve the probability of failure on demand (PFD), but not the safe failure fraction. This PST can be integrated internally or implemented externally to the final element. Before the release of IEC61508:2010 (Ed 2) many final elements were certified and achieved a safety integrity level of SIL 3, on the basis of having PST as a tool for claiming detection of dangerous undetected failures without looking deeper to the indirect effects to the overall structure of the type of the product and subsequently to the overall safety instrumented function. This paper discusses some of the changes made in IEC 61508:2010 (edition 2) which may affect the assessment of the final element with diagnostic, implication of these changes on product certification and its impact on the currently certified products assessed as per edition 1. Example is reviewed with and without PST using Fault Tree analysis and recommendation made when reviewing the overall SIL as a result of these changes. By: Dr H El-Sayed, FS Consultant, SIRA Test and Certification Ltd, [email protected]

    1 Introduction

    Partial Stroke testing (PST) is a useful feature in the final element as it has been used widely in safety instrumented system, mainly because these final elements are considered as slave devices. Hence the use of PST has contributed in the improvement of the product availability in the safety instrumented function (SIF) and considered as a measure of enhancing the confidence level in responding to a demand of a safety function such as an emergency shutdown (ESD). The technique has been used over the years and is well known. Having got this feature in the final element, a list of questions must be asked, such as:

    Will this tool contribute to reveal all or part of the claimed undetected dangerous failures DU?

  • Form 7013 Issue 1 Page 2 of 10

    What is the impact of this feature on the type of the final element classification (is it type A or type B as defined in IEC 61508-2, clause 7.4.4.1, [1]?)

    Is this function defined in accordance with IEC 61508 (ed.2) as a diagnostic function or partial proof test?

    Is there any difference if this function is performed externally or internally to the final element?

    What is the impact on the safe failure fraction, and how this function is treated under IEC 61508 edition 1 in comparison to edition 2?

    What is the status on the already certified products under IEC 61508 (ed.1) against the changes introduced in (ed.2)?

    How is the functional safety capability of a product defined in the IEC61508 (ed.2)?

    What is the final element safety integrity level (SIL) with and without PST? The list goes on and on.

    The objective of this paper is not to discuss the techniques of the PST implementation. It is rather to clarify the above stated issues since the author has identified an improper assessment of a number of certified products which led to create confusion to the end users and system integrators in construction of a proper safety instrumented system. Some safety system designers just follow products certificates ignoring which standard and edition the products were certified to without looking deeply into the hardware architecture constraints and the systematic capability assessment achieved. Clients may have felt guilty to ask for the assessment reports without any further questioning if these certificates are actually treated with the same rigour and consistency as ATEX or IECEx, which may include accreditation and so on.

    2 The past (ed1), the present (ed2) of the IEC 61508 Since the release of the first edition of IEC 61508:2001 [1], the market has been flooded with products certified to various safety integrity levels (SIL 1, 2 or 3) using a quantitative approach such as the failure mode and effect analysis (FMEA or sometimes defined as FMEDA) or based on feedback of field data (known as proven in use) methods. The market is swamped with products claiming SIL 3 with hardware fault tolerance equal to zero (no hardware redundancy, HFT=0). Some of the final elements claimed less than 5 FITs as the dangerous failures, such as solenoid, claimed du less than 0.5 FIT with a PFDAVG (2E-7) i.e MTBF (500,000 yrs). How can this product achieve 2 orders of magnitude better than others? Experienced safety process engineers may be able to recognise that such type of assessment is not realistic. This hypothesis in claiming such low dangerous failures will of course lead to vulnerable SIFs because of less of redundancy (HFT=0) and proof testing is probably not required anymore. A number of changes have been introduced in edition 2 to clarify the type of failures involved in determining the safe failure fraction (SFF) and the effect of diagnostic role in the determination of the overall SFF. For example, edition 1 did not specify the no effect (NE) failures and no part (NP) failures. These two types of failures were considered as safe failures and added to the total sum of the identified safe failures. Making the percentage figure of the safe failures significantly higher than the dangerous

  • Form 7013 Issue 1 Page 3 of 10

    failures that made the overall safe failure fraction (SFF) delivers higher SIL. While in Edition 2, it was made clear in (Annex C, part 2) that these types of failures shall not play any part in the calculation of the diagnostic coverage or the safe failure fraction. In regards to diagnostic issue, Edition 2 (part 2, clause 7.4.9.4-J) [2] has specified that the failure rate of the diagnostics, due to random hardware failures, must be considered in the realisation of the SFF or diagnostic coverage. Edition 2 considers the part (diagnostics) contributes in the detection of the dangerous components and making considerable improvement in the diagnostic coverage and claiming higher SIL. As this diagnostic part plays an important role in detecting a percentage of the dangerous components, and subsequently a better SFF, this diagnostic part must be considered as part of the subsystem, similarly failure rates and modes must also be considered for the diagnostic part. Therefore edition 2 considers the failure rate of the diagnostic part of the overall failure rate because it plays a part in implementing the diagnostic coverage and architecture constraints. This diagnostic part was not defined clearly in Edition 1 and left for individuals to make their own interpretation. So under edition 1, a credit for diagnostic coverage and SFF were claimed at the expense of the random hardware failure of diagnostic contribution which was left out. According to Edition 2 [2], diagnostics can be internal or external as defined in clause (part 2, 7.4.9.4), that a safety function can fail as a result of random hardware failures which are detected by the internal diagnostics tests or detectable by diagnostics externally to the element. Also, diagnostic is identified as credit, as defined in NOTE 2: that the diagnostic coverage and diagnostic test interval are required to allow credit to be claimed for the action of the diagnostic tests performed in the element in the hardware safety integrity model of the E/E/PE safety related system part 2 (clauses 7.4.5.2, 7.4.5.3 and 7.4.5.4) [2]. In particular as a means of using external diagnostics for the detection of failure modes of a specific function, sufficient information shall be provided to facilitate the development of an external diagnostics capability. The information shall include details of failure modes and their failure rates, (part 2, Annex D) [2]. As previously stated above, diagnostics as described in Edition 2 can be internal or external to the subsystem. In either case, random hardware failure assessment of the diagnostic is required and shall be taken into consideration.

    3 Classification of partial stroke testing PST is one of the features that most of the valve manufactures claim in their products specification. It is a method for satisfying the need to return the subsystem close to as good as new while a full test interval (full stroke test) is performed to evaluate the full subsystem parameters and performance. PST is widely described in many articles and vendor data sheets [4, 5 and 6] to the extent that a valve with PST can be deployed in SIL3, as PST is interpreted as a kind of diagnostic test, therefore, eliminating

  • Form 7013 Issue 1 Page 4 of 10

    the use of redundancy (HFT=0), reducing the probability of failure on demand due to random hardware failure, and some claims are providing unsubstantial arguments for increase on the safe failure fraction to derive higher SIL i.e valves with PST sold as a very cost effective alternative of using one valve instead of installing two valves on SIL 3 safety instrumented systems. Under edition 1, the above argument was adopted widely in the industry. It became clear to several end users, vendors and system integrators that final elements with PST bought into this promotional marketing claims for single devices capable for SIL 3, able to claim 3 order of magnitude risk reduction with only a single valve, no redundancy and as a true expert commented it is too good to be true. The claims are accepted by the market, since the arguments are as far as the valves are certified by recognised certifying agencies, hence no substantial questions need to be raised about the products assessment and the validity of these results. Under edition 2, the definition of diagnostics and its applications becomes clearer, which will be described in the followings sections. One of the main reason behind this paper was that the market is still under the influence and understanding that PST is considered as diagnostics tool, i.e referred as (DC: diagnostic coverage) and credit is granted for this type of tests, and not counted in wide scope as a complementary tool as a proof test coverage to a full stroke testing coverage. The author will first quote the definition of diagnostics as described in IEC 61508 edition 2, and then as part of the compliance, a separate example will be considered to illustrate the effects of diagnostic on the PFD calculation and its impact on the safe failure fraction.

    4 Diagnostics in accordance with IEC 61508 ed.2 The definition of diagnostics coverage under IEC 61508 edition 2 [7], says fraction of dangerous failures detected by automatic on-line diagnostic tests. The fraction of dangerous failures is computed by using the dangerous failure rates associated with the detected dangerous failures divided by the total rate of dangerous failures. If PST is considered as diagnostic on line function, then a credit shall only be taken for the diagnostic if the sum of the diagnostic test interval and the time to perform the repair of a detected failure is less than the mean time to restoration (MTTR) used in the calculation to determine the achieved safety integrity for that safety function. MTTR consists of the time to detect the failure (diagnostic time) plus the mean repair time which consists of the time spent before starting the repair, the effective time to repair and the time before the component is put back into operation, (IEC 61508-4) [7]. According to part 2 clause (7.4.5.3) when quantifying the effect of random hardware failures of a subsystem, operating in high demand mode or continuous, credit shall only be taken for the diagnostic test if:

    a) The sum of the diagnostic test interval and the time to perform the specified action to achieve or maintain a safe state is less than the process safety time; or

  • Form 7013 Issue 1 Page 5 of 10

    b) In high demand mode of operation the ratio of the diagnostic rate to the demand rate equals or exceeds 100.

    As per clause (7.4.5.4), the diagnostic test interval of any subsystem (operating in low demand mode), shall be such that the sum of diagnostic test interval and the time to perform the repair of a detected failure is less than the MTTR used in the calculation to determine the achieved safety integrity for that safety function. The maximum approximate active time quoted to complete one PST test is no more than 2 minutes [8], and since PST is typically used at a relatively low rate for example between (weeks to months) for the next interval test time. Based on the above definition, if PST is claimed as diagnostic interval test, and if it can be achieved within the time window to repair, which is between 8 to 72 hours, not as claimed that PST can be performed within a few weeks to months then the implication would be that MTTR would be an immensely high value [9]. However, as stated in [9], the key difference here is that the means of true diagnostic is that an increase in the proportion of the detected failures and decrease in the undetected failures, and thereby improve the diagnostic coverage and automatically may lead to an improvement in the safe failure fraction. If (PST) is considered as partial proof test, it does not enhance the SFF; it is a tool which can be considered to improve the Probability of Failure on Demand (PFD). The next section will consider the implication of using PST as a means of diagnostic test not as partial proof test on an industrial application.

    5 Implication of PST on a safety instrumented function It was discussed above that diagnostic can be implemented internally; a specific hardware and firmware fully integrated with the actuated valve which can be manually activated or remotely using DCS or Logic Solver. Diagnostic can also be remotely functioned on line or configured as part of the DCS. In either case, the implementation of such technique if coupled to the valve assembly and becomes one piece of hardware and making claim of dangerous detected failures the full assembly becomes type B as quoted by a few references [4.10] In the following example, PST is assumed to be an external diagnostic function integrated locally at the valve site, and interfaced directly with the logic solver. The diagnostic test period is configured for 2 days, and the diagnostic cover factor is assumed 70%. The example is designed to study the impact on the SFF and the PFDavg. Table 2 of IEC 61508-1 and Table 3 of IEC 61508-2 will be consulted for the PFDavg and SFF, respectively. Note that the random hardware failures are considered constant within the useful life of the valve assembly with regular maintenance provided. The example will be illustrated by using Fault Tree Analysis and the failure rate data are extracted from currently marketed products from different manufactures. It is assumed that the target of the SIF is SIL 2. The requirement is to keep the full proof test interval for one year. Figure 1 shows a schematic block diagram of an overpressure SIF flow line. The intention here is to make use of PST as diagnostic coverage

  • Form 7013 Issue 1 Page 6 of 10

    to improve the PFDavg. The author wants to show what implication would be on the SFF, even if PST is considered as diagnostics. This is for illustration only to show that if PST is used as diagnostics to reduce undetected dangerous failure, the impact on the SFF does not make the element higher SIL. MTTR is considered to be no more than 72 hours to meet the requirement of clause (7.4.5.4) as mentioned above. The SIS consists of a SIL 2 pressure transmitter (type B) of which SFF is assumed 92%. The dangerous undetected is DU = 0.004 f/yr (or 456 FIT), its PFD at proof test interval 1 yr is 2 E-3, so it is SIL 2. The logic solver is already SIL 3, while the block valve and the solenoid being added together, DU = 0.025 f/yr (or 2.85 E-6, PFD = 1.25 E-2), averaged SFF of both is 55%, product is type A, it is SIL 1. The PST positioner controller is of type B with DU = 0.005 f/yr and the SFF of 93% (market value). Figure 1 illustrates the SIF blocks diagram.

    Figure 1: Overpressure SIF block diagram. The probability of the overpressure SIF can be expressed as FTA, using one year as proof test interval, and no PST involved. As shown in Figure 2.

    Figure 2: Fault Tree Analysis of the overpressure SIF.

    Failure rate data

    PT (type B): DU = 0.004 f/yr LS (type B): PFDavg = 1.5 E-4 V+S (type A): DU = 0.025 f/yr

    V/ma

    fieldbus

    S

    air

    block valve

    Logic solver

    PT

    V

    PST positioner

    controller

    Position

    indicator

    Overall PFDavg = 1.465 E-2 (SIL1)

    Overall SIL, SFF = SIL 1 PT : DU = 0.004 f/yr V+S : DU = 0.025 f/yr PFDavg = DU x TI/2 ; Test Interval (TI) = 1 yr

    PT : PFDavg = 2 E-3 (SIl2)

    V+S : PFDavgO = 1.25 E-2

    Logic solver Pressure Xmtrs

    PT1 V&S

    ESD V+S

    LS : PFDavg = 1.5 E-4 (SIL3)

    Note. PST not included

  • Form 7013 Issue 1 Page 7 of 10

    If PST is included as a diagnostic coverage factor of 70% and as said above, the partial stroke test is assumed running every 2 days, the TIFS is 1 year and the MRT is 1 day, then: Note: Eq. 1 does not include PST.

    Equation 2 represents an approximation equation of the PFDavg of the final element when PST is included. Since PST hardware is part of the final element assembly, claiming dangerous failures is conditional on the PST hardware availability. When PST is available, diagnostic is performed automatically by on-line logic solver, or using a local firmware. This means the second part of Eq. 2 is zero. When PST hardware is not available, then the PFDavg of the (V+S) will revert back to equation one, unless a shutdown is taking place. This can be represented as shown in Figure 3.

    Figure 3: Final element with PST.

    Eq.2

    PFDavg = 3.87E-3 (SIL2) as per Table 2 of IEC 61508-1

    What is SFF = SIL ? as per Table 3 of

    IEC 61508-2

    0.9975 3.845E-3

    3.84E-3

    PFDavg-vs =1.25E-2

    1-PFDavg-pst = 1-2.5E-3

    V&S V&S

    PFDavg_FS = 3.75E-3

    PFDavg_PS = 9.5E-5

    V&S PST HW PST

    HW

    PFDavg-pst =2.5E-3

    3.125E-5

    3.87E-3

  • Form 7013 Issue 1 Page 8 of 10

    The example above has demonstrated that PFDavg of the final element is improved from SIL 1 to SIL2 under PFD criteria. The next question is what is the final SFF of the element as a measure of the architecture constraints as defined in clause IEC 61508-2 (7.4.2.2)? Bear in mind that the PST positioner which is already type B is supporting (V+S) type A. According to edition 2, and as said above, the random hardware failure of the diagnostic shall be taken into consideration in the element, see clause (7.4.9.4-J), that makes the final assembly type B, as credit has been claimed for the undetected dangerous failures. The SFF of the assembly is then calculated as shown below.

    For a given SFF = 55% and DU = 0.025 f/yr, then XO = 0.031 f/y. Since PST is implemented as diagnostic then part of the dangerous undetected (DU) is reduced by a diagnostic coverage factor (70%). That makes the new SFF as follows:

    In this equation, the hardware of the PST positioner was not included in the overall assembly to work out the overall SFF. PST hardware was only used as a means to claim diagnostic of the final element, the credit gained by this hardware makes the final product a bit more complicated, as a result of this diagnostic, the product is falling under type B, hence by referring to IEC 61508-2 table 3, the product is SIL1. As can be seen in the above example, PST was used in improving the SIL level in terms of the probability of low demand table shown in Table 2 of IEC 61508-1, the possibility for using such a feature to claim appropriate credit in the PFD and make use of this credit to improve the SFF is not applicable, because it violates the above clauses as diagnostics credit cannot be claimed if the tools that implementing such diagnostics are ignored and were not included in the architecture constraints assessment. Of course, implementing a remotely controlled PST is a supportive tool if used as partial tests to extend the full proof test intervals. Using this feature in the wrong objective may lead to unsatisfactory results.

    Summary of the calculation of the final element V+S : DU = 0.025 f/y PFDavg_pst = 0.005 x 1yr / 2=2.5E-3

    = 9.5E-5

    ( ) = 3.75E-3

    0.9975

    = 0.0125 x 0.025 = 3.125E-5

    PFDavg = 3.75E-3+3.125E-5= 3.78E-3

    SFFV+S_0 ; (assume, S+DD = X); NO PST. Eq.3

    SFFV+S_0 ; => Xo ; Eq.4

    SFFV+S_1 = 0.86 or 86%

  • Form 7013 Issue 1 Page 9 of 10

    6 Evaluating product certification Basically every product should be quantitatively and qualitatively assessed as stated in clause (7.4.2.2) of IEC 61508-2. The stated parts are as follows: a) Hardware safety integrity which consists of :

    1- Route 1H ; based on FMEA or Route 2H ; based on field data 2- Quantifying the random hardware failures, HFT and SFF.

    b) Systematic safety integrity (systematic capability) which can be achieved by selecting any one of the routes; Route 1S (avoidance and control); Route 2S (proven in use); Route 3S (software only)

    c) Data process communication (for remote PST). The following questions need to be considered: How to understand the product certificate? What is the safety integrity level capability of the product safety function and how it is

    determined? What sort of information the certificate shall detail in order to allow precise selection of the

    required products for a given SIF? The product certificate shall be based on the above listed points. If any of the points is not specified then requesting the report for clarity may have to be considered. The product certificate should state the method of the random hardware failure assessment e.g FMEA, FTA or Markov method, along with the systematic safety integrity assessment (routes 1S, 2S, 3S) etc. This part is measured by a qualitative means, i.e proof of strong documentation, corrective actions, modification, project management; it is mainly measured against Annex B of IEC 61508-2. For example, a product certificate may state the systematic capability such as (SCx , x could be 1, 2 or 3) but it is not clear if the product was fully assessed according to the above points. This type of analysis does not mean that the product is capable to meet a specific safety integrity level capability unless the PFD for (low demand) or PFH for (high demand or continuous) are stated. For example, having a systematic capability (SC3) is a statement that the manufacturer has met all the systematic safety integrity measures, (e.g the documentation, the design techniques and measures and functional safety management used throughout the products realisation lifecycle) to a level of rigour that satisfies SIL 3.

    7 Conclusion PST is invaluable on line tool, which is very useful in improving the PFD of the SIF as it contributes to reveal part of the undetected dangerous failures on the basis as partial tests not on the basis as diagnostics test. It should not be used to affect the calculation of the safety failure fraction SFF as used in some products under IEC 61508 (ed.1). If used as a means of a diagnostic automated tool, the additional programmable firmware shall be assessed as type B to identify any undetected dangerous failure and should be included in the overall architecture constraints assessment as a fully integrated

  • Form 7013 Issue 1 Page 10 of 10

    ESD valves. The assessment shall be considered to IEC 61508 (ed.2) and PVST credit shall not be considered to overcome any redundancy requirement. It is apparent that product SIL classification does not rely only on PFD values, it relies on the architectural constraints, as said in [9] that ESD valves should be considered not only on PFD but also on SFF without PST unless the MTTR claimed includes the PST interval and meeting the timing requirement in terms of the safety process time and the constraints implied by the MTTR, as mentioned in the IEC 61508-2. A product certificate should clearly indicate the hardware assessment route, HFT, SFF and the type of systematic safety integrity route with a systematic capability (SCx) figure indicated. The latter shall not be considered as the only measure to classify the products suitability for use at a particular SIL as it could be a misleading factor in the product certificate. Any product certified under IEC 61508 (ed.1), its certification shall be reviewed to the latest edition. The final element safety integrity level (SIL) does not rely on the PST features, irrespective of its internal or external implementation in the SIF. SIL relies on the type, HFT and its systematic capability.

    Acknowledgements The author would like to thank Dr Bassem Alachkar, Paul Reeve, Harvey Dearden and James Lynskey for spending the time for proof reading this paper. References [1] BS EN 61508:2002; Functional safety of electrical/electronic/programmable electronic safety-related systems [2] BS EN 61508-2:2010, Ed. 2.0 : Functional safety of electrical/electronic/programmable electronic safety related systems Part 2: Requirements for electrical/electronic/programmable electronic safety related systems [3] Chris OBrien , too good to be true , April 2012, Safety automation element list. [4] Robin McCrea-Steele, Partial Stroke Testing Implementing for the Right Reasons, ISA EXPO 2005, 25-27 October [2005] [5] A.F.M. Prins, Partial Stroke Testing, Yokogawa, system centre Europe, [2010]. [6] Bill Mostia, Partial Stroke Testing, Simple or Not?|, control magazine, Nov. [2003]. [7] BS EN 61508-4:2010, Ed. 2.0 : Functional safety of electrical/electronic/programmable electronic safety related systems Part 4: Part 4: Definitions and abbreviations [8] Translation of special print from atp Automatisierungstechnische Praxis Volume 47 Issue 4 [2005] [9] Harley Dearden, Partial Stroke Testing. Diagnostic or Proof test? Inst. M&C, vol.46, no.5 June [2013]. [10] Web Guidline, M-2790-x-11,SIS automated block valves (ABV) Assemblies, draft copy, Sept. [2012]