acm symposium on applied computing nicosia cyprus march 16, 2004

76
OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY Assessing the Effect of Failure Severity, Coincident Failures and Usage-Profiles on the Reliability of Embedded Control Systems Frederick T. Sheldon, Ph.D. Oak Ridge National Laboratory Kshamta Jerath Microsoft Corporation ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

Upload: micol

Post on 01-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Frederick T. Sheldon, Ph.D. Oak Ridge National Laboratory Kshamta Jerath Microsoft Corporation. Assessing the Effect of Failure Severity, Coincident Failures and Usage-Profiles on the Reliability of Embedded Control Systems. ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Assessing the Effect of Failure Severity, Coincident Failures and Usage-Profiles on the

Reliability of Embedded Control Systems

Frederick T. Sheldon, Ph.D.Oak Ridge National Laboratory

Kshamta JerathMicrosoft Corporation

ACM Symposium on Applied Computing

Nicosia CyprusMarch 16, 2004

Page 2: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

2Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

The Modeling Cycle

Descriptive modelingComputational

modelingMaking it tractable

Model solutionValidation and model

refinementOperationalProposed

SystemMeasure

s

System Model(abstract level)

Computational Models

Model Solutions

OperationalValidation

Modified Modeland Data

Performance Measures

Real SystemProposed System

Computational Modeling

Descriptive Modeling

Validation and Model Refinement

Model Solution

Page 3: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

3Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

State Transition System

Deciding how the faults affect nominal and off nominal operation

Failure modesLoss of vehicleLoss of stabilityDegraded functionOver/Under-steer

Automatic pumpingof the brakes

Normalbraking

Pressure tothe brakes

Rear endslides out

Turning thesteering wheel

Normalturn

Apply brakes to tires onopposite side going into theslide

Apply brakes to tires onside going into the slide

Operatingthe car

Over-steerFront tiresslide

Turning

Under-steer

Slipping of anyone wheel

Braking

EngageABS

Accelerate

Apply brakes to RR tire

Activateaccerator

pedal

Normalacceleration

Slipbetween

LR tire androad

SlipbetweenRR tireand road

Right RearSlipage

Left RearSlipageApply brakes to LR tire

Page 4: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

4Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Page 5: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

5Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Agenda

Synopsis, Goals, Definition and Motivation Example Embedded System – The Anti-lock

Braking System Modeling Strategy, SPN Models and SAN

Models Reliability Analysis Results and Discussion Conclusion and Scope of Future Work

I

II

III

IV

V

Page 6: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

6Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Synopsis: Stochastic Modeling Case Study of Anti-lock Braking System

Problem/Domain: Model road vehicle ABS emphasizing failure severity, coincident failures and usage profiles using SPNs and SANs formalisms.

Challenges: Need to handle large state space –complex systems often include many layers

of complexity and numerous constituent components For realistic results we must model components to a sufficient level of detail Models should be scalable and extensible to accommodate the larger context

Benefits: Greater insight about contribution of components and non-functional factors to the overall system reliability. Establishes a framework for studying important factors that determine system

reliability Related work:

F.T. Sheldon and K. Jerath, “Specification, safety and reliability analysis using Stochastic Petri Net models”, in Proc. Int’l Symp. on Applied Computing , Nicosia Cyprus, pp. 826-833, Mar. 14-17, 2004.

Page 7: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

7Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Synopsis: Stochastic Modeling Case Study of Anti-lock Braking System

Problems/Results: Transient analysis of SPNs (using Stochastic Petri Net Package v. 6) and Stochastic Activity Network (UltraSAN v. 3.5) models was carried out and the results compared for validation purposes. Results emphasized the importance of modeling failure severity, coincident

failures and usage-profiles for measuring system reliability. Status/Plans:

Carry out the sensitivity analysis for the models developed to gain an insight into which components affect reliability more than others.

Model the entire system. ABS is a small part of the Dynamic Driving Regulation system and shares components with the ESA (Electronic Steer Assistance) and TC (Traction Control).

Simulation needed to model of the entire system. The model of the system would be too complex to allow numerical means of analysis.

Validate the results of the analysis against real data (should data become available).

Page 8: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

8Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

ESA Electronic Steer Assistance

TC Traction Control

DDR (Dynamic Driving Regulation System)

Sensitivity Analysis

Simulation

PT Power Transmission

ABS (Anti-lock Braking System)

Experiments (Monitoring of real system)

Problem Identification & Requirements Analysis

Compose Model using Stochastic Petri Nets

Analysis using Stochastic Petri Net Package v. 6

Comparison of results (Semi-validation)

Complete validation by comparison against real data

feedback feedback

Compose Model using Stochastic Activity Nets

Analysis using UltraSAN v. 3.5 and Möbius

Page 9: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

9Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Goals

Model and analyze the Anti-lock Braking System (ABS) of a passenger vehicle.

Model severity of failures, coincident failures and usage-profiles.

Carry out the reliability analysis using different stochastic formalisms – Stochastic Petri Nets (SPNs) and Stochastic Activity Networks (SANs).

Develop an approach that is generic and extensible for this application domain.

Page 10: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

10Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Definition (1) Model: An abstraction of a system that includes

sufficient detail to facilitate an understanding of system behavior.

Reliability: Probability that a system will deliver intended functionality/quality for a specified period of time, given that the system was functioning properly at the start of this period.

Failure: An observed departure of the external result of operation from requirements or user expectations.

Page 11: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

11Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Definition (2) Severity of failure: The impact the failure has on

the operation of the system. An example of a service impact classification is critical, major and minor.

Coincident failures: All failures are not independent. Components generally interact with each other during operation and affect the probability of failure of other components.

Usage-Profiles: Quantitative characterization of how a system (hardware and software) is used. (a.k.a. operational profiles, workload)

Page 12: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

12Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Motivation Reliability analysis of an ABS model to

predict/estimate the likelihood and characteristic properties of failures occurring in the system. Reliability function & Mean Time To Failure (MTTF).

The need for a realistic, scalable & extensible model Important to model severity and coincident failures Important to model usage-profiles

Comparing results from two stochastic formalisms – SPNs and SANs Validation by comparison against actual data beyond the

scope of this research.

Page 13: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

13Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Synopsis, Goals, Definition and Motivation Example Embedded System – The Anti-

lock Braking System Modeling Strategy, SPN Models and SAN

Models Reliability Analysis Results and Discussion Conclusion and Scope of Future Work

Part II

Page 14: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

14Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Anti-lock Braking System (1)

An integrated part of the braking system of vehicle. Prevents wheel lock up during emergency stop by

modulating wheel pressure. Permits the driver to maintain steering control while

braking.

Main Components Wheel speed sensors. Electronic control unit (controller). Hydraulic control unit (hydraulic pump). Valves.

Page 15: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

15Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Anti-lock Braking System (2)

Functioning Wheel speed sensors measure wheel-speed. The electronic control unit (ECU) “reads” signals from

the wheel speed sensors. If a wheel’s rotation suddenly decreases, the ECU

orders the hydraulic control unit (HCU) to reduce the line pressure to that wheel’s brake.

The HCU reduces the pressure in that brake line by controlling the valves present there.

Once the wheel resumes normal operation, the control restores pressure to that wheel’s brake.

Page 16: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

16Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Top Level Schematic of ABS

Rear

R1 090

Anti-lock Breaking / Anti-skid Controller

Disc break (4 indpt)Wheel speed sensor (4 indpt)B1-4 = Brakes (LF, RF, LR, RR)S1-4 = Speed sensors (LF, RF, LR, RR)R1-2 Turning angles (of the vehicle and the tires respectively)

BrakePressure

Masterbreakcylinder

Electronic brakecontrol module(EBCM)

RR

LF

LR

RF0

R290

Hydraulicmodulator valveassembly

2

2 4

B1 B2

B3 B4S3 S4

S1 S2

Accerometer

Top level schematic showing sensors, processing and actuators

Page 17: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

17Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Detailed Schematic

BrakeActuationSystem

IV(F)

SV(F)

PRV (F)

BPP

DPS FL FR

IV(R)

SV(R)

PRV (R)

BPP

DPS RL RR

FR

Rear Disc Brakes

Hydrolic Control UnitRear (HC) SubunitFront (HC) Subunit

PD PD

Front Disc Brakes

EM

Tank

PS PS PSPSFVDV DV DV DV FVFVFV

Front Right Rear Left Rear RightFront Left

DC DC

HPC

Legend:BPP Back Pressure PumpDC Damping Chamber (w/ mixing)DPS Down Pressure StorageDV Drain ValveEM Electric MotorFL/R Front Left/RightFR Floating RotorFV Filling Valve

HC Hydrolic ControlHPC Hydrolic pressure circuitIV Intake ValveL/R Left/RightPD Pulsing DamperPRV Pressure Releave ValvePS Pressure SensorRL/R Rear Left/RightSV Switching Valve

Main BrakeCylinder

Page 18: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

18Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

ABS Assumptions

Modes of operation (different levels of degraded performance failure severity) Normal operation Degraded mode Lost stability mode

Lifetime of a vehicle: 300-600 hrs/yr for an average of 10-15 yrs (i.e. 3000-9000 hrs)

Four-channel four-sensor ABS scheme

Page 19: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

19Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Failure Rates of Components†

Component # Base Failure Rate

Probability

Degraded Operation

Loss of Stability

Loss of Vehicle

Wheel Speed Sensor 4 2.00E-11 0.38 0.62 -

Pressure Sensor 4 1.50E-11 0.64 0.36 -

Main Brake Cylinder 1 1.00E-11 - - 1.0

Pressure Limiting Valve 2 6.00E-13 - 0.22 0.78

Inlet Valve 4 6.00E-13 - 0.18 0.82

Drain Valve 4 6.00E-13 - 0.19 0.81

Toggle Switching Valve 2 6.00E-13 1.0 - -

Hydraulic Pump 2 6.80E-11 - - 1.0

Pressure Tank 2 2.00E-12 - - 1.0

Controller 1 6.00E-12 0.2 0.4 0.4

Tubing 1 3.00E-12 0.33 - 0.67

Piping 1 4.00E-12 0.33 - 0.67

Obtained from DaimlerChrysler. The data has been falsified for publishing as part of this research. †

Page 20: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

20Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Synopsis, Goals, Definition and Motivation Example Embedded System – The Anti-lock

Braking System Modeling Strategy, SPN Models and SAN

Models Reliability Analysis Results and Discussion Conclusion and Scope of Future Work

Part III

Page 21: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

21Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Stochastic Modeling Mathematical (numerical solution) method

Defined over a given probability space and indexed by the parameter t (time).

Markov Processes Memoryless property: Future development depends

only on the current state and not how the process arrived in that state.

Markov Reward Models (MRM): Associate reward rates with state occupancies in Markov processes.

Common solution method for performability.

Page 22: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

22Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Modeling Challenges Practical Issues

Obtaining reliability data Limited ability of capturing interactions b/w

components Need to estimate fault correlation b/w components Incorporating usage information Direct validation of results

Problems in stochastic modeling Large state space: Size of the Markov model grows

exponentially with no. of components in the model. Stiffness: Due to the different orders of magnitude of

failure rates.

Page 23: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

23Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Stochastic Petri Nets (SPNs)

Graphical and mathematical tool for describing and studying concurrent, asynchronous, distributed, parallel, non-deterministic and/or stochastic systems.

Concise description of the system, which can be automatically converted to underlying Markov chains.

Bipartite directed graph whose nodes are divided into two disjoint sets: places and transitions.

Page 24: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

24Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Stochastic Petri Net SymbolsPlaces (drawn as circles) represent conditions.

Transitions (drawn as bars) represent events. Timed transitions and Immediate transitions.

Arcs (drawn as arrows) signify which combination of events must hold before/after an event. Input arcs and Output arcs.

Inhibitor arcs (drawn as circle-headed arcs) test for zero marking condition.

Tokens (drawn as small filled circles) denote the conditions holding at any given time.

Page 25: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

25Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Stochastic Petri Net Package Stochastic Petri Net Package (SPNP) allows

specification of Stochastic Reward Nets (SRNs) and the computation of steady-state, transient, cumulative, time-averaged measures.

SRNs are specified using CSPL (C-based Stochastic Petri net Language).

Sparse Matrix techniques are used to solve the underlying Markov Reward Model (MRM).

Version 6

Page 26: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

26Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Models Representing Severity and Coincident Failures (1)

Assumptions Exponential Failure Rates to

allow Markov chain analysis Levels of failure severity:

degraded mode, loss of stability (LOS) and loss of vehicle (LOV)

Impact of failure on failure rates:• Degraded – two orders of

magnitude• LOS – four orders of magnitude

Limited number of inter-dependencies modeled

Tubing

HydraulicPump

ControllerPressure

Tank

ToggleSwitch

DrainValve

Inlet Valve

Modeled

Not modeled

Inter-dependencies b/w components

Page 27: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

27Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Models Representing Severity and Coincident Failures (2)

All ABS components represented in the global model.

Components grouped according to their cardinality.

degraded_operation, loss_of_stability and loss_of_vehicle places model severity of failure.

Next slide shows controller detail…

start

braking

axlecentral

central_op axle_op

mbrakecyl controller tubing piping

FLWheel

FRWheel RRWheelRLWheelaxleCentral

loss_of_vehicleloss_of_stabilitydegraded_operation

The SPN Model for ABS

Page 28: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

28Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

start

braking

axlecentral

central_op axle_op

mbrakecyl controller tubing piping

FLWheel

FRWheel RRWheelRLWheelaxleCentral

loss_of_vehicleloss_of_stabilitydegraded_operation

Page 29: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

29Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Models Representing Severity and Coincident Failures (3)

controller

controllerOpcontrollerFail

failedController

controllerDegradedOp controllerLOSOp controllerLOVOp

controllerDegraded controllerLOS

degraded_operation loss_of_stability loss_of_vehicle

Model of an ABS component w/ coincident failures

Every component either functions “normally” as shown by controllerOp or “fails” as shown by controllerFail.

Failed component may cause degraded-operation, loss-of-stability or loss-of-vehicle.

Degraded-operation/ loss-of- stability: component continues to operate with increased failure rate (by 2 and 4 orders of magnitude respectively).

Page 30: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

30Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

controller

controllerOpcontrollerFail

failedController

controllerDegradedOp controllerLOSOp controllerLOVOp

controllerDegraded controllerLOS

degraded_operation loss_of_stability loss_of_vehicle

Page 31: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

31Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Models Representing Severity and Coincident Failures (4)

Each failure transition has a variable rate determined by a corresponding function.

Failure of component B affects failure rate of component A by including the condition:

if failedB then

failureA = failureA * order

where order is 100 in case of degraded operation and 10000 in case of loss of stability.

double controllerRate(){ double controller_rate = 0.0000006;

if (mark("controllerLOS") > 0) return controller_rate * 10000;

if ((mark("controllerDegraded") > 0) || mark("tubingDegraded") > 0)) return controller_rate * 100;

return controller_rate;}

Variable Rate to Model Coincident Failures

Page 32: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

32Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Models Representing Usage-Profiles (1)

User’s interact with the system in an intermittent fashion, resulting in operational workload profiles that alternate between periods of “active” and “passive” use.

Assumptions Exponential Failure Rates to allow Markov chain

analysis. Infinite repair rate all repairs occur instantaneously. Exponentially distributed workload. Two usage-profiles: Low usage and High usage which

are two orders of magnitude different.

Page 33: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

33Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

controller

controllerOpcontrollerFail

failedController

controllerDegradedOp controllerLOSOp controllerLOVOp

controllerDegraded controllerLOS

degraded_operationloss_of_stability loss_of_vehicle

inUseControllerrepairableController

alphamu

repair

Model of an ABS component w/ usage-profiles

SPN Models Representing Usage-Profiles (2)

When a component fails, check if it was in “active” use or not.

The parameter 1/mu indicates the mean duration of active use while the parameter 1/alpha indicates the mean duration of passive use.

Failure of component in “active” mode only affects reliability.

Page 34: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

34Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

controller

controllerOpcontrollerFail

failedController

controllerDegradedOp controllerLOSOp controllerLOVOp

controllerDegraded controllerLOS

degraded_operationloss_of_stability loss_of_vehicle

inUseControllerrepairableController

alphamu

repair

Page 35: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

35Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Models Representing Usage-Profiles (3)

State explosion problem due to increased number of states.

Work-around: The model was simplified to incorporate the usage parameters while calculating the failure rate itself for each component.

The value of mu was assumed to be 2.5 for infrequent use periods and 250 for frequent use periods.

Variable Rate to Model usage-profiles

double controllerRate(){ double controller_rate = 0.0000006;  // usage parameter controller_rate += controller_rate * mu;  if (mark("controllerLOS") > 0) return controller_rate * 10000;

if ((mark("controllerDegraded") > 0) || (mark("tubingDegraded") > 0))

return controller_rate * 100; return controller_rate;}

Page 36: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

36Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Reliability Measure Reliability measure expressed

in terms of expected values of reward rate functions.

The reliab() function defines a single set of 0/1 rewards.

Used as an input argument to

void pr_expected(char* string, double (*func)())

provided by SPNP that computes the expected value of the measure returned by func.

Function to calculate reliability reward

double reliab(){ double reward; if((mark("loss_of_vehicle") >= 1) || (mark("loss_of_stability") >= 3) || (mark("degraded_operation") >= 5)) reward = 0; else reward = 1; return reward;}

Page 37: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

37Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Halting Condition Necessary to explicitly impose a

halting condition because the developed SPN models recycle tokens.

The system is assumed to fail when> 5 components function in a

degraded mode, or > 3 components cause loss of

stability, or the failure of an important

component causes loss of vehicle.Function to evaluate for Halting Condition

int halt() { if((mark("loss_of_vehicle") >= 1) || (mark("loss_of_stability") >= 3) || (mark("degraded_operation") >= 5)) return 0; else return 1; }

*When this function evaluates to zero, the marking is considered to be absorbing.

Page 38: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

38Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Stochastic Activity Networks (SANs) A generalization of SPNs, permit the

representation of concurrency, fault tolerance, and degradable performance in a single model.

Use graphical primitives, are more compact and provide greater insight into the behavior of the network.

Permit both the representation of complex interactions among concurrent activities (as can be represented in SPNs) and non-determinism in actions taken at the completion of some activity.

Page 39: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

39Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Stochastic Activity Network Modeling Constructs

Places (drawn as circles) represent the state of the modeled system

Activities (drawn as ovals) represent events. Timed and Instantaneous activities. Case probabilities (as circles on right of activity).

Input Gates (triangles with point connected to activity) control the enabling of activities.

Output Gates (triangles with flat side connected to activity) define the marking changes that occur when activity completes.

Page 40: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

40Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

UltraSAN An X-windows based software tool for evaluating

systems represented as SANs. Three main tools: SAN editor, composed model

editor, performance model editor. Analytical solvers as well as simulators available. Steady-state and transient solutions are possible. Reduced base model construction used to

overcome largeness of state-space problem. Version 3.5

Page 41: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

41Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Models Representing Severity and Coincident Failures (1)

Assumptions Exponential Failure Rates to

allow Markov chain analysis Levels of failure severity:

degraded mode, loss of stability (LOS) and loss of vehicle (LOV)

Impact of failure on failure rates:• Degraded – two orders of magnitude• LOS – four orders of magnitude

Limited number of inter-dependencies modeled

Tubing

HydraulicPump

ControllerPressure

Tank

ToggleSwitch

DrainValve

Inlet Valve

Modeled

Not modeled

Inter-dependencies b/w components

Page 42: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

42Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Models Representing Severity and Coincident Failures (2)

Three individual SAN sub-models: Central_1, Central_2 and Wheel (replicated four times).

The division into three sub-categories done to facilitate representation of coincident failures.

Avoid replication of sub-nets where unnecessary.

The Composed SAN Model for ABS

Hydraulic Pump

Toggle Switch Pressure Tank

DrainValve InletValve

Replicate(4) Central

Wheel

Join

Main brake cylinder

Piping

Hydraulic Pump

Pressure Sensor Speed Sensor Toggle Switch

Limiting Valve

Pressure Tank

DrainValve InletValve DrainValve InletValve

Replicate(4) Central_1

Wheel

Join

Tubing Controller

Central_2

Page 43: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

43Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Models Representing Severity and Coincident Failures (3)

All subnets share common places: degraded, LOS, LOV and halted.

Presence of tokens in degraded, LOS, and LOV places indicates degraded operation, loss of stability and loss of vehicle resp.

Output cases of an activity have different probabilities to model conflict between the outcome of failure.

Central_2 subnet with the Controller component highlighted

central_2 central2_op

central2_out

hydraulicPump

pressureTank

controller

tubing

hydraulicPumpFail

pressureTankFail

controllerFail

tubingFail

toggleSwitchDegraded_out

controllerLOS_out

tubingDegraded_out

toggleSwitchDegraded

controllerLOS

tubingDegraded

LOV

LOS

degraded

halt_testhalttoggleSwitch toggleSwitchFail

controllerDegraded_out

controllerDegraded

halted

HPFailInhibit

PTFailInhibit

TSFailInhibit

haltInhibit

CFailInhibit

TFailInhibit

Page 44: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

44Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

central_2 central2_op

central2_out

hydraulicPump

pressureTank

controller

tubing

hydraulicPumpFail

pressureTankFail

controllerFail

tubingFail

toggleSwitchDegraded_out

controllerLOS_out

tubingDegraded_out

toggleSwitchDegraded

controllerLOS

tubingDegraded

LOV

LOS

degraded

halt_testhalttoggleSwitch toggleSwitchFail

controllerDegraded_out

controllerDegraded

halted

HPFailInhibit

PTFailInhibit

TSFailInhibit

haltInhibit

CFailInhibit

TFailInhibit

Page 45: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

45Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Models Representing Severity and Coincident Failures (4)

Degraded-operation/ loss-of- stability: failure rate increases (by 2 and 4 orders of magnitude respectively).

Failure of component A to degraded mode causes the failure rate of component B to increase by 2 orders.

Failure of component A to a loss of stability mode causes the failure rate of component B to increase by 4 orders.

Activity Rates Model Severity and Coincident Failures

Activity

controllerFail

RateProbability

MARK(controllerLOS) !=0?controllerRate*10000:

(MARK(controllerDegraded) !=0|| MARK(tubingDegraded) !=0

?controllerRate*100:controllerRate)

Case1 Case2

0.4 0.4

Case3

0.2

hydraulicPumpFail

MARK(controllerLOS) !=0?hydraulicPumpRate*10000:

(MARK(controllerDegraded) !=0?hydraulicPumpRate*100

:hydraulicPumpRate)

1.0 - -

Page 46: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

46Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Activity

controllerFail

RateProbability

MARK(controllerLOS) !=0?controllerRate*10000:

(MARK(controllerDegraded) !=0|| MARK(tubingDegraded) !=0

?controllerRate*100:controllerRate)

Case1 Case2

0.4 0.4

Case3

0.2

hydraulicPumpFail

MARK(controllerLOS) !=0?hydraulicPumpRate*10000:

(MARK(controllerDegraded) !=0?hydraulicPumpRate*100

:hydraulicPumpRate)

1.0 - -

Activity Rates Model Severity and Coincident Failures

Page 47: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

47Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Models Representing Usage-Profiles (1)

Assumptions Exponential Failure Rates to allow Markov

chain analysis. Infinite repair rate: all repairs occur

instantaneously. Exponentially distributed workload. Two usage-profiles: Low usage and High

usage which are one order of magnitude different.

Page 48: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

48Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Models Representing Usage-Profiles (2)

When a component fails, check if it was in “active” use or not.

Failure of component in “active” mode only affects reliability.

Work around the state explosion problem by incorporating the usage parameters while calculating the failure rate of component (lambda+mu).

mu same for all componentsState Diagram for reliability evaluation

00 01

10 F

alpha

mu

mu

lambda lambdav

mu-active use rate alpha-passive use rate

lambda-failure rate v-repair rate

Page 49: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

49Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Reliability Measure Reward rates specified using

a predicate and function.

If the system is not in an absorbing state (system failed), reliability is a function of the number of tokens in degraded, LOS and LOV.

For normal operation, the function evaluates to 1. Reliability is 0 when the predicate evaluates to false, by default.

Reward Rate to Calculate Reliability

Predicate: MARK(halted)==0 Function: 1.0/(1+MARK(degraded)+MARK(LOS)

+MARK(LOV))

Page 50: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

50Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Halting Condition

central_2 central2_op

central2_out

hydraulicPump

pressureTank

controller

tubing

hydraulicPumpFail

pressureTankFail

controllerFail

tubingFail

toggleSwitchDegraded_out

controllerLOS_out

tubingDegraded_out

toggleSwitchDegraded

controllerLOS

tubingDegraded

LOV

LOS

degraded

halt_testhalttoggleSwitch toggleSwitchFail

controllerDegraded_out

controllerDegraded

halted

HPFailInhibit

PTFailInhibit

TSFailInhibit

haltInhibit

CFailInhibit

TFailInhibit

SAN Halting Condition Depicted

Input condition on each activity states that it is enabled only if there is no token in halted place (common to all subnets).

Presence of token in halted place indicates an absorbing state.

Page 51: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

51Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

central_2 central2_op

central2_out

hydraulicPump

pressureTank

controller

tubing

hydraulicPumpFail

pressureTankFail

controllerFail

tubingFail

toggleSwitchDegraded_out

controllerLOS_out

tubingDegraded_out

toggleSwitchDegraded

controllerLOS

tubingDegraded

LOV

LOS

degraded

halt_testhalttoggleSwitch toggleSwitchFail

controllerDegraded_out

controllerDegraded

halted

HPFailInhibit

PTFailInhibit

TSFailInhibit

haltInhibit

CFailInhibit

TFailInhibit

Page 52: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

52Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Synopsis, Goals, Definition and Motivation Example Embedded System – The Anti-lock

Braking System Modeling Strategy, SPN Models and SAN

Models Reliability Analysis Results and Discussion Conclusion and Scope of Future Work

Part IV

Page 53: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

53Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Reliability Analysis Results

Transient Analysis carried out using SPNP (Stochastic Petri Net Package) version 6 on a Sun Ultra 10 (400 MHz) with 500 MB memory.

164,209 tangible markings of which 91,880 were absorbing.

Approximate running time of the solver was 144-168 hrs.

Page 54: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

54Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Results for Coincident Failures and Severity (1)

The Y-axis gives the measure of interest i.e. reliability, the time range (0 to 50K hrs) is along X-axis.

MTTF for the model with coincident failures (784,856.4 hrs) is ~421 hrs less than without coincident failures (785,277.6 hrs).

Reliability of ABS

0.75

0.8

0.85

0.9

0.95

1

1.05

Time (in hrs)

Without coincident failures

With Coincident Failures

MTTF (w/o) = 785277.6 hrs.MTTF (with)= 784856.4 hrs.

SPN Reliability Analysis Results for Coincident Failures and Severity

Page 55: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

55Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Reliability Analysis Results for Coincident Failures and Severity

Page 56: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

56Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Reliability Results for Coincident Failures and Severity (2)

Graph shows the difference between the reliability functions.

Start diverging around 350 hrs of operation.

The difference in reliability between the two cases becomes marked (after 13K hrs) only beyond the average lifetime of the vehicle (3K-9K hrs).Difference in Reliability Functions

Difference in reliability functions

0

0.00002

0.00004

0.00006

0.00008

0.0001

0.00012

0.00014

0.00016

0.00018

Time (in hours)

Page 57: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

57Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Difference in Reliability Functions (With and without coincident failures)

Page 58: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

58Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Reliability Analysis Results for Usage Profiles

Reliability Analysis with Usage Profiles

0

0.2

0.4

0.6

0.8

1

1.2

Time (in hours)

Low Usage

High Usage

MTTF (Low Usage) = 775111.7 hrs.MTTF (High Usage) = 771022.9 hrs.

SPN Reliability Results for Usage Profiles

MTTF for the high usage case is 771,022.9 hrs as opposed to 775,111.7 hrs for the low usage case, a difference of ~ 4089 hrs

Reliability of the system with heavy usage decreases alarmingly (!) within the first 1K hrs, while the reliability of the system with low usage decreases perceptibly (!!) only after 2.5K hrs of operation and then steadily thereafter

Page 59: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

59Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SPN Reliability Analysis Results for Usage Profiles

Page 60: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

60Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Reliability Results

Transient Analysis carried out using UltraSAN version 3.5 on a Sun Ultra 10 (400 MHz) with 500 MB memory.

859,958 states generated.

Approximate running time of the solver (transient solver trs) was 120-144 hrs.

Page 61: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

61Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Reliability Results for Coincident Failures and Severity

Reliability functions diverge perceptibly after around 1K hrs of operation, difference increases w/ time.

After 5K hrs the difference is 0.025, after 10K hrs 0.049.

Time to failure for model with coincident failures is 25,409 hrs, for model without coincident failures is 29,167 hrs (diff. of 3,758 hrs).

SAN Reliability Analysis Results for Coincident Failures and Severity

Reliability of the ABS

0

0.2

0.4

0.6

0.8

1

1.2

10 50 100 500 1000 5000 10000 20000 30000 40000 50000

Time (in hours)

Without Coincident Failures

With Coincident Failures

Page 62: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

62Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Reliability Analysis Results for Coincident Failures and Severity

Page 63: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

63Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Reliability Usage Profiles Results

System Reliability with heavy usage decreases alarmingly after 100hrs, while the reliability of the system with low usage decreases only perceptibly after 100hrs of operation.

At the extreme end of average lifetime (9Khrs) of the vehicle, reliability has dropped to ~0 for heavy usage and to ~0.4 for low usage.

Time to failure for model with low usage is 12,262hrs, for model with high usage is 1,687 hrs (diff. of 10,575hrs).

SAN Reliability Analysis Results for Usage Profiles

Reliability of ABS with Usage-Profiles

0

0.2

0.4

0.6

0.8

1

1.2

0 10 50 100 500 1000 5000 10000 20000 30000 40000 50000Time (in hours)

Low Usage

High Usage

Page 64: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

64Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

SAN Reliability Analysis Results for Usage-Profiles

Page 65: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

65Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Comparing the SPN & SAN Results (1)

Because it is beyond the scope of this research to validate the results from the analytic experiments against real data, . . . we compare the results from SPN & SAN analyses.

The difference in the range of actual reliability values between the SPN and SAN models may be attributed to the different ways in which the reliability reward is defined. See the plots where both curves are in the same graph

Severity and Coincident Failures SPNs - The curves for the two cases completely overlapped. SANs - The curves diverge after 1K hrs of operation.

Page 66: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

66Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Comparison of SPN and SAN Reliability Results for Models Representing Severity and Coincident Failures

Page 67: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

67Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Comparison of SPN and SAN Reliability Results for Models Representing Usage-Profiles (with failure severity and coincident failures)

Page 68: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

68Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Comparing the SPN & SAN Results (2)

Usage Profiles SPNs – Reliability for high usage decreases alarmingly

within first 1K hrs, for low usage only after 2.5K hrs. SANs - Reliability for high usage decreases alarmingly

after 100 hrs, for low usage only perceptibly after 100 hrs. Results from both models agree on the fact that failure

severity, coincident failures and usage-profiles contribute significantly to predicting system reliability.

Which of these results is more realistic? Comparing results does not make up for validation

against real data.

Page 69: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

69Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Comparing the SPN & SAN Results (3)

Criteria SPN Models SAN ModelsAssumptions Same Same

Reliability measure Different Different

Number of states 164,209 859,958

Solvers’ Running time 144-168 hrs 120-144 hrs

Reliability at 9Khrs (severity & co.failures)

9.5792578e-01 vs. 9.5792653e-01

7.3672e-01 vs. 7.8600e-01

Reliability at 9Khrs (usage-profiles)

8.9621556e-01 vs. 7.6658329e-01

4.455167e-01 vs. 3.130521e-03

Page 70: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

70Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Synopsis, Goals, Definition and Motivation Example Embedded System – The Anti-lock

Braking System Modeling Strategy, SPN Models and SAN

Models Reliability Analysis Results and Discussion Conclusion and Scope of Future Work

Part V

Page 71: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

71Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Conclusions (1) Modeling and Analysis: The Anti-lock Braking

System of a passenger vehicle was modeled (with emphasis on failure severity, coincident failures and usage profiles) and analyzed.

Realistic Models: The models were built incrementally to achieve the best balance between faithfulness to the real system and keeping the model tractable at the same time.

Extensible Models: The models developed can be easily extended to incorporate different levels of severity, other coincident failures and usage levels.

Page 72: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

72Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Conclusions (2) Two stochastic formalisms: Stochastic Petri Nets &

Stochastic Activity Networks, were used to analyze the developed models for reliability measures.

Results justified the modeling strategy adopted and highlighted the importance of modeling severity, coincident failures and usage-profiles while examining system reliability.

This research has successfully established a framework for investigating system reliability and the basis for further investigations in this application domain.

Page 73: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

73Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Future Work (1) Sensitivity Analysis: The analysis of the effect of

small variations in system parameters on the output measures and can be studied by computing the derivatives of the output measures with respect to the parameter.

Model the entire system: The ABS is a small part of the DDR (Dynamic Driving Regulation) system which consists of other subsystems like the Electronic Steering Assistance (ESA) and the traction control (TC).

Page 74: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

74Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Future Work (2) Simulation: Evaluate the (complex) model

numerically in order to estimate the desired true characteristics of the system.

Validation: Results from experiments on the real system to validate analysis results to incrementally arrive at a realistic model.

Generalization of modeling strategy for modeling both software and hardware components and the way of representing severity, coincident failures and usage profiles.

Page 75: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

75Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

Contact Information

Frederick T. Sheldon, Ph.D. and Tom Potok, Ph.D.

Software Engineering for Dependable for Systems

Applied Software Engineering Laboratory

Rick: 865-576-1339Tom: 865-574-0834Fax: 865-574-6275

URL: http://www.csm.ornl.gov/~sheldon

http://computing.ornl.gov/cse_home/acer.shtml

Page 76: ACM Symposium on Applied Computing Nicosia Cyprus March 16, 2004

76Software Engineering for Secure Dependable Systems Applied SE Research, Computational Sciences and Engineering Division

OAK RIDGE NATIONAL LABORATORY – U.S. DEPARTMENT OF ENERGY

The End