aspira dependability prediction with ultrasan · 2000-10-08 · aspira dependability prediction...

7
1 10-4-00 Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision Statement A standardized method of system dependability modeling that streamlines communication between designers and dependability analysts and provides a framework for the development of 99.999% available systems.

Upload: others

Post on 14-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

1 10-4-00

Aspira Dependability Prediction withUltraSAN

Aspira Systems Engineering

Bryce Kuhlman

Steve Beaudet

2 10-4-00

Vision Statement

A standardized method of system dependability modeling that streamlinescommunication between designers and dependability analysts andprovides a framework for the development of 99.999% available

systems.

Page 2: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

3 10-4-00

Goals

• Modeling used as a tool to develop detailed understanding of systemdependability performance

• Analysis throughout the entire product life cycle.

• Streamlined communication between product engineers, 3rd partyvendors, and dependability analysts.

• One model, many measures (with distributions).

• Timely modeling and analysis– Low cycle time for new models and analysis (usually less than one week,

depending on the product complexity and familiarity.

– Models evolve with the design; more detail is added as it becomesavailable.

– Trade studies defined by availability team and system designers drivedesign for High Availability.

4 10-4-00

Modeling Process

• Two levels of modeling: description and computation

• Dependability Description Model (DDM)– Explanation of how the system works in an Availability sense.

– Standardized framework based on existing methodologies in dependabilityanalysis and system design which can be easily understood by all.

– Provides detailed system descriptions.

– Designers become an integral part of system dependability evaluation.

– Focus is on description, not evaluation and therefore circumvents the needfor expertise in particular modeling techniques (Markov Process, SPN,SAN, etc.)

– To achieve 99.999% availability, all possible sources of serviceinterruption must be addressed.

• Dependability Computation Model (DCM)– Calculation of measures defined in DDM

– UltraSAN

Page 3: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

5 10-4-00

Dependability Description Model• Measures

– Availability = Probability of a user being able to setup a new connection– Reliability = Probability of an existing connection being dropped– Maintenance = Number of maintenance events necessary– Bellcore / TL-9000: outage, DPM, OFM, etc.

• Model Assumptions• System Description

– Dependability Block Model• Identification of Serial Blocks – Common failure impact, detection, response, repair

– Dependency Graphs

• Block Dependability Information– General Information– Failure Information– Detection Information– Recovery Information– Notification Information– Repair Information– Upgrade Information

6 10-4-00

Dependability Description Model

• For each serial block, for type of information– Description of design aspect

– Impact on other components

– Applicable Parameters• Time distribution & parameters

• Probability

– Basis for Parameter Estimate

– Effect(s) of failed activity• Reference next escalation level of detection or effect of failed detection, next

level of response

Page 4: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

7 10-4-00

Dependability Computation Model (DCM)

• Simulation and analysis for the purpose of estimating measurementsdefined in the DDM.

• Created by dependability analysts based on information contained inthe DDM.

• Model precisely how the system behaves.

8 10-4-00

UltraSAN Selection Factors

• Output measures are accompanied with estimated distributions– Distribution shape gives insight

– Expected performance of individual networks, small populations can beunderstood

• Monte Carlo simulation utilized to avoid state-space explosion and tosupport non-exponential time distributions.– All details specified in the DDM can be modeled using UltraSAN

simulation.• Model how design works. Avoids Markov Model simplifications

• One model to estimate all measures.

• Composed model supports modular programming, model reuse, anddevelopment time reduction.

• Exceptionally fast simulation time.

Page 5: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

9 10-4-00

Modeling Capability Details

• Rate Distributions (exponential, Weibull, lognormal, etc.)– Failure time distributions

– Failure detection, response, and notification (time distribution andprobability

• Distributions that reflect real experience

• Detection and Response– Multiple levels of detection and response escalation

– Effects of protocols and packet networks in fault management

• Software Architecture– Model details of how the software modules work and fail together, how

they interact, and their relationship to the hardware.

• Event edge (time-independent) impacts

10 10-4-00

Modeling Capability Details

• Repair Dependency– Example: If a single port on a multi-port adapter fails, the entire port

adapter and all of its connections must be disabled to replace the portadapter.

• Operational Dependency– Failure of some elements disables other elements.

– Example: If a processor fails, all applications are disabled and cannot failuntil the processor is brought back online.

• Procedural Errors– Failures caused by network operators performing routine or specialized

operations on the network.

• Maintenance strategies

• Planned upgrade

Page 6: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

11 10-4-00

UltraSAN Modeling Process

• Defined UltraSAN templates cover extensive range of configurationsand procedures:– Component failure (HW and SW)

– Redundancy (active and standby)

– Operational and repair dependency

– Detection and response time and probability

– Detection and response escalation

– Repair / replacement

– Maintenance/Procedural Error

– Upgrades

• Standardized measurement definitions

• Standards for naming conventions, time-increment, variable usage

• Detailed model validation procedures

• Strict configuration management guidelines

12 10-4-00

Desired UltraSAN Enhancements• GUI enhancements

– cut and paste, rename– more robust text/code editor– complete model compilation at all levels of definition to circumvent mandatory

subnet->composed->reward->study sequence

• Additional model composition formalisms (graph models, etc.)• Path-based reward variables• Integration with Design of Experiments functionality for evaluation of

sensitivities• Token specification (colored tokens, data structures, etc.)• Easier specification of user-defined functions• Triangular distribution• Architecture-independent multi-processor runs• Port to Windows 2000• Alternate project documentation format (HTML, PDF, etc.)• Improved documentation• Worked complex examples

– Including tricks

Page 7: Aspira Dependability Prediction with UltraSAN · 2000-10-08 · Aspira Dependability Prediction with UltraSAN Aspira Systems Engineering Bryce Kuhlman Steve Beaudet 2 10-4-00 Vision

13 10-4-00

Summary

• The attainment of 5 NINES Availability performance requires detaileddesign specifically targeted for availability enhancement

• A process has been developed that drives and records the Availabilitydesign detail

• The use of UltraSAN has allowed us to calculate the results of thatimplementation detail– Easy to learn without extensive mathematical background

– Deals with large state space reflective of design detail

– Rapid simulation time

– Distributions – as part of inputs and in outputs