traceable engineering of fault-tolerant soss · traceable engineering of fault-tolerant soss ......

58
SoSECIE 09/15/2015 1 Traceable Engineering of Fault-Tolerant SoSs Zoe Andrews, Claire Ingram, Richard Payne and Alexander Romanovsky Newcastle University, UK Jon Holt and Simon Perry Scarecrow Consultants Limited, UK SoSECIE webinar: 09/15/2015

Upload: buitu

Post on 05-Aug-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

SoSECIE 09/15/2015 1

Traceable Engineering of Fault-Tolerant SoSsZoe Andrews, Claire Ingram, Richard Payne and Alexander RomanovskyNewcastle University, UK

Jon Holt and Simon PerryScarecrow Consultants Limited, UK

SoSECIE webinar: 09/15/2015

SoSECIE 09/15/2015 2

Outline

• Introduction

• Engineering Fault-Tolerant SoSs

• Conclusions

SoSECIE 09/15/2015 3

Outline

• Introduction

– Systems of Systems

– Systems of Systems and Dependability

– The Fault Modelling and Analysis Method

• Engineering Fault-Tolerant SoSs

• Conclusions

SoSECIE 09/15/2015 4

Our Research …

Design technology (foundations, methods, tools) for:

• Systems of Systems (SoS)

• Cyber-Physical Systems (CPS)

We focus on model-based design:

• Models as a basis for collaborative development

• Machine-assisted analysis of models as a means of managing development risk

"HVAC Ventilation Exhaust" by PictorialEvidence - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons -http://commons.wikimedia.org/wiki/File:HVAC_Ventilation_Exhaust.jpg#/media/File:HVAC_Ventilation_Exhaust.jpg

SoSECIE 09/15/2015 5

• Strongly supported by European Union

• Current Priorities:

– Research: development of sound toolchains for verifying global properties of CPSs (INTO-CPS)

– Innovation Support: creating ecosystems in model-based engineering for CPS (CPSE Labs)

– Strategic Actions: Roadmaps and Transatlantic Research Agenda (Road2CPS, TAMS4CPS)

Current Projects

www.tams4cps.eu

@TAMS4CPS www.into-cps.au.dk@IntoCps

www.cpse-labs.eu@CPSE_Labs

www.road2cps.eu

@ROAD2CPS

SoSECIE 09/15/2015 6

Systems of Systems

• Assembly and integration of independent “constituent systems”interacting via a networked infrastructure

• Collectively offer a new global (“emergent”) service on which value and reliance is placed.

• Independence

• Distribution

• Evolution

• Emergence

• Model-based SoS Engineering as a way of mastering complexity

SoSECIE 09/15/2015 7

COMPASS Technology

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

Initiate Rescue Fault Activation [Fault 1]

«Fault Activation View» {faultsOfInterest = Complete Failure of the Radio System}

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

[idle ERU]

[no idle

ERU]

[higher

criticality]

[lower

criticality]

Semi-formal Modelling• SysML used for SoS modelling • Guidelines for Requirements, Architecture, Integration• SoS modelling profiles• Architectural patterns and extensible architectural frameworks (AFs)

SoSECIE 09/15/2015 8

COMPASS Technology

Formal Modelling• CML allows representation of behavioural semantics of the SoS• Supports contract specification• Describes functionality, object-orientation, concurrency, real-time, mobility. • Can be extended to new paradigms

process CallCentreProc = begin…actionsMERGE1(r) = (dcl e: set of ERUId @ e := findIdleERUs(); if e = {}then DECISION2(r)else

(dcl e1: ERUId @ e1 := allocateIdleERU(e, r); MERGE2(e1, r)))

process InitiateRescue = CallCentreProc[| SEND_CHANNELS |] RadioSystemProc[| RCV_CHANNELS |] ERUsProc

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

Initiate Rescue Fault Activation [Fault 1]

«Fault Activation View» {faultsOfInterest = Complete Failure of the Radio System}

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

[idle ERU]

[no idle

ERU]

[higher

criticality]

[lower

criticality]

Semi-formal Modelling• SysML used for SoS modelling • Guidelines for Requirements, Architecture, Integration• SoS modelling profiles• Architectural patterns and extensible architectural frameworks (AFs)

SoSECIE 09/15/2015 9

COMPASS Technology

Formal Modelling• CML allows representation of behavioural semantics of the SoS• Supports contract specification• Describes functionality, object-orientation, concurrency, real-time, mobility. • Can be extended to new paradigms

process CallCentreProc = begin…actionsMERGE1(r) = (dcl e: set of ERUId @ e := findIdleERUs(); if e = {}then DECISION2(r)else

(dcl e1: ERUId @ e1 := allocateIdleERU(e, r); MERGE2(e1, r)))

process InitiateRescue = CallCentreProc[| SEND_CHANNELS |] RadioSystemProc[| RCV_CHANNELS |] ERUsProc

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

Initiate Rescue Fault Activation [Fault 1]

«Fault Activation View» {faultsOfInterest = Complete Failure of the Radio System}

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocate

idle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescue

info to ERU

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

: Radio System

: Process

message

«Fault Activation»

: Fault 1 activation

: Process

message

«Fault Activation»

: Fault 1 activation

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«erroneous»

: Drop message«Error Detection»

: Error 1 detection

«Failure Event»

: Target not attended

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

ERU1 : ERU

: Service

rescue

: Receive

message

«End Recovery»

: End Recovery 1

[idle ERU]

[no idle

ERU]

[higher

criticality]

[lower

criticality]

Semi-formal Modelling• SysML used for SoS modelling • Guidelines for Requirements, Architecture, Integration• SoS modelling profiles• Architectural patterns and extensible architectural frameworks (AFs)

Tool-supported Analysis • Model-checker • Automated proof• Static Fault Analysis• Test generation• Simulation• Model-in-Loop Test• Exploration of design space

SoSECIE 09/15/2015 10

SoS and Dependability

Independence

Distribution

Evolution

Emergence

Challenge:

CSs are autonomous and independent, responsibility for preventing SoS failure when a CS fails or withdraws from the SoS is unclear

Approach:

Record explicitly where responsibility for fault tolerance lies

SoSECIE 09/15/2015 11

SoS and Dependability

Independence

Distribution

Evolution

Emergence

Challenge:

Networking technology that SoS-level functionality relies on may introduce faults

Approach:

Model-based support for exploration of recovery strategies that can be applied when networking failures occur

SoSECIE 09/15/2015 12

SoS and Dependability

Independence

Distribution

Evolution

Emergence

Challenge:

CSs evolve independently, need to preserve dependability under evolutions of the SoS

Approach:

Model-based traceability facilitates impact analysis to understand the impact of unexpected changes

SoSECIE 09/15/2015 13

SoS and Dependability

Independence

Distribution

Evolution

Emergence

Challenge:

Need to analyse the emergent dependability of the SoS, but constituents systems may be black boxes

Approach:

Document fault assumptions for constituents and provide tool support for analysis of emergent dependability properties

SoSECIE 09/15/2015 14

SoS and Dependability

SoS failure “a deviation of the service provided by the SoS

from expected (correct) behaviour”

SoS error “the part of the SoS state that can lead to its

subsequent service failure”

SoS fault “the adjudged or hypothesised cause of an error”

The focus of our work is on fault-tolerant SoSs

• preventing SoS failures from arising in the presence of faults

• achieved by detecting errors and conducting system recovery

Avizienis, A.; Laprie, J.-C.; Randell, B. & Landwehr, C., "Basic concepts and taxonomy of dependable and secure computing," Dependable and Secure Computing, IEEE Transactions on ,

vol.1, no.1, pp.11,33, Jan.-March 2004.

SoSECIE 09/15/2015 15

Methodology

Architectural Design

Fault Analysis

Requirements Engineering

Traceability

Fault Tolerance

Verification

Informal Requirements

SoSECIE 09/15/2015 16

Technologies

Fault Modelling

Architectural Framework

Fault Analysis

Architectural Framework

HiP-HOPS

SoS-ACRE

TraceabilityPattern

Fault Tolerance

Verification Plugin

Fault Analysis

Tool

SoSECIE 09/15/2015 17

Technologies

Fault Modelling

Architectural Framework

Fault Analysis

Architectural Framework

HiP-HOPS

SoS-ACRE

TraceabilityPattern

Fault Tolerance

Verification Plugin

ModellingFault

Analysis Tool

Analysis

SoSECIE 09/15/2015 18

COMPASS AF Framework (CAFF)

Approach

CAFF APPROACH

DefineOntology

Define Viewpoints

Define Rules

SoSECIE 09/15/2015 19

Ontology

System of Systems

Constituent System

System

COMPASS AF Framework (CAFF)

Approach

CAFF APPROACH

DefineOntology

Define Viewpoints

Define Rules

SoSECIE 09/15/2015 20

Ontology

System of Systems

Constituent System

System

COMPASS AF Framework (CAFF)

Approach

CAFF APPROACH

DefineOntology

Define Viewpoints

Define Rules

Viewpoints use modelling elements from ontology

Viewpoints

Fault Tolerance Connections Viewpoint

Fault/Error/Failure Definition Viewpoint

Threats Chain

Viewpoint

Erroneous/Recovery Processes Viewpoint

Fault Tolerance Structure Viewpoint

SoSECIE 09/15/2015 21

Ontology

System of Systems

Constituent System

System

COMPASS AF Framework (CAFF)

Approach

CAFF APPROACH

DefineOntology

Define Viewpoints

Define Rules

Viewpoints

Fault Tolerance Connections Viewpoint

Fault/Error/Failure Definition Viewpoint

Threats Chain

Viewpoint

Erroneous/Recovery Processes Viewpoint

Fault Tolerance Structure Viewpoint

Rules restrict how the viewpoints may be applied

Rules

Rule FMAF01: All faults, errors and failures modelled in the Fault Modelling Architectural Framework must be defined in the

Fault/Error/Failure Definition View (FEFDV).

Rule FMAF02: All constituents included in a Threats Chain View (TCV) must be included in the corresponding Fault Tolerance Structure View (FTSV) and Fault Tolerance Connections View

(FTCV).

SoSECIE 09/15/2015 22

Outline

• Introduction

• Engineering Fault-Tolerant SoSs

– Case Study

– Requirements Engineering

– Fault Modelling

– Requirements Tracing

– Fault Analysis

• Conclusions

SoSECIE 09/15/2015 23

Emergency Response SoS

SoSECIE 09/15/2015 24

Emergency Response SoS

SoSECIE 09/15/2015 25

Emergency Response SoS

SoSECIE 09/15/2015 26

Emergency Response SoS

???

SoSECIE 09/15/2015 27

Requirements Engineering

• Apply SoS-ACRE: a structured way of engineering and managing the requirements of an SoS

• Key elements of SoS-ACRE for fault tolerance:– define the fault tolerance requirements of the SoS (what

faults)

– examine the fault tolerance requirements in the context of the CSs

– identify error detection and recovery scenarios

We focus on the requirements engineering processes

SoSECIE 09/15/2015 28

ER SoS: Context Interaction

Requirement: Tolerate Radio System Failure

SoSECIE 09/15/2015 29

ER SoS: Context Interaction

Requirement: Tolerate Radio System Failure

SoSECIE 09/15/2015 30

ER SoS: Context Interaction

SoSECIE 09/15/2015 31

Fault Modelling

• Defined a Fault Modelling Architectural Framework (FMAF) + SysML profile

• Prompts an SoS engineer to consider the impact of faults at the early stages of design

• A coherent set of views & concepts for designing fault-tolerant SoSs

FMAF

ViewpointsOntology Rules

SoSECIE 09/15/2015 32

Fault Modelling

FMAF

ViewpointsOntology Rules

• Defined a Fault Modelling Architectural Framework (FMAF) + SysML profile

• Prompts an SoS engineer to consider the impact of faults at the early stages of design

• A coherent set of views & concepts for designing fault-tolerant SoSs

SoSECIE 09/15/2015 33

FMAF: Ontology Definition

SoSECIE 09/15/2015 34

FMAF: Ontology Definition

SoSECIE 09/15/2015 35

FMAF: Ontology Definition

• SoS failure: “a deviation of the service provided by the SoSfrom expected (correct) behaviour”

• SoS error: “the part of the SoS state that can lead to its subsequent service failure”

• SoS fault: “the adjudged or hypothesised cause of an error”

SoSECIE 09/15/2015 36

FMAF: Viewpoints

FMAF

Viewpoints

BehaviourStructureOntology Rules

SoSECIE 09/15/2015 37

FMAF: Viewpoints

FMAF

Viewpoints

BehaviourStructureOntology Rules

Structural Viewpoints

Fault Tolerance Connections

Fault Tolerance Structure

Threats ChainFault/Error/Failure

Definition

SoSECIE 09/15/2015 38

FMAF: Viewpoints

FMAF

Viewpoints

BehaviourStructureOntology Rules

Behavioural Viewpoints

RecoveryFault Activation

Erroneous/Recovery Scenarios

Erroneous/Recovery Processes

Structural Viewpoints

Fault Tolerance Connections

Fault Tolerance Structure

Threats ChainFault/Error/Failure

Definition

SoSECIE 09/15/2015 39

FMAF: Viewpoints

FMAF

Viewpoints

BehaviourStructureOntology Rules

Behavioural Viewpoints

RecoveryFault Activation

Erroneous/Recovery Scenarios

Erroneous/Recovery Processes

Structural Viewpoints

Fault Tolerance Connections

Fault Tolerance Structure

Threats ChainFault/Error/Failure

Definition

SoSECIE 09/15/2015 40

ER SoS: Threats Chain

SoSECIE 09/15/2015 41

ER SoS: Threats Chain

SoSECIE 09/15/2015 42

ER SoS: Threats Chain

SoSECIE 09/15/2015 43

CC : Call Centre : Radio System ERU1 : ERU

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

Initiate Rescue Fault Activation [Fault 1]

«Fault Activation View» {faultsOfInterest = Complete Failure of the Radio System}

CC : Call Centre : Radio System ERU1 : ERU

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Radio System

: Processmessage

«Fault Activation»

: Fault 1 activation

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

[idle ERU]

[no idle

ERU]

[higher

criticality]

[lower

criticality]

For the rescue

the ERU was

diverted from

ER SoS: Fault Activation

SoSECIE 09/15/2015 44

CC : Call Centre : Radio System ERU1 : ERU

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

Initiate Rescue Fault Activation [Fault 1]

«Fault Activation View» {faultsOfInterest = Complete Failure of the Radio System}

CC : Call Centre : Radio System ERU1 : ERU

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Radio System

: Processmessage

«Fault Activation»

: Fault 1 activation

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

[idle ERU]

[no idle

ERU]

[higher

criticality]

[lower

criticality]

For the rescue

the ERU was

diverted from

ER SoS: Fault Activation

SoSECIE 09/15/2015 45

CC : Call Centre : Radio System ERU1 : ERU

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

Initiate Rescue Fault Activation [Fault 1]

«Fault Activation View» {faultsOfInterest = Complete Failure of the Radio System}

CC : Call Centre : Radio System ERU1 : ERU

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

CC : Call Centre

: Start rescue

: Find idle ERUs

: Allocateidle ERU

: Divert ERU

: Log diversion

: Start rescue

: Wait

: Send rescueinfo to ERU

: Radio System

: Processmessage

«Fault Activation»

: Fault 1 activation

: Processmessage

«Fault Activation»

: Fault 1 activation

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Erroneous ...

: Drop message

«Failure Event»

: Target not attended

«Error Detection»

: Error 1 detection ...

«Start Recovery»

: Start Recovery 1

ERU1 : ERU

: Servicerescue

: Receivemessage

«End Recovery»

: End Recovery 1

[idle ERU]

[no idle

ERU]

[higher

criticality]

[lower

criticality]

For the rescue

the ERU was

diverted from

ER SoS: Fault Activation

SoSECIE 09/15/2015 46

Requirements Tracing

• Traceability Pattern– a structured way of defining

traceability

– in this case we will use it for fault tolerance requirements traceability

Detected by dependency

(FMAF Threats Chain Viewpoint)Validation View

(SoS-ACRE Viewpoint)

Use case

(SoS-ACRE Context Interaction Viewpoint)

SoSECIE 09/15/2015 47

Requirements Tracing (ER SoS)

SoSECIE 09/15/2015 48

Fault Analysis

• Assists an SoS engineer in analysing

the impact of CS/component level

faults on SoS functionalities

• Provide modelling and tool support for static fault analysis

– Fault Analysis Architectural Framework (FAAF) + SysML profile

– ‘Ergonomic profile’ and translation tool for Atego Modeler

• Fault analysis capabilities provided by HiP-HOPS tool

– Fault Tree Analysis

– Failure Modes and Effects Analysis

SoSECIE 09/15/2015 49

Fault Analysis: Example

SoS

CS1

CS2

CS3

SoSECIE 09/15/2015 50

Fault Analysis: Example

SoS

CS1

CS2

CS3

A

B

C

SoSECIE 09/15/2015 51

Fault Analysis: Example

A B

C

SoS failure

SoS

CS1

CS2

CS3

A

B

C

SoSECIE 09/15/2015 52

Tool Overview

1. Ergonomic profiling

SoSECIE 09/15/2015 53

Tool Overview

1. Ergonomic profiling2. Exporting models for analysis

SoSECIE 09/15/2015 54

SoSECIE 09/15/2015 55

Outline

• Introduction

• Engineering Fault-Tolerant SoSs

• Conclusions

SoSECIE 09/15/2015 56

Conclusions

Systems of systems have characteristics that motivate a structured approach to fault modelling and analysis of SoSs:

• Modelling support for requirements engineering, architectural design, requirements tracing of fault-tolerant SoSs

• Modelling and tool support for fault tolerance verification and static fault analysis

Future directions:

• Tool-supported integration of the fault modelling and fault analysis architectural frameworks and SysML profiles

• Support for modes of operation, either functional modes or degraded modes (e.g. partial functionality in presence of errors)

• Multi-objective optimisation, trading off attributes such as dependability, cost and energy consumption

SoSECIE 09/15/2015 57

[email protected]

[email protected]

[email protected]

@ZoeAndrewsNcl

@_Claire_Ingram

@riffio

thecompassclub.org

http://sourceforge.net/projects/compassresearch/files/StaticFaultAnalysis/

This work is part of the COMPASS project: research into model-based techniques for developing, maintaining and analysing SoSs

SoSECIE 09/15/2015 58

Further Reading

• Andrews, Z.; Fitzgerald, J.; Payne, R. and Romanovsky, A. “Fault Modelling for Systems of Systems,” in Proceedings of the 11th International Symposium on Autonomous DecentralisedSystems (ISADS 2013), March 2013, pp. 59–66.

• Andrews, Z.; Payne, R.; Romanovsky, A.; Didier, A. L. and Mota, A. 2013. “Model-based development of fault tolerant systems of systems” in Proceedings of the 2013 IEEE International Systems Conference (SysCon 2013).

• Ingram, C.; Riddle, S.; Fitzgerald J.; Al-Lawati, S.A.H.J.; Alrbaiyan, A., “SoS Fault Modelling at the Architectural Level in an Emergency Response Case Study” in Proceedings of the Workshop on Engineering Dependable Systems of Systems, 2014 (EDSoS 2014)

• Andrews, Z.; Ingram, C.; Payne, R.; Plat, N., SysML Fault Modelling in a Traffic Management System of Systems” in Proceedings of the 2014 IEEE International Systems of Systems Engineering Conference (SoSE 2014).

• Andrews, Z.; Ingram, C.; Payne, R.; Romanovsky, A.; Perry, S. & Holt, J., Traceable Engineering of Fault-Tolerant SoSs, in “International INCOSE Symposium, Las Vegas, USA”, June-July 2014.

• COMPASS Deliverables D24.2, D33.3 and D43.3: http://www.compass-research.eu/• Fault Analysis Tool: http://sourceforge.net/projects/compassresearch/files/StaticFaultAnalysis/

• Atego Modeler: http://www.atego.com/products/atego-modeler/• HiP-HOPS: http://hip-hops.eu/