f ault d etection in a hw /sw codesign environment

21
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök

Upload: kamal-wall

Post on 03-Jan-2016

48 views

Category:

Documents


2 download

DESCRIPTION

F ault D etection in a HW /SW CoDesign Environment. Prepared by A. Gaye Soyk ö k. Outline. Introduction System Specification Fault model Some terminology Methodology Analysis Reliable communication HW/SW Partitioning. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: F ault  D etection  in a HW /SW CoDesign Environment

Fault Detection in a HW/SW CoDesign Environment

Prepared by A. Gaye Soykök

Page 2: F ault  D etection  in a HW /SW CoDesign Environment

Outline

• Introduction• System Specification• Fault model• Some terminology• Methodology Analysis• Reliable communication• HW/SW Partitioning

Page 3: F ault  D etection  in a HW /SW CoDesign Environment

Introduction

• System reliability aspects are generally considered to the end of the design process, at low abstraction levels

• Working at low abstraction levels introduces more overhead

• Not all systems can be considered at low levels• It is better to handle fault detection at higher

levels• It is better to asses if fault detection should be

done in HW or SW for system performance

Page 4: F ault  D etection  in a HW /SW CoDesign Environment

Introduction

• At system level several parameters are considered and an alternative design is chosen among several alternatives– Time constraints– Power consumption– Testability– Area

Page 5: F ault  D etection  in a HW /SW CoDesign Environment

Introduction

• Fault detection facilities are introduced at system level– HW/SW binding of components is affected

• System Specification: which parts are critical and need fault detection

• Design methodologies: how these detection facilities are applied either in HW or SW

• HW/SW partitioning: which parts are in SW, which are in HW. Guided by methodologies

Page 6: F ault  D etection  in a HW /SW CoDesign Environment

System Specification

• Language must support .. User should eb able to specify which sections require reliability aspectsFor ex: SystemC or OCCAM

• Architecture; CPU(dsp or general purpose),

Coprocessors,

(ASIC or FPGA)

Page 7: F ault  D etection  in a HW /SW CoDesign Environment

FAULT MODEL

• Single Functional Failure– Any number of physical faults causes a functional

model to perform incorrectly– HW is faulty, software is affected by hardware– CPU, communication channels, one of Co processors

, memory may fail– Module failure is detected before any other fails

• Temporal, architectural and informational redundancy is adopted

Page 8: F ault  D etection  in a HW /SW CoDesign Environment

Some Terminology

• Nominal :original system function elements

• Checking: redundant elements for fault detection

• Checker: element to compare checking and nominal

• Each of these elements can be independently implemented in either HW or SW

Page 9: F ault  D etection  in a HW /SW CoDesign Environment

HW or SW

• Nominal SW, Checker SW, Checking SWChecking and checker are either executed by

system processor or a dedicated processorEx: Self checking SW, Assertions,

Dual_processor and VLIW

Page 10: F ault  D etection  in a HW /SW CoDesign Environment

HW or SW (Cont’d)

• Nominal SW, checker HW and checking SWInterface for functional Redundancy check,

VLIW with hardware, Dma checker

• Nominal SW, checker HW and checking HW CED solutions are implemented totally in HW,

EX: Dynamically configurable checker

Page 11: F ault  D etection  in a HW /SW CoDesign Environment

HW or SW (Cont’d)

• Nominal HW, Checker HW, Checking HWClassical Approach. Ex: Duplication , TSC

devices

Page 12: F ault  D etection  in a HW /SW CoDesign Environment

Methodologies Analysis - Concepts

• Number and type of processing elements

• Whether special architecture is necessary

• Synchronization issues between processing elements

• Allocation of checker memory space

• Checker structure and complexity

• Selection of a checker methodolgy to raise errors in case of mismatches

Page 13: F ault  D etection  in a HW /SW CoDesign Environment

Methodologies Analysis - Metrics

• Detection latency: the time between the instant an error occurs and the instance it is detected

• Coverage: how many of the existing faults can be detected

• Performance degradation: overhead caused by fault detection facilities compared to nominal functions

Page 14: F ault  D etection  in a HW /SW CoDesign Environment

Methodologies Analysis – Metrics (Cont’d)

• Material cost: cost of physical components

• Design Cost: effort needed to design the system

Page 15: F ault  D etection  in a HW /SW CoDesign Environment

Reliable Communication

• Apart from data processing communication needs to be reliable

• Hardware redundancy ; lines duplication

• Information redundancy; data encoding

• Best effective when data encoding is used when SW is involved and hardware sections employ dedicated lines (dublicated, encoded)

Page 16: F ault  D etection  in a HW /SW CoDesign Environment

HW/SW Partitioning

• After systems is specified, methodologies has been assessed, different alternatives have been produced with cost functions partitioning step takes place.

• Evaluate cost functions, evaluate constraints of the user

• Reliability aspects make it more complex

Make partitioning in two stages!

Page 17: F ault  D etection  in a HW /SW CoDesign Environment

HW/SW Partitioning (Cont’d)

• First level: classical aspects and functions are taken into account

• Second level: given the first solution reliability aspects are introduced and a solution between solution set that has the best trade off and that satisfies the first constraints is chosen.

• If no reliability constraints is given second level is not carried

Page 18: F ault  D etection  in a HW /SW CoDesign Environment
Page 19: F ault  D etection  in a HW /SW CoDesign Environment

HW/SW Partitioning (Cont’d)

• If specific architecture is required for reliability (for example dual processor) fist level benefits from earlier partitioning solutions

• A solution may not exist after reliability constraints are introduced and first level may need to be repeated

Page 20: F ault  D etection  in a HW /SW CoDesign Environment

HW/SW Partitioning (Cont’d)

• Reliability constraints may be which druve the second stage– Hard, ex: % 100 fault coverage – Soft, ex: any fault coverage

• Parameters considered– Fault coverage– Performance degradation– Detection latency– Area overhead

Page 21: F ault  D etection  in a HW /SW CoDesign Environment

Conclusion

• Design for reliability has been merged into HW/SW codesign process resulting in a final design that has on-line fault detection properties

• Future work is introducing fault tolerancy into HW/SW codesign process