noc error correction

Upload: navin-kumar

Post on 03-Mar-2016

213 views

Category:

Documents


0 download

DESCRIPTION

Useful start on NOC error correction

TRANSCRIPT

  • Multi-bit Error Correction in NOC Links

    Vamsi Yadav Chavali - 108110021

    Jayasimha Bezewada - 108110020

    Navin kumar - 108110057

    Suravarupu Sandeep- 108110092

  • Contents

    Introduction

    Review of existing papers

    Analysis of ECC , CAC , LPC codes

    Proposed method

  • Introduction

    Even though a chip passes a manufacturing test and it is integrated to a system a combination of newer DSM (deep submicron) technologies and lower voltage makes the chip more vulnerable to

    Transient effects

    Radiation, electromagnetic interference

    Permanent effects

    Crosstalk, device aging, and physical wear out.

    Thus chip designer has the challenge to create a robust design using unreliable DSM technologies.

  • The usual approaches to deal with on-line faults are based on redundancy

    Information redundancy

    Cyclic codes

    Crosstalk avoidance coding (CAC)

    Time redundancy

    Data retransmission

    Space redundancy

    Triple Modular Redundancy TMR

    Spare wires

    Combination of these approaches

  • The Major Constraints

    Direct Costs

    Silicon area - As the complexity increases so does the silicon area. But with the latest 3D integrated circuit the area might not be a major concern

    Power dissipation - Again as the complexity increases the power dissipation also increases but so does the error detection and correction capabilities. This enables the voltage level of 1 to be reduced which is the case now-a-days with low chip voltages. The more errors produced because of this is compensated by better detection circuit. Thus the bit error rate remains constant and with proper balancing power efficiency can be improved.

    Also the presence of buffers causes greater dissipation of power.

    Codec delay - As complexity increases the codec delay increases but the retransmission rate decreases. Thus with proper balancing net delay can be reduced.

    Indirect Costs Network Congestion - Here due to packet retransmission greater congestion in the circuit

    increases thus increasing the overall delay in the packets.

  • The 3 main and only components to NOC error correction codes

    ECC (Error correction codes)

    These include hamming , cyclic codes for detecting and correcting transient errors.

    CAC (Cross Talk avoidance codes)

    Mainly caused by transition in adjacent wire in the opposite direction and capacitive coupling between them. Popular approach is to use space redundancy.

    LPC ( Low power codes )

    Generally reducing transitions is used for low power coding.

  • Error location

    Another important factor is that , in the case of any permanent faults that may occur due to thermal causes or device aging can cause rapid performance degradation.

    But with error location and avoidance methods the graceful degradation is possible.

    It can be implemented in 2 ways :

    State machine

    Periodic checking