fault diagnosis overview david lavo uc santa cruz january 13, 2005

54
Fault Diagnosis Fault Diagnosis Overview Overview David Lavo UC Santa Cruz January 13, 2005

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Fault Diagnosis OverviewFault Diagnosis Overview

David Lavo

UC Santa Cruz

January 13, 2005

Page 2: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 2

OutlineOutline

• Introduction: What is Fault Diagnosis?• Components: What’s involved?• Algorithm details: How does it work?• Diagnosis in practice: How does it really

work?• Research: Why does (or doesn’t) it work?

How should it work?

Page 3: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 3

What is Fault Diagnosis?What is Fault Diagnosis?

• A guess as to what’s wrong with a malfunctioning circuit

• Narrows the search for physical root cause• Makes inferences based on observed

behavior• Usually based on the logical operation of the

circuit

Page 4: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

VLSI Fault Diagnosis VLSI Fault Diagnosis (in One Slide)(in One Slide)

Tests ObservedBehavior

Defective Circuit

Diagnosis Diagnosis AlgorithmPhysical Analysis

Location or

Fault

Page 5: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 5

Two Types of DiagnosisTwo Types of Diagnosis

• Circuit Partitioning (“Effect-Cause” Diagnosis)– Identify fault-free or possibly-faulty portions– Identify suspect components, logic blocks,

interconnects• Model-Based Diagnosis (“Cause-Effect”

Diagnosis)– Assume one or more specific fault models– Compare behavior to fault simulations

Page 6: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 6

Circuit PartitioningCircuit Partitioning

• Separate known-good portions of circuit from likely areas of failure

• Simplest method: identify failing flip-flops– Tester can identify failing flops or outputs– Input cone of logic is suspect– Intersection of multiple cones is highly

suspect– Single clock pulse with scan can be used

for sequential/functional fails

Page 7: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Back-Tracing FailuresBack-Tracing Failures

Page 8: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 8

aka Effect-Cause Diagnosisaka Effect-Cause Diagnosis

• Reasoning based on observed behavior and expected (good-circuit) functions

• Commonly used at system and board-levels• Tries to separate good and suspect areas• Advantage: Simple and general• Disadvantage: Not very precise, often gives

no indication of defect mechanism

Page 9: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 9

Cause-Effect DiagnosisCause-Effect Diagnosis

• Start from possible causes (fault models), compare to observed effects

• A simulator is used to predict behavior of the circuit in the presence of various faults

• Match prediction(s) against observed behavior• Advantage: Implicates a mechanism as well as

a location• Disadvantage: Can be fooled by unmodeled

defects

Page 10: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Tests

Defective Circuit

Fault Simulator

010001010100010101010 …

Behavior Signature

010100110000101010100 …

101000100001011101100 …

010100010100011101100 …

000111000101010011110 …

Candidate Signatures

Diagnosis Algorithm

Comparison & Conclusion

Cause-Effect DiagnosisCause-Effect Diagnosis

Page 11: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 11

OutlineOutline

• Introduction: What is Fault Diagnosis?• Components: What’s involved?• Algorithm details: How does it work?• Diagnosis in practice: How does it really

work?• Research: Why does (or doesn’t) it work?

How should it work?

Page 12: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 12

Components of Fault DiagnosisComponents of Fault Diagnosis

• Fault models

• Fault simulators

• Fault dictionaries

• Diagnosis algorithms

Page 13: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 13

Fault ModelsFault Models

• A fault model is an abstraction of a type of defect behavior

• A fault instance is the application of a model to a circuit wire, node, gate, etc.

• Used to create and evaluate test sets• For diagnosis, they can be used to simulate

and predict faulty behaviors

Page 14: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

• The most-used fault model (by far)

• Simple to simulate and enumerate

• Effective for testing, fault grading, and diagnosis of some defects

• Many defects are not well represented by the stuck-at model

0/10/1

1

Node A stuck-at 1:

(Fault-free/faulty logic values)

A

B

Stuck-at Fault ModelStuck-at Fault Model

Page 15: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

• Shorts are a common defect type in CMOS

• Different bridging fault models have varying accuracy and precision, from simplistic to very sophisticated

• Difficult or impractical to enumerate

Bridging Fault ModelBridging Fault Model

0

1

1

1

0

1/0

X

Y

Nodes X and Y bridged:

Node X forces Y to a value of 0

Page 16: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Some Diagnostic Fault ModelsSome Diagnostic Fault Models

Gate FaultNet Fault

Bridging Fault Path Fault

Page 17: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 17

Fault SimulatorsFault Simulators

• A fault simulator can simulate instances of a particular fault model

• Inputs:– Circuit (netlist)– Test set– Faultlist (list of fault instances)

• Output: circuit response• Usually, simulates the presence of a single

fault instance (“single-fault assumption”)

Page 18: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 18

Fault DictionariesFault Dictionaries

• A fault dictionary is a database of the simulated responses for all faults in faultlist

• Used by some diagnosis algorithms for convenience:– Fast: no simulation at time of diagnosis– Self-contained: netlist, simulator, and test

set not needed after dictionary creation• Can be very large, however!

Page 19: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 19

The Full-Response DictionaryThe Full-Response Dictionary

• For each fault ( f ), store the response to each test vector ( v )

• One bit per vector, pass ( 0 ) or fail ( 1 )• For each vector, store the expected output

response ( o )• Total storage requirement: f v o bits

Page 20: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 20

The Pass-Fail DictionaryThe Pass-Fail Dictionary

• For each fault, store only the test vector responses

• One bit per vector, pass ( 0 ) or fail ( 1 )• Total storage requirement: f v bits • Much smaller than full-response, and often

practical for even very large circuits

Page 21: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 21

Dynamic DiagnosisDynamic Diagnosis

• Alternative to dictionary-based diagnosis• Fault simulation is only done for certain faults,

based on test results– Only simulate faults in input cones of failing

flip-flops/outputs• Dictionary is eliminated, but requires

complete netlist and test pattern file• Used by most commercial ATPG tools:

Mentor Fastscan, Synopsys, Cadence, etc.

Page 22: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 22

OutlineOutline

• Introduction: What is Fault Diagnosis?• Components: What’s involved?• Algorithm details: How does it work?• Diagnosis in practice: How does it really

work?• Research: Why does (or doesn’t) it work?

How should it work?

Page 23: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 23

Algorithm DetailsAlgorithm Details

• Role of a diagnosis algorithm

• Scoring methods

• Types of diagnosis algorithms

Page 24: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 24

Diagnosis AlgorithmsDiagnosis Algorithms

• Algorithms compare observed behavior to predicted behaviors

• An algorithm attempts to “explain” the observed failures with fault candidates

• The job of a diagnosis algorithm is to report the best fault candidate(s)

• “Best” is determined by scoring method

Page 25: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 25

Fault Candidate ScoringFault Candidate Scoring

• Two common scoring methods– Match/mismatch points– Fault candidate probability

• Other common scorings:– Hamming distance– Set intersection/overlap– Nearest neighbor

Page 26: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 26

Match/mismatch Point ScoringMatch/mismatch Point Scoring

• Award points for matching observed failures• Optionally deduct points for not predicting fails• Nonprediction: A behavior not predicted by

candidate• Misprediction: A prediction not fulfilled by

behavior• Commercial tools (e.g. Fastscan) are usually

biased to lowest nonprediction

Page 27: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 27

Probabilistic ScoringProbabilistic Scoring

• Probability score based on matches and mismatches and error assumptions– Weights for non- and mis-prediction– Different prediction probabilities for different

fault candidates (bridges vs. stuck-at)• Usually normalized so that total of all

candidates equals 1.0• UCSC method uses probabilities to compare

stuck-at candidates to bridges in same diagnosis

Page 28: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 28

Types of Diagnosis AlgorithmsTypes of Diagnosis Algorithms

• Stuck-at– Most common, best supported by tools– Surprisingly effective (~60% exact

matches)– Very fast

• IDDQ

– Orthogonal set of failing data– Requires interpretation of tester results– Not well supported by tools

Page 29: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

IIDDQDDQ Threshold Setting Threshold Setting

0

20

40

60

80

100

120

140

160

180

0 50 100 150 200

Page 30: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 30

Types of Diagnosis Algorithms Types of Diagnosis Algorithms (Cont)(Cont)

• Bridging-fault– May better represent common CMOS

faults– More complicated fault model– Biggest problem: candidate selection

• Other possible (future) directions:– Functional fails– Delay fails– Parametric failures

Page 31: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 31

OutlineOutline

• Introduction: What is Fault Diagnosis?• Components: What’s involved?• Algorithm details: How does it work?• Diagnosis in practice: How does it really

work?• Research: Why does (or doesn’t) it work?

How should it work?

Page 32: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 32

Diagnosis in PracticeDiagnosis in Practice

• Using a diagnosis• Translating the results: circuit navigation• Evaluating diagnosis quality• Commercial diagnosis tools

Page 33: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 33

Using a DiagnosisUsing a Diagnosis

• Fault diagnosis is used to aid physical inspection and root-cause identification

• Diagnosis output is logical, not physical:– Abstract faults (such as stuck-at)– Gates, ports (nodes), and nets– No information about location or size

• Translation to physical location requires navigation of circuit

Page 34: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 34

Types of Circuit NavigationTypes of Circuit Navigation

• Netlist– Examine RTL (Verilog/VHDL etc) for gates

and data paths• Schematic

– Symbolic view of gates and wires• Layout/artwork

– Graphical view of metal lines, poly, vias, cell boundaries, etc.

Page 35: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Circuit NetlistCircuit Netlistmodule TOP (CLK, Reset, StartOut, SiReady, Rst_CntN, Up_DnN, Wr, SDin, Wr_RAM, Wr_Rreg, RAM_Addr, ATG_TESTMODE, BIST_TESTMODE, SDout, TwoOnes, OneOne, NoOnes, TwoZeros, OneZero, NoZeros);

input CLK;inout Reset, StartOut, SiReady, Rst_CntN, Up_DnN, Wr, SDin, Wr_RAM;

inout [2:0] RAM_Addr;inout ATG_TESTMODE;inout BIST_TESTMODE;inout SDout, OneZero, NoZeros;inout TwoOnes, OneOne, NoOnes, TwoZeros, Wr_Rreg;

// Tie off cellsTLOW tielow1 (.Q(tielow));THIGH tiehigh1 (.Q(tiehigh));

// Inverted CLKwire CLK_N;INVFF clkinv (.Q(CLK_N), .A(CLK));

//PADS

PADNMIOSCM0H08N05B50 PAD001_StartOut (.PUEN(tiehigh), .PDE(tielow), .IEN(tielow), .I(StartOut_I), .SIGNAME(StartOut), .INMODE(in_mode_avail), .TESTI(jumper001), .TESTIEN(tiehigh), .SCANIN(jumper001), .OUTMODE(out_mode_avail), .TESTO(tiehigh), .TESTOEN(tiehigh), .O(tielow), .OEN(tiehigh));

Page 36: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 36

Netlist NavigationNetlist Navigation

• Either use text editor on netlist, or use browser function in simulator

• Browsers allow you to trace forward and backward and see logic values

• Can be used to view hierarchy and functional blocks

• Can be tedious

Page 37: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Circuit SchematicCircuit Schematic

Page 38: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 38

Schematic NavigationSchematic Navigation

• Either hand-drawn (from netlist navigation) or tool-generated gate symbols and wires

• Schematic tools in simulators also allow forward and backward traversal and display of logic values

• Used to verify fault propagation• Does not reflect physical distances

Page 39: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Circuit ArtworkCircuit Artwork

Page 40: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 40

Layout (Artwork) NavigationLayout (Artwork) Navigation

• Use routing/floorplanning tools to view artwork• Can usually input cell or wire name and tool will

highlight the object• Useful for determining (x,y) values• Also good for evaluating physical implications of

a set of fault candidates– Faults clustered in a small area are good– Faults/nets spread around large die areas are

bad

Page 41: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Fault ProximityFault Proximity

Faults contained in small area: physical examination is possible

Net runs across die: physical examination is almost impossible

Page 42: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 42

Evaluating a DiagnosisEvaluating a Diagnosis

• A diagnosis without one or a few strong (high-scoring) candidates is usually poor

• Can indicate:– Multiple defects– Unmodeled (complex) behavior– Inappropriate algorithm

• If the diagnosis is poor, either try another algorithm or look for more data (failures)

Page 43: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 43

Evaluating a Diagnosis (cont)Evaluating a Diagnosis (cont)

• Many diagnoses (~60%) implicate a single stuck-at fault

• Usually a good sign, but you must consider equivalent faults

• Many defects can mimic a stuck-at fault, without being a short to Vdd or Gnd

• Consider nearby nodes also, if practical

Page 44: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Dominance Bridging FaultDominance Bridging Fault

FIB short

Strong inverter

Weak inverterTop candidate is stuck-at fault

on this node.

Page 45: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

Candidate #2 is BestCandidate #2 is Best

FIB short

Candidate #1 Candidate #2Candidate #3

Page 46: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 46

Commercial Tool:Commercial Tool:Mentor GraphicsMentor Graphics

• ATPG tool: Fastscan• Stuck-at diagnosis only

• No IDDQ capability

• Orders candidates by number of matched failures (biased to lowest non-prediction)

• Also has netlist & schematic browser• Based on Waicukauski & Lindbloom (D&T‘89)

Page 47: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 47

Commercial Tool: SynopsysCommercial Tool: Synopsys

• ATPG tool: TetraMAX• J. Waicukauski moved to Synopsys after

writing Fastscan• Diagnosis capability unknown: assumed to be

similar to Fastscan

Page 48: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 48

Commercial Tool: CadenceCommercial Tool: Cadence

• ATGP tool: Encounter Test• Test and diagnosis tools purchased from IBM• IBM has had good diagnosis research, but

Encounter’s capabilities are unknown• Also of interest: Silicon Ensemble - routing tool• Graphical artwork viewer• Good for highlighting nets and cells based on

diagnosis results• Good for determining (x,y) and producing screen

shots

Page 49: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 49

OutlineOutline

• Introduction: What is Fault Diagnosis?• Components: What’s involved?• Algorithm details: How does it work?• Diagnosis in practice: How does it really

work?• Research: Why does (or doesn’t) it work?

How should it work?

Page 50: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 50

Prior ArtPrior Art• Waicukauski & Lindbloom, IEEE Design & Test, Aug. ‘89

– Most widely-used algorithm for commercial tools– Finds candidates to match individual tests, attempts to

“explain” all failing tests• Abramovici & Breuer, IEEE Trans. Computing, June ‘80

– Effect-cause diagnosis– Permanent stuck-at fault assumption

• Aitken & Maxwell, HP Journal, Feb. ’95– Analysis of relative importance of models vs. algorithms

• Lavo, Larrabee, et. Al., Proceedings of ITC ’98– Probabilistic scoring– Mixed-model diagnosis

• Bartenstein et. Al., Proceedings of ITC ’01– SLAT: Single Location At-a-Time diagnosis– Focus on matching per-vector results

Page 51: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 51

Prior Art (cont)Prior Art (cont)

• Jee & Ferguson, Proceedings of ISTFA ’93– Carafe – Inductive Fault Analysis (IFA)– Examine circuit to determine likely failure locations

• Aitken, Proceedings of ITC ’95– Using FIBs to insert defects– Calibrate/evaluate diagnosis methods

• Henderson & Soden, Proceedings of ITC ’97– Probabilistic physical failure analysis

• Nigh, Vallett, et. Al., Proceedings of ITC ’98– Large-scale, multi-company SEMATECH experiment– Failure analysis of timing and IDDQ fails

Page 52: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 52

Research DirectionsResearch Directions

• Complex defect behaviors– Beyond stuck-at and 2-line bridges– Intermittent faults– Delay and timing-related defects– Parametric & process-related defects– Multiple simultaneous defects– Is there a simple, inductive way to infer

complex defects?

Page 53: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 53

Research Directions (cont)Research Directions (cont)

• Diagnosibility– What makes a particular circuit easy or

hard to diagnose?– What can we do to make diagnosis easier?

• Evaluation of diagnoses– What makes a good diagnosis?– Can we quantify our confidence in a

diagnosis?

Page 54: Fault Diagnosis Overview David Lavo UC Santa Cruz January 13, 2005

©2005 David Lavo Fault Diagnosis Overview 54

Research Directions (cont)Research Directions (cont)• Integration with physical FA & yield improvement

– Can we incorporate process information?– Can we produce a “physical diagnosis”?– On-line (or even on-chip) diagnosis

• Commercial toolflow integration– Can diagnosis tools use industry-standard data

formats?– Can commercial tools be scripted or

programmed to do better diagnosis?