built-in self-test of dsps in virtex-4 fpgasstrouce/class/elec6970/dspbist.pdf · 9application to...

30
Built Built - - In Self In Self - - Test of Test of DSPs DSPs in Virtex in Virtex - - 4 FPGAs 4 FPGAs Charles Stroud Charles Stroud Dept. of Electrical & Computer Engineering Dept. of Electrical & Computer Engineering Auburn University Auburn University (Funded by NSA)

Upload: habao

Post on 22-Mar-2018

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

BuiltBuilt--In SelfIn Self--Test of Test of DSPsDSPsin Virtexin Virtex--4 FPGAs4 FPGAs

Charles StroudCharles StroudDept. of Electrical & Computer EngineeringDept. of Electrical & Computer Engineering

Auburn UniversityAuburn University

(Funded by NSA)

Page 2: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 22

Outline of PresentationOutline of PresentationHistory of DSP Architectures in History of DSP Architectures in FPGAsFPGAs

Overview of VirtexOverview of Virtex--4 DSP4 DSPPrior Testing R&D vs. Our Analysis for:Prior Testing R&D vs. Our Analysis for:

Literature on DSP test not applicableLiterature on DSP test not applicableNo papers published on No papers published on DSPsDSPs in in FPGAsFPGAs

Literature on Multipliers and AddersLiterature on Multipliers and AddersApplication to VirtexApplication to Virtex--4 DSPs4 DSPs

BIST for DSPs in VirtexBIST for DSPs in Virtex--44Architecture, Operation, and ImplementationArchitecture, Operation, and ImplementationTiming and Fault Injection AnalysisTiming and Fault Injection Analysis

Summary and ConclusionsSummary and ConclusionsPlans for application to VirtexPlans for application to Virtex--55

Page 3: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 33

Xilinx FPGA ArchitecturesXilinx FPGA Architectures4000/Spartan4000/Spartan

NNxxNN array of unit cellsarray of unit cellsUnit cell = CLB + routingUnit cell = CLB + routingFast carry logic in CLBs for addersFast carry logic in CLBs for adders

Virtex/SpartanVirtex/Spartan--22MMxxNN array of unit cellsarray of unit cells

Carry logic + AND gate for array multipliersCarry logic + AND gate for array multipliers4K block 4K block RAMsRAMs at edgesat edges

VirtexVirtex--2/Spartan2/Spartan--3318K block 18K block RAMsRAMs in arrayin array18x1818x18--bit multipliers with each RAMbit multipliers with each RAM

““based on modified Booth architecturebased on modified Booth architecture””

VirtexVirtex--4/Virtex4/Virtex--55Added 48Added 48--bit DSP cores w/multipliersbit DSP cores w/multipliers

Altera includes 9x9 multipliersAltera includes 9x9 multipliers““based on modified Booth architecturebased on modified Booth architecture””

PC PC

PC

PC

Page 4: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 44

VirtexVirtex--4 DSP Architecture4 DSP Architecture2 DSP slices per tile2 DSP slices per tile

1616--256 tiles in 1256 tiles in 1--8 8 columnscolumns

Each DSP includes:3-input, 48-bit adder/subtractor

P = ZP = Z±±(X+Y+Cin)(X+Y+Cin)Optional accum reg

18x18-bit 2's-comp multiplier (w/o adder)User controlled User controlled operational modesoperational modes

For X, Y, & Z MUXsFor X, Y, & Z MUXsConfiguration bits Configuration bits control other MUXscontrol other MUXs

Pipelining registersPipelining registersAccumulator registerAccumulator register

×

×

±

±

X

Y

Z

X

Y

Z

C(48)

A(18)B(18)

A(18)B(18)

P(48)

P(48)

Inputs for cascading

Inputs for cascadingOutputs w/ dedicated routing

Outputs w/ dedicated routing

Page 5: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 55

Multiplier and Adder ArchitecturesMultiplier and Adder ArchitecturesTest algorithm depends Test algorithm depends on architectureon architecture

But architecture is not But architecture is not specified in data sheetsspecified in data sheets

Eliminate sequential logic Eliminate sequential logic architecturesarchitectures““Based on modified BoothBased on modified Booth””

Adder Adder choices include:choices include:Ripple carryRipple carryCarry selectCarry selectCarry saveCarry saveCarryCarry--looklook--ahead (CLA)ahead (CLA)

Our assumption based on Our assumption based on area/performance analysisarea/performance analysisBut multiple types of CLABut multiple types of CLA

Multiplier choices include:Multiplier choices include:ArrayArrayBoothBoothModified BoothModified BoothWallace treeWallace treeModified Booth/Wallace Modified Booth/Wallace treetree

Our assumption based on Our assumption based on area/performance analysisarea/performance analysis

Our goal: find/develop Our goal: find/develop architecture independent architecture independent test test algorithm(salgorithm(s) )

Page 6: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 66

Array Multiplier Array Multiplier TestTest AlgorithmAlgorithmKalyana KantipudiKalyana Kantipudi’’s MS thesiss MS thesis

10 vectors give 100% fault coverage for C628810 vectors give 100% fault coverage for C6288a 16x16a 16x16--bit array multiplierbit array multiplier

18x1818x18--bit array multiplier resultsbit array multiplier resultsOnly achieved Only achieved ≈≈ 95% fault coverage95% fault coverage

Pattern expansion required for 16x16Pattern expansion required for 16x16--bit to 18x18bit to 18x18--bitbitPotential for mistakes Potential for mistakes ifif patterns not expanded properlypatterns not expanded properly

Modified Booth multiplier resultsModified Booth multiplier results≈≈ 62% with carry62% with carry--save addersave adder≈≈ 37% with CLA37% with CLA

ConclusionConclusion: array multiplier test vectors do not : array multiplier test vectors do not adequately test modified Booth multiplieradequately test modified Booth multiplier

Chris EricksonChris Erickson’’ssResultsResults

Note differenceNote differencein FC wrt adderin FC wrt adderimplementationimplementation

Page 7: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 77

Modified Booth Test AlgorithmsModified Booth Test AlgorithmsTwo test algorithms using 8Two test algorithms using 8--bit counter bit counter (256 vectors)(256 vectors)

““Low Power BIST for Wallace TreeLow Power BIST for Wallace Tree--based Fast Multipliersbased Fast Multipliers””Bakalis, Kalligeros, Nikolos, Vergos & AlexiouBakalis, Kalligeros, Nikolos, Vergos & Alexiou

Proc. Int. Symp. on Quality of Electronic Design,Proc. Int. Symp. on Quality of Electronic Design,pp. 433pp. 433--438, 2000438, 2000

5x3 connections with 5 inputs to Booth encoding5x3 connections with 5 inputs to Booth encodingBut which side is Booth encoding?But which side is Booth encoding?Our approach: run both 5x3 and 3x5 algorithmsOur approach: run both 5x3 and 3x5 algorithms

““Effective BuiltEffective Built--In SelfIn Self--Test for Booth MultipliersTest for Booth Multipliers””Gizopoulos, Paschalis & ZorianGizopoulos, Paschalis & Zorian

IEEE Design & Test of ComputersIEEE Design & Test of Computerspp. 105pp. 105--111, 1998111, 1998

4x4 connections to multiplier inputs4x4 connections to multiplier inputsOur approach: also include 4x4 if fault coverage improvesOur approach: also include 4x4 if fault coverage improves

×nn

2n

Booth encoding

n×n multiplier

8-bit counterMSB LSB

4 4

4×4 algorithm

5 3

5×3 algorithm

3 5

3×5 algorithm

Algorithm used inAlgorithm used inSrinivasSrinivas GarimellaGarimella’’ss

MS thesis forMS thesis forVirtexVirtex--2 multipliers2 multipliers

Page 8: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 88

4x4 Booth Multiplier 4x4 Booth Multiplier TestTest AlgorithmAlgorithm18x1818x18--bit array multiplier resultsbit array multiplier results

≈≈ 99.99% (99.99% (1 undetected fault1 undetected fault))Booth multiplier resultsBooth multiplier results

≈≈ 90%90% with ripplewith ripple--carry addercarry adder≈≈ 90%90% with carrywith carry--save addersave adder≈≈ 70%70% with CLAwith CLA

ConclusionConclusion: modified Booth multiplier test : modified Booth multiplier test vectors do test array multipliervectors do test array multiplier

But ModifiedBut Modified--Booth/WallaceBooth/Wallace--Tree appears to Tree appears to be most likely candidate for Virtexbe most likely candidate for Virtex--4 DSP 4 DSP multiplier implementationmultiplier implementation

Also for VirtexAlso for Virtex--5 and 5 and AlteraAltera

Chris EricksonChris Erickson’’ssResultsResults

Note differenceNote differencein FC wrt adderin FC wrt adderimplementationimplementation

Page 9: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 99

Other Multiplier ResultsOther Multiplier Results4x44x4--bit implementationsbit implementationsExhaustive test patternsExhaustive test patterns

Undetected faults are undetectableUndetected faults are undetectableSame as 4x4, 5x3, & 3x5 algorithm for 4x4Same as 4x4, 5x3, & 3x5 algorithm for 4x4--bit multiplierbit multiplier

Simulation results discrepancy for array multiplierSimulation results discrepancy for array multiplier4 undetected faults in 4x44 undetected faults in 4x4--bit implementationbit implementation1 undetected fault in 18x18 multiplier w/ 4x4 algorithm 1 undetected fault in 18x18 multiplier w/ 4x4 algorithm in in Chris EricksonChris Erickson’’ss resultsresults

280280320320268268

# detected# detected

283283337337272272

# faults# faults

98.9%98.9%33Wallace TreeWallace Tree95.0%95.0%1717Signed ArraySigned Array98.5%98.5%44ArrayArray

FCFC# # undetectundetectMultiplierMultiplier

Chitanya BandiChitanya Bandi’’s Resultss Results

Page 10: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1010

8×8 Modified-Booth/Wallace-Tree

Fault simulation results:Fault simulation results:5x3 plus 3x5 give best fault coverage5x3 plus 3x5 give best fault coverage

No additional faults detected with 4x4No additional faults detected with 4x4

99.4%99.4%86.2%3222,0194×4 & 3×599.9%99.9%86.7%3122,0295×3 & 3×5

99.2%86.1%3262,015512

4×4 & 5×399.2%99.2%86.1%3262,0153×599.0%99.0%85.9%3302,0115×398.7%85.6%3362,005

2564×4

100%86.8%3102,031

2,341

65,536Exhaustive

With reduction

99.0%72.8%9252,4772564×4100%74.1%8822,520

3,40265,536ExhaustiveNo

reduction

Effective FC

Fault Coverage

Not Detected

Faults Detected

Total Faults

# Vectors

Test Algorithm

MultiplierVersion

Chitanya BandiChitanya Bandi’’s Resultss Results(note: used ripple carry (note: used ripple carry

adder to sumadder to sumpartial products)partial products)

Page 11: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1111

CarryCarry--LookLook--Ahead AdderAhead AdderRecall CLA was Recall CLA was more difficult to testmore difficult to testBasic CLA is 4Basic CLA is 4--bitsbits

44--bit CLAs then bit CLAs then combined to form combined to form larger adderslarger adders

Ripple CLAsRipple CLAs2 types based on 2 types based on Lookahead Carry Lookahead Carry Unit (LCU):Unit (LCU):

Ripple LCURipple LCUMultiMulti--stage LCUstage LCU

C1=G0+P0•C0C2=G1+G0•P1+P1•P0•C0C3=G2+G1•P2+G0•P1•P2+P2•P1•P0•C0C4=G3+G2•P3+G1•P2•P3+G0•P1•P2•P3+P3•P2•P1•P0•C0

Gi=Ai•BiPi=Ai+Bi

FullAdder

A3 B3

S3

FullAdder

A2 B2

S2

FullAdder

A1 B1

S1

FullAdder

A0 B0

S0

P3G3 C3 P2G2 C2 P1G1 C1 P0G0

4-bit Carry Look Ahead PG GG

C0

C4

PG=P0•P1•P2•P3GG=G3+G2•P3+G1•P2•P3+G0•P1•P2•P3

Page 12: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1212

CLA Test AlgorithmsCLA Test Algorithms““On the Adders with Minimum TestsOn the Adders with Minimum Tests””

Kajihara and SasaoKajihara and SasaoProc. VLSI Test Symp, pp. 10Proc. VLSI Test Symp, pp. 10--15, 1997 (VTS15, 1997 (VTS’’97)97)

10 vectors detect all single and multiple faults10 vectors detect all single and multiple faultsIn any size In any size rippleripple CLA (CLA (not an LCU implementationnot an LCU implementation))

““Scalable Test Generators for HighScalable Test Generators for High--Speed Speed Datapath CircuitsDatapath Circuits””

AlAl--Asaad, Hayes, and MurrayAsaad, Hayes, and MurrayJ. Electronic Testing, vol 12, pp. 111J. Electronic Testing, vol 12, pp. 111--125, 1998 (JETTA125, 1998 (JETTA’’98)98)

22××((NN+1) vector sequence (for an +1) vector sequence (for an NN--bit adder)bit adder)TPG implementation requires:TPG implementation requires:

NN+1+1--bit shift registerbit shift registerNN XOR gates, XOR gates, NN XNOR gates, and 1 inverterXNOR gates, and 1 inverter

Page 13: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1313

Qi Qi+1

to CLAcarry-in

Ai Bi

N+1-bit Serial Shift Register

CLA BIST SchemeCLA BIST SchemeEasy BIST circuit to implement

But we found a problem in design2 missing patterns needed for 100% FC

Replace inverter with flip-flop2×(N+2) vector sequence

1111111110000000000111111111000000000111111111000000000011111111010000000011111111011000000011111111011100000011111111011110000011111111011111000011111111011111100011111111011111110011111111011111111011111111100000000011111111110000000001111111110000000001111111111000000001011111111000000001001111111000000001000111111000000001000011111000000001000001111000000001000000111000000001000000011000000001000000001000000000

Ai Bi Cin

reset

Page 14: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1414

Fault Simulation ResultsFault Simulation ResultsJETTAJETTA’’98 approach gives best overall fault coverage 98 approach gives best overall fault coverage regardless of adder implementationregardless of adder implementation

Undetected faults in JETTAUndetected faults in JETTA’’98 approach can be detected98 approach can be detectedResults in Results in ““New BISTNew BIST”” column for column for 2×(N+2) vector sequencevector sequence

JETTAJETTA’’98 also claims similar BIST approach for 98 also claims similar BIST approach for ModifiedModified--Booth multiplierBooth multiplier

But description of test algorithm is very sketchyBut description of test algorithm is very sketchy

100%99.9%95.7%154212Ripple LCU

Test AlgorithmNew BISTJETTA’98VTS’97

#Faults

GateDelays

48-bit CLA AdderImplementation

100%99.9%95.9%150610Multi-stage LCU

100%99.9%100%139228Ripple CLA

Page 15: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1515

Adder in VirtexAdder in Virtex--4 DSP4 DSPAdder has 3 input portsAdder has 3 input ports

P = ZP = Z±±(X+Y+Cin)(X+Y+Cin)We interpret this as a 2We interpret this as a 2--stage CLA stage CLA adder/subtractor implementationadder/subtractor implementation

Apply test patterns to each stage in turnApply test patterns to each stage in turn2 clock cycles2 clock cyclesper vectorper vectorOPMODEOPMODEcontrolcontrol

48-bit CLA

48-bit CLA

(X MUX)A port

(Y MUX)B port

(Z MUX)C port CIN

Subtract

Clock cycle #1Clock cycle #1X test vectorX test vectorClock cycle #2Clock cycle #2Y test vectorY test vectorClock cycle #2Clock cycle #2Z test vectorZ test vector

Page 16: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1616

DSP BIST Modes & SequencesDSP BIST Modes & SequencesTest pattern sequenceTest pattern sequence

Four groups of 256 clock cycles (ccs) eachFour groups of 256 clock cycles (ccs) eachAllows control of operational modes (OPMODEs) of DSPAllows control of operational modes (OPMODEs) of DSP

Test mode controlled by 4Test mode controlled by 4--bit shift registerbit shift registerBits include: Test Mode (2), Invert Control Signals, ResetBits include: Test Mode (2), Invert Control Signals, ResetContents loaded via Boundary Scan interfaceContents loaded via Boundary Scan interface

Reduces the number of downloads to FPGAReduces the number of downloads to FPGA

Pseudo-Random Control SignalsConstant Control Signals

P1 = Z(C)P0=A:B+Z(ShiftPC)

P1 = Z(C)P0=A:B+Z(PC)

P1=A:B+Z(ShiftPC)P0 = Z(C)

P1 = A:B+Z(PC)P0 = Z(C)10 (cascade)

P = Y(C)P=Y(C)+Z(ShiftP)

P = Z(C)P=Y(C)+Z(P)

P = Y(C)P=Y(C)+Z(P)

P = Z(C)P=X(P)+Y(C)

01 (adder)Preg=1 only

P = A:B+CP = A×B+CP = A×BP = A×B00 (multiply)

Fourth 256 ccsThird 256 ccsSecond 256 ccsFirst 256 ccsMode (Test)

Page 17: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1717

BIST ArchitectureBIST Architecture2 2 TPGsTPGs drive alternate rows drive alternate rows of of DSPsDSPs tilestiles

TPG drives both TPG drives both DSPsDSPs in tilein tilePrevents faulty TPG from Prevents faulty TPG from escaping detectionescaping detection

DSPsDSPs driven by different driven by different TPGsTPGs compared by compared by ORAsORAs

Like Like DSPsDSPs comparedcomparedSlice 0 compared to slice 0Slice 0 compared to slice 0Slice 1 compared to slice 1Slice 1 compared to slice 1

Top Top DSPsDSPs compared to compared to bottom bottom DSPsDSPs in circular in circular comparisoncomparison

TPG0

TPG1

DSP s0

DSP s1

DSP s0

DSP s1

DSP s0

DSP s1

DSP s0

DSP s1

DSP s0

DSP s1

DSP s0

DSP s1

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

ORAs

BSCANshift reg

test mode

Page 18: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1818

TPG ArchitectureTPG ArchitectureCounter Counter ⇒⇒ 55×3 and 33 and 3×5 multiplier test to ports A&B 5 multiplier test to ports A&B Shift register Shift register ⇒⇒ 2×(N+2) vector adder test to port Cvector adder test to port CFSM FSM ⇒⇒ OPMODE control for 4 group sequencesOPMODE control for 4 group sequencesLFSR LFSR ⇒⇒ pseudopseudo--random patterns to other control random patterns to other control inputs during last two groups of 256 clock cyclesinputs during last two groups of 256 clock cycles

A portB port

DSP slice 0P port

C port

OPMODEcontrol

toORAs

36

48

7

32

48

TPG

Counter

ShiftRegister

LFSR

FSM

36

48

7

48to

ORAs

A portB port

DSP slice 1P port

C port

OPMODEcontrol

32

Page 19: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 1919

ORA ImplementationORA ImplementationOld comparisonOld comparison--based ORAbased ORA

Logic 1 latched in FF due to mismatchesLogic 1 latched in FF due to mismatchesConfiguration memory readback used to get resultsConfiguration memory readback used to get results

CLBs have dedicated carry chain for fast adders CLBs have dedicated carry chain for fast adders and countersand counters

New ORA latches logic 0 due to mismatchNew ORA latches logic 0 due to mismatchCarry chain performs iterative OR functionCarry chain performs iterative OR functionSingle pass/failSingle pass/failindication at end ofindication at end ofBIST sequenceBIST sequenceOnly read configuration memory to get failing results Only read configuration memory to get failing results for diagnosisfor diagnosis

LUT

DSPi outputkDSPj outputk

LUT

DSPi outputkDSPj outputk

0 1

carry-in

carry-out

1

O O OTDI TDO

Page 20: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2020

BIST ConfigurationsBIST Configurations5 downloads to FPGA5 downloads to FPGA

1 compressed download (<50% of full config)1 compressed download (<50% of full config)+ 4 partial reconfigurations (<0.5% of full config)+ 4 partial reconfigurations (<0.5% of full config)

only change DPS configuration bitsonly change DPS configuration bits

7 BIST sequences7 BIST sequencesBIST configurations #2 & #3 ran twiceBIST configurations #2 & #3 ran twice

different control register values for multiplier/adder test algodifferent control register values for multiplier/adder test algorithmsrithms

Yes (7)NoNoDirectCascadeLowAll Regs=15Yes (6)NoNoCascadeDirectHighAll Regs=14

NoYes (5)Yes (4)DirectDirectLowA&Breg=2Other Regs=13

NoYes (3)Yes (2)DirectDirectHighAll Regs=12NoNoYes (1)DirectDirectHighAll Regs=01

CascadeAdderMultiplySlice1Slice0Test Modes AppliedB Input SourceSignals

Active Level PipelineRegisters

BISTConfig

bottom row failures due to unconnected cascade inputs

Page 21: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2121

Cascade Mode TestingCascade Mode Testing

One slice from pair One slice from pair put in cascade put in cascade mode at a timemode at a time

Circular comparison Circular comparison of slices sees of slices sees identical behavioridentical behavior

Cascade inputs to Cascade inputs to bottom DSP are not bottom DSP are not connectedconnected

Expected failures in Expected failures in ORAs comparing ORAs comparing that DSPthat DSP’’s outputs s outputs

Page 22: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2222

DSP BIST ImplementationsDSP BIST ImplementationsCircular Circular comparison per comparison per DSP columnDSP column

Each slice in tile Each slice in tile compared with its compared with its counterpartcounterpart

slice0slice0--toto--slice0slice0slice1slice1--toto--slice1slice1

CLB carry chain CLB carry chain used to provide used to provide pass/fail indicationpass/fail indication

Only read config Only read config memory contents memory contents to get results for to get results for diagnosisdiagnosis

TDI

BSCAN

TDO

TPG0 TPG1ORAs

DSPs

SX25SX25

Page 23: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2323

Automated BIST ConfigurationsAutomated BIST ConfigurationsC program generates C program generates .XDL file.XDL file.XDL to .NCD.XDL to .NCD

xdl xdl ––xdl2ncd bist.ncdxdl2ncd bist.ncdFPGA EditorFPGA Editor

Design Rule CheckDesign Rule CheckRoute designRoute design

.NCD to .BIT.NCD to .BITBitGenBitGenDownload into FPGADownload into FPGA

.NCD to .XDL.NCD to .XDLModification program for Modification program for generating remaining 4 generating remaining 4 BIST configurationsBIST configurations

FPGA EditorFPGA Editor

BIST ProgramsBIST Programs

BitGen.exeBitGen.exe

BIT fileBIT file

XDL fileXDL file

NCD fileNCD file

XDL.exeXDL.exe

download

Page 24: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2424

DSP BIST ImplementationsDSP BIST Implementations

ORAs DSPs

TPG0

TPG1

ORAsDSPs

TPG0

TPG1

PowerPowerPCPC

LX15

FX12

Brad DuttonBrad DuttonGenerated BIST Generated BIST configurations forconfigurations forall Virtexall Virtex--4 4 FPGAsFPGAsand verified BISTand verified BISTon LX25, LX60,on LX25, LX60,SX35, & FX12SX35, & FX12

via download andvia download andexecutionexecution

Page 25: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2525

BIST Timing AnalysisBIST Timing Analysis

0

30

60

90

120

150

Config 1 Config 2 Config 3 Config 4 Config 5

Max

imum

CLo

ck F

requ

ency

(MH

z)

Bogus timing analysis by Xilinx tools

due to unused cascade path

with no pipeline registers

David BaumannDavid Baumann’’ssresultsresults

Page 26: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2626

BIST Timing AnalysisBIST Timing Analysis

0

10

20

30

40

50

60

70

80

FX12 FX25 FX40 FX60 FX100 SX25 SX35 SX55 LX15 LX25 LX40 LX60 LX80 LX100

Max

imum

Clo

ck F

requ

ency

(MH

z)

Based on configuration #332

48

64 64

80

96

3232

48

128

160

128192

512

FFmaxmax function offunction of##DSPsDSPs & size of array& size of array

4 4 TPGsTPGs mightmightimprove improve FmaxFmax

1

1

1 1

1

1

1 1

1

2

2

4

4

8

#DSPs#DSP columns

Page 27: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2727

110111010000100010110101System or BIST configuration fileSystem or BIST configuration file

FPGAFPGA

Physical Fault InjectionPhysical Fault InjectionFaulty FPGAs are difficult to findFaulty FPGAs are difficult to find

1 ORCA with faulty PLB & 2 ORCAs with faulty routing1 ORCA with faulty PLB & 2 ORCAs with faulty routingPhysical fault insertionPhysical fault insertion

Etch package down to bare die and Etch package down to bare die and ““zapzap””We use fault injection emulationWe use fault injection emulation

Modify configuration bits before or after download (RMW)Modify configuration bits before or after download (RMW)Can inject single and/or multiple faultsCan inject single and/or multiple faults

StuckStuck--at faults & bridging faultsat faults & bridging faultsFaults limited effects of configuration bitsFaults limited effects of configuration bits

011001101110011001000000StuckStuck--at valuesat values

000000001100000000110000Fault maskFault mask

110111011100100010000101Download fileDownload file 1101 1101

1100100100000101faultsfaults

Mustfa AliMustfa Ali’’s works work

Page 28: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2828

0

1

2

3

4

5

6C

inib

Cse

l0ib

Cse

l1ib

Sub

ibO

p0ib

Op1

ibO

p2ib

Op3

ibO

p4ib

Op5

ibO

p6ib

Cea

ibC

ebib

Cem

ibC

epib

Cec

crtli

bC

ecin

subi

bC

ecin

ibR

stai

bR

stbi

bR

stm

ibR

stpi

bR

stct

libR

stci

nib

Are

g0b

Are

g2b

Bre

g0b

Bre

g2b

Mre

g0b

Pre

g0b

Cin

reg0

bC

selre

g0b

Opr

eg0b

Sub

reg0

bC

lkib

Cas

cbC

reg0

bC

ecib

&t

nocf

gb

# B

IST

conf

igs

dete

ctin

g fa

ult

stuck-at-0

stuck-at-1

Fault Injection Emulation ResultsFault Injection Emulation Results1) Download BIST configuration1) Download BIST configuration2) Manipulate configuration bit via read2) Manipulate configuration bit via read--modifymodify--writewrite3) Run BIST sequence3) Run BIST sequence4) Get BIST results4) Get BIST results

Page 29: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 2929

SummarySummaryInvestigated known test algorithms for Investigated known test algorithms for multipliers and addersmultipliers and addersLooked for architecture independent tests Looked for architecture independent tests with highest fault coveragewith highest fault coverageJETTAJETTA’’98 approach easy to implement98 approach easy to implement

Needs modification for 100% FCNeeds modification for 100% FC7 DSP BIST sequences with 5 downloads7 DSP BIST sequences with 5 downloads

New ORA eliminates config memory readbackNew ORA eliminates config memory readbackTotal testing time < 52% of 1 full downloadTotal testing time < 52% of 1 full download

Using compressed and partial reconfigurationUsing compressed and partial reconfigurationOnly DSP configuration bits need to be changedOnly DSP configuration bits need to be changed

Application to VirtexApplication to Virtex--5 DSPs5 DSPs

Page 30: Built-In Self-Test of DSPs in Virtex-4 FPGAsstrouce/class/elec6970/DSPBIST.pdf · 9Application to Virtex-4 DSPs ... 1111111110000000000 1111111110000000001 1111111100000000001

C. Stroud 1/08C. Stroud 1/08 VLSI D&T SeminarVLSI D&T Seminar 3030

BIST Approach for VirtexBIST Approach for Virtex--5 DSP5 DSP

Larger multiplier butsame test algorithm

Logical operations but48-bit cascade of A:Ballows direct testing

Pattern detect but knownalgorithm for = comparator

Optional regs like V4 butdata sheets have less info