paper compression/scan co-design for reducing test data...

10
IEICE TRANS. INF. & SYST., VOLE89-D, NO.10 0CTOBER 2006 2616 PAPER Compression/Scan Co-design for Reducing Test Data Volume, Scan-in Power Dissipation, and Test Application Time yu HUTa), Member, Yinhe HANT, student Member> Xiaowei LI1, Huawei LIT, Nonmembers, and Xiaoqing WENT1, Member suMMARY LSI testing is critical to guarantee chips are fault-free be- fore they are integrated in a system, so as to increase the reliability of the system. Although ful1-scan is a widely adopted design-for-testability tech- nique for LSI design and testing, there is a strong need to reduce the test data Volume, scan-in Power dissipation, andLles[ application Time (VPT) of full-scan testing. Based on the analysis of the characteristics of the variable-to-fixed run-length coding technique and the random access scan architecture, this paper presents a novel design scheme to tackle all VPT is- sues simultaneously. Experimental results on ISCAS' 89 benchmarks have shown on average 51.2%, 99.5%, 99.3%, and 85.5% reduction effects in test data volume, average scan-in power dissipation, peak scan-in power dissipation, and test application time, respectively, key wordsI COmPreSSion, run-length coding, random access scan, power dissipation, test application time 1. Introduction Full-scan is a widely adopted technique for design and test- ing of LSI chips. In a ful1-scan sequential circuit, the mem- ory elements such as D flip-flops (DFFs) are replaced with scan ce11s, e.g. multiplexed DFFs; thus the memory ele- ments become pseudo primary inputs (PPIs) and pseudo pri- mary outputs (ppos). This scheme dramatically reduces the difBcult problem of sequential automatic test pattern gen- eration (ATPG) to the relatively easy problem of combina- tional ATPG; therefore, the test generation time and the test set quality are greatly improved. Although the fu11-scan technique is commonly sup- ported by commercial tools, the biggest disadvantage is that it suffers from large test data Volume, excessive test Power dissipation, and prolonged test application Tl'me due to scan operations. Here, these three problems are combined into the concept of the "VPT issues". As the integration scale doubles every eighteen months, the VPT issues of scan de- sign are becoming more and moreserious. Large test data volume requires large channel capacity of the automatic test equipment (ATE). High power dissi- pation during scanning test data can cause circuit reliability hazards, instant circuit damage [1], or faulted test results [2]. Long test application time increases the chip test cost. Manuscript received August 8, 2005. Manuscript revised December 26, 2005. TThe authors are with lnstitute of Computing Technology, Chi- nese Academy of Sciences, Beijing, 100080, China. TIThe author is with the Faculty of Computer Science and systems Engineering' Kyushu lnstitute of Technology, Iizuka-shi, 820-8502 Japan. a) E-mai1: [email protected] DOI: 10. 1093/ietisy/e89-d. lO.2616 Although many methods have been proposed to tackle the VPT issues in recent years, most of them focused only on one or two of the VPT issues, instead of addressing a11 of them simultaneously. Compression methods such as run-1ength coding [3], statistical coding (SC) [4], Golomb coding [5], frequency- directed-run-1ength coding (FDR) [6] , alternating run-length coding using FDR (ARL) [7] , extended-frequency-directed- run-length coding (EFDR) [8], variable-length input Huff- man coding (VIHC) [9], and variable-tail coding [10] reduce test data volume by encoding the original test set TD PrO- vided by the circuit vendor to a smaller test set TE. TE is stored in the memory of ATE channels. During scan testing, TE is transferred to an on-chip decoder. Then TE is reverted to TD by the on-chip decoder. [11]-[13] compress the test data to seeds, then a linear mapping network, which can be either an LFSR or an XOR network, decompresses the seeds to TD. Since the data volume of either TE Or Seeds is less than that of TD, these compression methods reduce the requirement for the ATE channel capacity. Optimally map- ping unspeciBed bits in test data to O's or 1's [7] can reduce scan-in and scan-out power dissipation as well. Scan design methods [14]-[l7] reduce test data volume and test application time by partitioning and broadcasting. scan chains are partitioned into shorter segments, and the same test data is broadcasted to multiple segments. When the grain of segments is a scan cell, the scan structure is named a scan tree in[15]117] or a scan forest in [18]. Vir- tualScan [19] is also a broadcasting design with a broadcast network to reduce the correlation in test data. In [20], scan- in power dissipation is saved by freezing scan chain seg- ments, and test data volume is reduced by using the test re- sponse to generate the next test stimuli. In [21] and [22], the test response is also reused to generate the next stimuli with dynamicreconBgurable scan chains and circular scan. [23] reordered the test data to reduce the Hammingdistance between the response vector and the next stimuli vector, and utilizes the random access scan (RAS) [24] to reduce VPT simultaneously. Since [23] is based on the test vector re- ordering technique and the RAS structure, we denote the technique proposed in [23] as RRAS. Each issue in the VPT problem is so critical that all three must be solved. Our goal is to reduce all the VPT parts simultaneously. This is achieved by reducing test data volume with variable-to-fixed run-1ength coding and reduc- ing scan-in power dissipation as we11 as test application time copyright @ 2006 The lnstitute of Electronics, Information and Communication Engineers

Upload: others

Post on 11-Nov-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

IEICE TRANS. INF. & SYST., VOLE89-D, NO.10 0CTOBER 2006

2616

PAPER

Compression/Scan Co-design for Reducing Test Data Volume,Scan-in Power Dissipation, and Test Application Time

yu HUTa), Member, Yinhe HANT, student Member> Xiaowei LI1, Huawei LIT, Nonmembers,and Xiaoqing WENT1, Member

suMMARY LSI testing is critical to guarantee chips are fault-free be-fore they are integrated in a system, so as to increase the reliability of thesystem. Although ful1-scan is a widely adopted design-for-testability tech-nique for LSI design and testing, there is a strong need to reduce the testdata Volume, scan-in Power dissipation, andLles[ application Time (VPT)of full-scan testing. Based on the analysis of the characteristics of thevariable-to-fixed run-length coding technique and the randomaccess scanarchitecture, this paper presents a novel design scheme to tackle all VPT is-sues simultaneously. Experimental results on ISCAS' 89 benchmarks haveshown on average 51.2%, 99.5%, 99.3%, and 85.5% reduction effects intest data volume, average scan-in power dissipation, peak scan-in powerdissipation, andtest application time, respectively,key wordsI COmPreSSion,run-length coding, random access scan, powerdissipation, test application time

1. Introduction

Full-scan is a widely adopted technique for design and test-ing of LSI chips. In a ful1-scan sequential circuit, the mem-ory elements such as D flip-flops (DFFs) are replaced withscan ce11s, e.g. multiplexed DFFs; thus the memory ele-ments become pseudo primary inputs (PPIs) and pseudo pri-mary outputs (ppos). This scheme dramatically reduces thedifBcult problem of sequential automatic test pattern gen-eration (ATPG) to the relatively easy problem of combina-tional ATPG; therefore, the test generation time and the testset quality are greatly improved.

Although the fu11-scan technique is commonly sup-ported by commercial tools, the biggest disadvantage is thatit suffers from large test data Volume, excessive test Powerdissipation, and prolonged test application Tl'me due to scanoperations. Here, these three problems are combined intothe concept of the "VPT issues". As the integration scaledoubles every eighteen months, the VPT issues of scan de-sign are becoming moreand moreserious.

Large test data volume requires large channel capacityof the automatic test equipment (ATE). High power dissi-pation during scanning test data can cause circuit reliabilityhazards, instant circuit damage [1], or faulted test results [2].Long test application time increases the chip test cost.

Manuscript received August 8, 2005.Manuscript revised December 26, 2005.

TThe authors are with lnstitute of Computing Technology, Chi-nese Academy of Sciences, Beijing, 100080, China.

TIThe author is with the Faculty of Computer Science andsystems Engineering' Kyushu lnstitute of Technology, Iizuka-shi,820-8502 Japan.

a) E-mai1: [email protected]: 10. 1093/ietisy/e89-d. lO.2616

Although many methods have been proposed to tacklethe VPT issues in recent years, most of them focused onlyon oneor two of the VPT issues, instead of addressing a11 ofthem simultaneously.

Compression methods such as run-1ength coding [3],statistical coding (SC) [4], Golomb coding [5], frequency-directed-run-1ength coding (FDR) [6] , alternating run-lengthcoding using FDR (ARL) [7] , extended-frequency-directed-run-length coding (EFDR) [8], variable-length input Huff-mancoding (VIHC) [9], and variable-tail coding [10] reducetest data volume by encoding the original test set TD PrO-vided by the circuit vendor to a smaller test set TE. TE isstored in the memory of ATE channels. During scan testing,TE is transferred to an on-chip decoder. Then TE is revertedto TD by the on-chip decoder. [11]-[13] compress the testdata to seeds, then a linear mapping network, which canbe either an LFSR or an XOR network, decompresses theseeds to TD. Since the data volume of either TE Or Seeds isless than that of TD, these compression methods reduce therequirement for the ATE channel capacity. Optimally map-ping unspeciBed bits in test data to O's or 1's [7] can reducescan-in and scan-out power dissipation as well.

Scan design methods [14]-[l7] reduce test data volumeand test application time by partitioning and broadcasting.scan chains are partitioned into shorter segments, and thesametest data is broadcasted to multiple segments. Whenthe grain of segments is a scan cell, the scan structure isnameda scan tree in[15]117] or a scan forest in [18]. Vir-tualScan [19] is also a broadcasting design with a broadcastnetwork to reduce the correlation in test data. In [20], scan-in power dissipation is saved by freezing scan chain seg-ments, and test data volume is reduced by using the test re-sponse to generate the next test stimuli. In [21] and [22],the test response is also reused to generate the next stimuliwith dynamicreconBgurable scan chains and circular scan.[23] reordered the test data to reduce the Hammingdistancebetween the response vector and the next stimuli vector, andutilizes the random access scan (RAS) [24] to reduce VPTsimultaneously. Since [23] is based on the test vector re-ordering technique and the RAS structure, we denote thetechnique proposed in [23] as RRAS.

Each issue in the VPT problem is so critical that allthree must be solved. Our goal is to reduce all the VPTparts simultaneously. This is achieved by reducing test datavolume with variable-to-fixed run-1ength coding and reduc-ing scan-in power dissipation as we11 as test application time

copyright @ 2006 The lnstitute of Electronics, Information and Communication Engineers

HU et al.: COMPRESSION/SCAN CO-DESIGN

with RAS structure. A heuristic algorithm is proposed to ar-range the scan ce11s in RAS to obtain a high compressionratio.

The rest of this paper is organized as follows: Sec-tion 2 first describes the characteristics of the variable-to-Bxed run-1ength coding and the RAS structure, respec-tively, then presents a compression/scan co-design (CSCD)scheme. Section 3 analyzes the VPT reduction effects. Ex-perimental results on larger ISCAS' 89 benchmark circuitsare shown in Sect. 4, and Sect. 5 concludes this paper.

2. CSCDScheme

2. 1 Variable-to-Fixed RunLength Coding

ln variable-to-fixed run-length coding, the number of con-secutive O's is encoded as a Bxed-length codeword. TheAxedJength is denoted as the block length b. Since run-length is the number of consecutive O's, it is suitable forcompressing skewed test data whose O-probability exceeds1-probability. In fact, for the stuck-at fault mode1, an im-portant property of the test data set TD generated by ATPGis that TD uSually has very high ratio of unspeciAed bits.These unspeciBed bits can be randomiy mapped to eitherlogic value O or l without any fault coverage loss. By map-ping the unspeciBed bits to O's, the test data can be wellskewed. Table l shows an example of variable-to-axed run-length coding with block length b=3.

Obviously, the last few bits unable to be encoded areall O's because the run-1ength coding always requires thatthe last bit be l except that the run-length equals 2b - 1. In[3], when the last few bits at the end of TD Can nOt be en-coded, extra bits are padded to solve the problem. In thecase of the 3-bit run-1ength coding, for example, if the lasttwo bits are OO, then the codeword 111 will be used to en-code the bits. Unlike the padding method of [3], our CSCDrun-1ength coding technique encodes the last few bits as thenumber ofO's. For the above example, 00 will be encoded asO10. The CSCD decoder is special1y designed to correctlydecode such codeword, e.g. 010 will be reverted to OO.

It is clear from Table l that, the characteristic ofvariable-to-Bxed run-length coding is that the relative ad-dress (RA) of l in test data is the run-1ength plus one. Theabsolute address (AA) is defined as the number of bits fromthe designated bit to the first bit. For each l in test data,weget its AA by adding its RA and AA of the last bit ofthe mostly recently decoded test data. For example, let TD

Table l Example of 3-bit variable-to-fixed run-1engthcoding.O rlg2n al J aIa R u n 11en gIh C O d eW O rd

l 0 0 0 0

O l l 0 0 1

0 0 1 2 0 10

0 0 0 1 3 0 1 1

0 0 0 0 1 4 10 0

0 0 0 0 0 1 5 10 1

0 0 0 0 0 0 1 6 1 10

0 0 0 0 0 0 0 7 11 1

2617

=ooooo100001001. In this case, we have TE =101100010with b=3. AA ofthe l ontheleftin TD is 6, the l inthemiddle is 11, and the l on the right is 14, as shownbe11ow:

TD : 000001 00001 001TE: 101 100 010

RA: 5+1=6 4+1=5 2+1=3

AA: 6 6+5=11 11+3=14

on the other hand, scan ce11s in the conventional serialscan chain have to be accessed serially. The characteristic ofthe RAS structure is that any of its scan cells can be selecteddirectly by the address decoders. This characteristic of RAScan be naturally incorporated with the characteristic of therun-length coding. Whenever the absolute address of l intest data is provided to the address decoder, the state of thecorresponding scan cell is nipped. In the fo11owing sections,such incorporation will be described in more details.

2.2 CSCD Architecture

As shown in Fig. 1, the CSCD architecture consists of threeparts: Decoder, Address Decoder, and RAS. The Decoderpart has an Address Register Bank (ARB), an Adder, a Uni-tAdder, two MUXs (MUXa and MUXb), tWOAND gates(ANDa and ANDb), an NOT gate, an OR gate and an NORgate. The Address Decoder part has a Counter, an X Ad-dress Decoder, and a Y Address Decoder. The RAS partis constructed with scan cells of the circuit-under-test. Theunique technique used here is that the address yo is left un-used. This is specially designed to handle the case that thelast few bits in TD are all O's.

Scan cells are classified into two blocks according tothe O-probability and l-probability of test data. When theO-probability of a scan cell is greater than its 1-probability,the scan cell is classiBed to RASo., otherwise, it is classi-fied to RAS1. The purpose of this classification is to reducethe state transitions during setting test data. The states of thescan cells that are most likely to have the test bit O are initial-ized to O, and the states of the scan ce11s that are most likelyto have the test bit l are initialized to 1. For the sake of clar-ity, the scan cells are placed regularly in Fig. 1; in practice,however, they can be distributed in the circuit-under-test.

Letthe sizeofRASbe(r+ 1)xc, wherer+ 1 isthenumber of rows of RAS and c is the number of columns. cis set to an integral power of 2 minus 1. It is clear that anyscan cell can be accessed by using a rlog2 (r + 1)l-bit X Ad-dress Decoder and a log2 (C + 1)-bit Y Address Decoder. Anadvantage of RAS is that the clock skew becomes less criti-cal now, because there is no need to shift the test data alongthe connected scan cells one by one. Primary inputs (PIs)are multiplexed to apply the encoded test data. To avoidtransitions of Decoder and Address Decoder in the functionmode, two MUXs (MUXc and MUXd) are inserted afterPIs. Two extra pins are needed in CSCD architecture, theoneis TC, and the other one is RST. TC is the signal thatselectively sets the circuitqnder-test into function mode ortest mode. RST resets RASo to O and RASl tO l whenever it

ltiTCE TRANS. INF. & SYST,, VOL.E89-D, NO.10 0CTOBER 2006

M U X C l

1 ~ W ・ k d& l

r l A J d le SS D e C O d e r l

M U X d

0 + V .

A d d

y l I 1 2 I 1 2 F Iy C

U n lIA d d e l l

H

e

S .

C?OFLE ]

XO Y Y

E lO g 2 (C+ "

A N D b

10 g 2 (C + 11 M U X

C O

]

M U X b

te l F

X l

X 2

X r

Y

g n

Y Y Y Y

&V3> *

B B l r * Y & Y +Y

C O Yi&1 3F y q ・T

* * * &

iy% j!m * ; + 1 + %1@* &

41<*IN C rlO g 2 (r+ 1 n l

l

@ & * !* i &lI I- II II I- II -F

l I I I I - 1

Fjg.1 CSCDaLChlteCture.

(a) RASo scan cell

(b) RASI SCan Ce11

Fig.2 CSCDRAS scancells,

%blc2 The operat10nmOde ofRAS scan cells.M O J e T C X l & 1 i R S T

fu n C llO Il l U U

te 5 t

r e S e t 0 U l

S e le C t 0 l 0

h O ld 0 0 0

is asserted. For the sake of simplicity, the test clock signal isnot shown in Fig, l, which is generally connected to ARB,Counter, andeach scan cell. Asfor test response analysis,weassumethat test responses are compacted withan XORnetwork and a MultlPle lnput Signature Register (MISR) asin[18],

The RASo andRASI SCanCells arediqerent from con-ventional multiplexed DFFs. AnANDgate andan OR gateare inserted in front of the MUX as illustrated in Fig. 2.Comparing the CSCD RAS scan cell design with the con-ventional mu1tiplexed DFF design makes it clear that, sincethe ANDgate and the OR gate are not on the signal propa-gation path, the CSCD RAS scan cell design has the sameimpact10n Onthedelay as the conventional multiplexed DFF.

Table 2 shows how TC, the address select signal(x,.&yj), and RST together deteminetheoperation modeof

1: logic 1 0: 1ogic O U: the unspecified state

a RAS scan cell. When TC=l, scan cells are in the functionmode. When TC=0, the scan cells are in the test mode. Inthetest mode, when RST=1, RASo scancells are reset to O'sand RASI SCanCells are reset to 1's. For a RASo scan cell,once it is selected by the X andYaddress decoders, its stateis set to l; for a RASI SCan Cell, once it is selected by the Xand Y address decoders, its state is set to O. The unselectedscan ce11s hold their present states. The hold mode avoidsthe transition at the scan cel1 from rippling to the combina-t10nal logic. Obviously, this helps reducing the extra powerdissipation of the combinational 1ogic.

The stuck-at-faults in Decoder, Address Decoder, RASandthe response analyzer(e.g. XOR net and MISR) canbe tested by steps as follows: l) RASo scancells are ini-t1alized to O's and RASI SCan Cells are init1alized to 1's byRST=1 and TC=0. 2) The response analyzer (e,g. XORnet and MISR) compacts the values of RAS to a signature,and outputs the signature to ATE for comparison. 3) SetRST=O and TC=0, select scan cells one by one withascend-ing order, then the states ofRASo scan cells are l's, andthestates ofRASI SCan Cells are O's. 4) The response analyzeragain compacts the values of RAS andoutputs the signatureto ATE for companson, After the DFT-circuitry is tested,conventional test data generated by ATPG tools canbe en-coded by run-1ength coding, then transfeITed to PIs to testthecircuit-under-test.

HU et a1.: COMPRESS10N/SCAN CO-DESIGN

2.3 Scan Cell Amangement

ln the CSCD scheme, the original test data TD PrOVidedby the circuit vendor must be partitioned and reordered toTLX) andTDl aCCOrdlng tO the addresses of scan cells, sincescan cells have been classified to two blocks: RASo andRASl. TDOandTDl are tWOnOn-OVerlapping sets with TDOfor RASo andTDl for RAS1. All unspecified bits in TDO aremapped to O's and all unspeciBed bits in TDl are maPPed tol's. Therefore, the test data TDOandTDl are Skewed moreeffectively than simply randomly mapping unspeciBed bitsin TD tO O's or 1's. Obviously, this way of unspecified bitsmapping is good for gaining better encoding eaect, ThenTDOis encoded to the variable-to-fixed run-lengthcode TEO.To simplify the encoding andthe decoding processes, TDlwi1lbe inverted to fD. Rrst, then encoded to TEl in the sameway as TDObeen encoded.

Since the original test data TD Will be reordered ac-cording t9 the address of RAS scance11s before being en-coded. The arrangement of scance11s deteminesthe en-coding effect, Obviously, enumerating all possible scancellpermutations to get the minimumsize of the encoded testset is infeasible since for k scan cells there are k! such per-mutations. Asa result, a heuristic algorithm is proposed tonear-optlma11y arrangescan cells. In the case of RASo scance11 arrangement, for example,suppose that there are ko scancells. Wehave a matrlX Ofpo x q (ko 5: po X q) empty boxeswhere wecanput scan cells. One box canhave at most onescancell. From Table l, we can see that for the vanable-to-Bxed run-1engthcoding, the best encodlng effeCt Can beattained when a codeword presents a 2b - 1 bits original data.Moreover, the bigger the block size b, the more signiAcantthe encoding effect. The steps of the scancell arrangementalgorithm are as follows:

(1) Sort the l-probability of the scan cells in descendingorder.

(2) Choose the scan cells, whose 1-probability values areno less than a threshold value, as "anchor"scan cells,

(3) Distribute the anchorsunifomiyin the box matrix, andmake sure that the distance between two neighbor an-chors in a row is the block size ofthe run-length coding,

(4) Put other scan cells zigzagly into the empty boxes oneby one until there are no morescan cells,

Step (3) is for the purpose of obtaining a bigger blocksize b. Step (4) is to make the runlength closer to 2b - l.These two steps together help in pursuing a near-optimalscan cell alTangement. Once a scan cell is arranged, a serialnumberis assigned to the scancell. The integral part of theserial number indicates the sequence number of the pass toarrange the scancells. The fractional part of the serial num-ber indicates the sequence number of the scan ce11 arrangedin th1S PaSS. Figure 3 shows anexampleof 5 x 7 arrangedRASo scan ce11s.

In Fig. 3, l.01, 1,02,..., 4.05 are the serial numbers of

arranged RASo scance11s. Scan cells arranged earlier have

*@** *& & **** &*

* 2> !? 1& W

2 ・I 8 l ・0 1

* g> ? qi &%*** ! * * * 2 ・ 0 ? : V l ・0 2

q * * % W

* & q l& * &.: ! J ・+ +.

< 2 ・0 & l ・0 3

% &@* %&*

2 ・0 7 l ・0 4

¥W*; & ***

*h ql =b*% &% @@

2 ・0 6 l ・0 5

S**% *&% &&%**@

2 ・0 5 l ・0 6

*% 1k m * * *

+ n l ..

* 1 ? & & &

2 ・0 4 l ・0 7

/ + ! ..+

3 ・0 3 l ・0 8

W% & * * % *

@*% 58 & @*% 2 ・0 2 l ・0 9

.,+! ! + +

2 ・0 1 l ・1 0

Fig.3 Exampleofarranged RASo scan ceus.

2619

higher l-probabilitleS and smal1er serial numbers. Since theanchors are arranged at first, the integral part of their serialnumbers is always 1.

Whenko < po x q, there are spare boxes after all scancells have been put into the box matnx. These spare boxesbecome the dummyscan ce11s whose states are always O inRASo, and TDOare Padded with 0's according to the ad-dresses of the dummyscan cells. For the samereason, thestates of the dummy scan cells are always l in RAS1, andTDl arePadded with 1's according to the addresses of thedummyscan cells.

2,4 Test Data Decompression

CSCD scheme reorders theoriginal test data TD tO TDOandTDl due to scan cells have been classified to RASo andRAS]. Then CSCD compresses thetest data by variable-to-fixed run-1engthcodlng. This is diqerent with RRAS tech-nique. RRAS reorders the test data to obtain test responsevector and test stlmuli vector pairs withminimumllammingdistance, then compresses test data by storing the bits thatthe test stimuli vector differs withits pair test response vec-tor.

The test data decompression procedure of CSCDscheme is as follows: in Fig. l, when a run-length code-word is transferred to PIs, Adder adds the codeword withthecontent of ARB. Then UnitAdder increases the outputof Adder by one, At the sametime the ANDa gate con-ducts the bitwise-ANDoperation on thecodeword with 1,s,If the result of the bitwise-AND operation at ANDa is O, theoutput of NOT is 1. Then the output of UnitAdder is pro-vided to the Y Address Decoder through MUXa, as well asto ARB throughMUXb. If the result of the bitwise-ANDoperation is l , meaning that the codeword contains only l ,s.Therefore, the output of Adder is directly provided to ARBthroughMUXbWhile O's are provided to the Y Address De-coder throughMUXa.Noscan cell is selected because thehst address yo is left unused. It canbe seen that ARB storestheabsolute Y address ofthe last bit of the test data mostlyrecently decoded.

Whenthe last codeword of a row is transferred to PIs,

lEICE TRANS. INF_ & SYST.. VOL.E89-D, NO.]0 0CTOBER 2CK)6

Pls

codewordl

codeword9

XoXI

X2

101111

001

001110

110110

011

100

(x., yj) ARB Counter

(0, 6)(cI, 0)

6 013 00 1

(l,2) 2 1(1,9) 9 l

0 2(2, 7) 7 2

(2, 1l) 11 20 3

Fig. 4 Example of test data decompression.

for the corresponding test data in TDO /TDl, there are twopossible cases: (l) the last bit is 1 and (2) the last bit is O.

For the first case, the output ofUnitAdder is the columnaddress of the scan cell at the end of the row. Since thisaddress is anintegral power of 2 minus 1, it means that theoutput of ANDb is l. Therefore INC is asserted. At thenext rising edge of the test clock, ARB is reset to O by theoutput of the NOR gate, and the asserted INC let Counterincrease by l. The output of the X Address Decoder pointsto thenext row of RAS. For the second case, UnitAddress isovernow and the carry-out bit (CO) is set to l. In this case,ARB is reset to O and Counter increases by 1,

After the last row of a TEI VeCtOr is decompressed, TCis asserted to make RAS operate in function mode, and thetest data is applied to the circuit-under-test. Then the testresponse is captured andcompacted by anXORnetworkand MISR. After that, RASo, ARB and Counter arereset toOandRASl is resetto 1.

Figure 4 is anexampleto illustrate how the CSCDscheme decompresses the encoded test data. Assumethatscan cells have been classi5ed into a 2-row-by- l5-columnRASo and a 1-row-by-l5-column RASl. Since the numberofrows andcolumns is 3 and l5, respectively, the widths ofCounter and ARB are 2-bits and 4-bits, respectively. RASo,ARB, and Counter are initialized to O, and RAS] is ini-tialized to 1. In Fig.4, a test vector of TDO/TDl is shownin RASonuS1, Which is encoded to codeword1 thrOughcodeword9. The column of "PIs" indicates the codewordtransferred to PIs, the column of "(xi, yj)" indicates the X-address and the Y-address of the selected scan cel1. "ARB"and "Counter" indlCateS the ValueSOf ARB and Counter, re-spectively, when the next codeword is transferred to PIs.The straight lines under the values of (xz., yj), ARB, andCounter emphasize that ARB is reset to O, and Counter in-creases by 1.

3. Effect Analysis

3.1 Test Data Volume Reduction

ln this work, the test data volume is reduced by the run-length coding, Since the encoding and decoding of TDl isconducted in the same manneras that of TLX), WeOnly ana-lyze the test data volume reduction with TLX)in the follow-ing.

According to the theorem of the variable-to-Axed run-length coding [25], the size of the encoded test data TEO isgiven by

RL-f;rHlb-A .bi:6i..=1

where, 0 5; 6.. s gj, n is thetotal numberofbits in TDO,his the total number of l's in TDO, b is the block size, andLLis the run-1ength in the i-th block. From this formula, for agiven test data set, it is clear that abigger block size b and asmaller 6.. result in shorter RL. If each l.- + 1 is integral timesof2b- l,then6i=0,

bnRLmin = 5tT

This is the lower bound of the vanable-to-fixed run-1engthcoding,

The encoding effect is indicated by the compression ra-t10. The compression ratio is given by

CR=No. ofbits in TD -No, ofbitsin TE

No. ofbits in TDx lOO%

Fromthediscussions above, its is clear that the upperbound of compression ratio is given by

CRmax =n -RLmin

x lOO%n

3,2 Scan-in Power Dissipation Reduction

The scan power dissipation includes both scan-in and thescan-outpowerdlSSipation. For thetraditional scanchain ar-chitecture, the scan-in procedure and the scan-outprocedureare symmetric. However, many low power compressiontechniques (e.g. ARL in [7]) usually only operate against thetest stimuli data but not the test response data, which resultsin that the scan-in power dissipat10n is usually less than thescan-out power dis sipation.

Ontheother hand, for theCSCD architecture, the scan-in procedtlre andthe scan-out procedure is asymmetnc. Thetest responses are directly transferred to MISR through theXOR network, hence there are no transitions on the scancells andthe combinat10nal circuit-under-test during thescan-out procedure. Therefore, only the XOR network andthe MISR contribute to the scan-out power dissipation,

Since the XOR network is a small combinational cir-cuit and MISR is relatively shorter than the rows of RAS,

HU et al.: COMPRESSION/SCAN CO-DESIGN

the scan-out power dissipation is less than the scan-in powerdissipation. In this work, weanalyze the scan-in power dis-sipation, but it is reasonable to infer that if the scan-in powerdissipation of CSCD is less than that of the techniques basedon the scan chain architecture, the CSCD architecture wi11also outperform those techniques in the scan-out power dis-sipation.

The weighted transition metric (WTM) has been shownto be strongly correlated with the power dissipation in thecircuit during scan operation [26]. Therefore, we estimatethe scan-in power dissipation with WTM.The weight as-signed to a transition is the difference between the size ofthe scan chain and the position in the vector in which thetransition occurs. For a test vector, the number of weightedtransitions is given by

weighted Transitions - Z(size of Scan Chain -

Position of Transition)

And the total scan-in power dissipation (PD) of theconventional serial single scan design is the sum of Wi.Mof the test data TD,

pD - Z weighted Transiti.nsi

/l=l

where lTDl is the number of test vectors in TD. The averagescan-in power dissipation (APD) and the peak scan-in powerdissipation (PPD) are given by

APD = PD/lTDT

PPD = max(Weighted Transitionsi)i=1,...,lTDJ

For the CSCD architecture, the total scan-in power dis-sipation has no relation with the position of transitions andis equal to the sum ofthe number of 1's in TDOand the num-ber of O's in TD1. Therefore, PD, APD, and PPD are givenby (

pD - Z'TL.'1'. TL1'0''

tl=l

APD = PD/lTDJPPD = max(Ti,.(l) + Ti,1(0))i=.,...,1TDf

where Ti,o(1) denotes the number of 1 's in the i-th vector ofTDO,and Ti,l(0) denotes the number of O's in the i-th vectorofTD1.

3.3 Test Application Time Reduction

For a conventional serial single scan design, the test appli-cation time is the product of the length of the scan chain andthe number of the test vectors in TD [27]. However, for theCSCD architecture, one cycle is needed to set the state oftheselected RAS scan cell. Therefore, the number of codewordin a TEOVeCtOris the number oftest cycles needed to load theTDOVeCtOr. For the same reason, the number of codeword

2621

in a TEI VeCtOr is the number of test cycles needed to loadthe TDI VeCtOr. After both TDOand TDI VeCtOrS are loadedinto RASo and RAS1, One CyCle is needed to capture the testresponse, and one more cycle is needed to reset RASo to Oand RASl tO 1. Therefore, the total test application time ofCSCD is given by

TAT - Z'TL.'codewo,d'. TL1(COdewo,d,. 2,

tl=1

where JTEF is the number of test vectors in TE. Sincethe variable-to-fixed run-1ength coding does not reduce thenumber of test vectors, rTEl is equal to lTDl.

For the instance of Fig.4, a test vector of TD is en-coded to codewordl through codeword9. Amongthese code-words, codewordl tO COdeword6 are belong to TEO, Whilecodeword7 tO COdeword9 are belong to TEl. Therefore, weneed 9 cycles to load 9 codewords, and need 2 more cyclesto capture the test response and to reset RAS. Hence TAT isll.

4. Experimental Results

Experiments were performedon the ful1-scan version of IS-CAS' 89 benchmark circuits to evaluate the proposed CSCDscheme. The test data generated by MinTest ATPG pro-gram [28] with fu11 fault coverage were used in our exper-iments.

Since the threshold value in our heuristic algorithm im-pacts the number of anchors and the size of encoded testdata, we tried the threshold value from O.05 to O.45 with astep of O.05. The number of RAS columns also impacts thearrangement of scan cells; therefore we tried the number 3 1,63, 127, 51 1, and lO23. The experiments were implementedin Matlab version 6.1, executed on a PC with a Pentium 42.6 GHz processor and 256 MB memory. All experimentswere completed in several seconds of CPU time.

Table 3 shows the impact of the threshold on the testdata volume (V), the scan-in power dissipation (P), and thetest application time (T) when the number of RAS columnsis 63. The row of "Threshold" is the threshold values thatachieved the best compression ratio. Row "No. of bits inTD"and "No. ofbits in TE" are the number oftotal bits inTDand TE for each benchmark circuit, respectively. Row"No. of specific bits in TD" is the number of speciBc bitsin TD. APD, PPD, and TAT arethe average scan-in powerdissipation, the peak scanjn power dissipation, and the testapplication time. Rows of "CSCD" areexperimental resultsof the proposed co-design scheme. The rows of "SS" areexperimental results of the conventional single serial scandesign.

From Table 3, it can be seen that the CSCD schemeachieved up to 81.20 %, 99.85%, 99.80%, and 96.60% re-

duction in test data volume, average scan-in power dissi-pation, peak scan-in power dissipation and test applicationtime, respectively. Amongthe seven benchmark circuits,the compression ratio of s13207 was the highest because its

IEICE TRANS. INF. & SYST" VOL.E89-D, NO. 10 0CTOBER 2006

Table 3 The impaction ofthe threshold to VPT.

T h l e S h O l d 0 ・2 5 0 ・2 5 0 ・2 5 0 ・2 5 0 ・2 0 0 ・2 5 0 ・2 0

N O ・ O I b lI S ln T D 2 2 11 5 4 3 9 !2 1 3 1 6 5 !2 0 0 7 6 I9 8 6 2 8 12 0 8 1 6 4 I1 2 6 1 9 9 ! 1 0 4

N O ・ O f SP e C IIIC b lIS ln T D 6 5 0 5 1 0 !6 0 1 1 1 12 1 3 1 2 !6 5 7 1 8 12 5 1 5 2 !5 8 2 3 5 12 8 7

N O ・ O I b llS ln T E 1 2 16 1 9 1 9 14 2 4 3 1 !0 5 6 2 6 14 9 6 2 1 14 1 3 7 5 10 1 6 8 5 16 1 2

C S C D

A P D 2 1 2 9 2 9 6 9 3 1 1 3 3 4 9 2 6 5 9 6 1 1 12 1 2 1 2 I9 1 9

P P D 8 0 9 6 2 5 3 2 1 6 1 0 6 2 7 1 7 8 0 0

T A T 2 8 5 5 4 2 0 3 5 6 4 8 4 6 6 8 1 5 ! 1 8 3 1 1 !2 1 7 1 1 12 9 5

S S

A P D 1 1 9 I6 2 1 3 1 0 !1 2 1 8 4 1 !2 4 9 7 9 1 I4 0 1 3 !4 8 4 16 8 7 7 14 4 5 18 2 5 6 !8 2 0 !5 9 7

P P D 5 0 1 8 5 8 8 1 4 0 10 2 6 4 2 11 8 8 5 2 2 14 2 4 3 2 4 18 2 6 2 5 1 !8 1 9

T A T 2 4 10 1 9 3 9 16 1 9 1 6 6 !1 2 6 7 1 11 2 3 2 9 !9 8 7 1 6 6 14 9 9 2 0 0 Il O 4

C O m P re S S lO n r a I lO ( % ) 4 6 ・6 2 5 0 ・5 4 8 1 ・2 0 6 5 ・5 8 2 ・8 2 5 4 ・4 3 5 1 ・0 0

A P D r e d u C llO n (% ) . 9 8 ・8 1 9 9 ・0 4 9 9 ・6 3 9 9 ・5 6 9 9 ・8 1 9 9 ・8 5 . 9 9 ・8 1

P P D r e d u C IlO n ( % ) 9 8 ・4 1 9 8 1 7 9 9 ・2 7 9 9 ・2 6 9 9 ・8 0 9 9 ・1 7 9 9 ・6 9

T A T r e J u C Il O n ( % ) 8 8 ・ 1 4 8 9 ・4 1 9 6 ・6 0 9 2 ・9 9 4 9 ・2 7 8 9 ・6 6 9 1 ・2 3

Table 4 Comparison of compression ratio with other approaches.C lr C d lI N am e

N O ・ O f

b llS ln T D

S C G O lO m b F D R E F D R V I H C A R L R R A SC S C D

lB e Sl )

C S C D

1 6 1 )

S 5 2 1 8 2 2 11 5 4 . 2 4 ・1 9 4 0 ・1 0 4 8 ・ 1 9 5 1 ・9 3 5 1 ・5 2 5 0 ・1 7 5 1 ・9 2 4 8 ・1 5 4 6 ・6 4

S 9 2 2 4 3 9 I2 1 3 3 5 ・5 2 4 2 ・2 4 4 4 ・8 8 4 5 ・8 9 5 4 ・8 4 4 4 ・9 6 5 6 ・2 2 5 1 ・2 3 5 0 ・5 4

S 1 2 2 0 7 1 6 5 !2 0 0 7 1 ・1 3 7 4 ・1 8 7 8 ・6 7 8 1 ・8 5 8 2 ・2 1 8 0 ・2 3 8 2 ・2 6 8 2 ・1 1 8 1 ・2 0

S 1 5 8 5 0 7 6 !9 8 6 4 0 ・1 6 4 1 ・1 1 5 2 ・8 7 6 1 ・9 9 6 0 ・6 8 6 5 ・8 3 6 0 ・0 0 6 6 ・8 0 6 5 ・5 8

S 2 5 9 2 2 2 8 !2 0 8 6 5 ・1 2 9 8 ・5 1 1 0 ・1 9 8 0 ・2 1 6 6 ・4 7 N IA 2 5 ・2 7 1 2 ・4 4 2 ・8 2

S 2 8 4 1 7 1 6 4 !1 2 6 3 1 ・1 1 4 4 ・1 2 5 4 ・5 3 6 0 ・5 7 5 4 ・5 1 6 0 ・5 5 5 8 ・6 1 5 5 ・4 9 5 4 ・4 3

S 2 8 5 8 4 1 9 9 11 0 4 3 1 ・1 2 4 1 ・1 1 5 2 ・8 5 6 2 ・9 1 5 6 ・9 7 6 1 ・1 3 6 5 ・2 8 5 1 ・4 6 5 1 ・0 0

A V e l a g e 4 6 ・9 6 5 6 ・6 1 4 8 ・8 8 6 J ・4 9 6 1 ・1 7 6 0 ・5 8 5 1 ・9 5 5 2 ・5 4 5 1 ・1 7

Table 5 Comparison of scan-in power dissipation and test application time with other approaches.l

C 2r C u lI N a m e -

A P D R e J u C IlO n ( % ) P P D R e d u C Il O n ( % ) T A T R e J u C ll O n ( % )

A R L R R A S C S C D A R L R R A S C S C D A R L R R A S C S C D

S 5 2 1 8 7 8 ・0 2 9 9 ・0 4 9 8 ・8 1 3 1 ・0 6 9 1 ・4 2 9 8 ・4 1 4 8 ・2 7 4 6 ・5 3 8 8 ・1 4

S 9 2 2 4 7 6 ・2 0 9 9 ・1 4 9 9 ・0 4 2 8 ・0 2 7 4 ・4 3 9 8 ・2 7 7 0 ・9 9 4 2 ・1 8 8 9 I4 1

S 1 2 2 0 7 9 2 ・6 8 9 9 ・1 9 9 9 ・6 3 3 6 ・6 6 8 9 ・2 0 9 9 ・2 7 2 0 ・2 4 7 9 ・0 2 9 6 ・0 0

S 1 5 8 5 0 8 5 ・2 7 9 9 ・6 3 9 9 ・5 6 8 2 ・2 5 7 9 ・1 1 9 9 ・2 6 4 1 ・1 2 5 1 ・9 8 9 2 ・9 9

S 2 5 9 2 2 N I A 9 9 ・4 9 9 9 ・8 1 N I A 9 9 ・0 4 9 9 ・8 0 N I A 5 I1 9 4 9 ・2 7

S 2 8 4 1 7 8 1 ・2 5 9 9 ・9 9 8 9 9 ・8 5 4 0 ・8 2 9 5 ・1 8 9 9 ・1 7 5 1 ・ 1 7 4 8 ・5 0 8 9 ・6 6

S 2 8 5 8 4 8 2 ・5 2 .9 9 ・8 4 9 9 ・8 1 1 6 ・2 5 3 9 ・1 7 9 9 ・6 9 4 2 ・1 5 5 1 ・2 0 9 1 ・2 3

A V e ra g e 8 2 ・0 2 9 9 ・5 6 t 9 9 ・5 0 3 9 ・1 8 8 2 ・ 1 2 9 9 ・2 5 4 5 ・8 9 4 1 ・4 6 8 5 ・5 0

2622

ratio of specific bits is the lowest: 11313/165200=0.0685.The lowest compression ratio was for s35932. The rea-sons were that its ratio of speciBc bits is the highest:18251/28208=0.6470, and that the number of O's and thenumber of 1's in TD Were Very Close. We observed thatthe standard deviation of O-probability and l-probability iso.1285, which is very small. Table 3 indicates that thevariable-to-fixed run-length coding based CSCD is suitablefor properly skewed test data.

Table 4 presents the comparison of the proposedcscD scheme with other compression approaches includ-ing SC [4], Golomb [5], FDR [6], EFDR [8], ARL for mini-mizing WTM[7], and RRAS [23]. The "CSCD (Best)" co1-umnis the best compression ratio achieved by the combi-nation of the threshold and the number of RAS columns,and "CSCD (63)" is the optimal compression ratio when thenumber of RAS columns is 63. The bold entries indicate thebest results. "N/A" in row of "s35932"is due to the experi-

mental result of s35932 is not provided in [7].The row of "Average" indicates that EFDR achieves

the smallest enc6ded test data set. However, it does notaddress the scan-in power dissipation and test applicationtime issues. Since ARL and RRAS also address the prob-lems of scan-in power dissipation and test application time,wecompare the CSCD scheme with these techniques in Ta-ble 5. And due to physical design concernssuch as placeand route, we limited the number of RAS columns to 63.There is no such considerations in RRAS, the experimen-tal results [23] show the number of its RAS columns rangesfrom 256 to 2048.

In Table 5, the APD, PPD, and TAT Reduction columnis the average scan-in power dissipation, peak scan-in powerdissipation reduction and test application time reduction, re-spectively. For the example of APD, the reduction is givenby

HU et al.: COMPRESSION/SCAN CO-DESIGN

Table 6 Hardware overhead of serial scan andCSCD.

S5 2 1 8 7 1!2 14 ・7 9 2 !6 8 2 ・5 2 9 ・9 7 9 0 !0 50 12 0 114 7 4 4 ・5 3 2 15 5 ・5 1

S9 2 2 4 1 15 !2 4 6 ・2 12 9 !6 8 8 ・8 2 1 ・10 1 5 5 !0 2 3 11 5 I2 5 8 1 2 ・0 5 2 19 5 ・4 2

S 12 2 0 7 2 0 2 16 14 ・3 2 5 1 I15 4 ・0 2 6 ・9 2 3 1 1 !9 1 3 38 8 11 10 4 ・26 2 2 9 5 ・2 2

S 15 8 5 0 2 18 10 18 ・9 2 6 9 12 6 8 ・8 2 2 I5 1 3 9 6 !1 0 5 38 9 !82 6 Z l・1 3 2 2 8 5 ・2 4

S2 5 9 2 2 5 0 5 I8 6 5 ・6 6 2 5 18 2 2 ・2 2 2 ・1 2 6 6 5 I4 8 1 1 11 80 !8 87 7 1 ・4 5 2 5 4 4 ・1 0

S2 8 4 17 5 5 4 12 0 4 ・8 6 1 1 !11 8 ・5 2 2 ・19 5 4 6 14 4 1 l !0 2 1 !1 86 8 8 ・0 8 2 5 14 ・1 6

S2 8 5 8 4 5 12 115 0 ・4 6 15 !1 1 2 ・2 2 0 ・0 0 1 15 1 9 !64 6 1 122 4 !9 0 0 1 2 2 ・4 6 2 4 6 4 ・8 6

A Ve rag e 2 2 ・9 1 2 9 ・0 4 2 2 5 5 ・0 9

APD of CSCDAPD Reduction = APD of Serial Scan

)

x 100%

The CSCD scheme significantly outperforms ARL inthe terms of test power dissipation and test applicationtime, although in the aspect of compression ratio, it did notachieve the best results. ARL needs a test clock that is fasterthan that of ATE; however, the high test frequency increasesthe power consumption of the chip-under-test, which ex-plains why the peak power dissipation during scan operationis sti11 high even ARL is optimized to minimize the WTMmeasure. "N/A" in row of "s35932"is due to the experimen-tal results of s35932 are not provided in [7].

RRAS and CSCD have almost the same effect on APDreduction. But in PPD and TAT reductions, the CSCDscheme has much better results than RRAS. In CSCD, scancells with high O-probability areclassified to RASo and scancells with high 1-probability areclassiBed to RASl. SinceRASo and RASl are initialized to O and l before loadingin every test stimuli vector, the number of scan cells whosestates need to be flipped is smal1. As a result, the test appli-cation time is short. The CSCD scheme can perform goodreductions for the VPT issues. At the same time, it ensuresfull fault coverage without any loss.

In terms of hardware overhead, using Synopsys De-sign Compiler, we synthesized the serial scan version ofthe ISCAS' 89 benchmark circuits, and also synthesized theCSCD version corresponding to the optimal compression ra-tio when the number ofRAS columns is 63. Then area over-head and routing overhead are given by Synopsys Astro.

In Table 6, the "Area Overhead (um2)" and "RoutingOverhead (um)" columns show the area overhead and therouting overhead of benchmark circuits designed with serialscan chain and CSCD, respectively. The two "DifferenceRatio" columns are the areaand routing overhead the CSCDversion exceeds the serial scan version. The areaoverheadof Decoder and Address Decoder parts of the CSCD archi-tecture is indicated in the "D and AD" column. Note De-coder and Address Decoder parts do not include of MUXcand MUXd. Wecan see that the areaoverhead of Decoderand Address Decoder parts increases very slowly with theincreasing of benchmark circuit scales. Because the RASscan ce11 has an AND gate and an OR gate more than theconventional multiplexed DFF, the area overhead of CSCDscheme is mainly decided by the number of scan cells in

2623

the circuit-under-test. And along with the circuits scale in-creases, the difference between CSCD and SS decreases. Asto the routing overhead, the average difference ratio betweenCSCD and SS is 29.04%. Note for s38584, the routing over-head of CSCD is much smaller than that of SS. It is becausethe scan ce11s areclose to the address decoders in CSCD fors38584, the wires between the address decoders and scance11s are short,

5. Conclusions

The compression/scan co-design approach provides an ef-ficient solution for testing large integrated circuits andsystem-on-a-chip. The proposed CSCD scheme achievesthe goal of simultaneously reducing test data volume, scan-in power dissipation, and test application time by exploitingthe characteristics of both the variable-to-,fixed run-1engthcoding and the random access scan architectures. Com-bined with a heuristic algorithm for an optimal arrangementof scan cells, the proposed scheme on average reduced testdata volume, average scan-in power dissipation, peak scan-in power dissipation, and test application time by 51.2%,99.5%, 99.3%, and 85.50%, respectively, in our experimentson ISCAS' 89 benchmark circuits.

As future work, we are planning to make more com-parisons with other scan-based compression techniques andreduce the hardware overhead.

Acknowledgments

This work is supported in part by National Natural ScienceFoundation of China (NSFC) under the grant No. 90207002,and also by the National Basic Research Program of China(973 Program) under the grant No. 2005CB321604. Theauthors would like to thank their colleague Ge Zhang forperforming place and route experiments.

References

[1] P. Girard, "Survey of low-power testing of VLSI systems," IEEEDes. Test Comput., vo1.19, no.3, pp.82-92, May-June 2002.

[2] X. Wen, Y.Yamashita, S. Kajihara, L.-T. Wang, K,K Saluja, and K.

Kinoshita, "On low-capture-power test generation for scan testing,"Proc. VTS, pp.265-270, California, USA, May 2005.

[3] A, Jas and N.A. Touba, "Test vector compression via cyclical scanchains andits application to testing core-based designs," Proc. ITC,

pp,458-A64, WashiIlgtOn, USA, Oct. 199B.[4] A. JasJ. Ghosh-DasddaT, and NA Touba, "Scan vector compres-

sion/decompression using statistical codlng," Proc. VTS, pp.1 14-120, San Diego, USA, Aprll 1999.

[5] A. Chandra and K Chakrabarbr, ''System-on-a-chip test data c.m_pression and decompression architectures based on Golomb codes,"IEEE Trans. CompuL-Alded Des. TntegL CircultS Syst- , vo1.20, no.3,pp.355J68, March 2001,

[6] A. Chandra and K. Chakrabarty,.'Frequency-directed run-1ength(FDR) codes wi(h applicatlOn tO SyStem-OnOChip test data c.m-pression," Proc. VTS, pp.42--47, Manna Del Rey' USA, May 2001.

[7] A. Chandra and K. Chakrabarty, "A unlaed approach to reduceSOC test data volume, scan power and testing time:, TEEE T,ar.s.Comput:Aided Des. Integr, Circu1(S Syst., vo1.22, no.3, pp.352-363, March 2003,

[8] A. E1-maleh and R. Al-Abaji, "Extended frequency directed runlength codes with improved applicadon to sys(em-on-a-chip testdata compression," Proc. Int. Conf. Elec. Cir, and Sys., pp.449452,Greece, Sept. 2002.

[9] P. Gonciar1. B. Al-llashimi, and N. Nicolici, -Improving compres-sion rat10. area OVerhead, and test applicat10n tlme for system-on-a-chip test data compressiorVdecompression," proc. DATE, pp.604-

61 1, Pans, France, March 2002.[10] Y, Han and X, Li, "Simultaneous reduct10n Of test data volume and

testlng POWer fof SCan-based test," Proc. ICVLSI, pp.374-381, LasVegas, USA,.June 2004.

[11] A. Jas, B. Pouya, and N-A. Touba, "Virtual scanchalnS= A mea.1Sfor reducing scan length in cores," proc. vTS, pp-73]S, Montreal,Ca,lada, May 2000.

[12] I, BayraktaIOglu and A. Orai1oglu,.vrest volume and applica-tion tlme reductlOn through scan chain concealmen(," Proc, DAC,pp.151-155, Las Vegas, USA, June 2001.

[13] W. Rao, I. J3ayraktaroglu, and A. Ora11oglu, -Te5t aPPlicadon timeand volume compression through seed overlapping:, Proc, DAC,pp.732-737, Anaheim, USA, June 2003.

[14] I. Hamzaoglu and J. Patel, "Reducing test applicat10n time f., fu11scan embedded cores," proc. DFT, pp.260-267, Albuquerque, usA,Nov. 1999.

[15] K. Miyase and S. Kajihara, "Optimal scan tree construct,.n with testvector mod16cation for test compression;' Proc. ATS, pp.136-141,Xian, Chlna, Nov. 2003.

[16] H. Yo(suyanagi, T. Kuchii, S. Nishlkawa, M, Hashizume, and KKinoshi(a, "Reducing scan shif(s using folding scan trees;, proc.ATS, pp.6-1 1, Xlan, China, Nov. 2003.

[17J Y. lhhomme. T. Yoneda, H. FujiwNa, and P. Girard, -An eLRcien(scan tree design for test tlme reduct10n," Proc. ETS, pp.321-326,Corsica, France, May 2004.

[18] D. Xiang and Y. Wu, "A cost-eqectlVe SCan arChitecture f.r scan(estlng with non-scan test power and test applicatlOn COSt:, Proc.DAC, pp,744-746, Anaheim, USA, June 2003,

[19] L-T. Wang, X. Wen, H, Furukawa, F. Hsu, S. Lin, S. Tsal, K.S.

Abde]-Hafez, and S. Wu, "VirtualScan: A new compressed scantechnology for (est cost reduction:, ProcL ITC, pp,916-925, Char-lotte, USA, Oc(. 2004.

[20] 0, Sinanoglu and A. Oralloglu, "A novel scan a,chi(ect..re f..power-eLRcient, rapid test," Proc. ICCAD, pp.299-303, San Jose,USA, Nov. 2002.

[21] S. Samaranayake, N. Sitchmava, R, Kapur, M.B. Amin,and T.W.WT11liams, "DynamlC SCan: Driving down the cost of test," Compu(er,vo1.35, no.10, pp.63J8, Oct, 2CO2.

[22] B. Arslan and A. Orai1oglu, '.Circularscan: A scan archi(eclure

for test cost reduchon," Proc, DATE, vol.2, pp,1290-1295, Pans,France, Feb, 2004.

[23] D.H. Baik, K.J(, Saluja, and S. Kajihara, L.Randc.m access scan= Asolution to test power, (est data volume and test tlme," Proc, VLSID,pp.883-888, Mumba1, lndia, Jan, 2004.

[247 H. Ando,'LTesting VLSI with random access scan," p,oc_ coMP-

lZ:ICE TRANS. INF. & SYST., VOL.E89-D, NO.1O OCTOBER 2006

CON, pp.50-52, San Francisco, USA, Feb. l980.[25] K- Chakrabany, V lyengar, and A. Chandra, Test resource pard(ion-

ing for system-on-a-chip' pp,147-148, Kluwer Academic publish-ers, 2002.

[26] R. Sankafalingam, R.R Oruganti, and N.A_ Touba, "Stat.c c.m-paction techniques (o control scan vector power dlSSipat10n," fbc.VTS, pp.35AO, Montreal, canada, May 2000.

[27] J. Aerts and E,J, Mammisen,"scan chain design for test tlme ,educ-t10n in core-based lCs;' ProcL ITC pp,448457, Washington, DC.USA, OcL 1998,

[28J I. llamzaoglu and J. Patel, "Test set compact10n algOrithms f., com_binational circuits," Proc. ICCAD, pp.283-289, San Jose, USA,Nov. 1998.

YuHu receivedherB.S"M.S.andPh.D.de-

grees al1 in electncal engineedrlg from the uni-versity of Electronic Science and Technology,Chengdu, China, in 1997, 1999 and 2003, re-spectively, She is curreJ1(ly an associate profes-sor at lnstitute of ComputiJlg Technology, Chi-nese Academy of Sciences, Beijing. China. Herresearch in(erests include DFr, tes( resource op-t1miZat"n and EDA tools development. She is amemberof IEEE,

l YiAhe Han received his B. Eng. fromNanjing Univerisity of Aeronautics and Astro-

s nautlCS (China) in 1997 and his PhD degree

HreosT incs=teuLeyo.ffCsoc:::fsTgH7?,he::1ocghy,inCteh:-

ests include VLSJ/SOC design and test. He re-ceived the Test Technology Technical CouncilBest Papef Award at ATS 2003, He is a studentmemberof lEEE.

XiaoweiLi receivedhisB.Eng, andM.Eng.degree5 in computer science &omHefei Umiver-sity of TechtlOloBy (China) in 19B5 and 1988respectlVely, and his Ph.D, degree in computerscience from the Tnstltute Of Computing Tech-nology, ChlneSe Academy of Sciences in 1991,Dr. Li joined Pek.ng University as a Pc'stdoc-toral Research Associate in l991, and was pro-moted (o Associa(e Prc.fessor in l993, all withthe DepamneJlt Of Computer science and Tech-nology. From 1997 (o 1998. he was a Visitlng

Research Fellow in the Department of Electncal and Electronic Engineer-ing at the University of FIong Kong. h 1999 and 20CK), he was a visidngProfessor in the Graduate School of Tnforma(ion Science, NaTa lnst1(ute Ofscience and Technology, Japan. f7e joined the lnsdtute of Computing Tech-nology, Chinese Academy of Sciences as a professor in 2000_ His reseNChinteres(s include VLSl/Soc design verihcadon and test genera(ion, designfor testability, 1ow-power design, dependable computing. Dr. Li receivedthe Natural Science Award fromthe ChlneSe Academy of Sciences in 1 992,the CertiBcate ofAppreciation fTOmIEEE Compute, society in 2001. He isa senior member of EEE. He is an area edltOr Of the Joumal of ComputerScience and Technology and an associate edltOr-in-chlef of the Journal ofComputer-Aided Design & Compu(er Graphics (in Chinese).

HU et al. : COMPRESSION/SCAN CO-DESIGN

Huawei Li received her B.S. in computerscience from XiangtaJI University (ChlTla) in1996, and M,S, and Ph.D, degrees from lnstltuteof Computlng Technology, ChlneSe Aeademy ofSciences in 1999 and 2Wl respectlVely. Sheis now an associa(e professor at the lnStituteof Computlng Technology, Chinese Academyof Sciences. Her research interests includeVLSUSoc design veriBcatlOn and test gerlera-(ion, delay test, and dependable computlnB. She

is a member ofIEEE.

Xiaoqing WeTI reCeived the B.E. degreefTOm Tsinghua University, Beijing Chirla, in1986, the M.E. degree from Hiroshima Univer-

sity, Hiroshima, Japan, in 1990, aIld the PhDdeBree from Osaka University, OsakaJapan, in1993. From l993 to 1997, he was a LecttlrefatAkltjl Universi(y. He was a Visitlng Researcherat Universi(y of Wisconsin, Madison, U.S.A.,from Oct. l995 to March 1996. He joined Syn-

----&-*&>"-J Test Technologies, Inc., U.S.A., in 1998, and

served as its CTO tJntil 2CO3. From 2004, hehas been an Associate Professor a( Kyush lnstltu(e Of Technology, Iizuka,Japan. His research interests irlClude VLSI test, dlagI10Sis, and testable de-sign. Ile is a member of lEEE.