compression

Upload: rajesh-durisala

Post on 19-Jul-2015

43 views

Category:

Documents


0 download

TRANSCRIPT

CHAPTER 1 INTRODUCTION 1.1 PROLOGUE The main concern of testing a digital system or a device is to test it as thoroughly as possible with minimum power dissipation and test data volume. With the number of transistors doubling every 24 months (Moores law1965), we already have chips with billions of transistors, and this number continues to grow. Testing devices with this number of transistors quickly and thoroughly is a very challenging task and requires proper strategies and systematic test approaches for generating test, and properly compressing the test vectors to reduce the test data volume. There is additional power dissipation due to higher switching activity during testing than in functional operation. This is due to the fact that there is a fundamental conflict between the very aims of low power design where correlation between the input patterns is increased and traditional DFT methodologies where correlation between test vectors is decreased in order to reduce test application time. A number of techniques to control power consumption in test mode are available. These can be broadly classified as structural, algorithmic and tester based. The scan based design is commonly used DFT methodology in sequential part of an integrated circuit. It provides a structural approach to improve controllability and observability and thus renders the ever more complex circuits testable. Scan flip-flops operate in two modes: shift and capture. In the shift mode, scan flip-flops are connected as a scan chain so that test patterns can be shifted in the scan chain via the scan input and test responses are shifted out via the scan output. In the capture mode, the circuit output responses are captured into the scan chain. The power constraints defined under normal operations are usually much lower than the power consumed in the test mode. It is likely that a CUTs power rating is violated in both shift mode and capture mode in scan tests. A significant amount of research work has been proposed to address this problem in the literature, which can be broadly divided into two categories: DFT-based solutions and software-based solutions. Generally speaking, DFTbased solutions are more effective for test power reduction by introducing dedicated DFT hardware to suppress switching activities in the CUT. Software-based solutions, on the other hand, usually cannot achieve the same amount of test power reduction as that of DFT-based solutions, but they do not involve any DFT overhead and can be easily integrated into conventional IC design flow. Techniques for test power reduction, including peak and1

average power, help to avoid unwanted test failures. In this work, we focus on one of the most widely-used software-based solutions for test power reduction, which tries to reduce the CUTs switching activities by filling the dont-care bits (i.e., X-bits) in given test cubes intelligently, known as the X-filling technique. In these two issues, the peak power consumption is more critical, since excessive heat dissipation may damage circuits under test while IR-drop may reduce test yield. In order to perform a non-destructive test, the test power should meet power constraints defined in the design phase. The peak test power is the result of capture power consumption, which can be solved by low-capture-power (LCP) testing. After appropriate filling of X-bits we get fully specified test vectors .Test data volume is also a major problem encountered in testing of SOCs. A large amount of test patterns should be sent from the ATEs to test embedded cores through the ATE channels. However, since the bandwidth of ATE channels and the capacity of ATE memory are limited, the cost of testing and test application time will increase considerably. To alleviate these problems, test compression techniques are usually adopted in todays SOC era. 1.2 NEED FOR TESTING

During test application in a scan-based circuit, power is dissipated in both the sequential scan elements and in the combinational logic. While scan values are loaded into a scan chain, the effect of scan ripple propagates to the combinational block and redundant switching occurs in the combinational gates during the entire scan-in/out period. It is observed that about 78% of total test energy is dissipated in the combinational block alone. Hence, a low-power scan design should address techniques to reduce power dissipation in the combinational block. Reduction in test power is important to improve battery lifetime in portable electronic devices employing periodic self test, to increase reliability of testing, and to reduce test cost. In scan-based testing, a signicant fraction of total test power is dissipated in the combinational block. Statistics say that Power Dissipated during Testing is 3 times more than Normal functioning. POWER dissipation during test mode can be signicantly higher than that during functional mode, since the input vectors during functional mode are usually strongly correlated compared to statistically independent consecutive input vectors during testing. Zorian showed that the test power could be twice as high as the power consumed during the normal mode. Test power is an important design concern to increase battery lifetime in hand2

held electronic devices that incorporate BIST circuitry for periodic self-test. It is also important to improve test cost; increased peak power is likely to create noise problems in a chip by causing a drop in the supply voltage. Peak and average power reduction during test contributes to enhanced reliability of the test and improvement of yield. It is, therefore, important to ensure reduction in power dissipation during the test mode. Reduced correlation between the consecutive test vectors increases the switching activity and eventually the power dissipation in the circuit. The second reason of increasing the power dissipation during test is because the test engineers may test cores in parallel to reduce the test application time. This extra power (average or peak) can cause problems such as instantaneous power surge that causes circuit damage, difficulty in performance verication and decreased overall product yield and cost. Low power test application has become important in todays VLSI design and test. Problems in present testing 1) Large Test cost 2) Large test data. 3) More test power dissipation due to more transitions.

1.3 DESIGN FOR TESTABILITY Testability has become a critical concern for ASIC designers. Design for Test (DFT) techniques provides measures to comprehensively test the manufactured device for quality and coverage. The main DFT techniques available today are: Scan insertion Boundary scan insertion Memory BIST insertion PTAM Insertion Logic BIST insertion

DFT is a methodology that improves the testability in terms of controllability and observability by adding test hardware and introducing specific test oriented decisions during the VLSI design flow. This often results in shorter test application time higher fault coverage and hence tests efficiency and easier ATPG. The most common DFT methodology is scan based DFT where sequential elements are modified to scan cells and introduced into a serial shift register. This is done by having a scan mode for each scan cell where data is not loaded3

in parallel from the combinational part of the circuit but it is shifted in serially from the previous scan cell in the shift register. The most common method for delivering test data from chip inputs to internal circuits under test (CUTs, for short), and observing their outputs, is called scan-design. In scan-design, registers (flip-flops or latches) in the design are connected in one or more scan chains, which are used to gain access to internal nodes of the chip. Test patterns are shifted in via the scan chain(s), functional clock signals are pulsed to test the circuit during the "capture cycle(s)", and the results are then shifted out to chip output pins and compared against the expected "good machine" results.

Methodology to ensure a design works correctly after manufacturing DFT tools add test circuitry (RTL or gate level) for design testability DFT tools generate test sets applied to manufactured designs to detect defects DFT-based diagnostics facilitate failure analysis Testability is a design attribute that measures how easy it is to create a program to

comprehensively test a manufactured designs quality. Traditionally, design and test processes were kept separate, with test considered only at the end of the design cycle. But in contemporary design flows, test merges with design much earlier in the process, creating what is called a design-for-test (DFT) process flow. Testable circuitry is both controllable and observable. In a testable design, setting specific values on the primary inputs results in values on the primary outputs which indicate whether or not the internal circuitry works properly.

4

1.4 SCAN CELL STRUCTURE 1.4.1 Definitions of Shift and Capture Power In testing Scan test power is composed of two components corresponding to the two modes of operation: shift power and capture power. It has been shown that the switching activity in the scan chains is very closely correlated to the switching activity in the circuitunder-test (CUT). Based on this observation, many techniques have been presented in literature to reduce shift/capture power by reducing the switching activity in the scan chains. Shift operation enables the setting up of the stimulus in the scan cells via serially shifting in test data bit by bit, the distribution of specified bits and the DFT inserted determines the toggle activity in the CUT. Techniques proposed for reducing shift power can be broadly classified as based on: (a) design modification,(b) test data modification,(c)scan chain modification, (d) clocking modification and (e) test set scheduling . Capture power reduction is much severe problem to solve compared to shift power reduction. This is because the functional mode input vectors are usually strongly correlated as compared to statistically independent test input vectors and their responses. Capture power reduction has been studied in where the test stimulus is modified to reduce peak power. In a test vector chain it is observed that more than 90% of the bits in test vectors are dont care. In multiple scan chains, if the care bits of each pattern distribute over a minimum number of chains, the shift power will be reduced, since only a few chains with care bits need to load test data. Based on this condition, there are many chains, in which the values of all scan flip-flops are dont care. How to fill such chains will affect the capture power and ndetection of a pattern. This project focuses on the filling strategy of such chains. 1.4.2 Overview of Test Process During the test process, the chip is tested for manufacturing defects by analyzing the logic on the chip to detect logic faults. During testing, the test process puts the circuit into the following test modes: Capture mode:The part of the test process that analyzes the combinational logic on the chip. The flip-flops act first as pseudo-primary inputs (using ATPG-generated test vector data), and then as pseudo-primary outputs (capturing the output of the combinational logic).

5

Scan-shift mode: The part of the test process in which flip-flops act as shift registers in a scan chain. Test vector data is shifted into the scan flip-flops and the captured data (from capture mode) are shifted out of the scan flip-flops. System mode: The normal, or intended, operation of the chip. Any logic dedicated to DFT purposes is not active in system mode.Fig 1.1 below shows the Equivalent Circuit in system mode.

In1

out1 In system mode, the Flip-Flops act as intermediate storage devices,B

A

storing data between clock cycles.C

Fig 1.1 Equivalent circuit for system mode Scan_in A

B

C

Scan_out Fig 1.2 Equivalent Circuit in Scan Mode

In the above Fig 1.2 shows the Scan mode where the Flip-Flops acts as shift registers in Scan-Registers. The captured data is shifted out and the Test vector data is shifted in.

6

The scan flip-flops are linked together using the scan data input of the mux to form a scan chain that functions like a shift register in scan-shift mode. During scan-shift mode, test data can be shifted through the scan chains. Test data is shifted in through the Scan_in pins and shifted out through the Scan_out pins. The above process which is called the MUXED based scan is carried out in the following steps as shown below: 1. Shift the first test vector into the scan chain. 2. Capture one cycle of system function. a . Apply the primary input vector states to the primary inputs. b. The loaded test vector (the shifted-in flip-flop states and the primary input states) supplies the input to the combinational logic. The system's response is propagated to the primary outputs and flip-flop inputs. c. Compare the actual and expected test results at the primary outputs. d. Clock the system's response to the test vector into the flip-flops. 3. Shift the test result out of the scan chain flip-flops and simultaneously shift in the new test vector. 4. Compare the actual and expected test results as each bit is shifted out of the scan chain. 5. Repeat steps 2, 3, and 4 until all test vectors have been run.

Muxed-D Scan Cell :- This scan cell is composed of a D flip-flop and a multiplexer. The multiplexer uses an additional scan enable input SE to select between the data input DI and the scan input SI. In normal/capture mode, SE is set to 0. The value present at the data input DI is captured into the internal D flip-flop when a rising clock edge is applied. In shift mode, SE is set to 1. The scan input SI is used to shift in new data to the D flip-flop, while the content of the D flip-flop is being shifted out.

7

Without Scan Insertion Data_inCOMBINATIONAL LOGIC

Data_out

D

Q

D

Q

D Q

Clock

Fig 1.3 The original Circuitry with normal Flip-Flops

With Scan InsertionData_inCOMBINATIONAL LOGIC

Data_out

Scan_in

D

Q

D

Q

D Q

Scan_out

Shift_enable

ClockFig;1.4 The Circuitry showing with normal Flip-Flops converted to SFF

Several techniques have also been proposed to reduce or cancel the switching activity in the CUT during scan shifting. Various researchers tried to find out an input vector, called a control vector, such that when this vector is applied to the primary inputs of the CUT during scan shifting, the switching activity in the combinational part of the CUT is minimized. Fig.1.5 shows transitions in scan cells that cause shift and capture power consumptions during scan tests. In this circuit, the first test vector 10001 is shifted into the scan chain in five clock cycles. After one capture cycle, the response vector 00110 is captured into the scan chain and scanned out while the next test vector 01100 is scanned in simultaneously. Each vector row in this figure represents states of the scan cells in one test cycle, the dash

8

lines highlight where transitions happen. During the shift phase, transitions on the scan chain occur when adjacent bits in test vectors have different logic values, and the number of transitions that it takes effect is determined by the position it happens. Differently, capture power is caused by transitions happened when scan cells have different values before and after capture.

Combinational PortionC Y C L E S Scan_in 1 2 3 4 5 1 0 0 0 1 X 1 0 0 0 X X 1 0 0 X X X 1 0 X X X X 1 Scan_out 00110 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 6 7 8 9 10 11

10001

Fig 1.5 Shift and Capture power during scan tests.

To understand the impact of shift time on test application time, consider the typical sequence involved in processing a single scan test pattern shown in Figure 1.5. All these require one clock period on the tester. The shift operations, however, take as many clock periods as required by the longest scan chain.

9

CHAPTER 2 LITERATURE SURVEY 2.1 TEST POWER A high density system likes ASIC or SoC always demands the non destructive test which satisfies all the power constraints defined during design phase. On the other way, the current testing philosophy demands much more power consumption during test compared to power consumption during functional mode. This section describes the reasons and effects of such high-power consumption. Most power dissipated in a CMOS circuit comes from the charge and discharge of capacitances during switching. In order to explain this power dissipation during test, let us consider a circuit composed of N nodes, and a test sequence of length L applied to the circuit inputs. The average energy consumed at node i per switching is * Ci*Vdd2 where Ci is the equivalent output capacitance at node i and Vdd the power supply voltage. A good approximation of the energy consumed at node i in a time interval t is *Ci*Si*Vdd2 where Si is the average number of transitions during this interval (also called the switching activity factor at node i). Furthermore, nodes connected to more than one logic gate in the circuit are nodes with a higher output capacitance. Based on this fact, and in a first approximation, it can be stated that output capacitance Ci is proportional to the fanout at node i, denoted as Fi. Therefore, an estimation of the energy Ei consumed at node i during the time interval t is given by:

where C0 is the minimum output capacitance of the circuit. According to this expression, energy consumption at the logic level is a function of the fanout Fi and the switching activity factor Si. The fanout Fi is defined by circuit topology, and the activity factor Si can be estimated by a logic simulator. The product Fi.Si is named weighted switching activity (WSA) at node i and represents the only variable part in the energy consumed at node i during test application. According to the above formulation, the energy consumed in the circuit after application of a pair of successive input vectors (Vk-1,Vk) can be expressed by: ( )

10

Where i ranges across all the nodes of the circuit and Si(k) is the number of transitions provoked by Vk at node i. Now, let us consider the complete test sequence of length L required to achieve the target fault coverage. The total energy consumed in the circuit after the application of the complete test sequence is given below, where k ranges across all the vectors of the test sequence. 2.1.1 Reasons of High-Power Consumption during Test. There are several reasons for this increased test power. Out of them, the main reasons are as follows. (i) The test efficiency has been shown to have a high correlation with the toggle rate; hence, in the test mode, the switching activity of all nodes is often several times higher than the activity during normal operations. (ii) In an SoC, parallel testing is frequently employed to reduce the test application time, which may result in excessive energy and power dissipation. (iii) The design-for-testability circuitry embedded in a circuit to reduce the test complexity is often idle during normal operations but may be intensively used in the test mode. (iv) That successive functional input vectors applied to a given circuit during system mode have a significant correlation, while the correlation between consecutive test patterns can be very low. This can cause significantly larger switching activity and hence power dissipation in the circuit during test than that during its normal operation. 2.1.2 Eects of High-Power Dissipations The most adverse effect of very high-power dissipation during test is the destruction of IC itself. In addition, to prevent the IC from destruction, the power dissipation during test can affect the cost, reliability, autonomy, performance-verification, and yield-related issues. Some of the effects are as follows. (i) The growing need of at-speed testing can be constrained because of the highpower dissipation. So stuck at faults can be tested without any effect, but the testing of the delay fault will become difficult. (ii) During functional testing of the die just after wafer etching, the unpackaged bare die has very little provision for power or heat dissipation. This might be a problem11

for applications based on multichip module technology, for example, in which designers cannot realize the potential advantages in circuit density and performance without access to fully tested bare dies. (iii) (iv) Circuit can be failed because of erosion of conductors caused by electro migration. The online BIST in battery-operated remotes and portable systems consumes very high power for testing only. Remote system operation occurs mostly in standby mode with almost no power consumption, interrupted by periodic self-tests. Hence, power savings during test mode directly prolong battery lifetime. (v) The elevated temperature and excessive current density severely decrease circuit reliability. (vi) Because of excessive power dissipation, the simple low-cost plastic packages used in consumer electronics cannot be used, but expensive packages which can remove the excessive heat are forced to be used. Based on the above discussion, first of all, let us list the characteristics of the power reduction technique suitable to IP core-based Soc and then compare each of the available techniques with our ideal model. Characteristics of Power Reduction Scheme Suitable to IP Core-Based SoC: (i) (ii) (iii) It should not demand the knowledge of internal structure of design. It should not make any modification in internal design. But it can add the hardware as per requirement without modifying the available I/O pin configuration. (iv) (v) It should deal with readymade test sequence rather than test architecture . It should not be dependent on testing tools like ATPG or fault simulation which deals with the netlist of design. Now comparing the available techniques with above characteristics:Modification in LFSR: The implementation of this method (i) (ii) (iii) (iv) Deals with test sequence rather than test architecture. Requires the knowledge of internal details of design. requires the additional hardware to modify test pattern sequence, and Requires modification in internal structure.

12

Partitioning the Circuit: The implementation of this method (i) (ii) (iii) (iv) (v) deals with test architecture rather than test sequence, requires the well-defined internal hierarchical structure of design, requires the knowledge of internal details of design, requires the additional hardware to modify test pattern sequence, and requires modification in internal structure.

Separate Testing Strategy for Memory:- The implementation of this method (i) (ii) can be applied for memory when and where it is required, and is not applicable to functional blocks.

Improved ATPG Algorithms:- The implementation of this method (i) deals with generation of new test set rather than available test sequence or test architecture, (ii) Requires the netlist of the design. It cannot be directly applicable to hard core.

Ordering Technique:- The implementation of this method (i) (ii) (iii) (iv) (v) deals with test sequence rather than test architecture, requires the well-defined test sequence, that is, test data set, does not require the knowledge of internal details of design, requires the additional hardware to reorder test pattern sequence, and does not require modification in internal structure.

Exploring Do Not Care Bits:- The implementation of this method (i) deals with test data bit sequence rather than test vector sequence or test architecture, (ii) (iii) (iv) (v) requires the well-defined test sequence, that is, test data set, does not require the knowledge of internal details of design. does not require any additional hardware, does not require modification in internal structure.

13

Out of the above-mentioned categories, except ordering techniques and exploring do not care bits methods, all other methods require the internal details of the design under test. Hence, in context of IP core-based SoC, only these two categories are suitable. To reduce the cost and time to market, the modular design approach is largely adopted for SoC. The structure of such predesigned, ready-to-use intellectual property (IP) core is often hidden from the system integrator. So testing of such cores is even more daunting. So power reduction during testing of such cores puts many constraints on current low-power testing methodology. To develop the right testing strategy for such SoC, it is necessary to survey all the available low-power testing approaches and find out the suitable approach for such SoC. 2.1.3 Methods to handle Test power Many proposed approaches are dealing with test power reduction. Minimum Transition Fill (MT-fill) [4] technique to achieve low test power, and this method minimizes the transitions in one scan chain. Another method [29] considers the successive test pattern to fill the X bits in test pattern; this approach can also reduce test power. Reorder techniques [7],[12] to reduce the test power by minimizing the transition count among the test data. Multiple-scan-chain technique to reduce test power or test time [6]-[9],[15]. There is an architecture [8] that can shutdown or enable a partial range of the sub scan chains, which reduces both peak power and total power consumption. Others [9],[18] use decoders to decode the pre-encoded test data, providing reduction on power and test data volume. There are techniques to manipulate the embedded deterministic testing (EDT) structure [25], [26]and patterns to achieve low power test. MT-FILL technique Our main concern is to reduce the number of transitions that takes place inside the IC when the input test vectors are given (e.g: 1010101) has got greater no of transitions. This in turn increases the power dissipation and produces lot of spikes inside the IC which ultimately damages the IC hence to minimize the power dissipation we go for MT-FILL technique. In this technique we are selecting the input vectors in such a way that it reduces the time taken for computation and also the power dissipation. A static compaction procedure that compacts a set of test cubes (i.e., test vectors where the

14

unspecified values have been left as Xs) in a way that minimizes either average power or peak power. Static compaction is performed to reduce the number of test cubes. Static compaction involves trying to merge two compatible test cubes into one. When power is a consideration, it is generally better to use a minimum transition fill (MTfill).MT-fill involves filling strings of Xs with the same value to minimize the number of transitions. The power dissipated when applying two vectors can be compared by counting the number of weighted transitions in the vector [4].

2.2 TEST DATA VOLUME

Test data volume is the total amount of data that is being used for testing of a CUT. [30]Test compression involves compressing the amount of test data (both stimulus and response) that must be stored on ATE for testing with a deterministic ATPG-generated test set. This is done by adding some additional on-chip hardware before the scan chains to decompress the test stimulus coming from the ATE and after the scan chains to compress the response going to the ATE. Test compression can provide a 10 or even 100 reduction in the amount of test data stored on the ATE. This greatly reduces ATE memory requirements and even more importantly reduces test time because less data has to be transferred across the limited bandwidth between the ATE and the chip. Moreover, test compression methodologies are easy to adopt in industry because they are compatible with the conventional design rules and test generation flows used for scan testing. ATE has limited speed, memory, and I/O channels. The test data bandwidth between the tester and the chip, as illustrated in Fig 2.1 , is relatively low and generally is a bottleneck with regard to how fast a chip can be tested . The chip cannot be tested any faster than the amount of time required to transfer the test data which is equal to: ( ) ( )

The idea in test compression is to co mpress the amount of test data (both stimulus and response) that is stored on the tester. This provides two advantages. The first is that it reduces the amount of tester memory that is required.

15

TESTERTEST DATA

Test Data Bandwidth = (#Channels*Clock Rate)

CHIP

Fig.2.1: Block Diagram Illustrating Test Data Bandwidth

The second and more important advantage is that it reduces test time because less test data has to be transferred across the low bandwidth link between the tester and the chip. Test compression is achieved by adding some additional on-chip hardware before the scan chains to decompress the test stimulus coming from the tester and after the scan chains to compact the response going to the tester. This is illustrated in Figure2.2 .This extra on-chip hardware allows the test data to be stored on the tester in a compressed form. Test data is inherently highly compressible. Test vectors have many unspecified bits that are not assigned values during ATPG (i.e., they are dont cares that can be filled with any value with no impact on the fault coverage). In fact, typically only 1 to 5% of the bits have specified (care) values, and even the specified values tend to be highly correlated due to the fact that faults are structurally related in the circuit.

Compressed Stimulus

Low cost ATE

D e c o m p r e s s o r

Stimulus Core

Responsec o m p a c t o r

Compacted Response

Scan Based Circuit(CUT)

Fig.2.2: Architecture for Test Compression

Consequently, lossless compression techniques can be used to significantly reduce the amount of test stimulus data that must be stored on the tester. The on-chip decompressor expands the compressed test stimulus back into the original test vectors (matching in all the16

care bits) as they are shifted into the scan chains. Output response is even more compressible than test stimulus because lossy compression (also known as compaction) can be used. Output response compaction converts long output response sequences into short signatures. Because the compaction is lossy, some fault coverage can be lost due to aliasing when a faulty output response signature is identical to the fault-free output response signature; however, with proper design of the compaction circuitry, the probability of aliasing can be kept negligibly small. A more challenging issue for output response compaction is dealing with unknown (non deterministic) values (commonly referred to as Xs) that might appear in the output sequence, as they can corrupt compacted signatures. This can be addressed by Xblocking or X-bounding, where the design is modified to eliminate any sources of Xs in the output as is done in BIST; however, this adds additional design complexity. Alternatively, X-masking can be done to selectively mask off the Xs in the output sequence, or an Xtolerant compaction technique can be used. If it is still impossible to prevent all Xs from reaching the compactor, then an X-impact technique that uses ATPG assignments to avoid propagating Xs can be used. A test cube is defined as a deterministic test vector in which the bits that are not assigned values by the ATPG procedure are left as dont cares (Xs). Normally, ATPG procedures perform random fill, in which all the Xs in the test cubes are filled randomly with 1s and 0s to create fully specified test vectors; however, for test stimulus compression, random fill is not performed during ATPG so the resulting test set consists of incompletely specified test cubes. The Xs make the test cubes much easier to compress than fully specified test vectors. As mentioned earlier, test stimulus compression should be an information lossless procedure with respect to the specified (care) bits in order to preserve the fault coverage of the original test cubes. After decompression, the resulting test patterns shifted into the scan chains should match the original test cubes in all the specified (care) bits. Many schemes for compressing test cubes have been proposed. They can be broadly classified into the three categories shown below; these schemes are described in detail in the following subsection Code-based schemes use data compression codes to encode test cubes. Linear-decompression-based schemes decompress the data using only linear operations (that is LFSRs and XOR networks). Broadcast-scan-based schemes rely on broadcasting the same values to multiple scan chains.

17

One approach for test compression is to use data compression codes to encode the test cubes. This approach is being exploited in this project. Data compression codes partition the original data into symbols, and then each symbol is replaced with a codeword to form the compressed data. The decompression is performed by having a decoder that simply converts each codeword into the corresponding symbol. Data compression codes can be classified into four categories, depending on whether the symbols have a fixed size (i.e., each symbol contains exactly n bits) or a variable size (i.e., different symbols have different numbers of bits) and whether the code words have a fixed or variable size. The categories are as follows. An example of each category is also given below: Fixed-to-Fixed Fixed-to-Variable Variable-to-Fixed - Dictionary Code - Huffman Code - Run length Code

Variable-to-Variable- Golomb Code

2.3.1 Methods to handle test data volume Test data volume can be reduced by compression techniques. It is another way to solve the test data problems, and it sometimes provides low power testing. Due to the necessity of test volume reduction, various techniques have been developed. Statistical and Huffman encoding in for test volume [5],[14]and test application time reduction. [5] 4-bit blocks and encodes test patterns to Huffman code by calculating the pattern frequency. Run-length encoding and Huffman encoding technique to compress data.

The basis of run-length encoding lies in encoding a long data string by counting the number of repeated characters or patterns. Nine-code encoding [13] technique is applied to compress test data. It uses 9 code words to encode data. There are some methods which use structure and memory base scheme to encode test data. In reconfigurable switch structure [22], applies fewer bits to reconfigurable switch block .There are inverter-interconnect based decompression network to decode the test data. Broadcast approach in [20],[26] uses broadcaster to distribute few control bits and generates large number of bits to internal scan chains. The Multi-layer Data Copy (MDC) schemes method duplicates the test data inside the chip. Others [19],[27] uses dictionary based method with memory to encode test data For

18

unification solutions, [11],[13] use finite state machine (FSM) mechanism and various kinds of data compression techniques to reduce scan power, test data volume, and test time. Modified run length and combines resource partition to achieve low power and small test data volume. Also [11] combines Golomb coding to deal with the power and volume issues. LFSR reseeding approach

This is a low power, test data compression scheme. A new encoding scheme is introduced which reduces the test power and even test storage. This encoding scheme classifies blocks into transition, non transition and dont care blocks depending the test data present. Some of the transition blocks are converted into non-transition blocks by using the dont care bits present in the test cubes. Depending on the hold cube generated by the encoding scheme the test cubes which are reordered in such a way that the compatible test cubes are at successive positions which further reduces the test storage. Selective Pattern Compression scheme

Selective pattern compression has the following stages for test power and test data reduction: (i)Pattern Selection Stage This stage separates the test patterns into two groups by X bit omit ratio. If test patterns X bit ratio is smaller than given X bit omit ratio, this test pattern will belong to the NSC group, or it will belong to CSC. The CSC data set is selected for the next stage because it contains more X bits which are easier to compress. Less X-bit01X00X 0XX0XX...... X00XXX X1XX0X .XXX0XX 01100X 01X010

NSC

More X-bitsFig.2.3.Pattern Selection Stage.

CSC

(ii)Pattern Compression Stage When the test patterns for CSC are determined, we use these patterns in compression stage. In this stage, we attempt to shrink coding size as small as possible.We use 4 bits as fixed output size to introduce the process. If the original test data length is 21 bits, we divide19

it to 6 segments. The compression procedure starts to compress test patterns from the first segment to the sixth segment. The next step merges patterns in each segment. For instance, this step will merge pattern XX11 and 1XX1 to pattern 1X11. Each compressed pattern is given a number as index.

(iii)Power Optimization Stage

After the pattern compression stage, this subsection introduces the power optimization method to minimize the shift-in power in CSC. We apply greedy search method to find the lower power coding, because we set the bit before first bit of each pattern to 0. This stage maps new encoded data to the compressed pattern index. The permutation of 3 bits, with 8 new encoded data, is 40320(8!). The optimization method tries to find all the permutations from the first segment to the last segment, and decides the fewest switching power encoding. The encoding result becomes new encoded data.

20

CHAPTER 3

METHODOLOGY

3.1 METHODOLOGY OF THE PROJECT: In order to deal with the shift-in power and test data volume problems, the decoders with fixed (F) or variable (V) input (I) and output (O) are defined here. We have 4 choices: FIFO FIVO VIFO VIVO

However, the FIFO scheme has less flexibility than VIFO in the input, which may cause some unnecessary input bits on the decoders. Since FIFO will reduce the compression efficiency, we will not implement the FIFO scheme. The other scheme which we will not implement is the VIVO scheme since it needs some constraint to reduce the complexity in implementation. We implement the FIVO scheme which is a pseudo VIVO scheme. In fact, we define the maximum fan-out as constraint in the FIVO scheme. If the number of output bits is larger than the defined maximum fan-out number, we will reduce the input number bits and the input number becomes variable. After the preliminary analysis, we choose fixed output (VIFO) and fixed input (FIVO) schemes to be implemented in this work. Our methodology consists of 3 steps:

Define the solution scheme Deal with the complex and simple patterns separately (CSC and NSC) Adjust the final solution.

With this methodology, these two schemes present different behaviours. The compressed scan units of the VIFO scheme have fixed number of output bits but variable number of input bits. The compressed scan units of the FIVO scheme have fixed number of input bits but variable number of output bits. Here we use adjacency fill technique to reduce the shift power and VIFO and FIVO methods are used to reduce test data volume. The methodology is as follows:21

Extract the required patterns and store that into a file. Set the X-bit omit ratio and separate the patterns in the file into NSC and CSC. The X-bits in the test patterns of NSC are filled according to adjacency fill technique. Here we reduce the shift-in-power of the patterns. The compression technique is performed on the test patterns in CSC. Variable Input Fixed Output Method: The CSC file is partitioned such that each partition has got 4 columns. The task is to iteratively compare and merge the compatible rows in each partition. The X-bits in the resultant patterns are filled with adjacency filling, this is an improvement over the existing technique where they filled the X bits by 0s. Index number should be assigned for each partition that was obtained after merging. Using this now assign the index numbers to the original pattern in CSC.Encode the index numbers such that this will give minimum number of transitions. The encoding must be based on greedy search such that it results in less number of transitions. But here binary coding was used which produces the same test volume efficiency but not power efficiency. Even though power is not considerably reduced the time complexity and decoder complexity will be significantly reduced. Fixed Input Variable Output Method This section will introduce the FIVO scheme optimization methodology. The methodology is similar to the previous scheme. Here we use fixed number of encodings say 8, n=3(2^3).Here we have to combine the columns such that say from m=1 to m=x, merging of x columns must result in less than or equal to 8 codes. This process must be repeated until the last column is reached. The X-bits in the merged data will be filled with 0s. The three stages are pattern selection, pattern compression, and power optimization. Pattern selection is the same with previous scheme. Techniques in pattern compression and power optimization are different. We will focus on these two stages in this section. Similarly, all of the test patterns in the same partition go through these three stages. Pattern selection method of FIVO scheme is the same with VIFO scheme. Test patterns for CSC are compressed by the pattern compression stage. This stage contains the following steps: Merging Extending Maximizing

We use n=3 to illustrate the procedure of this stage. The compression procedure starts to compress test patterns with 3 bits segment from the first bit of the CSC test pattern set. Each22

3-bit decoder provides 8 different codes. Each code represents one compressed pattern which is merged from original test data. For instance, this step will merge pattern X11 and XX1 to pattern X11. Moreover, XX11, X111, 0111 will be merged to code 0111 in the same segment. If the total number of compression results is smaller then 8, it extends the segment to 4 bits. Until the total number of compression results is maximum but smaller than 8 or equal to 8, the results are encoded to 3 bits in CSC. Next, it starts to encode another 3 bits. At last, it may have 2 bits or 1 bit decoder at the end. If the number of compression results is 3 or 4, the results will be encoded to 2 bits. If the compression results equal to 1 or 2, the results will be encoded to 1 bit. We also limit the maximum number of one decoder output to 256. If the output number of a decoder achieves 256, we will finish processing this segment and the input number of this segments encoder may fewer than 3. Finally, we also can use 4bit or 5-bit decoder that provides 16 or 32 different codes. In order to minimize the shift-in power with n-bit based encoding, the search method which finds out the most frequently occurring pairs and assign the code words in the following fashion such that the number of transitions are minimized. This technique reduces the time complexity and reduces the decoder complexity as compared to the existing technique.

23

3.2 FLOW OF THE PROJECT The whole project is divided into different sections as shown belowStart

Pattern Extraction

YesIs X-bit ratio2^length

No Increase Compression length

Yes Calculate compression length

No Move to next column

If data reach the last bit

Yes End

Fig.3.2: Flow chart of pattern compression

25

3.3 ALGORITHM OF PROJECTmain(); print_usage(); process(); extract(); Extract the patterns that are required for i 0;i< MAX_LINES;i i+1 allocate memory for each pattern. stripnewlinechar(); xcount(); If X-count < X-bit-omit ratio then put the patterns in NSC else in CSC nsc_fill_adj(); adjacency fill is performed and transition count determined for patterns in NSC. The following are performed on CSC test cubes. if optimization == 0 then ,goto merge(); else goto merge_fivo(); merge(); Partition_count pattern_length/MERGE_COL Iteratively merge each partition. merge_fivo(); Iteratively partition and merge till eight encoded patterns are reached. fillpatterns(); X 0 in the patterns. codepatterns(); getpartitioncode(); Returns the generated code. comparepatterns(); compares the merged data with the original patterns. for( j0;j< available_partitions;jj+1) if partition_result_row[j]>16 then codercoder5; else if partition_result_row[j]>8 then codercoder4; else if partition_result_row[j]>4 then codercoder3; else if partition_result_row[j]>2 then codercoder2; else if partition_result_row[j]>126

then codercoder1; free the allocated memories; stop ();

3.4 ALGORITHM FOR POWER CALCULATIONStart Determine MAX_PATTERN_LENGTH and MAX_LINES sum0 avg_power0 max_power -1 k1 for int j1;j=1;ii-1) val1 patterns[(j-1)][(i-1)]-0 val2patterns[(j-1)][(i-1)]-0 row_sum row_sum + [i*(val1 xor val2)] if rows_sum>max_peak max_peakrow_sum sumsum+row_sum avg_powersum/lines max_peakmaximum of all obtained powers.

3.5 ILLUSTRATIONS 3.5.1 Illustration of VIFO methodlology: Step1.The original data that has been partitioned such that each partition has got 4 columns.X000 X001 X111 XX1X XXX1 XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX 1XXX 1011 00XX 1110 1010 XXX0 XXX1 XXX1 XXXX XXXX XXXX XXXX XXXX X000 00X0 X100 0110 1110 X001 1010 X000 X001 X001 X101 X111 XXXX 1XXX 01XX 10XX X1XX 00XX 00XX 10XX 10XX 00XX 01XX 11XX 01XX 0X0X XXXX XXXX XXXX XXXX XXX0 X1X1 XXXX XXXX XXX0 XXX0 XXX0 XXX0 XXX1 X 1 0 X 0 X 1 X X X X X X

27

XXXX XXXX XXXX XXXX 1XXX X

Step2.The result of merging of the above example will be as follows:X000 1011 0000 01XX 1XXX 1 X001 00XX X100 10XX X1X1 0 X111 1110 0110 00XX 1010 1110 00XX X001 11XX 1010 X101 X111

Step 3. After this fill the X bits with zeros and assign code words as following.0 1 2 3 4 5 6 7 0000 1011 0000 0100 1000 1 0001 0011 0100 1000 0101 0 0111 1110 0110 0000 1010 1110 1100 0001 1010 0101 0111

Step4.The final output should be as follows:By comparing the original unmerged data with the coded data , we are coding the original data .The first 4 bits will take the code word from the first column and next 4 bits from second column and this process continues till the last column is reached.0 1 2 2 2 0 0 0 0 0 0 0 0 0 2 0 1 2 3 2 1 1 2 2 2 2 2 2 0 0 1 2 3 4 5 0 4 4 6 7 0 0 3 0 1 3 2 2 1 1 2 0 3 0 2 2 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1 1 1 1 1 1 1

Step 5. EncodingIn the first column we have three codes-0,1,2 so we can encode using two bits( 00,01,10,11) in the same way second column also can be encoded using these 4 symbols. In the third column we have eight codes-0,1,2,3,4,5,6,7 so we have to encode using three bits ,from 000 to 111,and the assignment of code words to each element should be such that the number of transitions in each row must be minimum using some power optimization methodology.

3.5.2 Illustration of FIVO methodology Consider the original test data:X0001XXXX0001XXXXXXXX X001101100X001XXXXXX1 X11100XXX10010XXXXXX0 XX1X11100110X1XXXXXXX

28

XXX11010111000XXXXX00 XXXXXXX1101010XXXXXX1 XXXXXXX1X00010XXXXXXX XXXXXXXXX00100XXXXX0X XXXXXXXXX00101XXXXX0X XXXXXXXXX10111XXXXX0X XXXXXXXXX11101XXXXX0X XXXXXXXXXXXX0X0XXXX1X XXXXXXXXXXXXXXXX1XXXX

Here check with each coloumn.we choose n=3, so we limit the encoding 8 possible combinations. No:of columns merged:1 2 3 4 5 6 7 8 9 10 11 12 1(x) 2(x0,x1) 2(X00,X11) 3(X000,X001,X111) 4(X0001,X0011,X1110,XX111) 4(X0001X,X00110,X11100,XX1X11) 4(X0001XX,X001101,X11100X,XX1X11) 5(X0001XXX,X0011011,X11100XX,XX1X1110,XXX11010) 6(X0001XXXX,X000110110,XX1X11100,XXX110101,XXXXXXX11) 6(X0001XXXX0,X001101100,X11100XXX1,XX1X111001,XXX11010111,XXXXXXX110) 6(X0001XXXX00,X001101100X,X11100XXX10,XX1X1110011,XXX11010111 ,XXXXXXX1101) 9(X0001XXXX000,X001101100X0,X11100XXX100,XXX1X11100110,XXX110101110 ,XXXXXXX11011,XXXXXXX11010,XXXXXXXXX101,XXXXXXXXX111)

Possible combinations:

So we take the first 11 columns for merging. Similarly will start from 12 column and continue increasing the columns till we find the next set of columns to encode in 8 possible ways. Stepwise illustration is shown below: Step1 The original dataX0001XXXX0001XXXXXXXX X001101100X001XXXXXX1 X11100XXX10010XXXXXX0 XX1X11100110X1XXXXXXX XXX11010111000XXXXX00 XXXXXXX1101010XXXXXX1 XXXXXXX1X00010XXXXXXX XXXXXXXXX00100XXXXX0X XXXXXXXXX00101XXXXX0X XXXXXXXXX10111XXXXX0X XXXXXXXXX11101XXXXX0X xXXXXXXXXXXX0X0XXXX1X XXXXXXXXXXXXXXXX1XXXX

Step2 Data after merging the columnsX0001XXXX00 X001101100X X11100XXX10 XX1X1110011 XXX11010111 XXXXXXX1101 001XXXXXX1 010XXXXX10 000XXXXX00 100XXX1X1X 010XXXXXX1 100XXXXX0X 101XXXXX0X 111XXXXX0X

Step 3 Filling of x bits by 0's and assign code words to each of the following

29

0 1 2 3 4 5 6 7

00001000000 0010000001 00011011000 0100000010 01110000010 0000000000 00101110011 1000001010 00011010111 0100000001 00000001101 1000000000 1010000000 1110000000

Step 4. Comparing the patterns with step1 with the code words in step3,we get the coded data asfollows:0 1 2 3 4 0 5 1 1 1 2 4 5 5 4 0 1 0 2 3 4 4 5 6 7 6 1 0

Step5 Here the power is being optimized as follows:Count the frequency of occurring pairs 1,5 - 2times(1,5 and 5,1) 0,3 - 2 times, and remaining all pairs occur once. so,encode the elements with the priority as follows: first the most frequently occurring pairs and then the remaining elements.(i)000,111 (ii)001,100 (iii)011,100 (iv)101,010

Step 6 This will be the final encoded data04 10 21 30 42 03 54 14 15 16 27 46 51 50 000 110 001 000 011 001 111 000 110 011 000 111 101 110 001 110 001 100 001 101 011 010 110 101 100 001 100 000

From the above illustration it is clearly shown that there is a huge reduction in test data volume and test power is also considerably reduced.

30

CHAPTER 4

EXPERIMETAL RESULTS

4.1 TOP DOWN TEST SYNTHESIS FLOW

Fig.4.1:Top down synthesis flow

31

Using Cadence RTL compiler, we obtain the net list and the pin-assignment file. Using Simvision in c 4.2 Simulation and synthesis using Cadence Simulation using Cadence ncsim: Steps are as follows: 1. Invoking IUS (Incisive Unified Simulator) of CADENCE for simulation of bench mark circuit. 2. Elaborating the top level Module. 3. Checking the simulated using graphs through SIM Vision. 4. Completing the simulation to verify the correctness of code. RTL Synthesis using Cadence RTL Compiler Cadence provides RTL Compiler for synthesis of bench mark circuit code and generates netlist Input : Verilog RTL code

Technology library : TSMC180nm Output Steps are as follows: 1 2 Invoking RTL compiler for generation of netlist Elaborating the design and generating the design and generating the Schematic 3 4 Synthesizing the design and mapping the design Net list is generated. : Gate level netlist

From the generated netlist we obtain the necessary details using the command: >rc f setup.tcl logfile filename.log gui The tool that has been used in the project is Atalanta by Virginia Polytechnic and State University. The input to the tool is the ISCAS89 Benchmark circuits that are in .bench format. It will produce output in .test format. This required part of the output is taken and is manipulated according to the proposed methodology and the results are obtained.

32

All the experimental results are obtained by using ISCAS89 benchmark circuits. Table 4.1 show the statistics generated by Atalanta. Table 4.2 shows the experimental results obtained by using VIFO methodology on Atalanta test set. Table 4.3 shows the experimental results obtained by using FIVO methodology on Atalanta test set. Tables 4.2 and 4.3 shows the number of transitions in NSC after implementing adjacency fill technique, volume of CSC before encoding and after encoding using the desired methodology and even the number of transitions in the CSC encoded pattern. The transition count determines the amount of test power dissipation. Compression efficiency determines the amount of test data volume reduction. All the results were obtained by keeping X-bit omit ratio fixed to 0.6.

Circuit S5387 S9234 S13207 S15850 S38417

#PI 214 247 700 611 1644

# Gates 2863 5597 7979 9775 22257

#Test Patterns 4511 6470 9661 11322 31007

Fault Coverage 99.121 93.403 98.431 96.563 99.445

Table 4.1.Statistics generated by Atalanta on ISCAS89 benchmark circuits

In Table 4.4, the results obtained by VIFO are compared for the test set generated by Minset test and Atalanta. In Table 4.5, the results obtained by FIVO are compared for the test set generated by Minset test and Atalanta .The compression efficiency is determined in both the cases. The optimal value for test power and test data will depend upon the X-bit omit ratio and will be different for different circuits. The tables are self explanatory .The test power and test data volume calculated here are technology independent .The test data volume is measured in terms of number of bits. The volume for a CUT is determined by multiplying the pattern length and the number of patterns. The test power is calculated based on number of transitions. The transition count determines the test power,more the transition count higher the power and vice versa.

33

Circuit

# transitions in NSC after Adj.fill34 165 152 12 12 0 1075 0 502 119 336 332 34 44 912 914 951 0 1234 1294 0 3642 3683 0 0 0 0 0

# transitions in encoded CSC5 316 709 990 1000 987 172 1183 794 2063 1462 1442 1230 1706 2188 2393 2266 5127 4280 5085 5649 608 725 21720 49566 53950 63427 165590

Original size of CSC84 2964 4641 8160 8304 9600 1001 10056 9724 13650 11712 11712 24678 28998 14904 15019 40326 48600 33632 34496 135863 3416 3346 965568 1598337 6762700 6917742 51595648

Size of encoded CSC24 1872 3003 4220 4498 6800 693 7123 6006 9282 6344 6344 11425 13425 9720 9795 25662 27000 19969 22638 86594 1952 2151 618144 757107 2086776 2253078 31255056

% compression

S27 S208 S298 S344 S349 S382 S386 S400 S420.1 S510 S526 S526n S641 S713 S820 S832 S838.1 S953 S1196 S1238 S1423 S1488 S1494 S5378 S9234 S13207 S15850 S38417

71.40 36.8 35.29 45.83 45.83 29.16 30.76 29.16 38.235 32.00 45.83 45.83 53.70 53.70 32.47 34.78 36.36 44.44 40.625 34.375 36.26 37.03 35.70 36.00 52.6 69.10 67.4 39.4

Table4.2.Experimental results obtained by VIFO methodology

34

Circuit

# transitions in NSC after Adj.fill

# transitions in encoded CSC0 324 663 827 839 986 163 1250 727 1643 1416 1400 1087 1493 2210 2384 2359 5594 4794 5492 5057 629 631 22045 47829 46420 56291 156293

Original size of CSC84 2964 4641 8160 8304 9600 1001 10056 9724 13650 11712 11712 24678 28998 14904 15019 40326 48600 33632 34496 135863 3416 3346 965568 1598337 6762700 6917742 51595648

Size of encoded CSC12 1248 2457 3060 3114 5200 462 5447 5148 4914 5856 6832 5941 8592 6480 6530 18330 18360 18918 19404 62706 1708 1673 419616 362376 405762 600066 17891039

% compression

S27 S208 S298 S344 S349 S382 S386 S400 S420 S510 S526 S526n S641 S713 S820 S832 S838.1 S953 S1196 S1238 S1423 S1488 S1494 S5378 S9234 S13207 S15850 S38417

34 165 152 12 12 0 1057 0 502 119 336 332 34 44 912 914 951 0 1234 1294 0 3642 3683 0 0 0 0 0

85.70 57.89 47.05 62.50 62.50 45.83 53.84 45.83 42.43 64.00 50.00 41.70 75.92 70.37 56.52 56.52 54.54 62.22 43.75 43.75 53.84 50.00 50.00 56.50 77.30 94.00 91.30 65.30

Table4.3:Experimental results obtained by FIVO methodology

35

Circuit

Original SPL

Mintest set volume [2]

% compression

Atalanta test volume

% Compression

Before Compression S5378 S9234 S13207 S15850 S38417 214 247 700 611 1644 23754 39273 165200 76986 164736

After Compression 13404 23535 62843 36863 88968 77.25 40.00 61.01 52.11 45.90

Before Compression 965568 1598337 6762700 6917742 57595648

After Compression 618144 757170 2086776 2253078 31255056 36.00 52.60 69.10 67.40 39.40

Table 4.4. Comparision of Experimental Results for test volume in 4-bit VIFO.(ISCAS89 Benchmarks)

Circuit

Original SPL

Min test set volume [2]

% compression

Atalanta test volume

% Compression

Before Compression S5378 S9234 S13207 S15850 S38417 214 247 700 611 1644 23754 39273 165200 76986 164736

After Compression 8830 17607 32048 22266 61064 62.82 55.16 80.60 71.07 62.93

Before Compression 965568 1598337 6762700 6917742 57595648

After Compression 419616 3632376 405762 600066 17891039 56.50 77.30 94.00 91.30 65.30

Table 4.5:Comparision of Experimental Results for test volume in FIVO method, n=3(ISCAS89 Benchmarks).

36

CHAPTER 5 CONCLUSION AND FUTURE WORK

The demand for low power VLSI digital circuits in the growing area of portable communications and computing systems will continue to increase in the future. Cost and life cycle of these products will depend not only on low power synthesis techniques but also on new DFT methods targeting power minimization during test application. This is because the traditional DFT methods are not suitable for testing low power VLSI circuits since they reduce the reliability and manufacturing yield. Recent advances in manufacturing technology have provided the opportunity to integrate millions of transistors on an SOC. Even though the design process of embedded core-based SOCs is conceptually analogous to the traditional board design their manufacturing processes are fundamentally different. True inter operability can be achieved only if the tests for these cores can also be reused. Therefore novel problems need to be addressed and new challenges arise for research community in order to provide unified solutions that will simplify the VLSI design flow and provide a plugand-play methodology for core-based design paradigm. Due to the increasing complexity of future SOCs combined with the multitude of factors that influence the cost of test a new generation of power-conscious test techniques DFT automation tools structured methodologies with seamless integration into the design flow and novel architecturespecific software-centred embedded structural and deterministic test approaches are anticipated. The above mentioned procedure of X-filling to reduce the shift power, and selective pattern compression to reduce the test data volume and test application time by using VIFO and FIVO encoding techniques are proved to be better than existing techniques. Future work includes: Proposing technique to handle X bits efficiently to handle both shift and capture power . Propose a better power optimization technique.

37

APPENDICES

Gate level net list of s27 before insertion of scan chain.

Schematic before insertion of scan chain

38

Pin assignment file of s27

Gate level net list of s27after scan insertion

39

Schematic of s27 after scan insertion

The input for the tool, Atalanta is in .bench format.

# s27 # 7 inputs # 4 outputs # 2 inverters # 8 gates ( 1 AND + 1 NAND + 2 ORs + 4 NORs + 1 BUFF ) INPUT(G0) INPUT(G1) INPUT(G2) INPUT(G3) INPUT(G5) INPUT(G6) INPUT(G7) OUTPUT(G17) OUTPUT(G10) OUTPUT(G11_EXTRA) OUTPUT(G13) G14 = NOT(G0) G17 = NOT(G11) G8 = AND(G14, G6) G15 = OR(G12, G8) G16 = OR(G3, G8) G9 = NAND(G16, G15) G10 = NOR(G14, G11) G11 = NOR(G5, G9) G12 = NOR(G1, G7)

40

G13 = NOR(G2, G12) G11_EXTRA = BUFF(G11)

The Atalanta tool gives the output in .test format.This is real input file on which the manipulations are carried out to reduce the test power and test data volume.

* Name of circuit: s27.bench * Primary inputs : G0 G1 G2 G3 G5 G6 G7 * Primary outputs: G17 G10 G11_EXTRA G13 * Test patterns and fault free responses: G13 /0 1: x10xxxx xxx1 2: x00xxx1 xxx1 G2 /0 1: x11xxxx xxx0 G12->G13 /0 1: x00xxx0 xxx0 G13 /1 1: xx1xxxx xxx0 2: x00xxx0 xxx0 G11_EXTRA /1 1: xxxx1xx 1x0x 2: xxx000x 1x0x G11_EXTRA /0 1: x0x10x0 0010 2: 01x1010 001x G10 /0 1: 1xxx1xx 110x 2: 1xx00xx 110x G11->G10 /0 1: 10x10x0 0010 G14->G10 /0 1: 0xxx1xx 100x 2: 0xx000x 100x G10 /1 1: 0xxxxxx x0xx 2: 10x10x0 0010 G12 /0 1: x00xxx0 xxx0 2: x011000 0x10 G1 /0 1: x10xxx0 xxx1 G7 /0 1: x00xxx1 xxx1 G12 /1 1: x10xxxx xxx1 2: x00xxx1 xxx1 G14 /0 1: 0xxx1xx 100x 2: 0xx000x 100x G14 /1 1: 1xxx1xx 110x 2: 1xx000x 110x G6 /1 1: 01x100x 100x G8 /0 1: 01x101x 001x 2: 01x001x 001x G14->G8 /1 1: 11x101x 110x 2: 11x001x 110x G8 /1 1: x1x100x 1x0x 2: x1x000x 1x0x G16 /1 1: x0x0000 1x00 2: 10x0010 1100 G15 /1

41

1: x1x100x 1x0x 2: 11x101x 110x G9 /0 1: xxx000x 1x0x 2: 1xx001x 110x G11 /0 1: x0x10x0 0x10 2: 01x1010 001x G5 /0 1: x0x11x0 1x00 G3 /0 1: x0x1000 0x10 G8->G16 /0 1: 0xx001x 001x G8->G15 /0 1: 01xx01x 001x 2: 00xx011 001x G12->G15 /0 1: x0x1000 0x10 2: 10x1010 0010 G11 /1 1: xxxx1xx 1x0x 2: xxx000x 1x0x G17 /0 1: xxxx1xx 1x0x 2: xxx000x 1x0x G17 /1 1: x0x10x0 0010 2: 01x1010 001x

42

REFERENCES [1] N. Nicolici and B. M. Al-Hashimi, Power-Constrained Testing of VLSI Circuits. Norwell, MA: Kluwer, 2003. [2] Chia-Yi, Hsiu-Chuan Lin, and Hung-Ming Chen, On Reducing Test Power and Test Volume by Selective Pattern Compression Schemes IEEE Transactions On Very LargeScale Integration (VLSI) Systems, Vol. 18, no. 8, Aug 2010.

[3] J. Lee and N. A. Touba, LFSR-reseeding scheme achieving low-power dissipation during test, IEEE Trans. Computer-Aided Des. Integr. Circuits Syst., vol. 26, no. 2, pp. 396401, Feb. 2007. [4] R.Sankaralingam, R. R. Oruganti, and N. A. Touba, Static compaction techniques to control scan vector power dissipation, in Proc. IEEE VLSI Test Symp., 2000, pp. 3540 [5] S. Kajihara, K. Taniguchi, I. Pomeranz, and S. M. Reddy, Test data compression using dont-care identification and statistical encoding, in IEEE Int.Workshop Electronic Design, Test Appl., 2002, pp. 413416. [6] C.-Y. Lin and H.-M. Chen, A selective pattern-compression scheme for power and test-data reduction, in Proc. Int. Conf. Computer-Aided Design, 2007, pp. 520525. [7] Lee, J. H. Jeong, and T. Ambler, Two efficient methods to reduce power and testing time, in Proc. Int. Symp. Low Power Electron. Design, 2005, pp. 167172. [8] K. J. Lee, S. J. Hsu, and C. M. Ho, Test power reduction with multiple capture orders, in Proc. IEEE Asian Test Symp., 2004, pp. 2631. [9] Y. Shi, N. Togawa, S. Kimura, M. Yanagisawa, and T. Ohtsuki, Low power test compression technique for designs with multiple scan chains, in Proc. IEEE Asian Test Symp., 2005, pp. 386389. [10] A.Chandra and K. Chakrabarty, A unified approach to reduce SOC test data volume, scan power and testing time, IEEE Trans. Computer- Aided Des. Integr. Circuits Syst., vol. 22, no. 3, pp. 352362, Mar. 2003. [11] A.Chandra and K. Chakrabarty, Low-power scan testing and test data compression for system-on-a-chip, IEEE Trans. Computer-Aided Des. Integr. Circuits Syst., vol. 21, no. 5, pp. 597604, May 2002. [12] L.-C. Hsu and H.-M. Chen, On optimizing scan testing power and routing cost in scan chain design, in Proc. Int. Symp. Quality Electronic Design, 2006, pp. 451 456. . [13] M. Nourani M. Tehranipour and K. Chakrabarty, Nine-coded compression technique with application to reduced pin-count testing and flexible on-chip43

decompression, in Proc. Design, Automation, Test in Europe, Mar. 2004, pp. 1284 1289. [14] Jas, J. G. Dastidar, M. E. Ng, and N. A. Touba, An efficient test vector compression scheme using selective Huffman coding, IEEE Trans. Computer-Aided Des. Integr. Circuits Syst., vol. 22, no. 6, pp. 797806, Jun. 2003. [15] L. Whetsel, Adapting scan architectures for low power operation, in Proc. IEEE Int. Test Conf., 2000, pp. 863872. [16] Chandra and K. Chakrabarty, Frequency-directedrun-length (FDR) codes with application to system-on-a-chip test data compression, in Proc. IEEE VLSI Test Symp., May 2001, pp. 4247. [17] G. Seroussi and M. J. Weinberger, On adaptive strategies for an extended family of Golomb-type code, in Proc. Data Compression Conf., 1997, pp. 131140. [18] Y. Shi, N. Togawa, S. Kimura, M. Yanagisawa, and T. Ohtsuki, An efficient multiscan-based test compression technique for test cost reduction, in Proc. ACM/IEEE Design Automation Conf., Jul. 2006, pp. 653658. [19] S. Tautermann, A. Wurtenberger, and S. Hellebrand, Data compression for multiple scan using dictionaries with corrections, in Proc. IEEE Int. Test Conf., Oct. 2004, pp. 926935. [20] L.-T. Wang, X. Wen, H. Furukawa, F.-S. Hsu, S.-H. Lin, S.-W. Tsai, K. S. AbdelHafez, and S. Wu, VirtualScan: A new compressed scan technology for test cost reduction, in Proc. IEEE Int. Test Conf., Oct. 2004, pp. 916925. [21] Orailoglu, W. Rao, and G. Su, Frugal linear network-based test decompression for drastic test cost reduction, in Proc. Int. Conf. Computer- Aided Design, Nov. 2004, pp. 721725. [22] H. Tang, S. M. Reddy, and I. Pomeranz, On reducing test data volume and test application time for multiple scan chain designs, in Proc. IEEE Int. Test Conf., 2003, pp. 10791088. [23] S. P. Lin, C. L. Lee, J. E. Chen, J. J. Chen, K. L. L, and W. C. Wu, A multilayer data copy scheme for low cost test with controlled scan-in power for multiple scan chain designs, in Proc. IEEE Int. Test Conf.2006. [24] R. Pandey and J. H. Patel, An incremental algorithm for test generation in Illinois scan architecture based designs, in Proc. Design, Automation, Test in Europe, 2002, pp. 368375. [25] G. Mrugalski, J. Rajski, D. Czysz, and J. Tyszer, New test data decompressor for low power applications, in Proc. ACM/IEEE Design Automation Conf., 2007, pp. 539544.

44

[26] M. Elm et al., Scan chain clustering for test power reduction, in Proc. ACM/IEEE Design Automation Conf., 2008, pp. 828833. [27] L. Li and K. Chakrabarty, Test data compression using dictionaries with fixedlength indices, in Proc. IEEE VLSI Test Symp., 2003, pp. 219224. [28] K. J. Balakrishnan and N. A. Touba, Improving linear test data compression, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 14, pp. 12271237, 2006. [29] J. Li, Q. Xu, Y. Hu, and X. Li, On reducing both shift and capture power for scanbased testing, in Proc. IEEE Asia South Pacific Design Automation Conf., 2008, pp. 653658. [30] Laung-Terng Wang,Chen-Wen Wu Xiaoqing Wen, VLSI Test Principles and Architectures Design for Testability, Morgan Kufmann Publishers.

45