fault security analysis of cmos vlsi circuits using defect-injectable vhdl models

INTEGRATION, the VLSI journal 32 (2002) 77–97

Fault security analysis of CMOS VLSI circuits usingdefect-injectable VHDL models

Donald Shawa, Dhamin Al-Khalilib,*, Come Rozonb

aGennum Corporation, Burlington, Ont., Canada L7R 3Y3bDepartment of Electrical and Computer Engineering, Royal Military College of Canada, P.O. Box 17000, Stn. Forces,

Kingston, Ont., Canada, K7K 7B4

Abstract

This paper introduces a methodology for assessing the fault security attributes of Fault Secure (FS)circuits. Structural VHDL circuit descriptions are used to simulate the fault effects of realistic transistorlevel defects that occur in CMOS ICs. Defective standard cells are simulated at the analog level ofabstraction and the resultant fault effects are implemented in defect-injectable VHDL models to allow logicsimulation. Typical fault effects include functional changes, propagation delay increases, sequential logicfaults, stuck-at faults, reduced noise margins, and increased IDDQ: The defect-injectable VHDL models areswapped into FS circuit designs and the effects of the defects are analyzed in the context of the digitalcircuit. The FS circuits can then be assigned a figure of merit based on the ratio of detected defects to thosethat actually cause output errors. To facilitate the execution of the methodology, an integrated softwaretool has been developed that, in combination with a commercial VHDL simulation tool, provides anautomated means for determining the figure of merit. Implemented using a GUI, the new tool is userfriendly and flexible enough to be used with various logic circuits and different IC technologies. Threedifferent checker, as benchmarks, w!ere evaluated to demonstrate the FSA tool and the methodology toassess their relative fault security.Crown Copyright r 2002 Published by Elsevier Science B.V. All rights reserved.

Keywords: Fault security; Fault modeling; VHDL; CMOS defects

1. Introduction

Modern VLSI devices, with their reduced operating voltage and decreased feature size, arebecoming increasingly susceptible to failure due to electromigration, electrostatic discharge, and

*Corresponding author. Fax: +1-613-544-8107.

E-mail address: [email protected] (D. Al-Khalili).

0167-9260/02/$ - see front matter Crown Copyright r 2002 Published by Elsevier Science B.V. All rights reserved.

PII: S 0 1 6 7 - 9 2 6 0 ( 0 2 ) 0 0 0 4 3 - 3

gate oxide wearout [1]. Furthermore, latent design mistakes, process/fabrication defects,and mask imperfections contribute to the likelihood of VLSI circuits not performing asexpected. Thus, for security critical applications, such as in aircraft flight control systems orspace applications, circuits are required to detect when they are failing or contain defectsthat could cause failure. Such circuits are referred to as fault secure ðFSÞ circuits. Thispaper describes a methodology and a CAD tool for assessing FS VLSI circuits and computing arelative fault security figure of merit for the circuit being analyzed. Using this figure ofmerit, circuit designers can evaluate and compare design alternatives with respect to faultsecurity and reliability. It is important here to realize that the objective of this analysis is toassess only critical building blocks within a system rather than evaluation of a completesystem.

The discussion begins with fundamental definitions and background information about FScircuits and fault simulation. Then, a realistic transistor-level defect model is presented alongwith a description of the faults observed during defect injection. This is followed by an overviewof the system for assessing FS circuits by using defect-injectable VHDL gate models. Then, thefault security analysis (FSA) tool, which has been specifically designed for automatedimplementation of the system, will be presented. Details of the benchmark circuit used tovalidate the FSA software will then be given. Finally, the results of the software validationand a detailed analysis of these results are presented to illustrate the technical merit of thismethodology.

Nomenclature

tpHL propagation time, high to low transitiontpLH propagation time, low to high transitiontDD defect induced propagation delayF fan-outCeff effective defect coveragePðSecOpÞ probability of secure operationPðSecOpjIÞ probability of secure operation using code checker with IDDQ testC number of defects flagged by checkerI number of defects that cause increase IDDQ

N number of defects that cause no observed error

Acronyms

ASIC application-specific integrated circuitFS fault secureFSA fault security analysisPOs primary outputsTcl tool command languageVHDL very high speed integrated circuit hardware description language

D. Shaw et al. / INTEGRATION, the VLSI journal 32 (2002) 77–9778

2. Background

2.1. Fault secure circuits

A circuit is said to be fault secure if any single fault results in that circuit producing either thecorrect output or a known invalid output, for any valid input vector [2]. Thus, for any applicationrequiring FS circuits, it is critically important to be able to detect, during the operation of thedevice, the existence of defects that cause, or could potentially cause, errors to occur. Beforefurther discussing the general topic of FS circuits, it is necessary to define certain terminologies asthey are used in this paper. First, a defect is any physical imperfection that may exist within acircuit. Examples of defects include shorts or opens due to imperfections in the fabrication processor wear-out mechanisms such as electromigration. Physical defects remain dormant, causing noapparent problems, until certain conditions induce a fault to occur. A fault is defined as any typeof abnormal circuit behavior, such as an incorrect voltage level or increased signal delay. The faultis said to be latent until its effects have propagated to some observable point in the circuit; atwhich point it is referred to as an error. Typically, an error would be observed directly at theprimary outputs (POs) of the circuit.

In general, to design a circuit that performs fault detection during normal operation, twoelements are required. The first is that the circuit has design elements, such as a code-based errorchecking scheme, for determining if an error exists. If an error does occur, and it is detected by thechecker, then circuit is said to have failed to a secure state and it can be removed from service asrequired. Otherwise, if the error is not detected, an unsecure failure has occurred and potentiallydamaging or fatal consequences could result from continued circuit operation. The entiresequence from physical defect to the generation of checker results is summarized in Fig. 1.

The second element required for fault detection is that an appropriate input pattern, orsequence of input patterns, be applied to the circuit that cause a specific fault to be manifested asan error at some observable point in the circuit. Ideally, latent faults will become errors promptlyafter their occurrence to allow detection and subsequent efforts to eliminate the potentiallydangerous consequences of these errors. A generally accepted convention for designing FS circuitsis the assumption that faults can occur only one at a time [3]. Furthermore, it is assumed thatwhen a fault does occur, the mean time between faults is long enough to allow the detection ofthat fault. Therefore, during fault simulation and testing, it is critical that the set of test patterns

Secure

UnsecureDormancy Latency

Excitation Excitation

Physical Circuit Error atDefect Fault POs

Fig. 1. Defect to checker results sequence.

D. Shaw et al. / INTEGRATION, the VLSI journal 32 (2002) 77–97 79

be selected to maximize the number of individual faults that can be excited to the error state. As atrade-off, the number of test patterns in the set must be kept to a minimum to reduce the timespent during simulation or testing. Because efficient test pattern sets are critical for testing anyVLSI device, there is a significant amount of research devoted to the generation and selection oftest patterns. The same techniques are similarly applicable to testing of FS circuits and to confirmtheir ability to detect faults. However, the test pattern set should also reflect the input vectors thatwould be typical during actual operation of the circuit. The topic of test pattern generation isbeyond the scope of this paper.

2.2. Defect modeling and fault simulation

The conventional method for assessing the on-line fault detection capabilities of FS circuitdesigns is to perform fault simulation using the stuck-at fault model. Using the stuck-at faultmodel, the interconnects between standard cells within an ASIC circuit are forced to logic 0 or 1values to simulate the effects of physical defects. Then, the outputs of the circuit and the results ofthe checking circuitry are examined, and some figure of merit for that circuit is calculated after thecomplete set of test vectors has been applied for every possible fault.

The stuck-at fault model was originally developed as a means to generate test patterns thatwould cause the logical effects of circuit faults to be propagated to the circuit POs during testing.The model assumes that, in the presence of any circuit defect, one or more nets will becomepermanently stuck-at 1 or stuck-at 0. Using the stuck-at fault model, the input pattern setintended to test a VLSI device can be evaluated for effectiveness by determining, throughsimulation, the percentage of possible stuck-at faults that cause observable errors. The popularityof the stuck-at fault model is largely due to its cost-effective implementation, supported byefficient algorithms for generating test patterns. However, a significant problem that is nowknown to exist with the stuck-at fault model is that it does not cover many of the faults caused byrealistic physical defects in CMOS VLSI circuits [4–6]. For instance, there is a class of stuck-opendefects, unique to CMOS technology, which cause sequential behavior to occur in combinationallogic gates. Other common fault occurrences that are not considered include increasedpropagation delays and change of logical function for individual cells. Clearly, then, if accuratefault modeling is to be performed for testing of fault-secure circuits, an alternative to the stuck-atfault model must be utilized. Section 3 describes a more appropriate fault model that is closely tiedto actual physical defects.

2.3. IDDQ Testing for circuit defects

Another popular technique for detection of defects in CMOS VLSI circuits involves onlinemonitoring of the power supply current. In defect-free CMOS circuits, the gate node of eachtransistor is typically connected to either VDD through a network of PMOS transistors or toground through a network of NMOS transistors. Consequently, there is only negligible leakagecurrent drawn from the power supply in the static state. Research has shown that many commondefects, such as transistor node shorts and gate-oxide pinholes, enable current to flow from VDD

to ground during the steady state [7]. Thus, while monitoring the quiescent supply current, IDDQ;any substantial increase from the typically low levels is a strong indication that some type of


defect exists in the circuit. It has been shown that IDDQ testing, while not an adequate testingmethod on its own, is an excellent complement to logic-based tests [8]. The FSA system proposedin this paper considers the use of IDDQ testing for fault security in addition to other methodsinvolving circuit redundancy and code checking. However, it is worth noting that IDDQ testingmay be in question for complex systems using advanced CMOS technologies, but remainseffective for smaller blocks with localized monitors.

3. The nine defect CMOS transistor model

Extensive work has been conducted by the authors to characterize the fault effects of ninetransistor-level defects that typically occur in CMOS integrated circuits [9]. These defects, equallyapplicable to NMOS and PMOS transistors, consist of short and open circuits as shown in Fig. 2.Three of these defects, the gate–drain short, gate–source short, and gate–channel short, representgate oxide failures that have been found to exist throughout the life of a CMOS device [10]. Thesource–substrate and drain–substrate shorts caused by faulty diffusion/ion implantation processor material aging are also modeled. The drain–source short, caused by process-related spot defectsin a conducting layer and metallic electromigration were also found to be prevalent and aretherefore included in the transistor defect model [11]. Finally, open gate, drain, and source defectsdue to process-related spot defects and metallic electromigration are also modeled.

To enable study of the fault effects caused by the nine defects, defect-injectable macromodelswere developed for use with a commercial analog circuit simulator. In general, the shorts aremodeled with a subcircuit consisting of a small resistor, Rsh; in parallel with a small capacitor, Cc:The opens are modeled similarly, using a large resistance, Rop; in parallel with a small capacitor,Cc: Fig. 3 shows the defect injectable macromodel of an NMOS transistor for simulatingthe majority of the nine transistor defects. One type of defect not covered by this model is the gateopen with a trapped charge. Various sources such as hot electrons, photons, coupling capacitancefrom nearby circuit elements, and logic states prior to the defect occurrence can all contribute to atrapped charge existing on an unconnected transistor gate terminal. Thus, to model a severe gateopen with a trapped charge, the gate is disconnected from the circuit and connected to a voltage

drain-source shortgate-drain short

gate-source short

gate open

source open

drain open

gate-channel short

drain-substrate short

source-substrate short

Fig. 2. Nine common defects in CMOS transistors.


source to provide the necessary charge. The macromodel for the PMOS transistor is identicalexcept that a PMOS transistor is used in place of the NMOS.

The other defect not covered by the transistor macro-model described above is the gate-channelshort. This defect typically causes parasitic transistors to be created within the circuit and,therefore, cannot be modeled in such a straightforward manner. Fig. 4 is a diagram of thecircuit used to model the NMOS gate–channel short and Fig. 5 shows the model for the PMOSgate–Channel short [11–13].

In the gate–channel short macromodels, the effective width of the main transistor is reducedaccording to the diameter of the modeled defect. The length of the gate remains unchanged. Thechannel area immediately under the defect becomes a localized area of charge with the samepolarity as the gate. This localized channel area becomes both the source and drain for two newparasitic transistors. The relative sizing of the three transistors in the circuit is dependent on thesize and location of the defect in the gate oxide.

With consideration to manageable simulation time requirements, yet still covering theimportant defect scenarios, two different levels of defects are injected into the models describedabove. These levels are referred to as ‘‘hard’’ and ‘‘soft’’, as indicated by the resistance values usedfor modeling the shorts and opens. Hard defects are those which would typically cause theresultant fault to be manifested as an incorrect logic value at some point within the circuit. Softdefects typically cause performance degradation effects such as increased gate propagation delaysand reduced noise margins. In this FSA methodology, critical values for hard and soft defectswere determined by simulation. Hard defect resistor values are 1 kO for shorts and 100 MO foropens, and soft defect resistor values are established at 5 kO for shorts and 500 kO for opens. Theadequacy of these values are consistent with process data supplied by the manufacturer and byother related research work on defect modeling [9,14–18]. It is important to keep in mind that

Drain

Source

GateSubstrate

Cc

Cc

Cc

Cc

.Cc

Cc

Cc

CcNMOS

Rop

Rop

Rop

Rsh

Rsh Rsh

Rsh

Rsh

Fig. 3. Defect-injectable NMOS transistor macromodel.


PMOS

Parasitic PMOS

Parasitic PMOS

Drain

Gate

Source

Substrate

100MΩ

Rsh

100MΩ

Fig. 5. PMOS gate–channel short macromodel.

NMOS

Parasitic NMOS

Parasitic NMOS

Drain

Gate

Source

Substrate

1Ω

1Ω

Rsh

Fig. 4. NMOS gate–channel short macromodel.


selecting two values for the shorts and two for the opens is to demonstrate the methodology asclose as possible to some practically obtained values. Once the statistical variations of the modelvalues are available from a manufacturer, the effect of these variations can be incorporated in theFSA tool.

For an overview of the defect-to-fault mapping procedure, consider the CMOS NAND gateshown in Fig. 6. Since this cell has four transistors, there are 36 defects that can be injected intothe cell using the nine defect transistor macro models. Furthermore, if a complete fault analysis isto be performed, each of the defects would be studied at the hard and soft levels, for a total of 72defects. Table 1 summarizes the fault effects observed for the NAND gate when simulated usingthe macro models.

It can be seen that 14 of the 72 defects cause a stuck-at fault to occur in the NAND gate. Thus,the stuck-at fault model would be guaranteed to cover only 19% of the defects that could typicallyoccur in this cell; though the actual coverage would probably be closer to 90% [19], but stillunacceptable for FSA. There are also 12 sequential faults that have been observed. The valuesshown in the delay and noise margin rows refer to performance degradation faults that generallyoccur due to soft defects. The noise margin faults occur when the defect affects the circuit in sucha way that no logical fault occurs, but the noise margin is reduced considerably. In this case, the

VDD

Out

A B

A

B

Fig. 6. CMOS NAND gate.

Table 1

Fault effects observed in CMOS NAND circuit

Fault Hard defects Soft defects

Stuck At 0/1 10 4

Sequential 12 0

Delay 0 23

Noise margin 0 23

IDDQ increase 25 17


circuit would be significantly more susceptible to various noise sources such as power supplyvariation, lightning, a-particles, and electromagnetic interference. Finally, the defects that areshown to cause IDDQ increases are those that would likely be detected by current monitoringcircuitry, if present.

The defect injection and fault mapping procedure was applied, as described above, to severalCMOS gates from a 0:8 mm cell library provided by the Canadian Microelectronics Corporation.The next logical step in this analysis procedure is to apply the nine-defect transistor to largercircuit blocks made up of several logic gates. However, it would be unrealistic to attempt thisdefect injection technique with large circuits because the analog simulation time requirementswould become immense. For this reason, the fault statistics compiled for the various logic gateshave been implemented directly into VHDL gate models, which allow higher level digitalsimulation to be performed.

4. A VHDL-based FSA system

4.1. Defect modeling using VHDL

Several recent studies have explored the use of VHDL for fault analysis of VLSI circuits[20–25]. The technique implemented in the FSA tool utilizes VHDL models as described by [21,24]for use with structural level VHDL circuit simulation. Other research tends to emphasizebehavioral level simulation, perhaps justifiably so a few years ago, because structural level analysiswas far too time consuming for any practical implementation. However, by utilizing moderncomputers, efficient VHDL simulators, and optimized fault injection algorithms, the benefitsrecognized from structural level simulation have become a reality. Using a structural levelapproach, it is possible to accurately model realistic transistor defects at the gate level rather thanbeing limited to the stuck-at fault model on nets that may or may not exist after the synthesis ofthe behavioral design.

Two structural level VHDL modeling techniques have been considered for this system. The firstis called a saboteur, which is a controllable component that is physically added to the VHDLnetlist of a design [21]. It is placed on the nets between existing components as per the twoexamples shown in Fig. 7 and can be used to model a wide variety of fault cases. For instance,saboteur 1 could be coded to cause the output of the XOR gate to appear as a stuck-at fault tosubsequent circuit elements. Alternatively, by changing the defect control line, additional delays

saboteur 1

saboteur 2

defect control

defect control

Fig. 7. The saboteur defect model.


or reduced voltage swings could be modeled. The saboteur can also be used to model defects thatcause shorts, or bridging faults, as demonstrated by saboteur 2 in Fig. 7. The saboteur componentis implemented using a behavioral VHDL description that completely defines its output responsebased on the defect control state and input pattern sequence. A significant drawback ofthe saboteur model is that it has no direct access to the input ports of the preceding gate andtherefore cannot model gate level faults that are input pattern dependent. However, the saboteuris an excellent choice for modeling net defects that cause bridging faults, line noise, andpropagation delays.

The second type of defect injectable model is called a mutable gate [21]. This componentphysically replaces any gate in the VHDL netlist of a design and functions correctly until a defectis injected via the defect control line. To implement the mutable gate, a behavioral levelcomponent description is developed and stored in a mutable cell library. Substituting it into adesign is simply a matter of changing the name of the component in the netlist and adding theframework for a defect control signal to that component. Since the mutable gate is actually inplace of the original component, with full access to the gate’s input and output ports, the faulteffects of any defect can be modeled accurately. These effects include logic errors, precise delayincreases, and flags indicating reduced noise margins or increased IDDQ conditions. Therefore, thistechnique is the clear choice for modeling the transistor level defects described previously. As anexample, the schematic representation of a mutant NAND gate is shown in Fig. 8. For this work,a set of mutable VHDL gate models has been developed to reflect the observed fault effects ofseveral gates from a commercial 0:8 mm CMOS cell library.

The mutable gate descriptions have been coded using a look-up table approach to facilitatesimulation efficiency and flexibility. According to the example shown in Fig. 8, the cell inputs areconsidered in combination with the defect control signal to determine if a logical fault will occur.When the selected defect is known to exhibit sequential behavior, the previous state of the cell isalso considered. For computing the defect induced delay, the logic transition taking place at theoutput of the gate is considered. Therefore, distinct delay times will be assigned for tpHL and tpLH

depending on the pending logic transition and documented effects of a specific defect on that

Look-UpTable

A

B

Defect Control

VDD

Delay

4 x 1 MUX

LatchVSS

IDDQ Flag NM Flag

Out

Fig. 8. Schematic representation of mutant NAND gate.


transition. Also, the delay computation is approximated as a linear factor of the number of gatesbeyond the single load propagation delay. Thus, when a defect which causes a delay fault occurs,rather than simply adding a certain fixed delay due to the defect, tDD; the increased propagationdelay also considers the fan-out, F ; in accordance with

tpLH ¼ ðtpLH1 þ ðF 1ÞmÞ 1þtDD

tpLH1

; ð1Þ

where m is the slope of the line for the relationship between delay versus additional loads beyondone and tpLH1 represents the defect free propagation delay for a fan-out of one. This also assumesthat tDD is the additional delay caused by the defect under a single load condition only. As analternative to the assumed linear relationship between propagation delay and fan-out, defectinduced analog simulations at various load increments would provide more accurate informationthat could be stored in an efficient look-up table. Of course, the drawback to this approach wouldbe the increased set-up and simulation time required to gather the data.

4.2. The FSA process

This section gives a general overview of the tasks that must be performed to conduct the FSA ofan ASIC using the defect-injectable VHDL models. The FSA process is divided into three distinctphases as shown in Fig. 9.

4.2.1. Setup phase

The setup phase consists of all required tasks to be conducted before the FSA actually occurs.This begins with a defect analysis of the technology in which a particular design is to beimplemented; whereby physical anomalies are studied to find likely defects. Then, transistor leveldefect injectable macromodels are developed to enable study of the possible transistor defects.Analog simulation of these defects is conducted and a defect to fault mapping is determined forthe logic cell. Using this information, the mutable VHDL gate models are coded.

FSAResults

Setup

Simulation

Analysis

- Mutable VHDL gate models- VHDL netlist of circuit- Simulation test vectors

- Simulation Result Files

Fig. 9. Overview of FSA process.


Also included in the setup phase is the design of the FS circuit by any means convenient tothe design engineer. This involves the typical iterative process whereby a design is produced,simulated, and refined until it satisfies the basic functional requirements. Once this hasbeen achieved, a VHDL netlist of the design is generated using a commercial tool. This isthe VHDL netlist that will be used for substituting the mutable components into the design duringthe FSA.

The final task performed in the setup phase is test pattern generation. Ideally, the test patternset would be generated automatically such that it is capable of exciting as many defects as possibleto the error state. At the same time the size of the test pattern set should be kept to a minimum inconsideration of reduced simulation time. Methods to achieve a balance between these factors,while considering the realistic transistor defects, are not fully developed at this time. To furthercomplicate matters, another important factor is that the test pattern set should also include theinput vectors that would be expected during actual operation of the circuit. Once the set of testpatterns is generated, by whatever means, a fault-free simulation is conducted so that the resultscan be used for comparison to those obtained from fault simulation.

4.2.2. Simulation phase

During the simulation phase, individual gates from the netlist of the design are selected forreplacement by their mutable counterparts and subsequent defect injection. For a comprehensiveFSA, every component in the design should be selected. Then, individual defects from the list ofavailable defects are selected for injection into the mutable gates. Again, every defect should beselected unless there is some compelling reason not to select certain defects. After these selectionshave been made, an iterative process begins; swapping the first component in the list with itsmutable counterpart as shown in Fig. 10.

After the VHDL netlist has been compiled, the simulator is invoked and the first defect fromthe list is injected into the single mutable component. Then, the entire set of test patterns is appliedsequentially to the circuit and the observable circuit outputs are written to a file for analysis. Thecircuit is then reset and each subsequent defect in the list is individually injected and simulated.When the end of the defect list is reached, the next component in the gate list is replaced, theVHDL netlist is compiled, and the simulation continues until all selected components have beencompleted.

It is important to address briefly the small time penalty incurred by compiling the design aftereach gate is replaced during the iterative process. Circuit components are replaced individually toreduce the signal overhead that could potentially result from adding an extra control line for every

NAND2MutableNAND2

defect control line

Fig. 10. Replacement of a gate with its mutable equivalent.


component in the circuit. Experience has shown that by reducing this overhead, and thesubsequent control processes needed to drive each signal, the simulation speed is significantlyimproved. Furthermore, simulation time savings are realized by eliminating the exclusive use ofdormant mutable components, which are significantly more complex than the original gatemodels. Finally, and most importantly, this arrangement allows for distributed simulation amongmultiple processors. It is also the intention of the authors to consider other fault simulationtechniques such as parallelism and concurrency in future developments.

4.2.3. Analysis phase

The final phase of the FSA is to parse the output files generated during the simulation phase,compare them to the fault-free results, and extract the pertinent information. Specifically, thefollowing data are obtained:

(i) number of defects injected;(ii) number of errors observed;(iii) number of errors flagged by the checker;(iv) number of errors not flagged by the checker;(v) number of potential IDDQ detectable faults; and(vi) number of reduced noise margin incidences.

From these data, several important figures can be computed. The first is referred to as theeffective defect coverage, Geff of the test pattern set. A relatively high value for Geff means that theset of test patterns excited a high number of defects to the state where they caused errors to occur.Furthermore, it shows that a comprehensive analysis has been performed, allowing a high degreeof confidence in the FSA results. Conversely, a low value for Ceff shows that many defectsremained dormant due to inadequate test pattern coverage; thereby reducing the confidence in theFSA results.

The intuitive method for calculating Ceff is to divide the number of observed errors by thenumber of defects injected into the circuit. However, referring back to the NAND gate data inTable 1, it is seen that not all transistor defects cause faults to occur in the logic gate. Therefore, itis impossible to realize a test pattern set that can achieve 100% defect coverage for the logiccircuit. As an additional observation, delay faults often remain dormant unless the defectivecomponent lies along a nearly critical path in the circuit and the clock speed of the circuit is set torun near its maximum for the fault free condition. Thus, since soft defects typically cause delayfaults that can be masked by any input combination, Ceff does not include these defects.Additionally, since there are some hard defects that do not cause faults and others that cause onlydelay faults, calculation of Ceff should not include these defects. It can be argued that neglectingdelay faults during test pattern selection negates the benefits of simulating them during the FSAprocess. However, if a set of test patterns is known to provide good defect coverage for logicaldefects, it can be reasonably assumed that most of the signal paths are being exercised and thatseveral delay faults will also be covered [26,27]

Considering the above arguments, Ceff is presently based only on hard defects that can causelogical faults to occur. Future research may augment this set to include relevant delay faults aswell. As an example of the present computation, consider the circuit shown in Fig. 11. Above eachcomponent is the ratio of hard defects that can cause logical faults versus the total number of hard


defects. The maximum hard defect induced logical fault coverage is simply a combined ratio ofthese values as shown. Assume now that an exhaustive hard defect injection process yields anactual coverage (the number of observed errors divided by the number of defects) of 80:0%: Sincethe maximum coverage is 85:8%; the effective defect coverage, Ceff ; is computed as 80:0=85:8 ¼93:2%:

The next calculation performed during the analysis phase is to find a relative figure of merit forthe FS design. This is referred to as the probability of secure operation, PðSecOpÞ; and represents theprobability that the circuit will operate in a secure manner, with the given test pattern set, under asingle transistor defect. The Venn diagram in Fig. 12 depicts the results scenario obtained fromthis FSA method.

Note that the IDDQ subset shown would be valid only if the design had IDDQ monitoringcircuitry. Furthermore, it is assumed that all defects cause observable errors unless they are insubset N: Without an IDDQ checker, the defects that cause security failures are those in the shadedareas of the Venn diagram described by

ðC,NÞ ð2Þ

90/108

33/36

16/18

90 + 16 + 33 = 139

108 +18 + 36 = 162= 85.8%

Fig. 11. Computing maximum hard defect coverage for a circuit.

Defect Population (P)

IDDQ Flag

No Error atPrimary Outputs

Checker Flag

CI

NSecurity Failures

Fig. 12. FSA results scenario.


Therefore, to calculate PðSecOpÞ; the number of errors flagged by the checker, C; is divided by thenumber of defects that cause errors at the primary outputs, %N: It is important to emphasize thatPðSecOpÞ is merely a relative figure of merit for the circuit and assumes that all defects have equalprobability of occurring. It is intended that PðSecOpÞ be compared directly to results achieved forother circuit designs using this FSA system exclusively. A more general computation for PðSecOpÞ;reflecting actual defect distributions, is presented at the end of this subsection.

Extending the figure of merit computation to include potential IDDQ testing provides usefulinformation and can be accomplished in a straightforward manner. First, PðSecOpjIÞ is defined asthe probability of secure operation provided by the code checking circuitry and an IDDQ checker.On the Venn diagram, the defects that do not cause security failures are those in the three subsetsC; I ; and N as in

C,N,I : ð3Þ

To compute PðSecOpjIÞ; the number of detected defects, ðC,IÞ is determined by counting those thatare detected either by the code checker, the IDDQ checker, or both. Then, this value is divided bythe number of defects that cause errors, %N; to obtain PðSecOpjIÞ: Typically, the value obtained forPðSecOpjIÞ is substantially higher than PðSecOpÞ and provides a strong indication of the benefits thatcan be gained by including IDDQ checking circuitry in the design.

In the unlikely event that actual defect distributions are known based on information from anIC fabrication and test facility, a generalized PðSecOpðtÞÞ can be computed for some point in time.For instance, assume that there is no IDDQ checker and that the hard gate–drain short has aknown probability, PGDSðtÞ; of existing anywhere within the circuit at a certain time, t: Then, usingthe results of the FSA, the probability of hard gate–drain shorts causing security failures, PGDSSF;can be determined by dividing the number of security failures caused by hard gate–drain shorts bythe number of those defects that were injected. Using these two values, the probability of secureoperations at time, t; assuming that the hard gate–drain defect is the only possible defect, can becomputed by

PðSecOpðtÞ;GDSÞ ¼ 1 ½PGDSðtÞ:PGDSSF: ð4Þ

Of course, this computation could be extended to find an aggregate PðSecOpðtÞÞ; considering allpossible defects, by applying Eq. (3) to each defect and summing the results. Furthermore, thisexpression could be extended in a straightforward manner to include the effects of an IDDQ

checker, if available.

4.3. The FSA tool

To implement the fault security analysis process described in this paper, a new software tool hasbeen developed. Written using Tcl and the Tk toolkit, it is a graphical user interface based toolthat can be run on virtually any computing platform. The FSA tool invokes specializedsubroutines and commercial VHDL tools as required to accomplish all of the steps outlined in theprevious subsection except for the technology defect analysis, coding of the defect-injectableVHDL models, and the FS circuit design. It is flexible enough such that new designs can beanalyzed with minimal set-up time and support for various IC technologies is integral. Finally, ithas features which enable simultaneous simulations to be conducted across multiple processors on


a network. Thus, depending on available network resources, this reduces simulation timesignificantly, making the technique feasible for detailed analysis of moderate-sized circuit designs.

5. The ALU testbench and results

5.1. Testbench circuit description

To demonstrate the utility of the FSA tool and the analysis method presented in this paper, an8-bit ALU testbench circuit, as per Fig. 13, has been synthesized from a behavioral VHDLdescription. The circuit implements eight typical ALU functions as selected by the three bitfunction select vector, Sð2 : 0Þ: The CLK signal drives registers that store values for A; B; S; andGin at the input of the ALU. In total, the ALU circuit consists of 316 logic gates. Acomplementary ALU circuit has also been synthesized for use in a duplication with comparisonchecking scheme. The complement circuit, implemented with a different design architecture as aroutine measure against common mode failures, has a total of 408 logic gates.

A total of three different FS circuit configurations were analyzed. The first circuit used a simpleeven parity checking scheme with the parity bit generated from a separate module that is similarto the ALU. The entire circuit, including the parity checker, consists of 3930 transistors. Thesecond FS design follows a complementary duplication with comparison model. It uses amultiple-bit TSC two rail checker [28] to compare the results from the ALU and its complementcircuit. This design has a total of 3952 transistors. The third circuit is similar to the second, exceptit uses a commercial TSC complementary code checker. This checker is somewhat larger andbrings the circuit size up to 4964 transistors.

5.2. Parity checker configuration

For the parity checker circuit, four separate test scenarios were analyzed. In the first test, 100pseudorandom test vectors were generated. These were simulated for every transistor defect. Theclock frequency was set to 41:7 MHz; which is about 75% of the maximum clock speed for thisdesign. Then, the test was repeated with an augmented test set consisting of 200 test vectors. The

8 Bit ALU Result

Sign

CoutB(7:0)

A(7:0)

CLK

Cin

S(2:0)

(7:0)

Fig. 13. ALU testbench circuit.


final two test scenarios were identical to the first two except that the circuit clock was increased to55:6 MHz: This clock speed was found to be near the maximum that would allow signals topropagate to the outputs for all input patterns. The results computed for probability of secureoperations, PðSecOpÞ; and the effective defect coverage of the test pattern set, Ceff ; are presented inTable 2.

As expected, these results show that the defect coverage is higher for the larger test set at bothclock speeds. However, by noting that Ceff increases by only 1:3% from 100 to 200 test vectors, weknow that we are using a statistically sufficient number of test patterns. Also, as the defectcoverage increases, the probability of secure operation decreases. Again, this is the intuitive resultsince the larger set of test vectors should excite more defects to the error state. Comparing PðSecOpÞ

at the different clock speeds shows that more security failures occur at the faster clock rate. Thisresult is due to the effects of delay faults along the critical path in the circuit. For this ALU design,the critical path is exercised only when the input vector specifies the addition operation and causesthe carry bit, Cout; to be set. This difference in PðSecOpÞ would be substantially larger in a circuitwith a more frequently exercised critical path.

5.3. Two rail TSC checker configuration

To allow direct comparison to results achieved for the parity checker configuration, the two railchecker circuit was tested with the same two test pattern sets. Since the two rail checker hasmore levels of logic than the parity checker, the propagation time required for results to reachthe checker output is slightly increased. Simulating the defect-free circuit showed that a 45:4 MHzclock speed would be sufficient to allow the signals to reach the checker outputs. After con-ducting a fault security analysis with the two test sets, the tests were repeated under a two defectscenario. The second defect was held constant in the checker module while the entire defect list

Table 2

Parity checker configuration results

41.7 MHz 55.6 MHz

PðSecOpÞ Ceff PðSecOpÞ Ceff

100 Vectors 83:2% 96:8% 82:5% 96:8%200 Vectors 82:5% 98:1% 81:7% 98:1%

Table 3

Two rail checker configuration results

One defect: Two defects:

45.4 MHz 45.4 MHz


100 Vectors 94:0% 94:9% 79:5% N/A

200 Vectors 94:0% 97:3% 77:1% N/A


was applied for the rest of the circuit. The results obtained for these four test scenarios are shownin Table 3.

With a single defect, the effective coverage is again better for the 200 vector test pattern set.Furthermore, due to the TSC property of the two rail checker, significantly better results areachieved than for the parity checker circuit. For the cases where there are two defects, the defectcoverage has been marked as not applicable because the constant logical defect in the checkercauses deceivingly perfect defect coverage. The results also show that PðSecOpÞ decreasessignificantly when there are two defects.

5.4. Commercial TSC code checker configuration

The test scenarios used for the commercial TSC code checker circuit were identical to thoseused to test the two rail checker configuration. The results are shown in Table 4.

The results clearly show that this checker provides fault security against the two defects andperforms significantly better than both of the other checking systems. The cost associated withthis increased security is about 1000 transistors; an increase of about 25% over the other checkingschemes. In a more realistic, larger VLSI circuit, this increased overhead would likely beinsignificant.

5.5. Effect of IDDQ testing on testbench circuit

As a final point of consideration, the effect of the IDDQ testing will be examined with respect toits impact on fault security. Since the commercial TSC code checker circuit yielded acceptableresults already for our fault model, it is concluded that no further benefit can be gained from IDDQ

testing other than for the detection of dormant defects. However, it should be pointed out thatthere are important defect classes such as interconnect bridges and breaks that have not yet been

Table 4

Commercial TSC checker configuration results

One defect: Two defects:

45.4 MHz 45.4 MHz


100 Vectors 100% 95:6% 100% N/A

200 Vectors 100% 96:1% 100% N/A

Table 5

Effect of IDDQ testing (200 vector test set)

Parity checker Two rail checker

41.7 55.6 1 2

PðSecOpÞ 82:5% 81:7% 94:0% 77:1%PðSecOpjIÞ 96:0% 95:6% 99:9% 94:7%


considered, which may cause the commercial TSC code checker to fail. Notwithstanding that,significant improvement can be achieved with the addition of online IDDQ checking circuitry in theother two configurations. The results obtained from tests using the 200 vector test pattern set areshown in Table 5.

6. Future work

The approach presented in this paper represents only part of a comprehensive fault securityanalysis system. Extensive fault security analysis must consider interconnect defects such asbridges and open circuits as well as transistor defects. Research is presently underway to developsimilar VHDL models for interconnect defects based on analog simulation results. These modelswill be integrated directly into the FSA tool upon completion.

References

[1] K. Seshan, T.J. Maloney, K.J. Wu, The quality and reliability of Intel’s quarter micron process, Int. Technol. J. Q3

(1998).

[2] D.A. Anderson, G. Metze, Design of totally self-checking check circuits for m-out-of-n codes, IEEE Trans.

Comput. 3 (1973) 263–269.

[3] C. Bolchini, F. Salice, D. Sciuto, A TSC evaluation function for combinational circuits, Proceedings of the

International Conference on Computer Design (ICCD)’97 Austin, U.S.A., 1997, pp. 555–560.

[4] R.L. Wadsack, Fault modeling and logic simulation of CMOS and MOS integrated circuits, Bell System Tech. J.

57 (5) (l978) 1449–1474.

[5] J. Galiay, Y. Crouzet, M. Vergniault, Physical versus logical fault models MOS LSI circuits: impact on their

testability, IEEE Trans. Comput. C-29 (1980) 527–531.

[6] N. Burgess, R.I. Damper, S.J. Shaw, D.R.J. Wilkins, Faults and fault effects in NMOS circuits-impact on design

for testability, IEEE Proc. 132 (3) (1985).

[7] C.F. Hawkins, J.M. Soden, Electrical characteristics and testing considerations for gate oxide shorts in CMOS ICs,

Proceedings of the International Test Conference, IEEE, November 1985, pp. 544–555.

[8] D.M. Wu, Can IDDQ test replace conventional stuck-fault test? Proceedings of the IEEE Custom Integrated

Circuits Conference, 1991, pp. 13.2.1–13.2.4.

[9] C. Rozon, D. Al-Khalili, J. Coppens, M. Hossain, D. Racz, Fault modes for VLSI circuits, Tech. Report CSE-13,

Dept. of Electrical and Computer Engineering, Royal Military College of Canada, 1997.

[10] J.M. Soden, C.F. Hawkins, Test considerations for gate oxide shorts in CMOS ICs, IEEE Des. Test Comput. 3 (4)

(1986) 56–64.

[11] M. Hossain, D. Al-Khalili, C. Rozon, Defect modeling and testability analysis of CMOS circuits, Technical

Report, CSE-6, Department of Electrical and Computer Engineering, Royal Military College of Canada, July

1996, 48 pp.

[12] M. Syrzycki, Modeling of gate oxide shorts in MOS transistors, IEEE Trans. Computer-Aided Des. 8 (3) (1989)

193–202.

[13] R. Rodriguez-Montanes, J.A. Segura, V.H. Champac, J. Figueras, J.A. Rubio, Current vs. logic testing of gate

oxide short, floating gate and bridging failures in CMOS, Proceedings of the International Test Conference, IEEE,

1991, pp. 510–519.

[14] F. Anderson, Emitter coupled logic and cascode current switch testability and design for test, Proceedings of the

1988 IEEE Southern Tier Technical Conference, 1988, pp. 119–126.

[15] D.M. Wu, Can IDDQ test replace conventional stuck-fault test, Proceedings of the IEEE 1991 Custom Integrated

Circuits Conference, 1991, pp. 13.2.1–13.2.4.


[16] K. Roy, M.E. Levitt, J.A. Abraham, Test considerations for BiCMOS logic families, Proceedings of the IEEE 1991

Custom Integrated Circuits Conference, 1991, pp. l7.2.1.–17.2.4.

[17] M. Syrzycki, Modeling of gate oxide shorts in MOS transistors, IEEE Trans. Computer-Aided Des. 8 (1989) 93–

202.

[18] J. Segura, A. Rubio, A detailed analysis of CMOS SRAM’s with gate oxide short defects, IEEE J. Solid-State

Circuits 32 (10) (1997) 1513–1550.

[19] F.J. Ferguson, J.P. Shen, Extraction and simulation of realistic CMOS faults using inductive fault analysis,

Proceedings of the International Test Conference, IEEE, New York, 1998, pp. 475–484.

[20] P.C. Ward, J.R. Armstrong, Behavioral fault simulation in VHDL, Proceedings of the 27th IEEE/ACM Design

Automation Conference, IEEE CS Press, Silver Spring, MD, 1990, pp. 587–593.

[21] E. Jenn, J. Arlat, M. Rimen, J. Ohlsson, J. Karlsson, Fault injection into VHDL models: the MEFISTO tool,

Proceedings of the 24th International Symposium Fault-Tolerant Computing, IEEE, 1994, pp. 66–75.

[22] T.A. DeLong, B.W. Johnson, J.A. Profeta, A fault injection technique for VHDL behavioral-level models, IEEE

Des. Test Comput. (1996) 24–33.

[23] J.M. Coppens, D. Al-Khalili, C. Rozon, VHDL modeling and analysis of fault secure systems, Proceedings of the

1998 Design Automation Test in Europe, IEEE, New York, 1998, pp. 148–152.

[24] J.M. Coppens, Logical fault analysis of fault secure systems using VHDL, M.E. Thesis, Royal Military College of

Canada, 1997.

[25] R.J. Baucom, T.A. DeLong, D.T. Smith, B.W. Johnson, VHDL-based distributed fault simulation using

SAVANT, Proceedings of the National Aerospace and Electronics Conference, July 13–17, 1998.

[26] M.A. Gharaybeh, M.L. Bushnell, V.D. Agrawal, Classification and test generation for path-delay faults using

single stuck-fault tests, Proceedings of the International Test Conference, IEEE, New York, 1995, pp. 139–148.

[27] C.C. Liaw, S.Y.H. Su, Y.K. Malaiya, Test generation for delay faults using stuck-at-fault test set, Digest of papers,

1980 Test Conference, IEEE, New York, 1980. pp. 167–175.

[28] B.W. Johnson, Design and Analysis of Fault-Tolerant Digital Systems Addison-Wesley Publishing Company, Inc.,

Reading MA 1984, 584 pp.

Dr. Dhamin Al-Khalili received the B.Sc. degree in 1966 and the M.Sc. and Ph.D. degrees inelectrical engineering from the University of Manchester, U.K., in 1970 and 1972 respectively. Hejoined the Ontario Centre for Microelectronics, OCM, in 1982, as a Senior Consultant, thenworked with the Canadian Semiconductor Design Association as a Research Director. Beforejoining OCM, he worked at the Winnipeg Microelectronics Center as Senior Engineer and ProjectLeader. He also worked for one year as a Research Industrial Fellow at Northern Telecom,Ottawa. He is presently a Professor with the Department of Electrical and Computer Engineeringat the Royal Military College of Canada and Adjunct Professor at Concordia University,Montreal. His research interests include VLSI architecture, low power electronics, testabilityanalysis and design automation.

Dr. C#ome N. Rozon received the M.Sc. degree (June 1977) in solid state physics from SherbrookeUniversity, Sherbrooke, Que., Canada, and the Ph.D. degree (August 1987) in ElectricalEngineering from Queen’s University, Kingston, Ont., Canada. From 1975 to 1983 he served as aCombat Systems Engineer in the Royal Canadian Navy and retired at the rank of NavalLieutenant. In 1983 he joined the teaching staff of the Electrical & Computer EngineeringDepartment at the Royal Military College of Canada, Kingston, Ont. He has worked as aconsultant for Newbridge Networks and Nortel Networks. He has teaching commitments inelectronics and digital design, and research interests in VLSI and Design For Testability appliedto both binary and multi-valued logic systems. He has held the position of Director of ComputingServices at the Royal Military College of Canada and currently he is head of the Department ofElectrical and Computer Engineering at the same institution.


Donald B. Shaw received a B.Sc. (Electrical Engineering) and M.Sc. (Computer Engineering)degrees from the University of Manitoba, Winnipeg, MB., Canada in 1994 and 1997 respectively.In 2001, he received the Ph.D. degree from the Royal Military College of Canada, conductingresearch in CMOS device/ interconnect defects, fault modeling and fault secure systems. He iscurrently employed as a video ASIC design engineer at Gennum Corporation in Burlington,Ontario, Canada.


fault security analysis of cmos vlsi circuits using defect-injectable vhdl models

Documents