the inadequacy of the stuck-at fault model for testing mos lsi circuits: a review of mos failure...

7
The inadequacy of the stuck-at fault model for testing MOS LSI circuits: a review of MOS failure mechanisms and some implications for computer-aided design and test of MOS LSI circuits N. Burgess and R.I. Damper Indexing terms: Fault testing, Modelling Abstract: The stuck-at fault model is widely used as the basis for automatic test pattern generation in digital circuit testing, for example the D-algorithm. However, there have been growing doubts over the ability of the model to cover faults that occur in MOS LSI circuits. The paper consists of a review of the failure mechanisms that produce faults in MOS LSI circuits, a discussion of the problems that arise when using the stuck-at fault model to test MOS LSI circuits and a set of guidelines for the future development of computer-aided design and test of such circuits. 1 Introduction Integrated circuits are designed, fabricated and tested by three separate groups of engineers. These three groups need to understand one another's disciplines as never before in order to exploit the possibilities offered by LSI/VLSI technology. For example, if integrated circuits designers take account in their designs of the problems faced by the processing and test engineers, LSI circuits with a high yield and reliability will be produced. Similarly, if test engineers are to develop test sets with a high fault coverage, they need to have a working knowledge of the faults that may be introduced into integrated circuits and their associated fault effects. The stuck-at fault model, devised in 1959 [1], forms the basis of most automatic test pattern generation algorithms, for example the D-algorithm [2]. However, there have been growing doubts over the ability of the stuck-at fault model to cover the faults that occur in MOS LSI circuits [3-7]. This paper has been written for test engineers unfamiliar with the details of MOS LSI silicon processing, and reviews the failure mechanisms that produce faults in MOS circuits (Section 2), highlighting those introduced by LSI processing technology (Section 3). An extensive list of References is supplied, making this whole area accessible to the test engineer. Finally, the problems of testing MOS LSI circuits are discussed in the light of the failure mechanisms review, and a set of impli- cations for the future development of MOS LSI computer- aided circuit design and test are presented (Section 4). 2 Process-related faults MOS circuits fail in a wide variety of ways [8-12], of which the most commonly reported are gate-oxide shorts, metalli- Paper SM78, received 14th February 1984 The authors are with the Department of Electronics & Information Engineering, University of Southampton, Southampton SO9 5NH, England sation failures (including Al-Si contact window failures), threshold voltage shifts and various packaging- and assembly-induced failures. Those papers [4, 13-18] that give relative incidences of failure modes for a given device, process and processing plant do not, when taken as a whole, provide a clear picture of overall relative incidences of MOS failures. This is as much due to the different authors' classifications of different failure modes as to the wide variations in reported incidences which, in turn, result from the different devices and samples studied by each author. For example, Galiay [6] studied failed microprocessors and reported that 55% of them failed as a result of faults in the chips' metallisations. However, Peck [18] found a similar failure rate in memory chips due to oxide defects. Pappu [16], on the other hand, reported that 25% of memories failed as a result of oxide defects, while 45% failed as a result of 'surface defects' (assumedly surface potential instability problems). Also, Johnson and Stitch [15] show the manufacturer dependency of relative failure incidences in 'identical' devices. In the following discussion, details of the more common MOS failure mechanisms are presented with some discussion of their likely fault effects in digital circuit performance. 2.1 Photolithography-related failures MOS integrated circuits are fabricated as a sequence of up to nine patterned layers on the surface of a silicon wafer. These patterns are transferred from a set of masks to the silicon surface by a photolithographic process that has become standardised in industry [19]. Faults may be introduced during the photolithography process by any of the following mechanisms: (i) Mask defects (extra or missing details), dirt and photoresist impurities can result in open- and short-circuits, poor ohmic contacts between layers of interconnect and increased leakage currents, due to the creation of extra 30 Software & Microsystems, Vol. 3, No. 2, April 1984

Upload: ri

Post on 21-Sep-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

The inadequacy of the stuck-at fault model for testingMOS LSI circuits: a review of MOS failuremechanisms and some implications for

computer-aided design and test ofMOS LSI circuits

N. Burgess and R.I. Damper

Indexing terms: Fault testing, Modelling

Abstract: The stuck-at fault model is widely used as the basis for automatic test pattern generation in digitalcircuit testing, for example the D-algorithm. However, there have been growing doubts over the ability of themodel to cover faults that occur in MOS LSI circuits. The paper consists of a review of the failure mechanismsthat produce faults in MOS LSI circuits, a discussion of the problems that arise when using the stuck-at faultmodel to test MOS LSI circuits and a set of guidelines for the future development of computer-aided designand test of such circuits.

1 Introduction

Integrated circuits are designed, fabricated and tested bythree separate groups of engineers. These three groups needto understand one another's disciplines as never before inorder to exploit the possibilities offered by LSI/VLSItechnology. For example, if integrated circuits designerstake account in their designs of the problems faced by theprocessing and test engineers, LSI circuits with a high yieldand reliability will be produced. Similarly, if test engineersare to develop test sets with a high fault coverage, theyneed to have a working knowledge of the faults that may beintroduced into integrated circuits and their associated faulteffects. The stuck-at fault model, devised in 1959 [1],forms the basis of most automatic test pattern generationalgorithms, for example the D-algorithm [2]. However,there have been growing doubts over the ability of thestuck-at fault model to cover the faults that occur in MOSLSI circuits [3-7]. This paper has been written for testengineers unfamiliar with the details of MOS LSI siliconprocessing, and reviews the failure mechanisms thatproduce faults in MOS circuits (Section 2), highlightingthose introduced by LSI processing technology (Section 3).An extensive list of References is supplied, making thiswhole area accessible to the test engineer. Finally, theproblems of testing MOS LSI circuits are discussed in thelight of the failure mechanisms review, and a set of impli-cations for the future development of MOS LSI computer-aided circuit design and test are presented (Section 4).

2 Process-related faults

MOS circuits fail in a wide variety of ways [8-12], of whichthe most commonly reported are gate-oxide shorts, metalli-

Paper SM78, received 14th February 1984The authors are with the Department of Electronics & InformationEngineering, University of Southampton, Southampton SO9 5NH,England

sation failures (including Al-Si contact window failures),threshold voltage shifts and various packaging- andassembly-induced failures. Those papers [4, 13-18] thatgive relative incidences of failure modes for a given device,process and processing plant do not, when taken as a whole,provide a clear picture of overall relative incidences ofMOS failures. This is as much due to the different authors'classifications of different failure modes as to the widevariations in reported incidences which, in turn, result fromthe different devices and samples studied by each author.For example, Galiay [6] studied failed microprocessors andreported that 55% of them failed as a result of faults in thechips' metallisations. However, Peck [18] found a similarfailure rate in memory chips due to oxide defects.Pappu [16], on the other hand, reported that 25% ofmemories failed as a result of oxide defects, while 45%failed as a result of 'surface defects' (assumedly surfacepotential instability problems). Also, Johnson and Stitch[15] show the manufacturer dependency of relative failureincidences in 'identical' devices. In the following discussion,details of the more common MOS failure mechanisms arepresented with some discussion of their likely fault effectsin digital circuit performance.

2.1 Photolithography-related failures

MOS integrated circuits are fabricated as a sequence of upto nine patterned layers on the surface of a silicon wafer.These patterns are transferred from a set of masks to thesilicon surface by a photolithographic process that hasbecome standardised in industry [19].

Faults may be introduced during the photolithographyprocess by any of the following mechanisms:

(i) Mask defects (extra or missing details), dirt andphotoresist impurities can result in open- and short-circuits,poor ohmic contacts between layers of interconnect andincreased leakage currents, due to the creation of extra

30 Software & Microsystems, Vol. 3, No. 2, April 1984

conductive paths in the substrate. These defects occurrandomly across the mask; thus whether or not a defect hasany effect on a circuit's performance depends on itslocation.

(ii) Dirt on the wafer surface may also affect theetching process by creating defects in the photoresist, or bycovering an area of material that is to be etched, producingpinholes in the material under the photoresist.

(iii) Other etching problems include underetching,leading to high-resistance or open-circuit contact windowsand incomplete diffusions, and overetching, leading toextended diffusion areas whose associated depletion layersmay short together.

(iv) Finally, the linewidth of a detail on the mask andthe linewidth of the same detail on the processed chip candiffer by 1 /urn or more. This is due to the wet, isotropicetching of the material under the photoresist after the resisthas itself been etched. This variation in linewidth affectsthe gain of the MOSFETs on the chip, although it may beallowed for by making the relevant shapes on the mask alittle larger or smaller. However, the final size of the pro-cessed transistor is not precisely predictable, and thus thetransistor's gain (or drive capability) is not guaranteed.This problem is exacerbated at smaller linewidths becausethe error of 1 fim does not scale with the linewidth of theprocess technology unless anisotropic etching techniquesare used.

2.2 Gate-oxide-related failures [20]

Many device failures can be attributed to oxide-relateddefects such as:

(i) ionic contamination of the gate oxide

(ii) charge accumulations at the oxide-silicon interface('slow trapping')

(iii) lateral spreading of charge across the field oxide

(iv) pinholes in, or dielectric breakdown of, thegate-oxide layers.

These last two mechanisms are usually caused by faultyprocessing and result in shorts between the gate electrodeand the channel, source or drain of a device. (Electricalscreening at higher-than-rated value is performed toeliminate devices which have oxide regions of only marginalquality.)

The contamination of silicon dioxide by mobile alkaliions (particularly sodium) has been a major problem inMOS integrated-circuit manufacture. Sodium is one of themost abundant elements and appears in the oxide after theetching of the aluminium layer, which itself containssodium ions. Under the influence of the applied gatevoltage, the alkali ions move toward the oxide-silicon inter-face under the gate regions, where they accumulate in thepotential well (Fig. 1). This accumulation of positive chargeinduces negative charge at the 'top' of the silicon in thechannel of the NMOS device, resulting in a decrease in thedevice's threshold voltage or an increase in the subthresholdconduction.

potential

Na*

silicondioxide

> siliconsubstrate

^ -Na*

Fig. 1 Ion accumulation at the silicon dioxide-substrate interface

Two other charge accumulation mechanisms have beenidentified. First, a fixed surface-state charge is alwaysproduced because of the way in which the thermal oxidationprocess terminates the silicon lattice and forms the oxide.This fixed charge is positive, very close to the interface andits magnitude is independent of the oxide thickness. Thisphenomenon is well known and taken into account incalculating threshold voltages of transistors. The secondmechanism, 'slow trapping', involves interfacial trap sitesthat capture holes and which are believed to be caused byincomplete silicon-oxygen bonds. These traps can capturecharges such as are produced by thermal electron-holegeneration or by ionising radiation in the oxide, or chargecarriers that have tunnelled from the silicon surface. Boththese mechanisms result in a reduced threshold voltage orincreased subthreshold conduction for the same reasonsas were previously mentioned in connection with theionic contamination.

Under certain circumstances, the silicon-dioxide 'upper'surface may become conductive as a result of positivecharge spreading out from positively biased metal lines.This positive charge is transported by lateral ion movementor by the surface becoming conductive in the presence ofmoisture. In this latter case, leakage currents betweenneighbouring conductors can be created. Usually, however,the charge induces extended inversion layers in the substratebelow the field oxide, producing either high leakagecurrents (away from the channel) or a conductive pathbetween two diffused regions, resulting in a short-circuit.

2.3 Metallisation-related failures [11, 12]

The following metallisation failures have all been reported:

(i) microcracks in the aluminium lines where an oxidestep is crossed

(ii) breaks and 'hillocks' in the aluminium lines due toelectromigration (sometimes producing shorts betweenadjacent aluminium lines also)

(iii) faulty etching of the metallisation

(iv) corrosion of the metal as a result of moisturewithin the chip

Software & Microsystems, Vol. 3, No. 2, April 1984 31

(v) contact window failures due to the misalignmentsor the Al-Si interactions that can take place, resulting inhigh resistance contacts or no contact at all.

Almost all of these failures may be reasonably characterisedas shorts or opens in the metal lines.

2.4 Other failures

Silicon bulk crystal defects are a further source of circuitfailures, and are either introduced during crystal growth orelse are process induced. These defects include dislocations,stacking faults, swirl defects and various point defects.Their major fault effect in MOS circuits is junction leakage(in bipolar technologies, they produce transistor 'pipes'),leading, for example, to charge-storage failure in a dynamicRAM [21]. However, they also produce a reduction inchannel carrier mobility and an increase in the thresholdvoltage of a device [22], although these last two effects areonly likely to be a problem in submicrometre VLSItransistors.

Cracks or pits in the passivation layer are, in general,caused by mishandling or by a mismatch between thethermal expansion coefficients of silicon and the glasspassivation layer, thus introducing stress and strain duringsubsequent thermal processing. Defects of this kind areusually spotted during visual inspection before packagingand, as such, tend not to be a testing problem.

Packaging- and assembly-related failures are also not amajor test problem because their effects are, on the whole,so catastrophic, resulting in pin shorts or opens. Examplesinclude the 'purple plague', a gold-aluminium reactionthat results in an open-circuit instead of a bond betweenthe silicon chip and integrated-circuit external pin, someforms of corrosion, which may be related to the hermeticityof the encapsulation, or breakage of the device as a result ofmechanical shock.

3 Faults in scaled-down devices

The increasing integration of devices onto silicon chips hasbeen achieved by reducing the dimensions of the devices(minimum feature size 3-5 /urn) and by developing newprocessing techniques. These trends introduce new failuremechanisms as well as exacerbating existing ones, but theoutlook is not entirely gloomy. For example, high-performance MOS technology has been developed byreducing the channel length, gate-oxide thickness andjunction depth of the MOS device, while retaining the samesupply voltages to maintain compatibility with other MOStechnologies and to keep noise problems to a mini-mum [23]. The increased field across the gate oxide willthus accelerate time-dependent breakdown or degradationof MOS devices, as discussed previously. However, the useof polysilicon gate electrodes and of phosphosilicate glass'gettering' layers with silicon nitride reduces both theextent of the ionic (Na+) contamination and its mobility.

3.1 Failures introduced by LSI processing technology

Anisotropic etching has become widely used in LSIintegrated-circuit fabrication because it greatly reduces the'sideways etching' that leads to large discrepancies betweenthe linewidths on the mask and those on the chip. However,anisotropic etching has its associated problems, the main

one being that 'fillets' may be left between tracks over astep, shorting them together (see Fig. 2). This is a particu-lar problem in the etching of the polysilicon layer, whereanisotropic etching is used to make sure that the transistordimensions on the chip do not differ from those on themask.

\,.x*y VE\before etching after etching

track !

-•-•fillet'

track

step

Fig. 2 Production of 'fillets' by anisotropic etching

LSI circuit fabrication would not have been achievedwithout ion implantation being used instead of diffusion.Using ion implantation, it became relatively easy to controlthe low-concentration doping of the silicon substratenecessary for threshold-voltage control of MOSFETs, andit became possible to use 'self-alignment' techniques ratherthan photolithography steps, thus eliminating the photo-lithography-related failures from certain steps of thefabrication process (Section 2.1). However, implantation,together with the dramatic heat changes used in silicon-chipprocessing, produces dislocations, stacking faults and othercrystallographic faults in the silicon substrate. These defectsare often 'decorated' by impurity atoms (for example,copper, iron, gold, etc.), and, besides producing excessivep-n junction leakage currents, can, if particularly numerous,increase the threshold voltage and reduce the transconduc-tance of a MOSFET as noted earlier.

3.2 Failures in LSI metallisation

The use of a smaller linewidth for the metallisation exacer-bates existing failure modes such as electromigration andcontact window failures [24]. The use of multilevel metalli-sation schemes has given some of these an added signifi-cance. In order to avoid electromigration problems in thesmaller lines, the aluminium tracks are made 'taller': thismeans that the second-level metal lines have to cross steeper,bigger, oxide steps, thus increasing the changes of amicrocrack.

The two levels of metallisation are connected by contactwindows called 'vias'. Any contamination or oxidation ofthe exposed aluminium in the first layer will produce high-resistance contacts or smaller-than-expected 'via' areasleading to electromigration-related failures. The smallercontact windows between the first metal layer and the

32 Software & Microsystems, Vol. 3, No. 2, April 1984

silicon also increase the incidence of failed contacts, butthe use of metal silicides, or alloys, together with heat-treatment techniques has reduced the likelihood of thisfailure's occurrence [25]. Breakdowns of the dielectricbetween conducting levels may also occur [26].

3.3 Failures in LSI memories

Alpha-particle-induced errors have received wide attentionsince their discovery in dynamic RAMs [27]. An alphaparticle entering the silicon substrate with an energy of5 MeV penetrates to an average depth of 25 ixm, generatingsome 2.5 x 106 electron-hole pairs along its path as it losesenergy. In n-channel dynamic RAMs, the electrons collectin the storage cell and the holes disperse in the substrate,and, as a result, the information stored (as presence orabsence of charge) may be lost because of the smalleramounts of charge used in larger RAMs. These 'soft'errors are random, recoverable (if detected), and it is onlypossible to test a cell's liability to be so affected.

It has also been found that interactions among neigh-bouring memory cells are common. These interactions arepattern sensitive, i.e. dependent on the data being stored,and are so common that testing strategies have been devisedto test for them (WALKPAT, GALPAT etc.). The physicalmechanism causing these interactions is capacitativecoupling between physically adjacent memory cells, andthey have appeared in memories because memory processingtechnology is a relatively advanced art, utilising the smallestlinewidths. These interactions are starting to appear inother devices, for example the register arrays of micropro-cessors that have passed their manufacturers' qualityassurance tests [28], and will become more and morecommon as the scale of integration increases.

3.4 Failures in short-channel VLSI devices [24, 29-32]

There are three widely reported failure mechanisms in VLSIdevices (feature size <1.5 nm) that result from theincreased electric fields within the scaled-down MOSFET:

(i) hot electron injection into the gate oxide

(ii) drain-source breakdown resulting from the activa-tion of the parasitic bipolar device

(iii) punch-through.

gate

drain

p- substrate

depletionregion

In saturation, an NMOS transistor has a large electric fieldin the drain depletion region, which produces greatlyaccelerated electrons near the drain. Here, impact ionisationcollisions scatter these 'hot' electrons, which may havesufficient energy to surmount the oxide-silicon potentialbarrier and may then be trapped in the gate oxide. ('Hot'electrons may also be produced from thermal hole-electronpair generation in the substrate, from where some electrons,and holes, may tunnel into the gate oxide.) Over a period oftime, the trapped charge will cause instability in the formof a threshold voltage increase. The trap density in theoxide has been increased in VLSI devices as a result ofdamage caused by electron-beam lithography.

Fig. 4 Example of fault obscured by gate-level representation of acircuit (after Reference 6)

Fig. 3 Short-channel effects in MOSFETs

Software & Microsystems, Vol. 3, No. 2, April 1984

An n-p-n parasitic bipolar transistor exists in thett-channel MOSFET structure (Fig. 3), and may becomeactive in short-channel devices, as follows. The above-mentioned impact ionisation collisions also produce holecurrents that flow in the substrate (the electrons flow tothe drain) eventually forward biasing the source-substratejunction. This means a direct electron current flowsbetween the source and the drain which can result in theavalanche breakdown of the drain junction.

The third short-channel effect is punch-through, and issimilar in effect to the bipolar 'latch-up' phenomenon justdiscussed. The depletion layer associated with the drainspreads across the channel and reaches the source depletionlayer, producing a large increase in the subthreshold leakagecurrent. In a long-channel device, the drain and sourcedepletion layers do not penetrate very far into the channelregion, and thus the channel potential barrier is solely afunction of the applied gate voltage along most of thedevice's length. However, in short-channel devices wherethe source and drain depletion regions have merged, thechannel potential barrier is lowered, resulting in an increasedsubthreshold leakage current. As the drain voltage isincreased, extra charge is imaged in the source region andproduces a still larger subthreshold current.

Design rules exist to minimise these effects, but processvariations in the channel length can lead to a fraction of thedevices having a shorter channel length than the design rulesallow, thus producing faulty devices.

4 Implications for testing

As noted in the Introduction, there has been growingconcern over the validity of the 'single stuck-at' fault model

33

for use in testing MOS LSI circuits. This Section firstdiscusses these doubts, and then goes on to consider someof the conditions necessary for the production of'reasonable' length test input sequences that will cover thefaults described in the preceding Sections.

4.1 Shortcomings of the stuck-at model

The single 'stuck-at' fault model postulates that a circuitwill only fail in such a way that one logic gate input oroutput becomes permanently fixed at logical 1 or at logical0. The problem with this approach to testing MOS LSIcircuits is that the physical faults that can occur (Sections 2and 3) do not necessarily manifest themselves as 'stuck'nodes in a gate-level description of the circuit for tworeasons.

First, some physical failures in the silicon chip do notproduce 'stuck' nodes [4, 33]. They might produce a short-circuit between adjacent tracks, or some nearest-neighbourinteraction between improperly isolated transistors; theymight produce a transistor that only just turns on or offas a result of a shifted threshold voltage, or which takeslonger to switch states than the clock rate allows for, due tosome leakage current mechanism. Obviously, these failuresdo not result in 'stuck node' faults, although it should bepossible to test for them if they are built into the faultmodel that is guiding the test pattern generation. Indeed,the 'stuck-at' fault model is now routinely expanded toinclude bridging faults between adjacent tracks of printed-circuit boards so that the fault coverage of the test set isenhanced [34].

Secondly, nodes that exist in the silicon layout need nothave any obvious counterpart in the gate-level description[3, 4, 35, 36]. This means that faults occurring in thesilicon may not be tested for because their possibleexistence is not obvious from inspection of the gate-levelcircuit description. An example is shown in Fig. 4, wherethe test vectors 0110 and 1001 only would pick out nodeM open-circuit (due to electromigration, photolithographyerror etc.). It would be impossible to derive these tests fromthe corresponding logic diagram. (This kind of fault issometimes referred to as 'alteration of logic function'.)

Two other factors have led to the 'stuck-at' fault modelbecoming all but completely useless in the LSI/VLSI era:

(i) Gate-level circuit diagrams of LSI circuits areoften not available, thus eliminating the use of the 'stuck-at'fault model.

(ii) Although algorithms exist to enable computer-aided testing, the amount of man/computer time involvedin generating and evaluating test input sequences to coverthe 'stuck-at' faults in an LSI circuit containing thousandsof gates remains prohibitively expensive.

The problem of generating and evaluating a test set at aminimum cost that is as short as possible and which coversas many failures in the silicon as possible (contact windowfailures, gate-oxide pinholes, threshold voltage instabilities,metallisation faults etc.) has thus become more acute.

4.2 Implications for computer-aided design and test

The recent trend in LSI design has been towards the use ofa small number of particular types of device (ROM, RAM,

PLA, microprocessor), and test strategies have been de-veloped for each device that takes into account theirgeneral structure and the ways in which they commonlyfail. For example, programmable logic arrays are widelyused to implement random logic functions because they arerelatively simple (and therefore quick) to design. They arealso eminently testable, as their regular structure may beexploited to produce computationally efficient test genera-tion programs. PLAs are often drawn as a grid of lines withdots to represent crosspoint connections (transistors). Thephysical failures that are most likely to occur in PLAs(bridging between adjacent lines, defective crosspointdevices, 'stuck' lines) are, thus, more readily modellableusing this representation (rather than a gate-level repre-sentation), and, as a result, this leads to the simple genera-tion of a test set with a high fault coverage [37]. Reviewsof other test strategies appear elsewhere [38, 39], but twopoints to be made here are:

(i) The higher the level at which the device under testis modelled (layout level, circuit level, gate level, functionalprimitive level, register transfer level), the lower the faultcoverage will be, due to the loss of information at thehigher level about the device's performance in the presenceof a fault [40].

(ii) The more complex (random) a device's structureis, the higher the level at which it has to be modelled,because of the inability of the human mind to handlehighly detailed descriptions of complex systems and theneed to keep CAD costs reasonable.

Faced with a choice between producing test sequences thathave a high 'real fault' coverage and which are expensive togenerate (transistor-level modelling), and sequences that areeasier to generate but which have a lower fault coverage(gate- and functional-level modelling), test engineers havebegun to model MOS LSI circuits at the transistor levelwhere possible, in order to obtain a higher fault coverage[41-44]. This approach has its problems, not the least ofwhich is the availability and analysis of a transistor-leveldiagram of an LSI circuit containing thousands of transistors,each one of which is individually susceptible to failure.Obviously, it is not feasible to test whole MOS LSI circuitsby working at the transistor-level representation; however,if the fault coverage of test sets generated for LSI circuitsis to remain high, while the scale of the process of generatingand evaluating these test sets is to be kept to manageableproportions, the implications are clear:

(i) LSI circuits should be designed using circuitprimitives that may be tested by easy-to-generate testsequences with a high fault coverage (ROMs, PLAs,repetitive structures etc.), and should be partitionable toallow these primitives to be tested independently.

(ii) The effects of the failures discussed in Sections 2and 3 on the performance of MOS transistors must beevaluated in digital terms [32]. This will enable models offaulty transistors to be developed, which, in turn, willassist in the efficient generation of test input sequences.

(iii) The effects of these failures on functionalprimitives (counter, shift register etc.) must also be evalua-ted for functional-level testing to achieve a high fault

34 Software & Microsystems, Vol. 3, No. 2, April 1984

coverage (functional-level testing is simpler, and thereforecheaper, than transistor-level testing). If necessary, thesilicon layout of these functional primitives may have tobe redesigned so that

(a) 'hard-to-test' failures do not occur, and(b) in the presence of a failure the circuit exhibits

fault effects coverable by a predefined functional faultmodel.

(iv) Design for testability must be incorporated intothe circuit design to ease the test pattern generation andapplication problem [45, 46]. A node is difficult to test ifit is not easily controllable or observable (in other words,if it is difficult to set the node to a desired value, or topropagate the value on the node to an output). In LSIcircuits containing thousands of transistors but only 40 orso pins, such nodes are becoming more and more common.Design for testability measures (such as scan-in, scan-outtechniques) make these nodes more accessible to the tester,thus reducing the effort needed to generate and apply a testinput sequence, and simultaneously increasing its potentialfault coverage.

In the final analysis, though, the set of assumptions madeabout the ways in which a digital circuit can fail is whatprimarily determines the fault coverage of a test inputsequence. The stuck-at fault model is an example of onesuch set of assumptions, and it has been shown above whyalternatives to it are being sought in the MOS LSI era. Highor full 'real fault' coverage will only be achieved when thefault effects of failure mechanisms in MOS LSI logiccircuits are fully understood in terms of the digitaloperation of these circuits. When that happens, faultmodels will be devised that more accurately reflect the faulteffects present in MOS LSI circuits, and the phrase 'designfor testability' will also come to cover techniques whereby'hard-to-test' failures become easier to test for or arecompletely eliminated, as a function of the way in which acircuit is laid out in silicon [3, 4 ,47] .

5 Summary

The stuck-at fault model is inappropriate for testing MOSLSI/VLSI circuits for several reasons. A review of thefailure mechanisms that produce malfunctions in MOScircuits shows that not all failures are modellable as 'stuck'nodes on the gate-level representation of a logic circuit.Further, LSI circuits consist of thousands of gates, eachsusceptible to failure, and the amount of man/computertime required to generate and evaluate test patternsequences is prohibitive. To overcome the problem ofgenerating test input patterns that detect as many of thepossible on-chip defects as possible, circuits should bedesigned using primitives that are 'easy to test' by virtueof their structure (scan-in/scan-out design, regular array)or whose behaviour in the presence of a fault is predictable,allowing functional-level testing to be performed.

6 Acknowledgments

This work was performed under a UK Science & EngineeringResearch Council CASE award in conjunction with BritishTelecom.

The authors readily acknowledge the support of D.R.J.

Wilkins and S.J. Shaw of British Telecom ResearchLaboratories, Martlesham Heath, in the preparation of thispaper.

7 References

1 ELDRED, R.D.: 'Test routines based on symbolic logic state-ments ' , / . Assoc. Comput. Mach., 1959, 6 pp. 33-36

2 ROTH, J.P.: 'Diagnosis of automata failures: a calculus and amethod', IBM J. Res&Dev., 1966, 10, pp. 278-281

3 BANERJEE, P., and ABRAHAM, J.A.: 'Fault characterisationof VLSI MOS circuits'. Proceedings of IEEE InternationalConference on Circuits and Computers, 1982

4 BEH, C.C., ARYA, K.H., RADKE, C.E., and TORKU, K.E.:'Do stuck fault models reflect manufacturing defects?'Proceedings of IEEE Test Conference, 1982, pp. 35-42

5 EL-ZIQ, Y.: 'Classifying, testing and eliminating VLSI MOSfailures', VLSIDes., 1983, Sept., pp. 30-35

6 GALIAY, J., CROUZET, Y., and VERGNIAULT, M.:'Physical vs. logical fault models', IEEE Trans., l980,TC-29,pp. 527-531

7 WADSACK, R.L.: 'Fault modelling and logic simulation ofCMOS and MOS integrated circuits', Bell Syst. Tech. J., 1978,57, pp. 1449-1474

8 COLBOURNE, E.D., COVERLEY, G.P., and BEHERA, S.K.:'Reliability of MOS LSI circuits', Proc. IEEE, 1974, 62,pp. 244-259

9 EDWARDS, D.G.: 'Testing for MOS IC failure modes', IEEETrans., 1982, TR-31, pp. 9-17

10 SCHNABLE, G.L., and COMMOZOLI, R.B.: 'CMOS integratedcircuit reliability', Microelectron. &Reliab., 1981,21, pp. 33-50

11 STOJADINOVIC, N.D., and RISTIC, S.D.: 'Failure physics ofintegrated circuits',/%ys. Status Solidi (a), 1983, 75, pp. 11-48

12 WOOD, J.: 'Reliability and degradation of silicon devices andintegrated circuits' in HOWES, M.J., and MORGAN, D.V.(Eds.): 'Reliability and degradation' (Wiley, London, 1980),pp. 191-236

13 'How some Japanese IC's fail', Electron. Int., 1980, 53,pp. 142-143

14 BRAMBILLA, P., FANTINI, F., MALBERTI, P., andMATTANA, G.: 'CMOS reliability', Microelectron. & Reliab.,1981, 21 , pp. 191-201

15 JOHNSON, G.M., and STITCH, M.: 'Microcircuit acceleratedtesting reveals life-limiting failure modes'. Proceedings of 15thInternational Reliability Physics Symposium ,1977, pp. 179-195

16 PAPPU, R., HARRIS, E.,and YATES, M.: 'Screening methodsand experience with MOS memory', Microelectron. & Reliab.,1977, 17, pp. 193-197

17 PEATTIE, C.G., ADAMS, J.D., CARRELL, S.L., GEORGE,T.D., and VALEK, M.H.: 'Elements of semiconductor devicereliability',Proc. IEEE, 1974, 62, pp. 149-168

18 PECK, D.S.: 'New concerns about integrated circuit reliability'.Proceedings of 16th International Reliability Physics Sympo-sium, 1978, pp. 1-6

19 MAVOR, J., JACK, M., and DENYER, P.: 'Introduction toMOS LSI design' (Addison Wesley, 1983)

20 LYCOUDES, N.E., and CHILDERS, C.C.: 'Semiconductorinstability failure mechanisms review', IEEE Trans., 1980,TR-29, pp. 237-248

21 STRACK, H., MAYER, K.R., and KOLBESEN, B.O.: 'Theinfluence of stacking faults on MOS memories', Solid-StateElectron., 1979, 22, pp. 134-140

22 VELCHEV, N., TONCHEVA, L., and DIMITROV, I.: 'Elec-trical properties of MOS structures containing process-inducedfaults', Cryst. Lattice Defects, 1980, 8, pp. 159-166

23 ROSENBERG, S., CROOK, D., and EUZENT, B.: 'HMOSreliability', IEEE Trans., 1979, TED-26, pp. 48-51

24 FANTINI, F., and MATTANA, G.: 'Failure mechanisms andanalysis of VLSI circuits'. European Conference on Testing,'Testing Integrated Circuits: A Challenge', Lausanne,Switzerland, 1983, pp. 85-104

25 OTTAVIANI, G., and MAYER, J.W.: 'Mechanisms and intcr-facial layers in silicide formation in reliability and degradation',in HOWES, M.J., and MORGAN, D.V. (Eds.): 'Reliability anddegradation' (Wiley, 1980), pp. 105-149

26 GAJDA, J.J., LINDSFROM, G.J., and DeLORENZO, D.J.:'Interlcvel insulation reliability evaluation', IEEE Trans., 1981,TCHMT4, pp. 509-514

Software & Microsystems, Vol. 3, No. 2, April 1984 35

27 MAY, T.C., and WOODS, M.H.: 'Alpha-particle-induced softerrors in dynamic memories', ibid., 1979, TED-26, pp. 2-9

28 NICKEL, V.V.: 'VLSI - the inadequacy of the stuck-at-faultmodel'. Proceedings of IEEE Test Conference,'1980, pp. 378-381

29 CHATTERJEE, P.K., YANG, P., and SHICHIJO.H.: 'Modellingof small MOS devices and device limits', IEE Proc. I, Solid-State & Electron Dev., 1983, 130, pp. 105-125

30 COTTRELL, P.E., TROUTMAN, R.R., and NING, T.J.: 'Hotelectron emission in W-channel IGFETs', IEEE Trans., 1979,JSC-14, pp. 442-455

31 EITAN, B., and FROHMAN-BENTCHKOWSKI, D.: 'Surfaceconduction in short-channel MOS devices', ibid., 1982, TED-29, pp. 254-266

32 EL-MANSEY, Y.: 'MOS device and technology constraints inVLSI'.iWd., 1982,TED-29,pp. 197-203

33 BURGESS, N., DAMPER, R.I., WILKINS, D.R.J., and SHAW,S.J.: 'Fault effects in MOS circuits and their implications fordigital circuit testing'. Proceedings of IEE EDA 84 Conference

34 MEI, K.C.Y.: 'Bridging and stuck-at faults',IEEE Trans., TC-23,pp. 720-727

35 COURTOIS, B.: 'Failure mechanisms, fault hypotheses andanalytical testing of MOS LSI circuits' in GRAY, J.P. (Ed.):'VLSI 81' (Academic Press, 1981), pp. 341-350

36 CHIANG, K.W., and VRANISEK, Z.G.: 'Test generation forMOS complex gate networks'. Proceedings of InternationalSymposium on Fault Tolerant Computing, 1982, pp. 144-157

37 OSTAPKO, D.L., and HONG, S.J.: 'Fault analysis and testgeneration for PLAs', IEEE Trans, 1979, TC-28, pp. 617-626

38 BENNETTS, R.G.: 'An introduction to digital board testing'(Arnold, 1982)

39 GAI, S., MEZZALAMA, M., and PRINETTO, P.: 'A reviewof fault models for LSI/VLSI devices', Software & Microsyst.,1983,2, pp. 44-53

40 ABRAHAM, J.A.: 'Functional level test generation forcomplex digital systems'. Proceedings of IEEE Test Conference,1981, pp. 461462

41 BANERJEE, P., and ABRAHAM, J.A.: 'Generating tests forphysical failures in MOS logic circuits'. Proceedings of IEEETest Conference, 1983, pp. 554-559

42 BRYANT, R.E., and SCHUSTER, M.D.: 'Fault simulation ofMOS digital circuits', VLSIDes., 1983, Oct., pp. 24-30

43 COURTOIS, B.: 'Analytical testing of data processing sectionsof integrated CPU's'. Proceedings of IEEE Test Conference,1981,p.21

44 TIMOC, S., BUEHLER, M., GRISWOLD, T., PINA, C,STOTT, F., and HESS, L.: 'Logical models of physical failures'.Proceedings of IEEE Test Conference, 1983, pp. 546-553

45 ABRAHAM, J.A.: 'Design for testability'. Proceedings of IEEECustom Integrated Circuits Conference, 1983, pp. 278-83

46 PARKER, K.P., and WILLIAMS, T.W.: 'Design for testability- a survey', Proc. IEEE, 1983, 71, pp. 98-112

47 MASUDA, I., UENO, M., and TASHIRO, K.: 'A fault-tolerantMOS-LSI for train controller applications'. IEEE InternationalSolid State Circuits Digest of Technical Papers, 1983, pp. 138-139

- Book review.Operating system conceptsJ.L. Peterson and A. SilberschatzAddison Wesley, 1983, 548pp., £11.95ISBN: 0-201 06097-3

Operating system design is often considered to be a veryblack art. Many programmers are unsure of exactly whathappens within an operating system — understanding usuallystops at the program/operating-system interface. This bookilluminates the area behind this interface.

Until recently, there have been comparatively few text-books covering the wide field of operating system design ata level suitable for the newcomer. The authors seem tohave tackled this task with some success and have produceda clear, comprehensive and well presented book. The statedaim of the book is to provide 'a clear description of theconcepts underlying operating systems'. It does notconcentrate on any single existing system or hardware.

The book starts with a useful introduction to operatingsystems and the services which they provide, and goes on tocover various important topics such as processor, memoryand disc management, file systems, deadlock, protection,design principles and distributed systems. There are alsotwo good chapters dealing with concurrent processes andconcurrent programming. The book concludes with ahistorical perspective (where several operating systems are

rather too briefly presented) and a comprehensive biblio-graphy. Each chapter ends with a summary, a collection ofgood exercises and a section of notes referring to the mainbibliography.

The chapters covering processes and concurrency appearquite late in the book, and the authors justify this organi-sation by suggesting that the student needs to be familiarwith the basic ideas of operating systems before tacklingthe harder aspects of the process model. The later chapterscovering topics such as concurrent programming languagesand techniques, protection and distributed systems areespecially important since they introduce the reader tosome of the current research directions in operating systems.Most topics relevant to an undergraduate course are coveredwith adequate detail except that the section on concurrentlanguages is surprisingly short.

The book is of course primarily intended for the collegestudent, but it would be equally suitable for almost anyprogrammer or analyst seeking a better understanding ofoperating systems. A basic knowledge of assembly-languageprogramming and computer organisation is required.

This is a well-written book with clear diagrams andfrequent useful examples. It would be an excellent additionto a student's reading list.

DJ. WATSON

36 Software & Microsystems, Vol. 3, No. 2, April 1984