memory repair primer - iss · memory test & repair primer april 2007 7 repair approach is that...

MEMORY REPAIR PRIMER A Guide to Understanding Embedded Memory

Repair Options and Issues

APRIL 2007

MEMORY TEST & REPAIR PRIMER

April 2007 2

TABLE OF CONTENTS

TABLE OF CONTENTS............................................................................................................................ 2

INTRODUCTION ........................................................................................................................................ 3

EFFECTIVE TESTING FOR EFFECTIVE REPAIR.............................................................................. 4

MEMORY REPAIR APPROACHES ....................................................................................................... 5

MANUFACTURING REPAIR FLOW ...................................................................................................... 8

HOW MUCH REDUNDANCY? .............................................................................................................. 10

HOW MANY FUSES? ............................................................................................................................. 13

CHOOSING A REDUNDANCY SCHEME ........................................................................................... 15

MEMORY REPAIR SOLUTION CHECKLIST ..................................................................................... 17

MEMORY IP VENDOR INFO ................................................................................................................. 18

FOUNDRY INFO ...................................................................................................................................... 21

REFERENCES ......................................................................................................................................... 24


April 2007 3

INTRODUCTION

One of the most notable consequences of the semiconductor industry moving to deeper nanoscale technology nodes is the significant growth in both the number and densities of embedded memories. Designs have migrated from containing a handful of memories to containing hundreds and in some cases over a thousand memories of all types. This explosion in embedded memories is driving the need for rethinking the manufacturing test strategy for these designs [1]. In particular, embedded memories now represent in most cases a die’s largest contributor to yield loss due to the very large area and density of these regular circuits. A successful memory strategy must now incorporate some form of repair methodology in order to achieve profitable yield levels.

Formulating a repair methodology often requires combining IP from memory providers, automation from DFT providers, and data from foundries. This often represents a significant challenge as not only are there several combinations and choices to consider, but more importantly, there is generally very little information on how to best make these choices. This document attempts to address this challenge by explaining the memory repair process along with all of its components and choices as well as by providing repair related information on popular memory IP vendors and foundries.


April 2007 4

EFFECTIVE TESTING FOR EFFECTIVE REPAIR The memory repair process has three basic components: test, repair analysis, and repair delivery. A comprehensive test capability is fundamental as the repair process is only effective if it addresses all existing defects. In the great majority of cases, embedded memories today are tested with Built-In Self-Test (BIST). In its simplest form, memory BIST consists of an on-chip engine placed next to each embedded memory that writes algorithmically generated patterns to the memory and then reads these patterns back to discover and possibly log any defects. The memory BIST engine is typically designed to generate patterns based on a pre-determined memory test algorithm encoded in a finite state machine (figure 1a). Decreases in process geometries and associated increases in memory densities are resulting in a growing number of memory defect types. Many of these new defect mechanisms are difficult to predict and hence properly test for. These defects are therefore being discovered during the production testing of a device or worse during the analysis of field returns. This can result in significant quality and cost issues if the predetermined test algorithm does not detect a newly discovered defect type. The cost issues are worse when repair is used as the added repair cost is wasted on a part that will remain defective.

To address this growing problem, some commercial memory BIST solutions now provide programmable BIST engines (figure 1b). With these engines it is possible to download (on the tester or in-system) program code that implements an arbitrary memory test algorithm, allowing new or enhanced algorithms to be applied as needed to specific memories as new defect mechanisms need to be addressed. To maintain a simplified manufacturing test flow, these programmable BIST engines will typically support predetermined default algorithms as well. This removes the need to program the BIST engine if the default algorithm is sufficient. Only when new defect mechanisms are discovered does it become necessary to program each BIST engine before having it execute the memory test. The programmable BIST engines are larger than the more traditional hard-coded ones and therefore should only be used when the need is justified. This tends to be when a new memory design and/or a new foundry process are to be used.

Figure 1: Memory BIST Architectures

PointerPointer& Loop& LoopControlControl

FSMFSM

ScannableScannableMicrocodeMicrocode

MemoryMemory Addr

ess G

enAd

dres

s Gen

Cont

rol G

enCo

ntro

l Gen

Data

Gen

Data

Gen

TAPTAP

Com

pare

Com

pare

FSMFSM

Addr

ess G

enAd

dres

s Gen

Cont

rol G

enCo

ntro

l Gen

Data

Gen

Data

Gen

TAPTAP

BIST BIST CONTROLLERCONTROLLER Co

mpa

reCo

mpa

re PROGRAMMABLEPROGRAMMABLEBIST BIST CONTROLLERCONTROLLER

TO / F

ROM

MEMO

RYTO

/ FRO

M ME

MORY

TO / F

ROM

MEMO

RYTO

/ FRO

M ME

MORY

(a)(a) (b)(b)


April 2007 5

MEMORY REPAIR APPROACHES In addition to an effective test capability, a memory repair solution consists of two additional basic components: repair analysis and repair delivery. These are described in detail in this section.

REPAIR ANALYSIS

This component of the repair process consists of determining which of a memory’s defective sections (typically rows or columns) must be replaced with available spares. Repair analysis can be performed on or off chip. In the off-chip approach, all memory failures are logged on the tester and the resulting fail data is post-processed offline. A significant drawback of the off-chip approach is that logging all of the fail data off-chip results in a large increase in test time. Because of this, the majority of today’s repair approaches use an on-chip repair analysis capability, often referred to as BIRA for Built-In Repair Analysis. With BIRA, absolutely no fail data needs to be logged externally as the BIRA circuitry or engine analyzes the fail data coming out of an associated BIST controller on the fly. By the end of the memory test, the BIRA engine has determined the spare element allocation necessary to repair the chip.

A key requirement for a BIRA engine is to maximize its success at finding spare allocation solutions. If only spare rows or spare columns are used then the repair analysis is straightforward as any defective row or column is simply replaced. The analysis becomes much more complex however when both spare rows and columns are available. Take for example the memory represented in figure 2a which contains 2 spare rows and one spare column and contains the six defects shown. If a simple linear algorithmic approach is taken to allocate spares, then the allocation shown in figure 2b would be the outcome and the repair would not be successful. A successful allocation is possible in this case however as shown in figure 2c. In general, determining the optimum allocation when both spare rows and columns are used is in mathematical terms an NP-Complete problem, or more simply put, a problem that grows exponentially

Figure 2: Optimal spare allocation

Spare RowsSpare Rows

Spar

e Colu

mnSp

are C

olumn

(a) (b) (c)


April 2007 6

in difficulty with growing number of spare elements. Fortunately though, when the number of rows and columns is relatively small (which is generally the case) an optimal solution can typically always be computed. Commercial BIRA solutions exist that can always find an optimal solution when one spare column and any number of spare rows are used.

REPAIR DELIVERY

There are two general forms of repair delivery: hard repair and soft repair.

Hard Repair

In this approach, repair instructions are stored permanently within the die through the programming of fuses. The two common fuse types are laser and electrical. Laser fuses are programmed by cutting a metal link, while electrical fuses (eFuses) are typically one-time programmable or flash memory elements and are programmed using an elevated voltage level. eFuse usage is growing rapidly as they are generally smaller than laser fuses—typically by a factor of 2 to 3 (e.g. 0.02 mm2 vs. 0.05 mm2), and they do not require special equipment or a different test insertion to be programmed. For this last reason, eFuses are also associated with Self-Repair approaches which are described later in this section.

Soft Repair

In this approach, repair instructions are stored in volatile memory, typically in scan registers, at each power up of the device. Soft repair has the advantage of being able to address defects that may arise over time as new repair instructions can be created and stored throughout the life of the device. This provides higher long term availability and reliability. Because the repair instructions are not permanently stored within the device, they have to be either stored somewhere external to the device (somewhere in the system) or they have to be generated on-the-fly at power-up. Storing the repair instructions in the system can be daunting from a logistics point of view as the repair instructions for typically many different memories within many different devices have to be properly managed. For this reason, soft repair is almost exclusively associated with a BIRA mechanism to calculate repair instructions on-chip at power up.

Self-Repair:

A self-repair solution, typically referred to as BISR (Built-In Self-Repair), is one where both the repair analysis and repair delivery are performed on-chip. In its simplest form, a BISR solution consists of the combined BIRA and soft repair capabilities described above. One important disadvantage of this approach however is that since the repair instructions are calculated once at power-up, they may not take into account defects that only manifest themselves under specific operating conditions such as high temperature. For this reason, more advanced BISR solutions now incorporate a combination of both soft and hard repair capabilities. Hard repair is used to store repair instructions determined during manufacturing test and soft repair is then used at each power up to address any new defects. These advanced solutions provide several advantages including: a simplified manufacturing test and repair process, support for long term reliability using soft-repair as explained above, and significant silicon area savings through pooling of fuse data as explained below. A potential drawback of this incremental soft


April 2007 7

repair approach is that the power up cycle time for a device becomes longer as the BIST must be executed twice. For some applications this extended time may be problematic.

The on-chip architecture for LogicVision’s BISR solution is shown in figure 3. A key component of this architecture is the concept of a centralized fuse pool (eFuse array). Because most memories with redundancy will typically need little to no repair on any given die, sharing a pool of fuses for all memories allows for much better fuse utilization. Memories needing little to no repair will require little to no fuse information to be stored, freeing that fuse storage for other memories. In order to simplify the fuse data allotment, standard data compression techniques are used to implicitly allocate the necessary amount of fuse storage per memory. On-chip management of a centralized programmable fuse pool is performed by a fuse controller. This controller together with one or more BIST controllers perform all necessary activities for testing and repairing memories. In this architecture, the BIST interfaces to memories containing redundancy are equipped with a BIRA engine to analyze failures and generate any necessary repair instructions in the form of fuse data. A dedicated chip-wide (BISR) scan chain is used by the fuse controller to transfer fuse data to and from the eFuse array and the various memories. This scan chain contains a BISR register for each memory with redundancy. The operation of this BISR architecture is described in detail in the next section.

Figure 3: BISR Architecture

TAP

eFuseArray

BISTController

BIRAENGINE

BIST Interface

BIRA REG

Fuse Controller

OtherBISR

Registers

BISR REG

5

RAMwith

Redundancy

RepairPort

RAMRAM

TAP

WTAP

WTAPWTAP

BIST

WTAP

FuseCont

eFuseeFuseArrayArray

RAMRAM

RAMRAM

RAMRAM

RAMRAMRAMRAM

RAMRAM

RAMRAM

REPRREPRRAMRAM

REPRREPRRAMRAM

REPRREPRRAMRAM

REPRREPRRAMRAM

REPRREPRRAMRAM

BIST

BIST

BIST

BIST

BIST

BIRA + BISR


April 2007 8

MANUFACTURING REPAIR FLOW The manufacturing repair flow is typically performed at wafer sort and depends of course on the repair capabilities used. The following steps define the flow when the on-chip self-repair solution described in the previous section is used. Each of the steps is also represented graphically in figure 4.

Action: Power up/Reset the chip.

Result: The fuse controller is automatically enabled and loads the chip-wide BISR chain with all 0s. This ensures that only the non-redundant portion of each memory is tested in the next step.

Action: Run the memory BIST/BIRA

controllers (at typically different test corners) Result: Each memory is fully tested and any

necessary repair info is automatically calculated and accumulated across different test corners. The repair info is stored in the local BIRA registers.

Action: Run “transfer BIRA to BISR”

instruction Result: Transfers local repair info into chip-

wide BISR chain for subsequent fuse processing.

Action: Scan out BIRA registers from each

BIST controller. Result: The BIRA registers contain the repair

status. If any memory is not reparable then exit as chip is bad. If no repair info is generated for all memories then exit as chip is good.

Action: Run fuse controller in programming

mode. Result: The repair info contained in the BISR

chip-wide chain is scanned out, compressed and programmed on-the-fly into the eFuse array. High voltage is applied to the eFuse array during this step. Figure 4: Manufacturing repair flow

TAP

eFuseArray

BISTController

BIRAENGINE

BIST Interface

BIRA REG

Fuse Controller

OtherBISR

Registers

RAMwith

Redundancy

RepairPort

RAMwith

Redundancy

RepairPortRepairPort

BISR REG

TAP

eFuseArray

BISTController

BIRAENGINE

BIST Interface

BIRA REG

Fuse Controller

OtherBISR

Registers

BISR REG

TAP

eFuseArray

BISTController

BIRAENGINE

BIST Interface

BIRA REG

Fuse Controller

OtherBISR

Registers

BISR REG

2

1

3

4

5

6

7

RAMwith

Redundancy

RepairPort

RAMwith

Redundancy


RAMwith

Redundancy

RepairPort

RAMwith

Redundancy



April 2007 9

Action: Power up/Reset the chip

Result: The fuse controller is automatically enabled and loads the chip wide BISR chain with all of the stored repair info. This results in all the memories with redundancy being repaired as the BISR registers directly drive the memory repair ports. Note that some memory types contain an internal scannable repair register rather than a repair port. For these memories, the internal repair register is scan loaded in parallel with the BISR register.

Action: Run the memory BIST controllers.

Result: The (repaired) memories are fully re-tested to ensure that the repair was successful. Generally, only steps 6 and 7 are repeated during final test as it is assumed that any additional memory yield fall out at final test is too small to warrant the test time cost of the extra repair steps. Nevertheless, if additional repair is mandated at final test, the above seven steps can be re-executed with the following modifications:

This step is now equivalent to step 6 as the fuse controller will load the stored wafer sort repair info into the BISR chain.

Before executing this step, the BIRA register is loaded with the BISR register contents in order to

create a baseline or starting point for any additional repair info calculation.

In this step the fuse controller is now run in incremental programming mode. Only new additional repair info is compressed and stored in the eFuse array. This is necessary as the eFuse array can not be reprogrammed and therefore the incremental repair info must be stored separately.

In the field, all memories are repaired automatically at power up by the fuse controller as described in step 6. For long term reliability, it is possible to perform additional incremental soft repair at power up to address any defects that may have developed over time. To accomplish this, once the fuse controller has loaded the BISR registers to repair the memories, the BIRA registers are then loaded with the BISR register contents to create a baseline as in the modified step 2 above. The BIST controllers (with BIRA) are then executed and the BIRA registers are then updated to contain the baseline repair info combined with any new repair info. This combined repair info is then transferred back into the BISR registers to repair the memories. This is a soft repair as the new repair info can not be programmed into the eFuse array and is therefore only available while the device is powered up.


April 2007 10

How Much Redundancy? The most basic question regarding memory repair is: how much redundancy, if any, should be added to each embedded memory? The answer to that question depends on several factors which are explored in this section. Although some guidelines are presented here, the reader is encouraged to work with his or her memory IP provider and/or foundry to determine the optimum redundancy strategy for a particular design.

Redundancy is added to improve memory yield and thus die yield. A method to calculate yield is therefore required in order to analyze redundancy requirements. There is a long history of work on determining accurate yield models [2]. A common model for memory yield used by several companies is one based on the Negative Binomial model:

YMEM = ( 1 + DMEM AMEM )-C (1)

where:

DMEM = memory defect density (defects/mm2)

AMEM = memory core size (mm2)

C = complexity factor. This parameter relates to the complexity of the underlying process and is derived from the number of critical steps in the manufacturing flow. Values between 5 and 15 have been used successfully with processes of varying complexity.

Figure 5 plots yields for memories ranging in size from 1 mm2 to 10 mm2, for complexity factor values ranging from 8 to 14. A relatively high defect density of 0.002 defects/mm2 (1.3 defects/in2) is assumed. It is clear that redundancy will be needed if even a few of the larger memories are placed together on a die as the die’s yield will be the product of the already low memory yields.

To calculate the effect of redundancy on a memory’s yield, consider first the case when one spare element (row or column) is added to a memory. In this case the memory can be viewed as being divided into N equal parts of the size of the spare element. For example, if a spare row is added to a memory then N is equal to the number of rows. With the spare element, the memory has N+1 parts each with the same yield value, YMEM/N, which can be calculated using equation (1) with an area value N times smaller than the full memory. The yield of the memory can be closely approximated by the probability of no more than one of the N+1 parts being bad. The memory yield with one spare element can therefore be calculated by:

Y1SP = (YMEM/N )N+1 + (N+1) (YMEM/N )N (1- YMEM/N ) (2)

0.8

0.85

0.9

0.95

1

1 2 3 4 5 6 7 8 9 10

C=8

C=10

C=12

C=14

YMEM

Memory core size (mm2)

DMEM = 0.002 defects/mm2

0.8

0.85

0.9

0.95

1

1 2 3 4 5 6 7 8 9 10

C=8

C=10

C=12

C=14

YMEM



Figure 5: Memory Yield


April 2007 11

As additional independent spare elements are added, the yield calculation becomes increasingly complex as all combinations of allowable bad part combinations must be taken into account. With 2 spare elements, the yield calculation grows to:

Y2SP = (YMEM/N )N+2 + (N+2) (YMEM/N )N+1 (1- YMEM/N ) + ½ (N+2)(N+1) (YMEM/N )N (1- YMEM/N )2 (3)

Figure 6 shows the improved yield values for both the single and double spare element cases for the same memory sizes and defect density used in figure 5. It is interesting to note that even at the relatively high defect density, a single spare element seems sufficient for all but the largest memories. It also appears from this data that it will be rare to need more than 2 spare elements within an individual memory.

Determining the optimum number of spares to use within any given memory requires more than just analyzing the memory yield improvement. Redundancy of course increases the memory’s area and thus the die’s area. The increase in area results in an increase in cost. Another way to view this is that the increase in the die’s area reduces the number of die that can be manufactured per wafer. The end goal is to maximize the number of good die per wafer DPWGOOD. DPWGOOD is the product of the die yield times the number of die that can be manufactured per wafer, or:

Adding redundancy to a given memory will therefore increase the DPWGOOD value if the resulting percentage increase in the die yield, YDIE (which is equal to the percentage increase in the given memory yield as yields are multiplied together) is greater than the resulting percentage increase in the die area ADIE. The ratio YDIE / ADIE must increase to justify the added redundancy. For example, if one spare element is added to a memory, then the ratio Y1SP / A1SP must be greater than the ratio Y0SP / A0SP or

DPWGOOD = YDIE DPW = YDIE AWAFER =

ADIE

AWAFER

ADIE

YDIE

Y0SP

A0SP

Y1SP

A1SP

Y1SP

A1SP > 1

0.80

0.82

0.84

0.86

0.88

0.90

0.92

0.94

0.96

0.98

1.00

1 2 3 4 5 6 7 8 9 10

YMEM



C = 12

Y0SP

Y2SP

Y1SP

0.80

0.82

0.84

0.86

0.88

0.90

0.92

0.94

0.96

0.98

1.00

1 2 3 4 5 6 7 8 9 10

YMEM



C = 12

Y0SP

Y2SP

Y1SP

Figure 6: Improved Yield from added redundancy


April 2007 12

where:

Y1SP = die yield with one spare element added to the memory

Y0SP = die yield with no spare element added to the memory

A1SP = die area with one spare element added to the memory

A0SP = die area with no spare element added to the memory

The graph in figure 7a displays the above ratio for the memory sizes and yield data used in figure 6. The ratio is greater than one for all memory sizes and therefore indicates that one spare element should be added in all cases. The graph also displays the ratio

which measures whether a second spare element should be added. In this case, the values indicate that a second spare element should only be added for memory sizes 4 mm2 or greater.

Figure 7b shows the same two ratios for the same memory sizes but with the defect density decreased from 0.002 defects/mm2 (1.3 defects/in2) to 0.0002 defects/mm2 (0.13 defects/in2). At this reduced defect density, two spare elements are never justified, and one spare element helps for only memory sizes 3 mm2 or greater.

Note that if the design starts off pad-limited then there is some unused silicon area in the core that can be used for redundancy without any cost. The above ratios are still useful in this case as they serve to rank the relative benefits of adding redundancy to the various memories.

The above analysis also assumes that the die area can grow to an arbitrary size. This is often not the case as specific die sizes may only be available due to packaging and other issues. In this case adding redundancy may force a change to the next die size resulting in a more significant area increase.

Y2SP

A2SP

Y2SP

A2SP Y1SP

A1SP

Y1SP

A1SP

0.95

1

1.05

1.1

1.15

1.2

1 2 3 4 5 6 7 8 9 10


Y0SP

A0SP

Y1SP

A1SP

Y2SP

A2SP Y1SP

A1SP


0.95

1

1.05

1.1

1.15

1.2

1 2 3 4 5 6 7 8 9 10


Y0SP

A0SP

Y1SP

A1SP

Y1SP

A1SP

Y2SP

A2SP

Y2SP

A2SP Y1SP

A1SP

Y1SP

A1SP


0.99

0.995

1

1.005

1.01

1.015

1.02

1 2 3 4 5 6 7 8 9 10


Y2SP

A2SP

Y2SP

A2SP Y1SP

A1SP

Y1SP

A1SP

Y0SP

A0SP

Y1SP

A1SP

Y1SP

A1SP


(a) (b)

Figure 7: Determining effectiveness of added redundancy


April 2007 13

HOW MANY FUSES? Another decision that must be made when adopting a repair strategy is how many fuses (laser or programmable) to incorporate within the die. The optimal number of fuses depends on many factors, including of course the chip-level repair scheme used. This section describes how to determine the number of fuses assuming LogicVision’s BISR architecture described earlier is used.

The number of fuses to incorporate in a chip’s centralized fuse pool is calculated using the following equation:

FuseCount = (Repairs * BISRRegSize) + ((Repairs + 1) * ZeroCountBits)

Where

Repairs = number of repair operations (spare element replacements) that are to be supported per die. Note that depending on the redundancy scheme used, more than one repair operation may be performed on a single memory.

BISRRegSize = Maximum number of repair data bits required to encode a spare element allocation. This number is typically between 6 and 12 bits and depends on the type of spare element (row or column) as well as the memory size and configuration.

ZeroCountBits = log(BISRChainLength) = number of bits needed to encode the largest run of zeros in the repair data for the entire chip. Largest run corresponds to no repair needed for the entire chip.

BISRChainLength = BISRRegSize * number of spare elements in the chip

Example:

A design has 40 reparable memories and each memory contains two spare rows, resulting in 80 spare elements on the chip. Each spare row requires 8 bits to encode. This results in the following parameter values:

BISRRegSize = 8

BISRChainLength = 8 * 80 = 640

ZeroCountBits = log(640) = 10

The FuseCount equation now looks like:

FuseCount = (Repairs * 8) + ((Repairs + 1) * 10)

The resulting relationship between number of repairs to be supported per die and the number of fuses required is shown in the following table:


April 2007 14

This reparability versus area overhead tradeoff must in general be analyzed to determine the optimal number of fuses to incorporate into the design. An exact analysis of this trade-off is difficult however as it must take into account many factors such as the defect density of the process being used, the physical design of the memories, and the relative percentage of memory and logic on the chip. It is important to keep in mind however that the logic on the chip can not be repaired. Even though the defect density within the memories tends to be greater than within the logic, as the number of defects grows, the logic quickly becomes the limiting factor. As a result, a good rule of thumb is to plan for a relatively low number of repairs, ten for example, regardless of the number of memories with redundancy. In the above example, a fuse count of 190 would therefore be acceptable.

# Repairs Supported

% Spare Utilization Fuse Count

5 6% 10010 13% 19020 25% 37040 50% 73080 100% 1450

Typical


April 2007 15

CHOOSING A REDUNDANCY SCHEME There are three general forms of redundancy to choose from. Each is described here along with some associated advantages and disadvantages. The reader is encouraged to work with the memory vendor and/or foundry to ultimately determine the best choice for his or her design.

Row Only Redundancy

One or more spare rows are added per memory. In the case of several spare rows, some redundancy schemes force all rows to be allocated as a contiguous block while others allow each row to be allocated separately. It is rare to have more than two spare rows within a memory.

Advantages: This is the cheapest repair method from a BIST and BIRA overhead point of view. The BIST overhead is cheapest as a serial test interface between the BIST controller and memory can be used. A serial interface only requires one comparator per word rather than one per bit (I/O). The amount of BIRA logic is also low and varies only slightly with the memory size as only the most significant bits (MSBs) of the row address bits are logged.

Disadvantages: Has a slight impact on performance as the setup time on the address inputs is slightly increased. Bit level diagnostics are not possible if the serial interface is used.

Column Only Redundancy

There are two forms of column redundancy. In the IO replacement scheme, one or more entire memory sub-arrays are added. Each redundant element can repair any failing column associated with a memory IO. Figure 8a illustrates the structure of an 8-bit repairable memory with 4-to-1 column multiplexing and 1 redundant IO. The redundancy logic for the IO replacement mechanism is highlighted in grey. In the column replacement scheme, one or more single columns are added. Each redundant element can repair one failing column within any memory IO. Figure 8b illustrates the structure of an 8-bit repairable memory with 4-to-1 column multiplexing and 1 redundant column. The logic for the column replacement

(a)(a)

(b)(b)

Figure 8: Column Redundancy Schemes


April 2007 16

mechanism is highlighted in grey.

Advantages: This has the least effect on memory performance as there is no impact on address decoding.

Disadvantages: It precludes the use of a serial interface between the BIST controller and the memory as a comparator per bit (I/O) is needed. The area cost is a function of the number of I/Os so that even a small memory can require a large amount of repair circuitry. The BIRA circuitry required to encode the failing I/O number is relatively big and slow. This may reduce the maximum frequency at which the BIST and BIRA can operate.

Row and Column Redundancy

A combination of both the row and column redundancy schemes. One or more spare rows as well as one or more spare columns are added per memory. The number of spares rows or spare columns rarely exceeds two.

Advantages: Provides the highest repair success rate for a given number of spares. Having spares in both dimensions not only improves the ability to cover a random distribution of defects, but also improves the ability to cover defect mechanisms that affect an entire word (e.g. word line fault) or entire column (e.g. bit line fault).

Disadvantages: Very expensive from both a memory overhead as well as from a BIRA overhead point of view. Can only be justified for very large memories and generally for less mature processes.


April 2007 17

MEMORY REPAIR SOLUTION CHECKLIST This section provides a laundry list of items that should be considered when assembling a comprehensive memory repair solution.

Memory redundancy options

As the amount and type of redundancy may only be determined later in the design process, it is important to ensure that the chosen family of memories or memory compiler offer a wide enough range of redundancy options

Programmable fuses If self-repair is to be used, it is necessary to choose programmable fuse IP that is compatible with the chosen solution. For example, programmable fuses come with either serial or parallel read/write interfaces. The chosen self-repair IP may only be compatible with one interface type.

Test algorithms The repair process is contingent on proper screening of all defects. It is important therefore to ensure that test algorithms suitable for the chosen memories and foundry process are used. In most cases BIST is used for memory testing and therefore the test algorithms supported by a given BIST solution should be examined. If new memory designs or new processes are to be used, it may be valuable to adopt a BIST solution that supports soft algorithm programming.

BIST and BIRA IP performance

Finding delay related defects requires that the BIST and BIRA (repair) IP operate at the operating frequencies of the memories under test. This should be investigated taking into account the chosen foundry process. In particular, BIRA logic for column redundancy analysis tends to be slow and is often the limiting factor.

BIST and BIRA area overhead

Area overhead numbers should be examined when choosing BIST and BIRA solutions. BIST solutions should be examined for how well they are able to share test resources across multiple memories. BIRA solutions should be examined for area efficiency for the chosen amount and type of redundancy.

Design automation If a design contains a large number of memories then the level of design automation that is provided with the chosen BIST and self-repair solutions becomes very important. Comprehensive automation can result in significant design schedule savings.

Memory Vendor Support Choosing BIST and self-repair solutions that support a large variety of memory vendor IP can be very useful as the choice of memory vendor and/or IP may change from one design to the next. It is also not uncommon to have memories from different vendors within the same design.


April 2007 18

Memory IP Vendor Info This section provides assistance in sorting through the majority of available 3rd party memory IP as well as any associated test, repair, and reliability features. For easy access, the information is provided in table form in the following two tables. The fields contained in these two tables are described below.

Website Lists the memory IP vendor’s website address

Memory Types

Lists the various types of memories offered by the memory IP vendor. Not all listed memory types are necessarily available in all of the vendor’s supported foundry process nodes. This detailed availability information can typically be found within the memory IP vendor’s website

Memory Types Supported by ETMemory

Lists the memory types that are fully supported by LogicVision’s ETMemory memory BIST solution. This support has been fully verified by both LogicVision and the memory IP provider.

Proprietary BIST Offered? Indicates whether the memory IP vendor offers a proprietary BIST capability for its memories. Note that these solutions work only with the vendor’s memories and can not be used with other vendor memories.

Integrated ECC available? Indicates whether Error Correction Code logic is provided with the vendor’s memories. ECC can be used as a replacement for repair, but is often used as a complimentary capability.

Reparable Memory Types

Lists the various memory types offered by the memory IP vendor that support some form of redundancy for repair. Not all listed memory types are necessarily available in all of the vendor’s supported foundry process nodes. This detailed availability information can typically be found within the memory IP vendor’s website

Supported Redundancy Types

List the types of redundancy available within at least some of the memory IP vendor’s memories. Not all listed redundancy types are necessarily available for all listed reparable memory types or supported foundry process nodes. This detailed availability information can typically be found within the memory IP vendor’s website

Reparability Supported on which nodes?

Indicates on which technology nodes reparable memories are available. Most vendors started offering reparability at 130nm and continue to offer reparability at all subsequent nodes.

Proprietary BISR Offered? Indicates whether the memory IP vendor offers a proprietary BISR capability for its memories. Note that these solutions work only with the vendor’s memories and can not be used with other vendor memories.

Proprietary Fuses Offered Indicates whether the vendor provides its own fuses for repair, and the types of fuses provided. Fuses are typically available from the foundry.

Reparable Memories supported by ETMemory - Repair?

Lists the memory types that are fully supported by LogicVision’s ETMemory-Repair BISR solution. This support has been fully verified by both LogicVision and the memory IP provider.

Automated LV Memory Library File generation

Indicates whether the vendor’s compilers automatically generate LogicVision memory library file descriptions of the memories. If this is not available for some memories, the memory library file can be generated manually or provided as a service by LogicVision.


April 2007 19

Please Note: Memory vendors are listed in alphabetical order. LogicVision does not endorse nor recommend any particular memory vendor. Information is compiled from publicly available data. The reader is encouraged to directly contact any of the memory vendors for the most up to date information and product updates.

Table 1a: Memory IP Vendor Info

ARM / Artisan Dolphin Technology MOSAID

Website www.arm.com www.dolphin-ic.com www.mosaid.com

Single Port SRAM Single Port SRAM Single Port SRAMDual Port SRAM Dual Port SRAM Dual Port SRAM

Single Port Reg File Multi Port Reg Files Multi Port Reg FilesDual Port Reg File ROM ROM

ROM Binary CAMTernary CAM

Single Port SRAM Single Port SRAM Single Port SRAMDual Port SRAM Dual Port SRAM Dual Port SRAM

Single Port Reg File Multi Port Reg Files Multi Port Reg FilesDual Port Reg File ROM ROM

ROM

Proprietary BIST Offered? YES NO NO

Integrated ECC available? YES YES NO

Single Port SRAM Single Port SRAM NoneDual Port SRAM

Multi Port Reg Files

Multi-Row Multi-RowSingle Column I/O Dual Column I/ORow and Column Row and Column

Select Foundry processes at


130nm, 90nm, 80nm, 65nm

130nm, 90nm, 65nm, 45nm

Proprietary BISR Offered? YES NO NO


YES YES YES

Automated LV Memory Library File generation Select Memories All Memories NO

None

N/A




NoneNoneProprietary Fuses Offered None


Memory Types


April 2007 20

Table 1b: Memory IP Vendor Info

MoSys TSMC Virage Logic

Website www.mosys.com www.tsmc.com www.viragelogic.com

1T-RAM Single Port SRAM Single Port SRAMDual Port SRAM Dual Port SRAM

Single Port Reg File Single Port Reg FileDual Port Reg File Dual Port Reg File

ROM ROM1T-RAM

Embedded Flash

1T-RAM Single Port SRAM Single Port SRAMDual Port SRAM Dual Port SRAM

Single Port Reg File Single Port Reg FileDual Port Reg File Dual Port Reg File

ROM ROM1T-RAM

Proprietary BIST Offered? NO YES YES

Integrated ECC available? YES NO NO

1T-RAM Single Port SRAM Single Port SRAMDual Port SRAM

Dual Port Reg File

Multi-Row Multi-RowSingle Column I/O Single Column I/O

Row and Column


Select TSMC processes at


130nm, 90nm, 65nm 90nm, 65nm 180nm, 130nm, 90nm, 80nm, 65nm

Proprietary BISR Offered? NO NO YES

Random Access eFuseSerial eFuse


YES YES NO

Automated LV Memory Library File generation All Memories Select Memories Select Memories

Laser

Bank

None

Memory IP Types





Proprietary Fuses Offered


April 2007 21

FOUNDRY INFO This section provides a cross reference between memory IP vendors and the memory types they provide for each of the most popular foundries. For easy access, the information is provided in table form in the following two tables. The fields contained in these two tables are described below.

Memory Vendors For each technology node, this field lists the 3rd party memory IP vendors that provide memories for each listed foundry. Details on the memories provided by each memory IP vendor and the specific process nodes supported (low power, high speed, etc) can typically be found within the vendor’s website. Most listed vendors provide both single port and double port SRAMs.

Reparable Memory Vendors

For each technology node, this field lists the 3rd party memory IP vendors that provide reparable memories for each listed foundry. Details on the memories provided by each memory IP vendor and the specific process nodes supported (low power, high speed, etc) can typically be found within the vendor’s website. Most listed vendors provide both single port and double port reparable SRAMs, and most provide row only, column only, as well as row and column redundancy options.

Please Note: Foundries and memory vendors are listed in alphabetical order. LogicVision does not endorse nor recommend any particular foundry or memory vendor. Information is compiled from publicly available data. The reader is encouraged to directly contact any of the foundries and memory vendors for the most up to date information and product updates.


April 2007 22

Table 2a: Memory IP per Foundry Info

Chartered IBM SMIC

www.charteredsemi.com www.ibm.com www.smics.com

ARM / ArtisanARM / Artisan Dolphin ARM / ArtisanVirage Logic Virage Logic

Dolphin

ARM / Artisan ARM / Artisan ARM / ArtisanMoSys Dolphin MoSys

Virage Logic Virage Logic

ARM / Artisan ARM / Artisan ARM / ArtisanMoSys Dolphin MoSys


ARM / Artisan ARM / Artisan ARM / ArtisanMoSys Virage Logic Virage Logic

Virage Logic

ARM / Artisan ARM / Artisan ARM / ArtisanVirage Logic Virage Logic Virage Logic

ARM / Artisan ARM / Artisan

ARM / Artisan ARM / Artisan

90nm

Reparable Memory vendors

Memory vendors

Memory vendors

65nm



Memory vendors


180nm

Memory vendors

130nm


April 2007 23

Table 2b: Memory IP per Foundry Info

Tower TSMC UMC

www.towersemi.com www.tsmc.com www.umc.com

ARM / Artisan ARM / ArtisanARM / Artisan Dolphin Virage Logic

Tower Virage LogicVirage Logic

Virage Logic Dolphin Virage LogicVirage Logic

ARM / Artisan ARM / Artisan ARM / ArtisanVirage Logic Dolphin Dolphin

MoSys Virage LogicVirage Logic

ARM / Artisan ARM / Artisan ARM / ArtisanVirage Logic Dolphin Dolphin


ARM / Artisan ARM / ArtisanDolphin Virage LogicMoSysTSMC

Virage Logic

ARM / Artisan ARM / ArtisanDolphin Virage LogicTSMC

Virage Logic

DolphinTSMC

DolphinTSMC

180nm

Memory vendors

130nm

Memory vendors



90nm

Memory vendors


65nm

Memory vendors



April 2007 24

REFERENCES [1] S. Pateras, “Best Practices for Cost Effective Test and Yield Optimization of Embedded Memories”,

FSA Forum, vol. 13, no. 4, December 2006

[2] J.A. Cunningham, “The Use and Evaluation of Yield Models in Integrated Circuit Manufacturing”, IEEE Transactions on Semiconductor Manufacturing, vol. 3, no. 2, May 1990, pp. 60-71.

[3] S. Shoukourian, V.Vardanian and Y. Zorian, “SoC Yield Optimization via an Embedded-Memory Test and Repair Infrastructure”, IEEE Design & Test of Computers, May-June 2004, pp. 200-207.

memory repair primer - iss · memory test & repair primer april 2007 7 repair approach is that...

Documents