Download - A Class of St-bEC-SbED Codes

7/31/2019 A Class of St-bEC-SbED Codes

1/13

A Class of Random Multiple Bits in a Byte ErrorCorrecting and Single Byte Error Detecting

(St=b

EC-SbED) Codes

Ganesan Umanesan, Member, IEEE, and Eiji Fujiwara, Fellow, IEEE

AbstractCorrecting multiple random bit errors that corrupt a single DRAM chip becomes very important in certain applications, such

as semiconductor memories used in computer and communication systems, mobile systems, aircraft, and satellites. This is because,

in these applications, the presence of strong electromagnetic waves in the environment or the bombardment of an energetic particle on

a DRAM chip is highly likely to upset more than just one bit stored in that chip. On the other hand, entire chip failures are often

presumed to be less likely events and, in most applications, detection of errors caused by single chip failures are preferred to correction

due to check bit length considerations. Under this situation, codes capable of correcting random multiple bit errors that are confined to

a single chip output and simultaneously detecting errors caused by single chip failures are attractive for application in high speed

memory systems. This paper proposes a class of codes called Single t=b-error CorrectingSingle b-bit byte Error Detecting

(St=bEC-SbED) codes which have the capability of correcting random t-bit errors occurring within a single b-bit byte and simultaneously

indicating single b-bit byte errors. For the practical case where the chip data output is 8 bits, i.e., b 8, the S3=8EC-S8ED code

proposed in this paper, for example, requires only 12 check bits at information length 64 bits. Furthermore, this S3=8EC-S8ED code iscapable of correcting errors caused by single subarray data faults, i.e., single 4-bit byte errors, as well. This paper also shows that

perfect Sb1=bEC-SbED codes, i.e., perfect St=bEC-SbED codes for the case where t b 1, do exist and provides a theorem to

construct these codes.

Index TermsRandom multiple bits in a byte error, single t=b-error correctingsingle b-bit byte error detecting (St=bEC-SbED) codes.

1 INTRODUCTION

SEMICONDUCTOR memory chips are highly vulnerable tomultiple random bit errors when they are exposed tostrong electromagnetic waves or radioactive particles suchas neutrons [1]. In certain applications, such as semicon-ductor memories used in mobile systems, aircraft, orsatellites, the bombardment of an energetic cosmic particleon a DRAM chip also upsets multiple bits stored in thatchip [2]. Subsequently, multiple random bit errors that areconfined to a single byte becomes the most significant errorsin memory codewords. In other words, this error is arandom t-bit error confined to a single b-bit byte, where1 t b, and it is necessary to correct this type of errors toreduce the bit error rate to an acceptable level [3]. The b-bitbyte errors, on the other hand, are caused by entire chipfailures, which are often presumed to be less likely events.In most cases, single b-bit byte error detection is preferred tosingle b-bit byte error correction because single b-bit byte

error correction requires at least2b

check bits regardless ofwhat the information bit length is.We know that byte error control codes such as Single

b-bit byte Error CorrectingDouble b-bit byte Error Detect-ing (SbEC-DbED) codes and Single bit Error Correcting-Double bit Error DetectingSingle b-bit byte ErrorDetecting (SEC-DED-SbED) codes have been successfully

applied to high speed memory systems using DRAM chipswith b 4 bits I/O data [4]. However, these codes,especially at practical information bit lengths, do not qualifyas suitable codes for memory systems, where the memorysystems employ recent high-density DRAM chips. Thereason is that the recent high-density DRAM chips havewide I/O data, such as 8, 16, or 32 bits, that is, b 8; 16, or32 [5], [6], [7]. As such, the main drawback of using thesecodes is that the minimum number of check bits required isoften out of proportionate at practical information bitlengths. For a memory system using recent 16-bit I/O dataDRAM chips, a single 16-bit byte error correcting code, i.e.,S16EC code, requires at least 32 check bits, which is tooexpensive for the practical case where information length is64 bits.

Existing Single bit Error CorrectingDouble bit ErrorDetectingSingle b-bit byte Error Detecting (SEC-DED-SbED)codes[8]andAdjacentDoublebitwithinab-bitbyte ErrorCorrectingSingle b-bit byte Error Detecting (ADECb-SbED) codes [9] have good code rate, but they do not correctmultiple random bit errors. On the other hand, existingmultiple randombit error correcting codes suchas DoublebitError Correcting (DEC) codes and Triple bit Error Correcting(TEC)codeshavetwoobjectionablefeatureswhenconsideredfor high speed DRAM applications: 1) They introduce toomany check bits and 2) the parallel encoding/decodinghardware for these codes are complex [4], [10]. Most of theother existing codes which have bit and byte error controlcapability [11], [12], [13], [15] areoftenbasedon BCHcodesortensor products of existing codes, thus they have the samelimitation as the above codes. For example, the DEC-S4EC

code presented in [11] is obtained by rearranging the binary

IEEE TRANSACTIONS ON COMPUTERS, VOL. 52, NO. 7, JULY 2003 835

. The authors are with the Graduate School of Information Science andEngineering, Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, Tokyo 152-8552, Japan. E-mail: {nezz, fujiwara}@fuji.cs.titech.ac.jp.

Manuscript received 26 Oct. 2001; revised 18 Sept. 2002; accepted 20 Sept.2002.For information on obtaining reprints of this article, please send e-mail to:

[email protected], and reference IEEECS Log Number 115259.0018-9340/03/$17.00 2003 IEEE Published by the IEEE Computer Society


2/13

columns in the parity check matrix of a BCH double bit errorcorrecting code.

From the above, it is obvious that existing codes are not

necessarily suitable for correcting multiple random biterrors confined to a single DRAM chip output. In thispaper, we call a random t-bit error confined to a b-bit byte, at=b-error, where 1 t b, and propose a new class of codescalled Single t=b-Error CorrectingSingle b-bit Byte ErrorDetecting (St=bEC-SbED) codes. The proposed codes correctrandom t-bit errors confined to a single b-bit byte andsimultaneously detect single b-bit byte errors.

Theorganization of this paper is as follows: Section 2 dealswith St=bEC-SbED code where fundamental theorems, aconstruction method and b-bitbyte error correction capabilityare illustrated. A decoding method and an evaluation areprovided for the proposed codes in Section 3 and Section 4,respectively. Section 5 considers the special case wheret b 1, and shows that perfect Sb1=bEC-SbED codes doexist. Here, a theorem is also provided to construct theseperfect codes.Section 6 shows howthe proposed codes canbesuitably made tocorrect errorscaused bysinglesubarray datafaults as well. Finally, the paper concludes in Section 7 withsome remarks on future work. Table 1 shows the notationsused in this paper. All the matrices and vectors used in thispaper are over the field GF2 and vectors are row vectorsunless stated otherwise.

2 St=bEC-SbED CODE

An St=bEC-SbED code corrects single t=b-errors and simulta-

neously detects b-bit byte errors. The St=bEC-SbED codefunction becomes single bit error correction and single b-bitbyte error detection, i.e., SEC-SbED [4] if t 1, and singleb-bit byte error correction, i.e., SbEC if t b.

2.1 Fundamental Theorems

First, we define the t=b-error as follows:

Definition 1. An error is called a t=b-error if t or fewer bitswithin a b-bit byte are in error, where 1 t b.

Theorem 1. Let Hi denote an r b binary submatrix for0 i n 1. The null space of H H0H1H2H3 Hn1is an St=bEC SbED code if and only if:

1. e H

T

i 6 0 for 0 i n 1, 8e 2 GF2

b

2. e1 HTi 6 e2 HT

j for 0 i < j n 1, 8e1 2 Et=b,8e2 2 GF2

b,

where Et=b e 2 GF2b j 1 we t

and HT denotes

the transpose of matrix H.

Proof. Condition 1 of Theorem 1 ensures that all the b-bit

byte error patterns including single t=b-error patterns

generate a nonzero syndrome. Condition 2 includes thefollowing condition:

e1 HTi 6 e2 H

Tj for 0 i 6 j n 1; 8e1; e2 2 Et=b;

which ensures that syndromes generated by different

single t=b-error patterns are distinguishable. Therefore,

the code is capable of correcting single t=b-errors. On the

other hand, Condition 2 itself asserts that a syndrome

generated by a b-bit byte error pattern is different from

that of a t=b-error pattern, thus, b-bit byte error patterns

are detectable. tu

Theorem 2. A linear binary St=bEC SbED code requires atleast b t check bits.

Proof. From Condition 1 of Theorem 1, it is obvious that allthe binary columns of the parity check matrix H of anSt=bEC-SbED code are nonzero vectors. Then, accordingto Condition 2 of Theorem 1, there exist at least b tbinary columns of H, corresponding to a byte error and at=b-error, are linearly independent. Therefore, a linearbinary St=bEC-SbED code requires at least b t checkbits. tu

Theorem 3. An N; N RSt=b EC SbED code exists only if

Nb 2R 2b P

ti1

bi P

ti1 bi :

Proof. The total number of t=b-errors that corrupt a singleb-bit byte is given by

Pti1

bi

. There are N=b bytes in a

codeword with length N bits. Therefore, we need N=b Pti1

bi

different syndrome patterns to correct all single

t=b-error patterns. On the other hand, in a b-bit byte, thetotal number of single b-bit byte error patterns that arenot single t=b-errors is given by

Pbit1

bi

. According to

Conditions 1 and 2 of Theorem 1, all these error patternsshould generate distinguishable nonzero syndromeswhen considered for any particular byte, but they donot have to be distinguishable from syndromes gener-

ated by thePb

it1 bi

byte error patterns belonging toanother byte. In other words, we only need anotherPb

it1bi

syndromes for detecting single b-bit byte

errors. From this we have:

2R 1 !N

b

Xti1

b

i

Xb

it1

b

i

N

b

Xti1

b

i

2b 1

Xti1

b

i

:

Subsequently, the inequality in Theorem 3 holds. tu

Corollary 1. An SEC-SbED code exists only ifN 2

R

2

b

b.

836 IEEE TRANSACTIONS ON COMPUTERS, VOL. 52, NO. 7, JULY 2003

TABLE 1Notations Used in This Paper


3/13

Proof. Substituting t 1 into the inequality in Theorem 3,

we get N 2R 2b b. tu

Also, we know that the St=bEC-SbED code function

becomes an SbEC code when t b. Substituting t b into

the inequality in Theorem 3 yields the bound for SbEC

codes, i.e., N b 2R 1=2b 1.

2.2 Construction Method

Lemma 1. For 1 t b, let H0 h0 h1 h2 hb1 be an

r b binary matrix such that, e H0 T6 0 holds 8e 2

GF2b with we t. Here, hi denotes a binary column

vector of GF2r for 0 i b 1. Let be a primitive

element ofGF2Rb where R ! b r. Define:

iH0 i0h0 i0h1 i0h2 i0hb1

for 0 i 2Rb 2, where 0 : GF2r GF2Rb is a

homomorphism of GF2r into GF2Rb under addition.

Then, the null space of

H HR j IR

Ib Ib Ib Ib Ib ObRb0H0 1H0 2H0 2Rb2H0 O

RbbIRb

" #

is a systematic St=bEC SbED code with check bit length R

and code length N b2Rb 1 R. Here, Ip and Opqdenote the binary p p identity matrix and p q all zero

matrix, respectively.

Proof. Clearly, e Ib 6 0 forany e 2 GF2b,and ey IRb 6 0

for any ey 2 GF2Rb, therefore, Condition 1 of Theorem

1 is satisfied. Furthermore, it is obvious that the following

holds for any e1 2 Et=b, e2 2 GF2b

and ey2 2 GF2

Rb

:

1.

e1 Ib

iH0

!T6 e2

IbORbb

!T;

2.

e1 Ib

iH0

!T6 e

y2

ObRbIRb

!T;

3.

e2 Ib

ORbb

!T6 e

y2

ObRbIRb

!T

for any i where 0 i 2Rb 2. Therefore, to show that

the code is St=bEC-SbED, we only need to show that

Condition 2 of Theorem 1 is satisfied by the binary

columns corresponding to the first 2Rb 1 bytes. Let

e1 2 Et=b and e2 2 GF2b such that

e1 Ib

iH0

!T e2

IbjH0

!T

holds for 0 i 6 j 2Rb 2. Then, e1 e2 because the

upper submatrix is an identity matrix. We know that

e1 H0 T6 0 because we1 t. Therefore,

e1 iH0

T i T

e1 0H0

T i

T0 e1 H

0 T

6 0

because 0 is a homomorphism of GF2r into GF2Rbunder addition. From this, we know that, e1

iH0 T

e2 jH0 T

leads to i j, which is a contradiction. Thus,

Condition 2 of Theorem 1 is satisfied as well. This proves

that the null space of H is an St=bEC-SbED code. It is

apparent that the check length is R bits and code length

N b 2Rb 1 R bits. tu

In Lemma 1, depending on whether t < b or t b, we canconsider the H0 matrix as an r b parity check matrix of abinary b; b r code with minimum distance t 1, or ab b binary matrix with full rank. In other words, if t < b,

H0 represents a binary t-error detecting code. Otherwise, H0can be any b b binary matrix with rank b, including theidentity matrix.

Lemma 2. The null space of

H HR0bb 0bb 0bb

HRb

IR

!;

where R ! 2b r, is a systematic St=bEC SbED code with

check bit length R and code length N b2Rb 1 +b2R2b 1 R. Here, 0bb is a b b all zero binary matrix.

Proof. According to Lemma 1, the null space of HR j IR

and HRb j IRb are St=bEC-SbED codes. Since the null

space of HRb j IRb is an St=bEC-SbED code and all

the nonzero elements of the subspace spanned by the

binary columns of Ib0Rbb

h ihave nonzero upper b-bit

column vectors, the null space of"0bb 0bb 0bb

HRb

IR#

"0bb 0bb

HRb

Ib0Rbb 0bb 0bbIRb

#

is an St=bEC-SbED code. On the other hand, the null space

of

HR

0bb 0bb 0bbHRb !

is also an St=bEC-SbED code, because any b-bit errorpattern corrupting a byte in the first partition generates anonzero syndrome with a nonzero upper b-bit columnvector whereas such a b-bit error pattern corrupting abyte in the second partition generates a nonzerosyndrome with an all-zero upper b-bit column vector.This proves that the null space of H is an St=bEC-SbEDcode, as required. It is apparent that the check length ofthe code is R bits and the code length N b2Rb 1 b2R2b 1 R bits. tu

UMANESAN AND FUJIWARA: A CLASS OF RANDOM MULTIPLE BITS IN A BYTE ERROR CORRECTING AND SINGLE BYTE ERROR... 837


4/13

Lemmas 1 and 2 suggest that if R ! !b r, we canconcatenate ! iterative partitions to obtain a longSt=bEC-SbED code. The following theorem uses iterativeconcatenation of partitions to obtain a new class ofSt=bEC-SbED codes. Here, we use Pc to denote thepartition that corresponds to the check bit portion ofthe codeword and Pi, where 0 i ! 1, to denote theith partition that corresponds to a portion of informationbits in the codeword.

Theorem 4. Let R ! !b r. Then, the null space of the Hmatrix shown in Fig. 1 is an St=bEC SbED code with checklength R bits and code length N R

P!i1 b2

Rib 1bits. Here, 0bb is a b b all zero binary matrix and !! 1 isan integer.

Proof. By iteratively applying Lemma 2, we can show thatthe null space of H is an St=bEC-SbED code. The code

length of this code is given by:

N R b2Rb 1 b2R2b 1 b2R!b 1

R X!i1

b2Rib 1;

where R denotes the check length in bits. tu

Corollary 2. If rankH0 b, the code obtained by applyingTheorem 4 is a single b-bit byte error correcting (SbEC) code.

Proof. From Conditions 1 and 2 of Theorem 1, it is clear thatwhen t b, the St=bEC-SbED code function becomessingle b-bit byte error correction (SbEC). Therefore, if

rankH0 b, the code obtained by applying Theorem 4is an SbEC code. tu

For the case where t b, we can choose the b b binaryidentity matrix as the rank b matrix in the constructions, i.e.,H0 Ib. Then, the code obtained by applying Theorem 4represents the well-known Hong-Patel codes [14]. In otherwords, the code denoted by Theorem 4 includes Hong-Patelcode as a special case when t b and H0 Ib.

2.3 Example

We will illustrate the construction of a practical St=bEC-SbEDcode where t 3 and b 8. The resulting code is called anS3=8EC-S8ED code. We can construct this code by using

binary distance-4 Hamming codes. Let C

0

be a distance-4

binary extendedcyclic Hamming code withcode length8 bits,i.e., C0 is an 8; 4SEC-DED code. Let H0 denote the 4 8binary parity check matrix of the code C0. From this, we have

H0

1 1 1 1 1 1 1 1

0 1 2 3 4 5 6 !

0 4 8 14 10 13 12 7

;

where is a primitive element of GF23 corresponding toprimitive polynomial px x3 x 1 and is a primitiveelement of GF24 corresponding to primitive polynomial

px x4 x 1. Then, as required by Lemma 1, H0

represents a binary three-error detecting code. Notice that,in this case, the homomorphism 0 : GF24 GF24 issimply given by 0x x for any x 2 GF24. The paritycheck matrix of the resulting 132; 120S3=8EC-S8ED code isgiven by the following matrix:

where

iH0 i i4 i8 i14 i10 i13 i12 i7

for 0 i 14. Fig. 2 shows the parity check matrix in binaryform. By deleting any 56 binary columns from theinformation bit part of this matrix, we can obtain a practical76; 64S3=8EC-S8ED systematic code with 12 check bits.

Fig. 3 shows another example of an St=bEC-SbED code forthe case where t 2 and b 3, i.e., a 110; 102S2=3EC-S3EDcode. As shown in Theorem 4, this code has two iterative

partitions, H8 and H5, corresponding to the information bitportion of the codeword, and an identity matrix I8corresponding to the check bit portion of the codeword.

2.4 Byte Error Correcting Capability

The St=bEC-SbED codes obtained by applying the abovemethods can correct some b-bit byte errors as well. Here, weconsider a lower bound on b-bit byte error correctingcapability for St=bEC-SbED codes obtained by applyingLemma 1. The b-bit byte error correcting capability of a codeis given by:

PSbEC Number of correctable b-bit byte error patterns

Total number of b-bit byte error patterns:


Fig. 1. Hong-Patel [14] type structure of the parity check matrix.


5/13

The following theorem illustrates the b-bit byte error

correctingcapabilityofacodeobtainedbyapplyingLemma1.

Theorem 5. Let Cbe an N; N R systematic code obtained byapplying Lemma 1. Assume that the null space of H0

represents a b; b r binary linear code. Then, the b-bit byteerror correcting capability, PSbEC of the code C satisfies:

PSbEC !2Rb2b 1 2Rr 1

2R 1 :

Proof. Consider the R N binary systematic parity check

matrix obtained by applying Lemma 1. This represents a

code where the last byte has only R b bits, and the code

length N b 2Rb R b bits. All the syndrome

patterns that are generated by single byte errors are

given in Fig. 4.Here, the syndrome patterns resulting from b-bit byte

errors occurring in any of the first N R=b bytes areshown on the left two patterns, whereas syndromepatterns shown on the right two patterns are the results

of b-bit byte errors corrupting the second last byte andR b-bit byte errors corrupting the last byte. Thesyndromes which have nonzero upper part and nonzerolower part indicate correctable error patterns because the

upper part indicates the error pattern and the lower partindicates the error location. Also, the syndromes with all-zero upper part and nonzero lower part are generated bycorrectable errors because these are errors corrupting thelast byte. Only the syndromes that have nonzero upperpart and all-zero lower part are generated by uncorrect-able error patterns. To that end, we know that an error

pattern e 2 GF2b, for which e H0 T 0 holds, gen-erates such syndromes. These error patterns representthe null space ofH0. Considering the null space ofH0 as ab; b r linear code, we know that the total number ofsuch nonzero error patterns in a byte is given by 2br 1.Therefore, the single b-bit byte error correcting capabilityof the code is given by:

PSbEC Number of correctable b-bit byte error patterns

Total number of b-bit byte error patterns

1 NRb

b 2br 1

NRbb 2

b

1 2Rb

1

1 2br 1

2b 1 2Rb 1 bNRb:


Fig. 2. Example of 132; 120 S3=8EC-S8ED code.

Fig. 3. Example of

110;

102

S2=3

EC-S3

ED code.


6/13

However, we know that b=N R b ! 2Rb becauseN b 2Rb R b for codes obtained by applyingLemma 1. Therefore, we end up with:

PSbEC ! 1 2br 1

2b 1 2Rb2Rb 1

2Rb2b 1 2Rr 1

2R 1;

as required by Theorem 5. tu

For the S3=8EC-S8ED code shown in Fig. 2, we haveR 12, b 8, and r 4. Then, the 8-bit byte errorcorrecting capability of the S3=8EC-S8ED code satisfiesPS8EC ! f2

428 1 28 1g=212 1 0:9413; that is,any shortened S3=8EC-S8ED code with check bit length R 12 corrects 94.13 percent or more of the total 8-bit byteerrors.

3 DECODING

In this section, we will consider a decoding method forcodes derived by Lemma 1, and evaluate the decodinghardware complexity for the practical S3=8EC-S8ED code.

This decoding method can be easily extended to the codesin Theorem 4. The decoding method presented in thissection is guaranteed to correct all single t=b-errors. Inaddition, it corrects all b-bit byte errors which arecorrectable. Uncorrectable b-bit byte error patterns are onlydetected. Fig. 5 shows the decoding logic flow chart.

Let v be the received word. The syndrome S can becalculated as follows: v HT S s1 s2 , where s1 2GF2b and s2 2 GF2r.

We know that the syndrome vector s1 corresponds to the

error pattern in GF2b, whereas the syndrome vector s2indicates the location of single t=b-error or a correctable b-bitbyte error pattern, except for the case where the error hascorrupted last byte. Based on this, we can devise a decodingalgorithm as follows:

1. s1 0; s2 0 : There are no errors. The receivedword is a codeword.

2. s1 0; s2 6 0 : The last byte is in error. This byte hasr bits in it. The error pattern itself is given by s2.

3. s1 6 0; s2 0 : Ifs1 H0 T6 0, it is a correctable error

pattern. The error pattern is given by s1, and theerror location is the second last byte in the code-

word. On the other hand, if s1 H

0

T

0, the error

pattern cannot be corrected. In this case we generatea signal to detect such an error.

4. s1 6 0; s2 6 0 : In this case, the error pattern is givenby s1. To find the error location, we calculate s1

iH0 T

in parallel for 0 i 2r 2. The ith byte isin error if s1 iH

0 T s2 holds, otherwise it is not.

It is apparent from the discussion in Section 2.4 that theabove decoding method corrects all correctable single byteerrors including all single t=b-errors.

Fig. 6 shows the 8-bit byte error detector for the

S3=8EC-S8ED code. This implements the function s1 H0 T

and outputs a logical 1 to indicate error detection whenever

s1 H0 T 0 for a nonzero syndrome. Here, s1 is the vector

represented by the upper 8 bits of the syndrome. Fig. 7 shows

the syndrome decoder and error corrector circuitry corre-

sponding to the first byte of the S3=8EC-S8ED code shown in

Fig. 2. Table 2 shows the decoder hardware complexity of the

S3=8EC-S8ED code for practical information lengths such as

64, 128, and 256 bits. In this table, a 4-input AND/OR gate

countsas one gateand a 2-input XOR gate counts as 1.5 gates.

4 EVALUATION

Fig. 8 shows the check bit length versus information bitlength relationship of S3=8EC-S8ED and S8EC codes alongwith S3=8EC-S8ED bound by Theorem 3. Clearly, comparedto the 16 check bits required by the S8EC code, the

S3=8EC-S8ED code requires only 12 check bits for thepractical information length 64 bits. Furthermore, it is clear

from Fig. 8 that the S3=8EC-S8ED code requires less number


Fig. 4. Syndrome patterns.

Fig. 5. Decoding logic flow chart.


7/13

of check bits than its counterpart S8EC code for other

practical information lengths such as 128 and 256 bits. Forthe practical case where the chip output is 8 bits, i.e., b 8,the check bit length versus information bit length relation-ship of S3=8EC-S8ED code for t 3; 4; 7, and 8 is shown inFig. 9. This figure also shows the H0 matrices used in codeconstructions. On the other hand, Fig. 10 shows the checkbit length versus information bit length relationship ofS3=4EC-S4ED codes along with S3=4EC-S4ED bound byTheorem 3. This figure also shows the S3=4EC-S4ED codesobtained in Theorem 7 in the next section as perfect codeswith check bit lengths 7, 10, 13, etc. The S3=4EC-S4ED codesobtained in Theorem 4 require no extra bits thanS3=4EC-S4ED bound for most of the practical information

bit lengths.The error detection capabilities and single 8-bit byte error

correcting capabilities of the S3=8EC-S8ED code are shown inTables 3 and 4, respectively. The error detection capabilitiesareshownfor three types of errors, such as randomdouble biterrors, random triple bit errors, and single 4-bit byte plussinglebiterrors.Table4showsthattheproposedcodescorrectmore than 94 percent single 8-bit byte errors including allsingle 3=8-errors. In both tables, we usesystematic 76; 64S3=8EC-S8ED, 141; 128S3=8EC-S8ED, 270; 256S3=8EC-S8ED

codes, which are shortened codes of original 132; 120S3=8EC-S8ED, 261; 248S3=8EC-S8ED, 518; 504S3=8EC-S8EDcodes, respectively.

5 PERFECT St=bEC-SbED CODE WITH t b 1

We know that, for the case where t b, the St=bEC-SbED

code becomes a single b-bit byte error correcting code, i.e.,SbEC code. It is well known that perfect SbEC codes doexist, and can be constructed by Hong-Patel constructionmethod [14]. In this section, we consider the case wheret b 1; that is, Sb1=bEC-SbED codes which correct allsingle b-bit byte error patterns except when all b bits in ab-bit byte are in error. It detects when all b bits in a b-bit byteare in error. We will show that perfect Sb1=bEC-SbEDcodes exists whenever R 1 is an integer multiple of b 1and N b 2R1 1=2b1 1, and provide a theorem toconstruct these perfect codes.

It is obvious that an Sb1=bEC-SbED code requires atleast 2b 1 check bits. The following lemma states that abinary linear Sb1=bEC-SbED code with one shortened byte

cannot be a perfect code.

Lemma 3. A perfect binary linear Sb1=bEC-SbED code withone shortened c-bit byte, where 1 c b 1 cannot exist.

Proof. Assume that such a perfect code exists with codelength N nb c and check bit length R, where n is anatural number. Since c b 1, the Sb1=bEC-SbED codecorrects all error patterns that corrupt the shortened bytewith c bits. The total number of different error patternswith b 1bits corrupted within a single b-bitbyte is givenby

Pb1i1

bi

. There are n bytes with b-bits, therefore, we

need a total ofn Pb1

i1bi

different syndromes to correct

all these errors. Further, 2c 1 syndromes are necessary to

correct all different2c 1

errors that corrupt the shortenedbyte. Finally, we need just one syndrome to detect b-bitbyte error patterns with all b bits corrupted in it. A code isperfect if and only if it uses all available nonzerosyndromes. Therefore, we have:

2R 1 nXb1i1

b

i

2c 1 1

n2b 2 2c 1 1:


Fig. 6. Byte error detector for S3=8EC-S8ED code.

Fig. 7. Syndrom decoder and error corrector corresponding to the first byte of S3=8

EC-S8

ED code.


8/13

From the above, we get the following equation:

n 2R 2 2c 1

2b 2: 1

We know that 2R 2 and 2b 2 are even numbers,

whereas 2c 1 is an odd number. Therefore, the

numerator 2R 2 2c 1 of (1) is an odd number

and the denominator 2b 2 is an even number. But,

this is impossible because n is a natural number. Thus,

there is no perfect Sb1=bEC-SbED code with one

shortened byte with c bits, where 1 c b 1. tu

Theorem 6. A perfect binary linear N; N RSb1=b

EC-SbED code exists only if R 1 is an integer multiple of

b 1 and N b 2R1 1=2b1 1.

Proof. From Lemma 3, we know that a perfect Sb1=bEC-SbED code with one shortened byte cannot exist,

therefore, we assume thatthe code length Nisamultipleof

byte length b.Then,fromthefactthataperfectcodeusesall

available syndromes, we have:

2R 1 N=b Xb1i1

b

i

1:

From the above equation, we obtain the followingequation:

N=b 2R 2=2b 2 2R1 1=2b1 1: 2

Weknowthat(2)istrueforaninteger N=b onlywhen R 1is an integer multiple of b 1, subsequently, a perfectN; N RSb1=b EC-SbED codes exists only when R 1is an integer multiple ofb 1, and the code length is givenby N b 2R1 1=2b1 1 bits. tu

The following theorem shows how we can construct aperfect Sb1=bEC-SbED code by using the GF2

b1 subfieldcosets ofGF2R1 whenever R 1 is an integer multiple of

b 1.

Theorem 7. Let be a primitive element ofGF2R1 such thatR 1 is an integer multiple ofb 1. For 0 i s 1, define

the R 1 b binary matrix Hi as follows:

Hi i is i2s ib2s fi

;

where s 2R1 1=2b1 1 and fi Pb2

j0 ijs.

The null space of

is a perfect Sb1=bEC-SbED code with code length N

b s b 2R1 1=2b1 1 bits and check bit length

R bits.

Proof. Consider the submatrix Hi for any 0 i s 1.Since i is i2s ib2s fi, the sum-

mation of all binary column vectors in Hi results in azero vector. This means that any b 1 or less numbercolumns in Hi are linearly independent, because the firstb 1 elements i; is; i2s; ; ib2s are linearlyindependent. Further, the subspace spanned by thebinary column vectors of Hi, in fact, represents a


TABLE 2Decoder Gate Amount for S3=8EC-S8ED Code

Fig. 8. Check bit length versus information bit length for S8

EC and S3=8

EC-S8

ED codes.


9/13

multiplicative coset of the subfield GF2b1. Therefore,

the subspaces spanned by the binary columns of Hi and

Hj are disjoint for 8i; j, 0 i 6 j s 1. This implies

that the code has Sb1=bEC capability. On the other

hand, when all the b bits are in error, the resulting

syndrome is

1

0

!;

which is clearly nonzero. Also, this syndrome is

distinguishable from any b 1=b-error syndromes

because b 1=b-errors generate a syndrome of the form

ab

!;

where a 2 GF2, b 2 GF2R1. The optimality of the

code in Theorem 7 can be easily proven by showing that

the code length b 2R1 1=2b1 1 meets the upper

bound given by the inequality in Theorem 3. tu

ByusingTheorem7,wecandesignperfectSb1=bEC-SbEDcodesforanyvalueofb ! 2,andanyvalueofR suchthat R 1

is an integer multiple of b 1. Fig. 11 shows a perfect

Sb1=bEC-SbED code for the case where b 4 and R 7. In

this case, there are 9 P3

i143

126 different 3-bit in a 4-bit

byte error patterns, thus we need 126 syndromes to correct


Fig. 9. Check bit length versus information bit length graph of S3=8EC-S8ED codes and H0 matrices for 2 t 8.

Fig. 10. Check bit length versus information bit length graph of S3=4

EC-S4

ED code.


10/13

them. On the other hand, there are nine different byte errorpatterns with all 4 bits are corrupted. However, we need justone more syndrome to detect them. Therefore, the totalnumberofsyndromesrequiredis127,whichissameas27 1.Since the code uses only 7 check bits, it is a perfect code.

6 St=BEC-SbEC-SBED CODE

Recent high-density DRAM chips have a multibankarchitecture where each bank usually has a number ofmemory subarrays which are physically separated fromeach other [5], [6], [7]. As such, the binary bits stored in amemory subarray become highly independent of bits storedin other memory subarrays. It is therefore advantageous toconsider the entire chip output as a B-bit block andsubarray output as a b-bit byte [15], [16]. Fig. 12 illustratesthese concepts by showing the architecture of a recent 16MbHigh-Density DRAM chip [5] along with its correspondingorganization of bit, byte, and block in a codeword.

In these memory chips, apart from multiple random biterrors, i.e., t=B-errors, errors caused by subarray data faults,i.e., b-bit byte errors, are a source concern, too. Therefore, inaddition to correcting multiple random bit errors corrupt-ing a single memory chip, correction of errors caused bysingle subarray data faults are desired as well. In otherwords, an St=BEC-SBED code with SbEC capability, i.e., anSt=BEC-SbEC-SBED code is desirable under this situation.

The following theorems show that the proposed con-struction method can be used to design St=BEC-SbEC-SBEDcodes. We know that codes capable of correcting singlet=B-errors for the case where t ! b are also capable of

correcting b-bit byte errors. Here, we will consider the casewhere t < b.

Theorem 8. Let

Hi hi;0 hi;1 hi;2 hi;B=b1

for i 0; 1; 2; n 1, where hi;j isan r b binary submatrixfor 0 j B=b 1. The null space of H H0H1H2H3 Hn1 is an St=BEC-SbEC-SBED code, if and only if:

1. e HTi 6 0 for 0 i n 1, 8e 2 GF2B,

2.e1 H

T

i 6 e2 H

T

j for 0 i < j n 1, 8e1 2 Et=B,8e2 2 GF2B, and

3. e1 hTi;p 6 e2 HT

j f o r 0 i < j n 1, 0 p B=b 1, 8e1 2 GF2

b, 8e2 2 GF2B,

where Et=B e 2 GF2B j 1 we t

and HT denotes

the transpose of matrix H.

Proof. Condition 1 of Theorem 8 ensures that all the B-bitbyte error patterns including single t=B-error patternsand b-bit byte error patterns generate a nonzerosyndrome. Condition 2 includes the following condition:

e1 HTi 6 e2 H

Tj for 0 i 6 j n 1;8e1; e2 2 Et=B;


single t=B-error patterns are distinguishable. Therefore,the code is capable of correcting single t=B-errors.

Similarly, Condition 3 includes the following condition:

e1 hTi;p 6 e2 h

Tj;q for 0 i 6 j n 1;

0 p; q B=b 1; 8e1; e2 2 GF2b;


single b-bit byte error patterns are distinguishable.

Therefore, the code is capable of correcting single b-bit

byte errors. On the other hand, Conditions 2 and 3,

themselves, assert that syndromes generated by B-bit

block error patterns are different from that of t=B-error

patterns and b-bit byte error patterns, respectively.

Therefore, B-bit block error patterns are detectable. tu

Theorem 9. A linear binary St=bEC-SbEC-SBED code requiresat least B b check bits.

Proof. According to Condition 3 of Theorem 8, B b

binary columns of the parity check matrix of an

St=bEC-SbEC-SBED code are linearly independent.

Therefore, a linear binary St=bEC-SbEC-SBED code

requires at least B b check bits. tu

Theorem 10. A linear binary N; N RSt=BEC-SbEC-SBED

code exists only if


TABLE 3Error Detection Capabilities of S3=8EC-S8ED Code

for Other Types of Errors(%)

TABLE 4Single 8-Bit Byte Error Correction Capability PS8EC

of S3=8EC-S8ED Code(%)

Fig. 11. Example of a perfect 36; 29 S3=4EC-S4ED code.


11/13

N2R 2B

Pti1

Bi

Bb

Pbit1

bi

1B

Pti1

Bi

1b

Pbit1

bi

:Proof. The total number of t=B-errors that corrupt a single

B-bitblock is given byPt

i1Bi

. There are N=Bblocks in a

codeword with length N bits. Therefore, we need N=BPti1

Bi

different syndrome patterns to correct all singlet=B-error patterns. Similarly, we need Nb

Pbit1

bi

differ-

entsyndrome patterns to correct all single b-bitbyte errors

patterns which are not t=B-error patterns. On the other

hand, in a B-bit block, the total number of single B-bit

block error patterns that are not single t=B-errors or b-bit

byte errors is given by 2B 1 Pt

i1Bi

+ Bb

Pbit1

bi

.

According to the Conditions 1, 2, and 3 of Theorem 8, all

these block error patterns should generate distinguishable

nonzero syndromes when considered for any particular

block. Therefore, we need another 2B 1 Pti1 Bi +Bb

Pbit1

bi

syndromes for detecting single B-bit block

errors. From this, we have:

2R 1 !N

B

Xti1

B

i

N

b

Xbit1

b

i

2B 1

Xti1

B

i

B

b

Xbit1

b

i

:

Subsequently, the inequality in Theorem 10 holds. tu

Theorem 11. Let the r B matrix H0 with rank H0 ! t be

defined as follows:

H0 H0H1H2 Hi HB=b1;

where Hi i s an r b binary submatrix for i 0; 1; 2,

;B=b 1. T he n, i f r an k Hi b f o r i 0; 1; 2,

;B=b 1, t he c od e d ef in ed b y T he or em 4 i s a n

St=BEC-SbEC-SBED code.

Proof. Clearly, Conditions 1 and 2 of Theorem 8 are satisfied.

We only need to show that Condition 3 is satisfied as well.Let Sx and Sy denote the syndromes generated by any

B-bit error patterns corrupting single blocks in Px and Py,

respectively, where 0 x 6 y ! 1. Let Sc denote the

syndrome generatedby any B-bit error pattern corrupting

a single block in Pc. Wecan easily prove that Sx 6 Sc 6 Syand Sx 6 Sy, where x 6 y. Therefore, we only need show

that the binary columns corresponding to the matrix

HRzb, where 0 z ! 1, satisfy Condition 3 of Theo-

rem 8. To prove that, we need to show that

IB

iH0 ! et

b6

IB

jH0 ! et

B3

holds for 0 i 6 j 2Rz1b 2, where is a primitive

element ofGF2Rz1b and 8eb; eB 2 GF2B such that

eb represents a b-bit byte error pattern and eB represents

a B-bit block error pattern. Suppose that

IBiH0

! etb

IBjH0

! etB

holds, then eb eB holds because the upper submatrices

are identity matrices. However, the fact that rank Hk

b for any k 0; 1; ;B=b 1 implies that iH0 etb 6 0.

Subsequently,

i

H

0

e

t

b

j

H

0

e

t

B leads to i j,


Fig. 12. Organization of bit, byte, and block for a 16Mb high-density DRAM chip with nibbled-page architecture [4].


12/13

which is a contradiction. Therefore, (3) holds, as required

by Theorem 8. tu

The example code shown in Fig. 2 is in fact capable of

correcting 4-bit byte errors, i.e., it is an S3=8EC-S4EC-S8ED

code, because the H0 matrix in this case is given by:

where and are primitive elements of GF23 and

GF24

, respectively. Clearly, rank H0 rank H1 4 asrequired by Theorem 11.

Fig. 13 shows the check bit length versus information

bit length relationship of S3=8EC-S4EC-S8ED and S8EC

codes along with S3=8EC-S4EC-S8ED bound. Also, the

parallel decoding method illustrated in Section 3 can be

directly employed for decoding St=bEC-SbEC-SBED codes.

7 CONCLUSION

In this paper, we have proposed a class of systematic codes

called St=bEC-SbED codes. The proposed codes are capable

of correcting random t-bit errors occurring within a b-bit

byte and simultaneously detecting b-bit byte errors. Wehave illustrated how the proposed codes can be made to

correct errors caused by subarray data faults as well. For the

practical case where the chip data output is 8 bits, i.e., b 8,

the S3=8EC-S8ED code presented in this paper requires only

12 check bits at information length 64 bits, whereas its

counterpart S8EC code requires 16 check bits. Furthermore,

this S3=8EC-S8ED code corrects more than 94 percent of the

8-bit byte errors including all single 4-bit byte errors caused

by subarray data faults and all single 3=8-errors. We have

proved that, in addition to perfect SbEC codes designed by

Hong-Patel construction method, perfect St=bEC-SbED codes

do exist when t b 1, i.e., perfect Sb1=bEC-SbED codes,

and provided a theorem to construct these codes. This

paper also clarifies that, for the special case where t b, the

proposed codes include the Hong-Patel codes which are

known as maximal SbEC codes.

REFERENCES[1] T.J. OGorman, J.M. Ross, A.H. Taber, et al., Field Testing for

Cosmic Ray Soft Errors in Semiconductor Memories, IBM J.Research and Development, vol. 40, no. 1, pp. 41-50, Jan. 1996.

[2] L.W. Massengil, Cosmic and Terrestrial Single Event RadianEffects in Dynamic Random Access Memories, IEEE Trans.Nuclear Science, vol. 43, no. 2, pp. 576-593, Apr. 1996.

[3] G. Umanesan and E. Fujiwara, A Class of Random Multiple Bitsin a Byte Error Correcting (St=bEC) Codes for SemiconductorMemory Systems, Proc. 2002 Pacific Rim Intl Symp. DependableComputing, Dec. 2002.

[4] T.N. Rao and E. Fujiwara, Error Control Coding For ComputerSystems. Prentice-Hall, 1989.

[5] K. Numata, Y. Oowaki, Y. Itoh, et al., New Nibbled-PageArchitecture for High-Density DRAMs, IEEE J. Solid StateCircuits, vol. 24, no. 4, pp. 900-904, Aug. 1989.

[6] T. Saeki, Y. Nakaoka, M. Fujita, et al., A 2.5-ns Clock Access,256mhz, 256mb SDRAM with Synchronous Mirror Delay, IEEE J.Solid State Circuits, vol. 31, no. 11, pp. 1656-1668, Nov. 1996.

[7] T. Sunaga, K. Hosokawa, Y. Nakamura, et al., A Full Bit PerfectArchitecture for Synchronous DRAMs, IEEE J. Solid StateCircuits, vol. 30, no. 9, pp. 998-1005, Nov. 1995.

[8] S . Ka neda, A General C lass of Odd-Weigh t-C olu m nSEC-DED-SbED Codes for Memory System Applications, IEEE

Trans. Computers, vol. 33, no. 8, pp. 737-739, Aug. 1984.[9] G. Umanesan and E. Fujiwara, Adjacent Double Bit ErrorCorrecting Codes with Single Byte Error Detecting Capabilityfor Memory Systems, IEICE Trans. Fundamentals, vol. E85-A,no. 2,pp. 490-496, Feb. 2002.

[10] W. Peterson, Jr. and E.J. Weldon, Error-Correcting Codes. MITPress, 1972.

[11] A.A. Davydov and A.Y. Drozhzhina-Labinskaya, Length 4 ByteError and Double Independent Error Correction by BCH Code inSemiconductor Memories, Automation and Remote Control, vol. 50,no. 11, pp. 1570-1579, Nov. 1989.

[12] N.H. Vaidya and D.K. Pradhan, A New Class of Bit and ByteError Control Codes, IEEE Trans. Information Theory, vol. 38, no. 5,pp. 1617-1623, Sept. 1992.

[13] G. Umanesan and E. Fujiwara, Random Double Bit ErrorCorrectingSingle b-Bit Byte Error Correcting (DEC-SbEC) Codesfor Memory Systems, IEICE Trans. Fundamentals, vol. E85-A,no. 1,

pp. 273-276, Jan. 2002.


Fig. 13. Check bit length versus information bit length for S8EC and S3=8EC-S4EC-S8ED codes.


13/13

[14] S.J. Hong and A.M. Patel, A General Class of Maximal Codes forComputer Applications, IEEE Trans. Computers, vol. 21, no. 12,pp. 1322-1331, Dec. 1972.

[15] G. Umanesan and E. Fujiwara, Single Byte Error CorrectingCodes with Double Bit within a Block Error Correcting Capabilityfor Memory Systems, IEICE Trans. Fundamentals, vol. E85-A, no.2,pp. 513-517, Feb. 2002.

[16] Y. Joji and E. Fujiwara, A Class of Byte Error Control CodesBased on Hierarchical Error Model, Technical Report of IEICE,

FTS96-58, Feb. 1997.

Ganesan Umanesan received the BSc degreein mathematics and the BEng degree in elec-trical and electronic engineering, from the Uni-versity of Melbourne, Australia, in 1995 and1996, respectively. He received the MEng andPhD degrees in computer science from theTokyo Institute of Technology, Japan, in 1999and 2002, respectively. Currently, he is apostdoctoral research student in the Departmentof Computer Science, Tokyo Institute of Tech-

nology, Japan. He is also a visiting student fellow at IBMs TokyoResearch Laboratory, Yamato, Japan. His research interests includeerror control coding for high-speed memories and advanced opticalcommunication channels and fault-tolerant computing. He is a memberof the IEEE and the IEEE Computer Society.

Eiji Fujiwara received the BS and MS degreesin electronic engineering in 1968 and 1970,respectively, and the DrEng degree in 1981, allfrom the Tokyo Institute of Technology, Tokyo,Japan. In 1970, he joined the NTT ElectricalCommunication Laboratories, and engaged indeveloping DIPS-1 and DIPS-11 computersystems and in research on fault-tolerant com-puting. From June 1985 to July 1986, he was a

visiting professor at the Center for AdvancedComputer Studies, the University of Southwestern Louisiana. In October1988, he moved to the Department of Computer Science, TokyoInstitute of Technology, Tokyo, Japan, and is now a full professor and adirector of the Global Scientific Information and Computing Center(GSIC), Tokyo Institute of Technology. His current research interestsinclude error correcting codes, fault-tolerant computing, and errortolerance in data compression. Dr. Fujiwara received the YoungEngineer Award from the IEICE in 1978 and the Teshima MemorialResearch Award in 1991. He is a coauthor of Error-Control Coding forComputer Systems (Englewood Cliffs, New Jersey: Prentice-Hall,1989), Essentials of Error-Control Coding Techniques (New York:Academic Press, 1990), and Japanese books. He is a fellow of theIEEE and of the IEICE in 1997 and 2001, respectively. He is a memberof the Information Processing Society of Japan.

. For more information on this or any other computing topic,please visit our Digital Library at http://computer.org/publications/dlib.


Download - A Class of St-bEC-SbED Codes

Top Related