error-detecting and -correcting codes radu tr^ mbit˘a˘smath.ubbcluj.ro/~tradu/ti/codingvb.pdf ·...

Elements of Coding TheoryError-detecting and -correcting codes

Radu Trımbitas

UBB

January 2013

Radu Trımbitas (UBB) Elements of Coding Theory January 2013 1 / 42

Outline I

1 Hamming’s TheoryDefinitionsHamming Bound

2 Linear CodesBasicsSingleton boundsReed-Solomon CodesMultivariate Polynomial CodesBCH Codes


Hamming’s Problem

Hamming studied magnetic storage devices. He wanted to build (outof magnetic tapes) a reliable storage medium where data was storedin blocks of size 63 (this is a nice number, we will see why later).When you try to read information from this device, bits may becorrupted, i.e. flipped (from 0 to 1, or 1 to 0). Let us consider thecase that at most 1 bit in every block of 63 bits may be corrupted.How can we store the information so that all is not lost? We mustdesign an encoding of the message to a codeword with enoughredundancy so that we can recover the original sequence from thereceived word by decoding. (In Hamming’s problem about storage westill say ”‘received word”’)

Naive solution – repetition code – to store each bit three times, sothat any one bit that is erroneously ipped can be detected andcorrected by majority decoding on its block of three.


Basic Notions I

The Hamming encoding tries to do better with the following matrix:

G =

1 0 0 0 0 1 10 1 0 0 1 0 10 0 1 0 1 1 00 0 0 1 1 1 1

Given a sequence of bits, we chop it into 4 bit chunks. Let b be the vectorrepresenting one such chunk, then we encode b as the 7 bit bG , where allarithmetic is performed mod 2. Clearly the efficiency is a much better 4

7 ,though we still need to show that this code can correct one bit errors.

Claim 1

∀ b1 6= b2, b1G and b2G differ in ≥ 3 coordinates.

First we present some definitions. We denote by Σ the alphabet, and theambient space Σn represents the set of n letter words over the alphabet Σ.


Basic Notions II

Definition 2

The Hamming distance ∆(x , y) between x , y ∈ Σn is the number ofcoordinates i where xi 6= yi .

The Hamming distance is a metric since it is easy to verify that:

∆(x , y) = ∆(y , x)

∆(x , z) ≤ ∆(x , y) + ∆(y , z)

∆(x , y) = 0⇔ x = y

In our case, consider the space of all possible encoded words {0, 1}7.

If we can prove our claim, then this means that in this space andunder the Hamming metric, each code word bG will have no othercode word within a radius of 2 around it.


Basic Notions III

In fact, any point at Hamming distance 1 from a code word isguaranteed to be closer to that code word than any other, and thuswe can correct one bit errors.

Definition 3

An Error Correcting Code is a set of code words C ⊆ Σn. The minimumdistance of C , written ∆(C ), is the minimum Hamming distance betweenpairs of different code words in C .

Definition 4

An Error Correcting Code is said to be e error detecting if it can tell thatan error occurred when there were ≤ e errors, and at least one erroroccurred. It is said to be t error correcting if it can tell where the errorsare when there were ≤ e errors, and at least one error occurred.


Basic Notions IV

To formalize these notions we define the ball of radius t centered at x:

Ball(x , t) = {y ∈ Σn|∆(x , y) ≤ t}

Definition 5

Formally: A code C is t-error correcting if ∀x 6= y ∈ C ,Ball(x , t) ∩ Ball(y , t) = ∅.


Basic Notions V

Definition 6

Formally: A code is e-error detecting if ∀x ∈ C , Ball(x , e) ∩ C = {x}.


Basic Notions VI

Definition 7

The Hamming weight wt(v) of a vector v is the number of non-zerocoordinates of v .


Basic Notions VII

Proposition 8

∆(C ) = 2t + 1⇔ the code C is 2t error detecting and t error correcting.

Proof.

: ∆(C ) = 2t + 1⇒ 2t error detecting since the word would simply not bea code word. ∆(C ) = 2t + 1⇒ t error correcting since the word would becloser to the original code word than any other code word. We omit thereverse implications for now, though we note that the case for t errorcorrecting is easy.

We now present some key code parameters:

q = |Σ|n = block length of code(encoded length)k = message length(pre-encoded length) = logq |C |d = ∆(C )


Basic Notions VIII

Usually, we fix three of the above parameters and try to optimize thefourth.

Clearly, larger k and d values, and smaller n values are desirable. Italso turns out that smaller q values are desirable.

We may also try to maximize the rate of the code(efficiency ratio) kn

and the relative distance dn . We denote an error correcting code with

these parameters as a (n, k , d)q code.

Thus, proving our claim boils down to showing that {bG |b ∈ {0, 1}4} is a(7, 4, 3)2 code.


Basic Notions IX

Proof.

: Assume that ∆(b1G , b2G ) < 3 for b1 6= b2 ⇒ ∆((b1 − b2)G , 0) < 3⇒∃ non-zero c ∈ {0, 1}4 s.t. wt(cG ) < 3. Consider the matrix

H =

0 0 0 1 1 1 10 1 1 0 0 1 11 0 1 0 1 0 1

T

It can be shown that 1 that {bG |b} = {y |yH = 0}. Hence, it suffices toprove that: if a non-zero y ∈ {0, 1}7 s.t. yH = 0⇒ wt(y) ≥ 3, since thiswould contradict that wt(cG ) < 3 for some non-zero c.Assume wt(y) = 2⇒ 0 = yH = hi + hj , where h1, h2, ..., h7 are the rowsof the matrix H. But by the construction of H, this is impossible. Assumewt(y) = 1⇒ some row hi has all zeros. Again, impossible byconstruction. Thus wt(y) is at least 3 and we are done.


Basic Notions X

From the properties of the matrix used above, we see that we maygeneralize the H matrix for codes of block length n = 2` − 1 andminimum distance ≥ 3 simply by forming the 2` − 1 by ` matrixwhere the rows are the binary representations of all integers between1 and 2l − 1. The message length k that this generalized Hcorresponds to is left as an exercise.

Error correcting codes find application in a variety of different fields inmathematics and computer science.

In Algorithms, they can be viewed as interesting data structures.In Complexity, they are used for pseudo-randomness, hardnessamplification, and probabilistically checkable proofs.In Cryptography, they are applied to implement secret sharing schemesand proposal cryptosystems.Finally, they also arise in combinatorics and recreational mathematics.


Hamming Bound I

Lemma 9

For Σ = {0, 1} and x ∈ Σn,

|Ball(x , t)| =n

∑i=0

(n

i

)≡ Vol(n, t)

Proof.

A vector y ∈ Ball(x , t) if the number of coordinates of y that differ fromx is at most t. Since Σ = {0, 1}, they can only differ in one way, namelyby being the opposite bit (0 if the bit of x is 1 and 1 if the bit of x is 0).Thus, to count the number of ways to be in the ball, we choose i of the ncoordinates for i from 0 to t. We define Vol(n, t) to be the number ofpoints in the ball Ball(x , t).


Hamming Bound II

Theorem 10

For Σ = {0, 1}, if C is t-error correcting, then

|C | ≤ 2n

Vol(n, t)=

2n

∑ni=0

(n

i

) .

Proof.

If the code C is t-error correcting, then ∀x 6= y ∈ C ,Ball(x , t) ∩ Ball(y , t) = ∅, namely, the balls do not intersect. Thus,|C |Vol(n, t) ≤ Vol(AmbientSpace). Note that Vol(AmbientSpace) = 2n,so dividing gives the result.


Hamming Bound III

To decode a Hamming code (of minimum distance 3), we note thatmultiplying the received codeword by the parity check matrix Hassociated to the code will give 0 if no error occurred, while if 1 erroroccurred it will give the binary representation of the index of the bitwhere this error occurred.

This is true because suppose we receive the codeword c with an errorei (where ei is the 0/1 vector which is 1 only in the i-th coordinate).Then

(c + ei )H = cH + eiH = eiH.

Now note that eiH is simply the i-th row of H which by constructionis the binary representation of i .

Click here for examples with Haming codes html/hamex.html


html/hamex.html

Richard Wesley Hamming (1915-1998)American mathematician and computerscientist. Contributions to InformationTheory, Coding Theory and NumericalAnalysis. Involved in Manhattan project.Fellow of the ACM.


Linear Codes I

Definition 11

If the alphabet Σ is a finite field, then we say that a code C is linear if it isa linear subspace of Σn. That is, x , y ∈ C implies x + y ∈ C andx ∈ C , a ∈ Σ implies ax ∈ C .

Notationally, we represent linear codes with square brackets:[n, k, d ]q. All the codes we will see in this class are linear.

Linear codes are interesting for many reasons. For example, theencoding function is simple, just matrix multiplication. It is also easyto detect errors: since the code is linear there is a parity check matrixH such that C = {y : Hy = 0}, and therefore we can detect errorsagain by simple matrix multiplication.


Linear Codes II

Another interesting feature of a linear code is its dual. Let C be alinear code generated by the matrix G . C has a parity check matrixH. We define the dual code of C as the code generated by HT , thetranspose of the matrix H. It can be shown that the dual of the dualof C is C itself.

For example, consider the Hamming code with block lengthn = 2` − 1. Its dual is the code generated by a `× n matrix whosecolumns are all the non zero binary strings of length `. It is easy tosee that the encoding of a message b is

< b, x >x∈{0,1}`−0,

where < ., . > denotes inner product modulo 2. In other words, theencoding of b is the parity check of all the non-empty subsets of thebits of b.


Linear Codes III

It can be shown that if b 6= 0 then < b, x >= 1 for at least 2`−1 ofthe x ’s in {0, 1}` − 0. This implies that the dual code is a[2` − 1, `, 2`−1]2 code. It can be shown that this is the best possible,in the sense that there is no (2` − 1, `+ ε, 2`−1)2 code. This code iscalled the simplex code or the Hadamard code. The second namecomes from the french mathematician Jacques Hadamard whostudied the n× n matrices M such that MMT = nI , where I is then× n identity matrix. It can be shown that if we form a matrix withthe codewords from the previous code, we replace 0 with −1, and wepad the last bit with 0, then we obtain a Hadamard matrix.


Singleton bounds I

So far we have discussed Shannon’s theory, Hamming’s metric andHamming codes, and Hadamard codes. We are looking for aymptoticallygood codes to correct some constant fraction of errors while stilltrasmitting the information through the noisy channel at a positive rate.In our usual notation of [n, k , d ]q-codes, Hamming’s construction gives a[n, n− log2 n, 3]2-code while Hadamard codes were [n, log2 n, n

2 ]2-codes.

But we are looking for codes where kn and d

n have a lower boundindependent of n and q is not growing to ∞. In this scenario, we discussthe following simple impossibility result:

Theorem 12

For a code C : Σk → Σn with minimum distance d, n ≥ k + d − 1.


Singleton bounds II

Proof.

In other words, we want to prove that d ≤ n− (k − 1). Just project allthe codewords on the first (k − 1) coordinates. Since there are qk

different codewords, by pigeon-hole principle at least two of them shouldagree on these (k − 1) coordinates. But these then disagree on at mostthe remaining n− (k − 1) coordinates. And hence the minimum distanceof the code C , d ≤ n− (k − 1).


Reed-Solomon Codes I

Definition 13

Let Σ = Fq a finite field and α1, . . . , αn be distinct elements of Fq. Givenn, k and Fq such that k ≤ n ≤ q, we define the encoding function forReed-Solomon codes as: C : Σk → Σn which on message

m = (m0, m1, . . . , m(k−1)) consider the polynomial p(X ) = ∑(k−1)i=0 miX

i

and C (m) = 〈p(α1), p(α2), . . . , p(αn)〉.

Theorem 14

Reed-Solomon code matches the singleton bound. i.e. it’s a[n, k , n− (k − 1)]q-code.


Reed-Solomon Codes II

Proof.

The proof is based only on the simple fact that a non-zero polynomial ofdegree l over a field can have at most l zeroes.For Reed-Solomon code, two codewords (with corresponding polynomialsp1 and p2) agree at i-th coordinate iff (p1 − p2)(αi ) = 0. But (p1 − p2),from the above fact, can have at most (k − 1) zeros which means that theminimum distance d ≥ n− (k − 1).

Reed-Solomon codes are linear. This can be easily verified by the factthat the polynomials of degree ≤ (k − 1) form a vector space (i.e. ifp1, p2 are polynomials of degree ≤ (k − 1) then similarly are βp1 andp1 + p2). Since the polynomialsp(X ) ≡ 1, p(X ) = X , . . . , p(X ) = X (k−1) form the basis for this


Reed-Solomon Codes III

vector space, we can also find a generator matrix for Reed-Solomoncodes.

G =

1 1 . . . 1α1 α2 . . . αn

: : : :α1

(k−1) α2(k−1) . . . αn

(k−1)

One can also prove the theorem about the minimum distance ofReed-Solomon codes by using the fact that any k columns of G arelinearly independent (because αi ’s are distinct and thus theVandermonde matrix formed by the k columns is non-singular).


Reed-Solomon Codes IV

Using Reed-Solomon codes with k = n2 we can get a

[n, n2 , (n+2)

2 ]q-code which means that we can correct much more errosthan before. Typically Reed-Solomon codes are used for storage ofinformation in CD’s because they are robust against bursty errors thatcome in contiguous manner, unlike the random error model studied byShannon. Also if some information is erased in the corruptedencoding, we can still retrieve the original message by interpolatingthe polynomial on the remaining values we get.

A way to visualize Reed-Solomon code on binary alphabet is toconsider q = n = 2m. This givesC : Σk = F2

k log2 n → Σn = F2n log2 n a

[n log2 n, k log2 n, n− (k − 1)]2-code. And if you put n log2 n = Nand n− (k − 1) = 5 then this gives a [N, N − 4 log2 N, 5]2-code. Aswe have already seen, Hamming’s construction gave a[N, N − log2 N, 3]2-code and Hamming’s impossibility result said that


Reed-Solomon Codes V

for a [N, k, 2t + 1]2-code, k ≥ N − t log2 N. Reed-Solomon codedon’t achieve this but give a fairly closer k ≥ N − 2t log N (We willdiscuss later about BCH codes which match this bound).


Irving Stoy Reed (1923 – 2012) mathematicianand engineer. He is best known for co-inventing aclass of algebraic error-correcting anderror-detecting codes known as Reed–Solomoncodes in collaboration with Gustave Solomon. Healso co-invented the Reed–Muller code.Reed made many contributions to areas ofelectrical engineering including radar, signalprocessing, and image processing.

Gustave Solomon (1930 – 1996) was amathematician and electrical engineer who wasone of the founders of the algebraic theory oferror detection and correction. Solomon was bestknown for developing, along with Irving S. Reed,the algebraic error correction and detection codesnamed the Reed-Solomon codes.


Multivariate Polynomial Codes I

Here instead of considering polynomials over one variable (like inReed-Solomon codes), we will consider multivariate polynomials. Forexample, let’s see the following encoding similar to Reed-Solomon codesbut using bivariate polynomials.

Definition 15

Let Σ = Fq and let k = l2 and n = q2. A message is typicallym = (m00, m01, . . . , mll ) and is treated as the coefficients of a bivariatepolynomial p(X , Y ) = ∑l

i=0 ∑lj=0 mijX

iY j which has degree l in eachvariable. The encoding is just evaluating the polynomial over all theelements of Fq ×Fq. i.e. C (m) = 〈p(x , y)〉(x ,y )∈Fq×Fq

.

But what is the minimum distance of this code ? We will use the followinglemma to find it.


Multivariate Polynomial Codes II

Lemma 16 (Schwartz-Zippel lemma)

A multivariate polynomial Q(X1, . . . , Xm) (not identically 0) of totaldegree L is non-zero on at least (1− L

|S | ) fraction of points in Sm, where

S ⊆ Fq.


Multivariate Polynomial Codes III

proof-of-lemma . . . .

Induction on the number of variables. For m = 1, it’s easy as we knowthat Q(X1, . . . , Xm) can have at most L zeros over the field Fq. Nowassume that the induction hypothesis is true for the a multivariatepolynomial with upto (m− 1) variables, for m > 1. Consider

Q(X1, . . . , Xm) =t

∑i=0

X1iQi (X2, . . . , Xm)

where t ≤ L is the largest exponent of X1 in Q(X1, . . . , Xm). So the totaldegree of Qt(X2, . . . , Xm) is at most (L− t). . . .


. . . proof-of-lemma.

Induction hypothesis ⇒ Qt(X2, . . . , Xm) 6= 0 on at least (1− (L−t)|S | )

points in S (m−1). But suppose Qt(s2, . . . , sm) 6= 0 then Q(X1, s2, . . . , sm)is a not-identically-zero polynomial of degree t in X1, and therefore isnon-zero on at least (1− t

|S | ) fraction of choices for X1. So putting it all

together, Q(X1, . . . , Xm) is non-zero on at least (1− (L−t)|S | )(1− t

|S | ) ≥(1− L

|S | ) points in Sm.

Using this, for m = 2 and S = Fq, we get that the above bivariatepolynomial code is a [q2, l2, (1− 2l

q )q2]q-code.

Some interesting cases of multivariate polynomial codes includeReed-Solomon code (l = k and m = 1), Bivariate polynomial code(l =

√k and m = 2) and Hadamard code (l = 1, m = k and q = 2) !


BCH Codes I

A class of important and widely used cyclic codes that can correctmultiple errors, developed by R. C. Boseand D. K. Ray-Chaudhuri (1960)[4] and independently by A.Hocquenghem (1959)[5]

Let C1 be the code over Fn = F2t , n = 2t with the following paritycheck matrix:

H =

1 α1 α2

1 α31

1 α2 α22 α3

2...

......

...1 αn α2

n α3n

C1 is a [n, n− 4, 5]n code. The reason why the code has distance 5 isas follows. If the code has distance 4, there exists 4 rows of H thatare linearly dependent. However, this cannot happen because thesubmatrix consisting of 4 rows of H is a Vandemonde matrix whosedeterminant is non-zero when the elements are distinct.


BCH Codes II

Now consider CBCH = C ∩ {0, 1}n. Clearly, the length and thedistance of the code do not change so CBCH = [n, ?, 5]2. The mainquestion here is how many codewords there are in CBCH. We knowthat the all zero codeword is in CBCH but is there any othercodeword?

Let’s represent all entries in Fn by vectors such that:

F2t ←→ Ft2

α←→ Vα

1←→ (100 . . . 0)


BCH Codes III

Apply the representation above to H and we get a new matrix:

H ′ =

1 Vα1 Vα2

1Vα3

1

1 Vα2 Vα22

Vα32

......

......

1 Vαn Vα2n

Vα3n

If a {0, 1} vector X = [x1 . . . xn] satisfies XH ′ = 0 then

∑ xiV αi = 0

V∑ xiαi= 0

∑ xiαi = 0

XH = 0


BCH Codes IV

Consider a matrix H equal to H ′ with the third column removed:

H =

1 Vα1 Vα3

1

1 Vα2 Vα32

......

...1 Vαn Vα3

n

Proposition 17

For any X ∈ {0, 1}n such that X H = 0, XH = 0.


BCH Codes V

Proof.

The only question is whether ∑ xiα2i = 0. Over Fpt , (x + y)p = xp + yp

∀x , y . Therefore,

∑ xiα2i = ∑ x2

i α2i

=(∑ xiαi

)2

= ∑ xiαi = 0

The second and third columns of H impose on the code log n linearconstraints each so the dimension of the code is n− 2 log n− 1. Thus,CBCH is a [n, n− 2 log n− 1, 5] code.


BCH Codes VI

In general, the BCH code of distance d has the following parity checkmatrix:

1 α1 α21 . . . αd−2

1

1 α2 α22 . . . αd−2

2...

......

. . ....

1 αn α2n . . . αd−2

n

−→

1 α1 α31 . . . αd−2

1

1 α2 α32 . . . αd−2

2...

......

. . ....

1 αn α3n . . . αd−2

n

The number of columns is 1 +

⌈d−2

2

⌉log n so the binary code we get

satisfies n− k ≤⌈d−2

2

⌉log n.

By the Hamming bound for code of length n and distance d ,

2k(

n

d/2

)≤ 2n → n− k ≥ d

2log

n

d


BCH Codes VII

In the case d = no(1), n− k ≥ d2 log n. Thus, BCH is essentially

optimal as long as d is small.

The problem is more difficult for bigger alphabet. Consider code overthe ternary alphabet {0, 1, 2}. The Hamming bound isn− k ≥ d

2 log3 n. BCH technique gives n− k = 23 d log3 n and we do

not know how to get a better code in this case.


Raj Chandra Bose (19 June 1901 – 31 October 1987)was an Indian American mathematician and statisticianbest known for his work in design theory and the theoryof error-correcting codes in which the class of BCHcodes is partly named after him.Dwijendra Kumar Ray-Chaudhuri (Born November 1,1933) a Bengali-born mathematician and a statisticianis a professor emeritus at Ohio State University. He isbest known for his work in design theory and the theoryof error-correcting codes, in which the class of BCHcodes is partly named after him and his Ph.D. advisorBose.Alexis Hocquenghem (1908?-1990) was a Frenchmathematician. He is known for his discovery ofBose–Chaudhuri–Hocquenghem codes, today betterknown under the acronym BCH codes.


References I

Thomas M. Cover, Joy A. Thomas, Elements of Information Theory,2nd edition, Wiley, 2006.

David J.C. MacKay, Information Theory, Inference, and LearningAlgorithms, Cambridge University Press, 2003.

Robert M. Gray, Entropy and Information Theory, Springer, 2009

John C. Bowman, Coding Theory, University of Alberta, Edmonton,Canada, 2003

D. G. Hoffmann, Coding Theory. The Essential, Marcel Deker, 1991

C. E. Shannon, A mathematical theory of communication, Bell SystemTechnical Journal, 1948.

R. V. Hamming, Error detecting and error correcting codes, BellSystem Technical Journal, 29: 147-160, 1950


References II

Reed, Irving S.; Solomon, Gustave, Polynomial Codes over CertainFinite Fields, Journal of the Society for Industrial and AppliedMathematics (SIAM) 8 (2): 300–304, 1960

Bose, R. C., Ray-Chaudhuri, D. K., On A Class of Error CorrectingBinary Group Codes, Information and Control 3 (1): 68–79, 1960

Hocquenghem, A., Codes correcteurs d’erreurs, Chiffres (Paris) 2:147–156, 1959[]


error-detecting and -correcting codes radu tr^ mbit˘a˘smath.ubbcluj.ro/~tradu/ti/codingvb.pdf ·...

Documents