error detection and correction -...

1

Error Detection and Correction

ETM3136 Chapter 2 3 / 69

Info. Transmitter Transmitted

signal

Received

signal Receiver

Received

info.

Noise

Channel Source User

Source

encoder

Channel

encoder Modulator

Source

decoder

Channel

decoder

Demodulator

Transmitter

Receiver

Block Diagram Digital Communication Systems

Data, speech,

video, etc.

Data, speech,

video, etc.

Source

User

Channel

3

Coding position on a transmission system

Information Theory & Coding

What is Information Theory ?

Introduction

Purpose of a communication system :- transmission of information

from one point to another with high efficiency and reliability

Originally known as the mathematical theory of communications

deals with modeling and analysis of a communication system rather

than with physical sources and physical channels

Is a scientific study of information and of communication systems

designed to handle it such as telegraphy, radio communications etc

The minimum number of bits per symbol required to fully

represent the source.

Maximum transmission rate : Channel Capacity

In summary, it provides a quantitative measure of the

information contained in message signals and allows us to

determine the capacity of a communication system to transfer

this information from source to destination

Information Sources

An object that produces an event, the outcome of which is random

and in accordance with some probability distribution.

Sources of Information

- Analog :- source that has continuous set of amplitudes

- Digital (Discrete) :- source that has only finite set of symbols as

output called “source alphabet” and the elements of the set are called

“symbols”

- Source with Memory :- is one for which a current symbol depends

on the previous symbol

- A Memoryless source :- each symbol produced is independent

of the previous symbols (statistically independent)

Discrete memoryless source (DMS) - the list of symbols with

probability assigned to them, specification of the rate of generating

them by a source and independent of previous symbols

Information Content of a Discrete Memoryless Source

• The amount of information contained in an event is closely related to

its uncertainty.

• Messages with high probability of occurrence convey relatively

little information.

• Event with probability of 1 conveys zero information

• Information should be proportional to the uncertainty of an

outcome

Amount of information I(sk) = loga (1/pk) bit

Amount of Information (Information Content) For an DMS event S = sk = {s0 , s1 , s2 .... sk-1 }, k = 0,1,...K-1

Probability of occurrence = pk

Information content of the event sk is defined as

I(sk) = 0 for pk = 1, Absolutely certain, no information

I(sk) 0 for 0 pk 1, provides some or no information

I(sk) > I(si) for pk < pi., less probable more information

I(sksl) = I(sk) + I(sl) : for si and sk independent, information

contained in statistically independent outcomes should add

Properties :

The standard practice today to use a logarithm to base 2.

The resulting unit of the information is called the bit (a

contraction of binary bit)

Example: A source emits one of four possible symbols during

each signalling interval. These symbols occur with the

probabilities: po=0.4, p1=0.3, p2=0.2, and p3=0.1. Find the amount

of information gained by observing the source emitting each of

these symbols.

Solution: Let the event S=sk denote the emission of symbol sk by

the source.

Hence, I(sk) = log2 (1/pk) bits

I(s0) = log2 (1/0.4) =1.322 bits

Solution: Let the event S=sk denote the emission of symbol sk by

the source.

Hence, I(sk) = log2 (1/pk) bits

I(s0) = log2 (1/0.4) =1.322 bits

I(s1) = log2 (1/0.3) =1.737 bits

I(s2) = log2 (1/0.2) =2.322 bits

I(s3) = log2 (1/0.1) =3.322 bits

Usually a long sequences of symbols are transmitted, therefore

average information is interested as compared to information content.

Average Information , H = E[ I(sk) ]

= pk I(sk)

= pk log2 (1/pk) b/symbol

AVERAGE INFORMATION or ENTROPY

Average Information content per symbol is called Entropy

Example: Consider a discrete memoryless source with source

alphabet {s0, s1, s2} with probabilities p0=1/4 , p1=1/4 and

p2=1/2. Find the entropy of the source.

Solution: The entropy of the given source is

H = p0 log2 (1/p0) + p1 log2 (1/p1) + p2log2 (1/p2)

= ¼ log2 (4) + ¼ log2 (4) + ½ log2 (2)

= 2/4 + 2/4 + ½

= 3/2 bits per symbol

A parity bit, or check bit, is a bit added to a string of binary code that indicates whether the number of 1-bits in the string is even or odd. Parity bits are used as the simplest form of error detecting code. There are two variants of parity bits: even parity bit and odd parity

bit.

In the case of even parity, for a given set of bits, the occurrences of bits

whose value is 1 is counted. If that count is odd, the parity bit value is

set to 1, making the total count of occurrences of 1's in the whole set

(including the parity bit) an even number. If the count of 1's in a given

set of bits is already even, the parity bit's value is 0. For example:

a. 1 1 1 1 1 0 1 _0_

b. 1 0 0 0 1 1 0 _1_

In example a. the number of 1’s is counted and it is even number so the parity bit is 0. In b. the number of 1’s is odd so the parity bit is 1 so that the total number of 1 would be even.

In the case of odd parity, the coding is reversed. For

a given set of bits, if the count of bits with a value of

1 is even, the parity bit value is set to 1 making the

total count of 1's in the whole set (including the

parity bit) an odd number. If the count of bits with a

value of 1 is odd, the count is already odd so the

parity bit's value is 0. For example:

a. 0 0 1 1 0 0 0 _1_

b. 1 1 0 1 1 0 1 _0_

In example a. the number of 1’s is counted and it is

even number so the parity bit is 1 to make it odd. In

b. the number of 1’s is odd so the parity bit is 0.

Source Coding

• Source coding is a process by which the symbols generated

by a discrete source is efficiently represented. The device

that performs the representation is called a source encoder.

• For a source encoder to be efficient, we require knowledge

of the statistics of the source.

• In particular, if some source symbols are known to be more

probable than the others, then we may exploit this feature by

assigning short code words to frequent source symbols,

and long code words to rare source symbols.

• We refer to such a source code as a variable-length code.

Our primary interest is in the development of an efficient

source encoder that satisfies two functional requirements:

• The code words produced by the encoder are in

binary form, i.e. they consist purely of 1’s and 0’s

• The source code is uniquely decodable, so that the

original source sequence can be reconstructed

without ambiguity from the encoded binary

sequence.

Source Coding

Prefix Coding

• To be of practical use, a source code should not only be

efficient, it should also be uniquely decodable.

• A special class of efficient, uniquely decodable source codes

is the prefix code. A prefix code is so named because it

satisfies a property known as the prefix condition.

Huffman Coding • Having been introduced to what prefix codes are, you will

now learn how to actually construct a type of prefix code

known as the Huffman code.

• The basic idea behind Huffman coding is to encode each

symbol with a binary codeword that is roughly equal in

length to the amount of information conveyed by the symbol

in question.

• The end result is a source code whose average codeword

length,L is roughly equal to the fundamental limit, H(S).

The Huffman Coding Algorithm

• STEP 1: The Splitting Stage

List the source symbols in the order of decreasing probability.

The two symbols of lowest probability are assigned a 0 and a 1.

• STEP 2: The Combining Stage

Combine the probabilities of the last two symbols, and reorder the

resultant probabilities. The probability of the new symbol is placed

in the list in accordance with its value. The list of symbols is now

reduced by one.

The Huffman Coding Algorithm

• STEP 3: Repeat

Repeat STEP 1 and STEP 2 until only two symbols are left, for

which a 0 and 1 are assigned.

• STEP 4: Encode

The code for each source symbol is found by working backward

and tracing the sequence of 0s and 1s assigned to that symbol as

well as its successors

Example

• In this example, we demonstrate how a prefix code is constructed

for a DMS with alphabet {s0, s1, s2, s3, s4} and corresponding

probabilities {0.4, 0.2, 0.2, 0.1, 0.1}.

• Following through the Huffman algorithm, our computation ends

after four iterations, resulting in the Huffman tree shown below:

• The Huffman code for the source is therefore:

• Before we can estimate the efficiency of the code, we need to

compute the average codeword length and the source entropy:

• Finally, the Huffman code’s efficiency in encoding the source is:

= H(S) /L = 2.12193 / 2.2 = 96.5 %

L = 0.4(2) + 0.2(2) + 0.2(2) + 0.1(3) + 0.1(3)

= 2.2 bits

H(S) = 0.4 log (1/0.4) + 0.2 log (1/0.2) + 0.2 log (1/0.2)

+ 0.1 log (1/0.1) + 0.1 log (1/0.1)

= 2,12193 bits

From the above Example, we may make several observations:

• No two codeword consist of identical arrangement of bits

• No codeword is a prefix of another codeword => Huffman code is

a type of prefix code

• Higher probability symbols have shorter codewords, and vice

versa => Huffman code is a variable-length code

• The two least probable codewords have equal length, and differ

only in the final digit

• The average codeword length is very close to the source entropy

• The average codeword length satisfy H(S) <L < H(S) + 1

various type of error-control coding techniques that provide different ways

of implementing Shannon’s channel-coding theorem will be discussed.

Each technique involves the use of a channel encoder in the transmitter and

a decoding algorithm in the receiver.

Channel Coding

Error-Control Coding

The task facing of the designer of a

digital communication system is

that of providing:

– cost effective

– reliability

– quality

From Shannon’s third Theorem, there

are 3 parameters:

– transmitted signal power

– channel bandwidth

– noise

determine the Eb/No which uniquely

determines the BER for a particular

modulation scheme.

• In practice, we find that for a limit value of Eb/No , it is not possible to

provide acceptable data quality.

• Also, reduction in Eb/No for a fixed BER may, reduce the system cost

(reduce transmitted power or hardware costs using smaller antenna size).

Solution:

ERROR-CONTROL CODING


• The channel encoder accepts message bits and adds redundancy

according to a prescribed rule, thereby producing encoded data at a

higher bit rate.

• The channel decoder exploits the redundancy to decide which message

bits were actually transmitted.

• The combined goal of them is to minimize the effect of channel error.


• Adding the redundancy in the code, increase transmission

bandwidth and increase the system complexity (especially,

decoding part).

• Hence, the design trade-offs in the use of error-control

coding to achieve acceptable error performance include

considerations of BW and system complexity.


The distinguishing feature is the presence or absence of

memory in the encoders.

Error-control codes can be classified into:

– Block codes

– Convolutional codes

• Block codes - accepts information in successive blocks,

producing an overall encoded blocks. (codeword).

• Convolutional codes - accepts message bits as a continuous sequence and thereby generates a continuous sequence of encoded bits at higher rate.

The basic property of linear block coder is called

closure, and according to this property the sum of any

two codewords is another codeword.

Linear Block Coding

Example 3.1:

C = {0000, 0101, 1010, 1111}

0000 + 0101 = 0101

……

0101 + 1010 = 1111

… …

1010 + 1111 = 0101

Linear Block Coding • Consider an (n, k) linear block code

– The first portion of k bits is always identical to the message

sequence to be transmitted.

– The second portion is an n - k bits generalized parity check bits

and is computed from the message bits in accordance with a

prescribed encoding rule that determines the mathematical

structure of the code.

Block codes in which

message bits are

transmitted in unaltered

form are called

systematic codes.

Structure of a systematic codeword.

Some Definitions

Hamming Weight:

The Hamming weight of a code vector x is defined as the

number of non-zero components of code vector x.

Hamming Distance or Distance:

The Hamming distance between two vectors is defined as the

number of components in which they differ.

Minimum Distance:

The minimum distance of a block code is the smallest distance

between any pair of codewords in the code.

Types of Error Correcting Codes

• Repetition Code

• Linear Block Code, e.g. Hamming

• Cyclic Code, e.g. Cyclic Redundancy Code, CRC

• BCH and RS Code

• Convolutional Code

– Tradition, Viterbi Decoding

– Turbo Code

– LDPC Code

• Coded Modulation

– TCM

– BICM

Cyclic Redundancy Check (CRC) Codes

Invented by W. Wesley Peterson, and published in 1961

A type of linear block codes, basically based on binary division

Generally, not cyclic, but derived from cyclic codes

A systematic error detecting code

a group of error control bits (which is the remainder of a

polynomial division of a message polynomial by a generator

polynomial) is appended to the end of the message block

with considerable burst-error detection capability

The receiver generally has the ability to send retransmission

requests back to the data source through a feedback channel.

CRC generator and Checker

x3+x2+x+1=??

Problem: Determine the message to be sent = 1011011 Generator P(x) = 1101 = x3+x2+x0 n = 3

• Check the output message is correct

Binary (N, k) CRC codes

k message or data bits are encoded into N code bits by appending to

the message bits a sequence of n = N - k bits. (Leftmost bit will not

consider for any operation during XOR operation)


Polynomial representation:

Message bits: m = [mk-1 mk-2 … m1 m0]

m(X) = mk-1Xk-1 +mk-2X

k-2 + … +m1X +m0 Degree (k-1)

Appended bits: R = [rn-1 rn-2 … r1 r0]

R(X) = rn-1Xn-1 +rn-2X

n-2 + … +r1X +r0 Degree (n-1)

CRC code bits: C=[cN-1 cN-2 … c1 c0]

= [mk-1 mk-2 … m1 m0 rn-1 rn-2 … r1 r0]

C(X) = cN-1XN-1 +cN-2X

N-2 + … +c1X +c0

= Xn m(X) + R (X)

Degree (N-1)

Example: (k = 10, N = 13, n = N-k = 3), CRC code



How to obtain the polynomial R(X) – the appended bits

CRC codes are designed by a generator polynomial g(X)

with degree of n; g = [g0 gn-1 … g1 g0]

g(X) = gnXn +mn-1Xn-1 + … +g1X +g0 degree (n)

Divide Xnm(X) by g(X) (modulo-2 division) and obtain

the remainder, which is R(X)

Xnm(X) = p(X)g(X) + R(X)

The remainder R(X) is always a polynomial of maximum order (n-1)


Example: the polynomial R(X) (the appended bits)

Message [11100110] 8 bits

M(X) = X7 + X6 + X5 + X2 + X

Generator polynomial g(X) = X4 + X3+ 1 [11001]

Given N-k = n = 4, 4 bits of redundancy : less than 1 bit of generator

Xnm(X) is the polynomial

corresponding to the

message bit sequence to

which a member n of 0’s is

appended [111001100000]

The remainder R(X) = X2 + X, therefore the appended bits are [0110] (since n = 4)

The CRC code bits are [11100110 0110]


The polynomial for the received code word T(X) is divided

by the generator polynomial g(X)

Upon the reception without error

- T(X) = C(X) = Xn m(X) + R(X)

- The remainder of T(X)/g(X) = R(X) + R(X) = the all-zero row

(modulo-2 addition)

- Example:

g(X) = X4 + X3 + 1

The transmitted CRC code bits are [111001100110]

T(X) = C(X) = X11 + X10 + X9 + X6 + X5 + X2 + X

= (X7 + X5 + X4 + X2 + X) g(X)

The remainder of [C(X)/g(X)] = 0 >> [0000]


The polynomial for the received code word T(X) is divided

by the generator polynomial g(X)

The remainder is not zero

• We can use a polynomial to represent a binary word.

• Each bit from right to left is mapped onto a power term.

• The rightmost bit represents the “0” power term. The bit next to it the “1” power term, etc.

• If the bit is of value zero, the power term is deleted from the expression.

Using Polynomials

53

Cyclic Redundancy Check Codes

• Cyclic codes are extremely well-suited for error detection.

• They can be designed to detect many combinations of likely errors.

• The implementation of both encoding and error-detecting circuits is practical.

• It is for these reasons that virtually all error-detecting codes used in practice are of the cyclic-code type.

54

• A cyclic code used for error-detection is

referred to as cyclic redundancy check

(CRC) code.

• We define an error burst of length B in an

n-bit received word as a contiguous

sequence of B bits in which the first and last

bits of any number intermediate bits are

received in error

Cyclic Redundancy Check Codes

55

Convolutional Codes

• Constraint length of a convolutional code, K expressed in terms of message bits, is defined as the number of shifts over which a single message bit can influence the encoder output.

• In convolutional coding, the generated codes are non-systematic codes.

1 MK

56

Convolutional codes offer an approach to error control

coding substantially different from that of block codes.

– A convolutional encoder:

• encodes the entire data stream, into a single codeword.

• maps information to code bits sequentially by convolving a

sequence of information bits with “generator” sequences.

• does not need to segment the data stream into blocks of fixed

size (Convolutional codes are often forced to block structure by

periodic

truncation).

• is a machine with memory.

– This fundamental difference imparts a different nature to the

design

and evaluation of the code.

• Block codes are based on algebraic/combinatorial techniques.

• Convolutional codes are based on construction techniques.

o Easy implementation using a linear finite-state shift register

57

A convolutional code is specified by three

parameters

(n, k,K) or (k / n, K) where – k inputs and n outputs

• In practice, usually k=1 is chosen.

– R = k/ n is the coding rate, determining the number of

data

bits per coded bit.

– K is the constraint length of the convolutinal code// K

expressed in terms of message bits, is defined as the

number of shifts .

58

• Given that message sequence (101) is

applied to the input of the convolutional

encoder constraint length K=3, rate

r=1/2. Find the corresponding output

sequence .

The output sequence is

(11, 10, 00, 10)

59

Convolutional encoder (rate ½, K=3)

– 3 shift-registers, where the first one takes the incoming

data bit and the rest form the memory of the encoder.

62

• Given that message sequence (10011)

is applied to the input of the

convolutional encoder constraint length

K=2, rate r=1/2. Find the corresponding

output sequence .

The output sequence is

(11, 10, 11, 11, 01)

63

Convolutional encoder: constraint

length K=2, rate r=1/2

input

Mod-2 adder

output

Path 1

Path 2

Flip-flop

10011 The output sequence is

(11, 10, 11, 11, 01)

M1M2=00, 10, 01, 11

64

Input data : m =1 1 0 1 1

Codeword : X = 11 01 01 00 01

65

Code Tree

• The structural

properties of a

convolutional

encoder are

portrayed in

graphical form by

its code tree.

constraint length K=3,

rate r=1/2

66

Code Tree

• Starting point

– at the left and corresponds to the situation before

the occurrence of the first message bit.

• Input bit = 0

– diverge upward from tree

• Input bit = 1

– diverge downward from tree

• So, a specific path in the tree is traced from

left to right in accordance with the input

(message) sequence.

67

Code Tree - Example • Given that message sequence (10011) is applied to the

input of the convolutional encoder constraint length K=3,

rate r=1/2.

The first input bit is 1

– move downwards from the input node.

The second bit is 0

– move upwards from this node.

Continue proceeding along the branched in this manner.

Reading in order, the output sequence is

(11, 10, 11, 11, 01)

68

Code Tree - Example

output sequence:

(11, 10, 11, 11, 01)

message sequence:

(10011)

69

Convolutional encoder: constraint length

K=2, rate r=1/2

input

Mod-2 adder

output

Path 1

Path 2

Flip-flop

10011 The output sequence is

(11, 10, 11, 11, 01)

M1M2=00, 10, 01, 11

State Diagram

70

State Diagram

• The current output of the convolutional encoder depends on its past history; which is recorded in terms of the outputs of the flip-flops in the shift register.

• For a 2-stage shift register, it has four possible states a, b, c and d, corresponding to M1M2=00, 10, 01, 11 respectively.

71

State Diagram - convolutional encoder with

constraint length K=3, rate r=1/2

a

b c

d

11 11

00

00

10 01 01

10 • Solid branch – represent a transition from one state to another

state in response to input 0.

• Dashed branch – represent a transition in response to input 1.

• The binary label on each branch represents the encoder’s output as it

moves from one state to another.

72

State Diagram

• Simply start at a and walk through the state diagram in accordance with the message sequence (10011).

• State 10, 01, 00,10, 11

• Finally, path abcabd is walked through, and therefore output the sequence (11, 10, 11, 11, 01).

a

b c

d

11 o/p:11

00

00

10 01 01

10

For a 2-stage shift register, it has

four possible states a, b, c and d,

corresponding to M1M2=00, 10,

01, 11 respectively.

73

Trellis Diagram

• The third method of representing the

structural properties of a convolutional

encoder is by its trellis.

a a

b b

c c

d d

00

00

11 11

01

01

10

10

j j+1

• Example:

– A central portion of the trellis for the

convolutional encoder with constraint length

K=2, rate r=1/2

– Illustrate the possible state transitions from

time instant j to j+1.

i/p P.S N.S o/p

0

1

00-a

00-a

00-a

10-b

00

11

0

1

10-b

10-b

01-c

11-d

10

01

0

1

01-c

01-c

00-a

10-b

11

00

0

1

11-d

11-d

01-c

11-d

01

10

74

Trellis Diagram

• The third method of representing the

structural properties of a convolutional

encoder is by its trellis.

a a

b b

c c

d d

00

00

11 11

01

01

10

10

j j+1

• Example:

– A central portion of the trellis for

the convolutional encoder with


– Illustrate the possible state

transitions from time instant j to

j+1.

75

• Left nodes – represent the four possible current

states of the encoder.

• Right nodes – represent the next states.

• Solid line – represents a code branch produced by an

input 0.

• Dashed line – represents a code branch by an input 1.

a a

b b

c c

d d

00

00

11 11

01

01

10

10

j j+1

Trellis Diagram

76

Trellis for convolutional encoder with


Trellis Diagram

Thank you

very much

Thank you very much

error detection and correction -...

Documents