quantization loss in convolutional decoding

8/4/2019 Quantization Loss in Convolutional Decoding

http://slidepdf.com/reader/full/quantization-loss-in-convolutional-decoding 1/5

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO . 2, FEBRUARY 1993 261

Transactions Letters

Quantization Loss in Convolutional Decoding

I. M. Onyszchuk, K.-M.

Abstract-We study the loss in quantizing coded symbols from

the AWGN channel with BPSK or QPSK modulation. A new

quantization scheme and branch metric calculation method are

presented. For the uniformly quantized AWGN channel, cutoff

rate is used to determine the stepsize and the smallest number

of quantization bits needed for a given bit signal-to-noise ratio

( & / N O ) loss. A 9-level quantizer is presented, along with 3-

bit branch metrics for a rate 1 / 2 code, which causes anEb/.’Vo

loss of only 0.14 dB. These results also apply to soft-decision

decoding of block codes. A tight upper bound is derived for the

range of path metrics in a Viterbi decoder. The calculations are

verified by simulations of several convolutional codes, including

the new memory 14, rate 1/ 4 or 1/6 codes used by the big Viterbi

decoders at JPL.

I. INTRODUCTION

UANTIZING AWGN channel symbols may significantly

increase the bit signal-to-noise ratio (Eb/No) equired to

achieve a particular decoded bit error rate (BER). This

increase in Eb/No, called quantization loss, depends mainly

upon the channel instead of the particular code utilized. It

is well-known that using 3 bits of quantization causes a loss

of 0.2-0.25 dB [1]-[7]. Furthermore, channel cutoff rate Ro

is useful for estimating quantization loss and for determining

good quantizer thresholds [3], [4].

The hardware complexity of one section in a convolutional

decoder increases linearly with both the number of bits (4 )

used to quantize demodulated channel symbols and the number

of bits (e ) used to represent path metrics. Also, the decodingspeed depends strongly upon q and e when bit-serial arithmetic

is used. Therefore, these parameters should be made the

smallest values which do not cause a significant &/No loss.

We find that 4 bits quantization yields a good compromise

between complexity and performance. The 9-level quantizer

described in the next section may be useful for very high-

speed decoders. More bits of quantization may be required

when coding gain is crucial, as in deep-space communications.

Q

Paper approved by the Editor for Coding Theory and Applications. Man-uscript received March 2, 1990; revised November 7, 1991. This work wassupported by the Jet Propulsion Labo ratory, California Institute of Technology,under a contract with the National Aeronautics and Space Administration.This work was presented in part at the IEEE International Symposium onInformation Theory, San Diego, CA, January 14-1 9, 1990.

I .M. Onyszchuk was with the Jet Propulsion Laboratory, Pasadena, CA

91109. He is now at 2305-2000 Hungerford Gate, Kanata, Ont. K2L 2T4,Canada.

K.-M. Cheung is with the Jet Propulsion Laboratory, Pasadena, CA 91109.0. Collins is with the Department of Electrical Engineering, The Johns

IEEE Log Number 9207339.Hopkins University, Baltimore, MD 21218.

Cheung, and 0. Collins

Then, since the memory 14 Viterbi decoders at JPL perform

double the computation of a memory 13 decoder, but for rate

1/4 or 1/6 require about 0.1 dB less &/NO for a bit error rate

(BER) of 0.001, even a quantization loss of 0.05 dB may be

unacceptable. Furthermore, extra quantization bits compensate

for inaccuracy (particularly at low &/No) in the automatic

gain control (AGC) within a receiver.

In Section 11, a new branch metric formula and uniform

quantization scheme are derived for the AWGN channel.

Nonuniform quantizers are not considered because they did

not decrease BER significantly for two test cases with y = 3.

In Sections I11 and IV, cutoff rate is used to estimate a good

quantizer stepsize A and the corresponding &/No powerloss for the uniformly quantized AWGN channel. In order to

determine e, a tight bound is derived in Section V for the

maximum difference between any two path metrics compared

in a Viterbi decoder.

The theoretical results are verified by simulations of

three convolutional codes with octal generator polynomals:

(171,133)-the memory 6, rate 1/2, NASA standard

code; (46321,51271,63667,70535)-the memory 14, ex-

perimental code for the Galileo mission to Jupiter; and

(46321,51271,63667,70535,73277,76513)-the new rate

1/6, memory 14 code for the and Cassini mission. In all cases,

the quantization loss measured from simulations was close to

the loss in channel cutoff rate Ro, and the quantizer stepsize

which maximized Ro nearly minimized BER, even though

the memory 14 code rates are above Ro for the operating

& / N o of 0-1 dB. Simulations of a Viterbi decoder yielded

the same quantization loss in dB measured with respect to bit

error rate or symbol error rate, where a symbol is a block of

4 to 8 consecutive decoded bits. Therefore, the results in this

paper also apply when an outer block code is concatenated

with the convolutional code. Although the examples presented

here are for rate l / n Viterbi decoders, this work applies

(except Section V) to soft-decision decoding of block codes

and to other convolutional decoders such as those using a

stack algorithm.

11. BRANCHMETRICS

Suppose convolutionally encoded bits are sent over anAWGN channel with BPSK modulation. Then an encoded 0 or

1 is mapped to +I or -1, respectively, and used to modulate

the carrier phase for T, seconds. Zero-mean, additive, white

Gaussian noise with one-sided power spectral density No

00904778/93$03.00 0 1993 IEEE



262 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993

W/Hz affects each received channel symbol independently

at the receiver. The demodulator output is a conditionally

Gaussian random variable y with mean +ar -awhere E, is the received energy per symbol. The variance

of y is u2 = N0/2, the same as the noise. Ideal coherent

detection is assumed and QPSK modulation is treated as two

independent BPSK streams. The automatic gain control (AGC)

in a widewand receiver divides y by an estimate of u, hich

makes the noise variance essentially 1 [l], [2].For the AWGN channel with fixed E,/No, a maximum-

likelihood sequence decoder finds the trellis path with mini-

mum Euclidean distance, equivalently minimum negative inner

product, to the received sequence of demodulated channel

symbols. So the metric for a trellis branch is the dot product

of the branch label [11,12,...,1,],1, = + l or -1, and

the negative of a received vector [ y l , y2, . . . ,y,]. Note that

branches with lower metrics are closer to the received vector.

Now incrementing or multiplying all branch metrics by the

same value does not change the decoder output. Therefore,

the decoder may add (-y, + Iyzl)/2 or (yI + ly,l)/2, instead

of -y I or +y,, to the metrics of those branches for which

1, = +1 or -1, respectively, yielding the branch metric

n

=?{ if 1, # sign(y,)c yz la+ y z l otherwise,2=1 2=1

which is equivalent to Euclidean distance. This “sign-

magnitude” method appears in [ l p. 41,871, [8], [l o p. 221but the derivation above seems to be new. This method will

be used herein because it halves the branch and path/state

metric maximum ranges which result from the correlation

metric -E,”=,zl,.

Let the random variable J , be the quantized value of y,.

A decoder which uses q-bit signed integers to represent J ,

could conceptually use 0, &A , f 2 A , . . . 4(24-l - 1)A for

any real number A because multiplying all metrics by A has

no effect. For a uniform quantizer with spacing A volts, the

thresholds should be & A / 2 , f3A/2 , . . . ,zt(24-l) A / 2 . Sev-

eral simulations of the NASA code using 3 bits integer branch

metrics and nonuniform quantization schemes, including the

ones that maximized channel capacity or cutoff rate, never

produced lower BER’s than using the best A . Furthermore,

any potential gain by a nonuniform scheme would decrease

rapidly with q > 3 . Thus, only uniform quantization schemes,

characterized by q and A , are considered herein.

For the above uniform quantizer and q = 3, there are seven

zones because J , E [ - 3 , . . ,$31. To improve the quantizer

performance, zones +4 and -4 are appended as shown in

Fig. 1. In rate 1/2 decoders, a branch metric of 8 is decreased

to 7 so that q = 3 b are still sufficient to represent the

branch metrics. As shown in the next section, this modification

leads to a lower BER than for 8 levels and standard integer

metrics [2].

111. QUANTIZATION Loss

In this section, quantization losses for different q are mea-

sured from simulations and compared to calculated losses in

AWGN channel cutoff rate Ro from an unquantized channel.

I+S

Fig. 1. A 9-level quantizer.

TABLE I

Q U A N TI ZA TI O NAN D CUTOFFATE LOSSES FOR THE

MEMORY . RATE 112 . N A S A CODE, h/.Vo = 2.25 dB

loss in cutoff rate quantization loss me asur e dbits q (dB) f r om s i mul at ions (dB)

3 0 . 134

4 0 . 054

5 0 . 016

6 0.005

0.14

0.05

<0.01

<0.01

TABLE I1

QUANTIZATION AN D CUTOFFATELOSSES IN dB FOR

TWO MEMORY 4 C O D E S , = 0. 5 dB

loss in cutoff rate quantization loss me asur e dbits q (dB) from simulations (dB)

3 0 . 130

4 0 . 053

5 0.015

6 0.005

-0.05

0.02

<0.01

Define

p , = Pr(J = .i I + 1)

(3+0 5 P-2 e - ( Y - a ) 2 / 2 u 2 d y ,&U

- 24-1 + 2 5 j 5 24-1 - 2.

(3-0 5 P

For j = f ( 2 4 - l - ) , p J is the above integral with limits

( j - 0.5)A and +m , or -cc and ( j + 0.5)A. To obtain p,’s

for the 9-level quantizer in Fig. 1 ,24- l above must be replaced

with 5. Then a measure of the channel noise level is

2 4 - 1 - 1

Y = &3G3= -2 4 -1 + 1

which is nearly 0 for high E,/No and approaches 1 at low

E,/No. The binary-input, q-bit uniformly quantized, AWGN

channel cutoff rate is Ro (9) = 1 log2 (1+y) its per channel

use. This quantity is useful for estimating the Eb/No loss

with respect to the unquantized AWGN channel, for which

In Tables I and 11, quantization losses measured from simu-

lations are the same as the cutoff rate losses. All the numbers

in Table I are insensitive to Eb/No in the range 0 to 4 dB. The

0.134 dB and 0.14 dB losses shown in Table I for q = 3 are

for a 9-level quantizer (Fig. 1) and 3 b branch metrics. They

are significantly less than the 0.2-0.25 dB often reported for 8

levels [2], [3j, [6], [7]. The quantization and cutoff rate losses

Ro = 1 - og2( l+e~p(-E, /No)) .



IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993 263

0 020 -

0018-

0016-

0014-

0 0 1 2 -

7 levels. double-width erasureZone

0 2 0 3 0 4 0 5 Ob 0 7 08 0 9

Quantizer Stepsize

Fig. 2. Bit error rates for different quantizers, NASA (7,112) Code atEb/:LTo= 2.25 d B .

in the tables are very close, even though the rates of the three

codes simulated are all above Ro for the AWGN channel at

the operating & , / N o alues shown.

IV. QUANTIZERTEPSIZE

The uniform quantizer stepsize A which maximizes Ro(q)

for a given & / N o almost minimizes BER and b-bit block error

rate when & / N o varies by up to 1 dB. The stepsize A is also

called the AGC “slice” because the AGC effectively controls

the quantizer level settings. A should be chosen to minimize

BER for the lowest operating & / N o . Then the quantizer will

work well but not optimally for larger @ , / N O and it will

saturate at high & / N o .

At high E , / N o , the union bound may be used to calculate

both quantization loss and the value of A which minimizes

BER [ 5 , p. 2921. However, exact calculation is tricky and

depends upon the particular code, whereas quantization lossis really a function of the channel. Furthermore, using Ro(q)

is sufficient, as was demonstrated in the last section.

Viterbi decoder bit error rates for several quantizers are

shown in Fig. 2 for the (7,1/2) NASA code at =

2.25 dB. For q 2 4, there is negligible quantization loss.

The quantizer stepsize plotted is a fraction of 0 =

because we assume that the AGC divides the received values

y i by c.Since the AGC estimates 0 in order to perform this

normalization, the BER curves in Fig. 2 actually show the

effect of AGC inaccuracy. Multiplying all stepsizes in Fig. 2

by ( 2 E s / N o ) - 1 ’ 2 ields values as fractions ofa.he BER

curve labeled “9 levels” is for the quantizer shown in Fig. 1,

with branch metrics of f 8 reduced to +7. The BER curve

labeled “8 levels” is for the uniform quantizer described in

It appears from Fig. 2 that it is safer to make the quantizer

stepsize larger than the value which minimizes BER. The

stepsize which maximizes R o ( q ) is labelled by “Ro”.n all

cases, the stepsize which maximized Ro(q ) nearly minimized

BER. The corresponding curves for error rates of 8-bit bytes

PI.

I I I I I I I

0 2 0 3 0 4 0 5 0 6 0 7 0 8 (

Quantizer Stepsize

Fig. 3 . Bit error rates fo r 3 nonoptimum quantizers and 3 bit branch metrics,NASA (7, 1 1 2 ) Code at Eb/:VO = 2.25 d B .

have the same relative shape and spacing. Therefore, the

quantization losses and stepsizes here also apply when a

block code (e.g., 8-bit Reed-Solomon) is used outside of the

convolutional code.

Fig. 3 shows Viterbi decoder bit error rates for three nonop-

timum quantizers and 3 bit branch metrics used with the (7,

1/2) NASA code operating at the same Eb / No of 2.25 dB

as in Fig. 2. The curve labelled “9 levels,” copied directly

from Fig. 2, is for the quantizer in Fig. 1. The BER curve

“8 levels” applies to an 8-level quantizer [6, p. 131 and 3 bit

branch metrics [6, p. 2581. The BER curve “7 levels” applies

to a quantizer with sign-magnitude 3 bit branch metrics and

zone assignments - 3 , -2 , -1, 0, 0, $1, +2, + 3 , [l, p. 871.

Clearly, these last two quantizers and branch metric assign-

ment techniques cause a substantial BER increase for large

quantizer stepsizes.

v. STATE METRICRANGE ND RENORMALIZATIONIN VITERBI DECODERS

For each received n-vector of channel symbols and encoder

state s , a Viterbi decoder finds the trellis path into s having

the least total branch metrics. The metric of this “survivor

path” becomes the state metric for s and is stored inside an

C-bit register associated with the add-compare-select (ACS)

unit for s. When received symbols are noisy, the minimum

over all state metrics increases as received n-vectors are

processed. So occasionally, all state metrics must be renormal-

ized: decreased by a constant in order to avoid overflowing an

C-bit register. Subtracting a constant c from all state metrics

when every state metric is 2 c works but may be difficult to

implement in some parallel Viterbi decoders. Renormalization

can occur when the most significant bit (msb) is 1 n every state

metric register because then zeroing these msbs is equivalent

to subtracting 2‘-’ from every state metric. This approach

is implemented in JPL’s prototype, fully parallel, Big Viterbi

Decoder, which contains 16384 ACS units [8]. Detecting when

every state metric has msb = 1works as follows [lo]. Let the

random variable A4 be the difference between the maximum



264 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993

Fig. 4. Binary modulo metrics. If m s b ( L 1 - LO ) = 1. then L1 < LO.

and minimum state metrics after the ACS units have completed

calculations for a trellis level. As proven later, M is upper

bounded by a constant which depends only upon the code

and the quantizer. Now choose L as the least integer such that

M < 2e-2 and monitor the metric ( p ) or any one state. When

the two most significant bits of p become 1, p 2 2'-' +2 e - 2 ,

so every state metric is 2 2e-1 (because M < 2e-2) which

means that the msb is 1 in all state metrics. However, 2 bits

of L are used for this method since L = 2 + [log? where

other into state w, whose branch labels differ in D or fewer

positions. Since the maximum contribution to a branch metric

by one quantized channel symbol is J,,,, by the Viterbi

algorithm the state metric of w is at most the metric of b

The above bound is usually much better than the upper

bound ( k - 1)n . J,,, often referenced [ 5 ] , [9]. However,

neither of these bounds applies to path metrics compared inside

ACS units, which is the critical point, particularly for moduloarithmetic.

Lemma: Suppose a Viterbi decoder for rate 1/n convo-

lutional codes uses sign-magnitude quantized symbols and

branch metrics. Then the difference between any two path

metrics compared inside an ACS unit is at most dfree J,,,.Proof: At any instance of time, for each encoder state

s, label b, and w, as the two trellis paths which end in

state s. The corresponding path metrics p(b , ) and p(w,) are

compared by the ACS unit for state s. Let p ( b s ) 5 p(ws).

Now since a convolutional code is linear, adding an error

event of weight dfree to b, yields a path e , which differs

from b, in dfree positions but which coincides with w, on the

branch from some state s p to state s. Thus, ( p ( b , ) - p(c,)I 5

dfree. , By the Viterbi algorithm, w, is the path with least

metric into state s p at the previous trellis level. Therefore,

p (wSp ) p ( c S p ) ,o p(w,) 5 p ( c , ) because tu and c coincide

on the branch from state s p to state s. Combining the above

inequalities completes the proof.

Corollary I : Equality occurs when the demodulated channel

plus D ' J,,,.

symbols corresponding to the dfree 1's in an error event of

weight dfree into saturation zones.

Corollary 2: For rate k / n nonpunctured convolutional codes

with k > 1, dfreemust be replaced by d in the above bound.

d is the maximum, over all 2k - 1 nonzero predecessor

states p s for which there is a transition branch into state 0, of

weight of an y trellis path from state

I -- IhM is the designed maximum state metric range. Furthermore,

a global signal to every ACS unit is needed to zero the msb

of every state metric.

Renormalization is automatic when state metrics are repre-

sented by two's ComplementLntegers and path metric arith-

arranged on a circle of circumference 2M and comparisons

are made using two's complement arithmetic (Fig. 4). It is

even though metrics are crossing the top of the circle (i.e.,renormalizing) at different times. This method is good because

renormalization occurs within the ACS units (i.e., locally)

and because = 1 +smaller value of C seems possible only if a constant c is

subtracted from all state metrics when every state metric is

2 e. When modulo path metric arithmetic is used in ACS

units, M should be made greater than the maximum possible

difference between path metrics compared inside any ACS

unit, a number derived below. First, the maximum difference

M is determined between any two state metrics after ACS

computations. Let D be the maximum, over all nonzero states

s, of the least weight of any trellis path from state 0 into state s.

Jmax ill denote the maximum absolute value of a quantized,

demodulated channel symbol. The next result follows from[l, p. 89-92].

Proof: Let b and w be the states with lowest and highest

metrics. Since a convolutional code is linear, there exist two

trellis paths from some state e , one into state b and the

metic is performed 2 M 19i. Metr&s are conceptuallythe branch weight from state pc7 to state 0, plus the minimum

into state p s .

Examples: The NASA code has dfree= and D = 8. For

dfree= 56 and D = 50 for the memory 14, rate 1 /6 code

to be used by Jp L in the Cassini mission. In general, canbe smaller, equal to, or larger than dfree.indeed, D = 4 and

only. For a given G7 dfree= 3 for the rate 1/2 code with octal generators 1 and 5 .

The length C of state metric registers in a Viterbi decoder

using modulo metrics should accommodate the maximum

possible difference between two path metrics compared in

an ACS unit. If C < 1 + log, (dfree. J,,,), the decoder

will sometimes make incorrect decisions between trellis paths.

Particularly at high & / N o , these incorrect decisions may

increase BER dramatically.

Example: Consider the convolutional code with octal gen-

erators 5 and 7 and dfree = 5. For q = 4, Jma, = 7. Thus,

if L =Ln a Viterbi decoder using modulo pathhtate metrics,

then M = 64 > dfree . J,,, = 35, SO all ACS decisionswill be correct. Indeed, a simulation at Eb/ N o = 5.0 dB

yields a decoded BER of 0.000089. But if C is 6 , then

M = 32 < dfree. J,,, = 35 and the BER rises to 0.082

because the trellis path at distance dfree from the transmitted

path is being chosen incorrectly over the path sent.

possible to find the path metric Of a pair,

the memory 14 Galilee code, dfree= 35 and D = 33. Also,

Claim: For any convolutional code, M 5 D ' Jmax.



IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 41, NO. 2, FEBRUARY 1993 265

Design Example: A new single-board version of the Big

Viterbi Decoder (BVD) has been built at JPL with 6 bit

quantized input symbols, sign-magnitude branch metrics, mod-

ulo pathhtate metrics, and C = 12 bit state metric registers.The value q = 6 was chosen to make the quantization

loss negligible (0.01 dB), but more importantly, to make the

decoder BER relatively insensitive to AGC inaccuracy. The

prototype BVD has q = 8 to avoid AGC altogether. For a

memory 14, r a t ~ l / 6 ode, dfree 5 56. Thus, if J,,, = 31and f = 12, A4 = 2048 > dfree J,,, guarantees that

correct ACS decisions will always occur. On the other hand,

simulations of the new memory 14, rate 1 /6 code at & / N O =

0 - 2 dB showed that C = 11 did not cause any detectable

change in BER, even though a few incorrect ACS decisions

occurred.

ACKNOWLEDGMENT

The authors thank S. Dolinar, R. J. McEliece, and L.

Swanson for their suggestions and comments. Their help and

encouragement led to the publication of this work.

REFERENCES

K. S. Gilhousen, J. A. Heller, 1. M. Jacobs,systems study for high data rate telemetry

and A. J. Viterbi, “Codinglinks,” Linkabit Corp., Jet

Prop. Lab., Pasadena: CA, NA SA Rep. CR-114278, Jan. 1971.J.A. Heller and I.M. Jacobs, “Viterbi decoding for satellite andspace communication,” IEEE Trans. Commu n. Technol., vol. COM-19,pp. 835-848, Oct. 1971.J. P. Odenw alder, “Optimal decoding of convolutional codes,” Ph.D.dissertation, Univ. California, Los Angeles, 1970.

J. L. Massey, “Coding and modulation in digital communications,” inProc. Int. Sem. Digital Commun., Zurich, Switzerland, 1974, pp. E21-4.A. J. Viterbi and J. K. Omura, Principles of Digital Communication and

Coding. New York: McGraw-Hill, 1979.G. C. Clark and J. B. Cain, Error-Correction Coding for Digital C om-

munications. New York: Plenum, 1981.Y. Yasuda, Y. Hirata, and A. Ogawa, “Optimum soft decision for Viterbidecoding,” in Proc. 5th Int. Conj Digital Satellite Commun., Genoa,Italy, Mar. 1981, pp. 251-258.J. Statman, G. Zimmerman, F. Pollara, and 0.Collins, “A long constraintlength VLSI Viterbi decoder for the DSN,” Jet Prop. Lab., Pasadena,CA, pp. 134-142, TDA Prog. Rep. 42-95, Nov. 15, 1988.A. Hekstra, “An alternative to metric rescaling in Viterbi decoders,”IEEE Trans. Commun., vol. 37, pp. 1220-1222, Nov. 1989.0. Collins, “Coding beyond the computational cutoff rate,” Ph.D.dissertation, California Inst. Technol., Pasadena, CA, 1989.

quantization loss in convolutional decoding

Documents