ekt430/4 digital signal processing 2007/2008 chapter 2 finite word length effect
TRANSCRIPT
![Page 1: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/1.jpg)
EKT430/4DIGITAL SIGNAL
PROCESSING2007/2008
CHAPTER 2FINITE WORD LENGTH EFFECT
![Page 2: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/2.jpg)
FINITE word length effectin fixed point processing
The Digital Signal Processors have finite width of the data bus.
The word-length after mathematical operations, if exceeds the bus width,
will have to be omitted. This is the source of Serious Errors.
We now discuss attributes that cause such errors.
![Page 3: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/3.jpg)
Causes of word length error
The different causes of the errors are:Run time error or, Register Over-flow,Arithmetic & coefficient truncationData scaling in an attempt to reduce overflow.Zero-input limit cycling.
![Page 4: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/4.jpg)
Fixed point design procedure
Ideal floating point
FloatingPoint To Fixed pt R
ealiz
able
Filt
er
Test &Evaluate
Fail !!! causes
OPTIONS
Pass: Luck
1.Register over flow2.Coefficient errors3. Arithmetic Errors.
1, adjust binary point2. Change architecture3.Scale Parameters
![Page 5: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/5.jpg)
Run time / over flow error: Definition
The input data begins with sign bit followed by MSB to LSB extending to mantissa.When the input data length is larger than the bus width, the sign bit followed by MSB bits etc. overflows the register on sign bit side. Since the significant part is over-flown which causes the “remain” in the register useless.
![Page 6: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/6.jpg)
Solution for Run time / over flow error:For fixed point format, we should prefer the subtraction by of 2’s complement. Here the effect of overflow is least significant.Once the sign bit enters the overflow point of the accumulator and overflow likely detected, a Flag is activated that will clamp the input to the data bus by stopping less significant bits entering. Inaccuracy incurs due to loss of data after clamping.Clamping is saturation.
![Page 7: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/7.jpg)
Scaling to reduce overflow..
hsm( )
h m( )
0
N 1
k
h2 k( )
=
Here we scale the input to reduce its dynamic range. It is at the cost of performance of the system. The methods to scale are:
hsm( )
h m( )
0
N 1
k
h k( )
=
RMS scaling: preferred
Absolute scaling
![Page 8: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/8.jpg)
Arithmetic truncation Errors
When two inputs of length N are multiplied, the net word length can increase to 2N. 2N bit is rounding off to S(<2N) bits. These S bits are added with another S bits and the result is say, T bits.The contents are again rounded off to S bits and fed to accumulator.The output is truncated to M bits, the bus width.These multiple rounding-offs causes additive errors.If M<T< 2N, may cause an arithmetic error. Generally for N=16, 2x16 bit multiplication results in maximum 32 bits while the accumulator has a width of 40 bits.
![Page 9: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/9.jpg)
Coefficient MULTIPLIER
N BIT COEFFICIENTSTORAGE REGISTER
Adder
INPUT xK
Output yk= Ayk-1+xk N bit Output
Storage register
yk-1
Ayk-1
Implementation of First Order Recursive Filter.
![Page 10: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/10.jpg)
Multiply accumulate :MAC unit
N-bit i/p
N-bit i/p
2N bit r/o
to s bits S-bits
T- to S- bit r/o
S- to M-bit r/o output
S-bit Accumulator
r/o round offM bit bus
![Page 11: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/11.jpg)
Coefficient Quantization
A transfer function is represented by a ratio of Numerator and Denominator polynomials. The binary version of coefficients are not compatible with the data bus width. Requires quantization/truncation.The coefficients of the denominator may be sensitive to instability. Selecting suitable structures may reduce the sensitivity.
![Page 12: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/12.jpg)
Limit Cycling
In IIR filters, even though the present input is zero, the output may reveal some undesirable variations due to past signal. It is caused when the response of an unforced LTI system does not decays down to zero in stipulated time. Larger the size of data bus, smaller is the effect of limit cycling.
![Page 13: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/13.jpg)
Methods to control the errors:
Scale the input and/or coefficients.
Increase the word length and Select an alternative DSP
architecture.
![Page 14: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/14.jpg)
Example:1
The binary version of filter coefficients after arithmetical calculations can come out to be non-compatible with the length of data bus.
For a data bus of width n, the actual word length of data may be larger. We need to truncate it.
Example:H(z) = 1/[1+az-1+b] where a = -1.957 558 and b = 0.995 813.
![Page 15: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/15.jpg)
Truncation contd:
Let the bus width be n=8. One bit for sign is dedicated. Balance are 7 bits.
If the exponent a > 1; here one bit for exponent has to be spared.
Balance are (n-2) =6 bits for mantissa. Round-off of quantization coefficient is to be used.
For fixed point data processing:
![Page 16: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/16.jpg)
Manipulation:
1.The result is added by 0.5, when ‘round-off’ is desired.
2.For bottom or floor, zero is added.
3.For top or ceiling 1.0 is added.
![Page 17: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/17.jpg)
For fixed point data processing:
CASE-1 when the coefficient has an integer before
decimal. a = -1.957 558 This ‘a’ is multiplied by 2(n-2) = 64 to yield
125.283712. Add 0.5 for round-off: 125.783712 We obtain its binary equivalent truncated
to 8 bits including sign bit. This amounts to
10000011= -125.
![Page 18: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/18.jpg)
Case-1…
In practice, we need not convert it in binary. We take integer number and forgo mantissa. We now convert -125 back into the
coefficient level by dividing by 26 = 64. The resulting quantized coefficient is
a~ = -125/64= -1.953125.
![Page 19: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/19.jpg)
For fixed point data processing:CASE-2: when the coefficient is Mantissa. b=0.995
813. This b is multiplied by (2n-1-1). Given b=0.995 813
multiplied by 27-1 = 127, the result = 126.468 251. For round off add 0.5 : 126.96851. 126.96851. when converted in binary and
truncated, becomes 0111 1110. It is 126 when taken in decimal.
The step is 126. This can be directly taken from 126.468251 by eliminating mantissa..
To reduce to coefficient size,126 is divided by 27=128.
The truncated coefficient b is 0.984375.
![Page 20: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/20.jpg)
Final Solution and comments
The quantized transfer function is: H’(z) = 1/[1+a’z-1+b’] where a’ = -1.953 125
and b’ = 0,984375. Check it for stability. Try Matlab function: beq = a2dR(d,n) and beq = a2dT(d,n). These function should generate the decimal
equivalent of the binary representation/decimal representation in sign-magnitude form of a decimal number with a specified number of bits for the fractional part obtained by rounding off.
![Page 21: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/21.jpg)
Example:02
A Digital Signal Processor takes a word length of 7 bit and 1 sign bit. A set of coefficients have been calculated and tabulated as follows.
Quantize the coefficients and find out the values the DSP will accepts in place of the calculated coefficients.
Filter Coefficients
value
ao = a8 0.7252 a1 = a7 -1.1111 a2 = a6 -0.5920 a3 = a5 3.1993 a4 5.5555
![Page 22: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/22.jpg)
Soln:02
The number of bits excluding sign bit are 7.
When treating integer + decimal, the total number of effective bits will be 6.
When treating no integer and only decimal, the total number of effective bits will be 7.
ADC being successive approximation, flooring scheme is applicable. Hence 0.5 is not added.
![Page 23: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/23.jpg)
Soln_2….
0.7252 x(27-1) = 92.1004 92: 92/128 = 0.71875
1.1111x(26)= 71.1104 71: 71/64 =
1.1093751.0.5920X(27 -1) = 75.18475:
75/128=0.5859375 3.1993x26 = 204.7552 204:
204/64 = 3.1875 5.5555x26= 355.5552 355:
355/64 = 5.546875
![Page 24: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/24.jpg)
Results in Tabular Form
Filter Coefficients
value Quantized SAR
ao = a8 0.7252 0.71875
a1 = a7 -1.1111 -1.109375
a2 = a6 -0.5920 -0.5859375 a3 = a5 3.1993 3.1875 a4 5.5555 5.546875
![Page 25: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/25.jpg)
Example:03
A filter function is represented by the polynomial:
H[z] = 1/ [1 + 1.19z-1 + 0.198 z-2]. The poles are at 0.99 and 0.20. The
system is unconditionally stable. The coefficients are quantized by a 6
bit uni-polar round-off quantizer. Whether the system remains stable?
![Page 26: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/26.jpg)
Soln:03
(A)1.19: number of bits for effecting the unipolar bits are 6 that exclude sign bit. Hence 6 –1(integer)=5. 1.19x 25 =38.09 38: 38/25 = 1.1875.
(B)0.198: number of bits effecting the unipolar bits are 6 excluding the sign bit. 0.198x(26 -1)+0.5 =12.974 12/26 = 0.1875.
The TF becomes:H~(z) = 1/(1+1.1875z-1+ 0.1875z-2 ]
TF is marginally stable as roots are[ 1, 0.1875]
![Page 27: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/27.jpg)
Effect of quantization on performance
Problem:4 A band-pass digital filter to be used for
digital clock recovery at 4.8 kbaud and a sampling frequency of 153.6 kHz.
The filter is characterized by H(z) = 1/[1–1.957558z-1 + 0.995913 z-2].
Assess the effect of 8 bit quantization on the (a) pole location of the filter and (b) it’s center frequency.
![Page 28: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/28.jpg)
Soln_4:
Given that the TF is a normalized 2nd BPF. The poles are therefore complex conjugate. For the sake of simplicity we represent the
TF = 1/[1+az-1+bz-2] Here a = -1.957558 and b = 0.995913.
Sampling frequency given is fs = 153.6 kHz.
The polar distance p of the pole (magnitude) from the origin =b and is calculated to be p = 0.99795. Its angle with respect to positive x-axis:
= cos-1{(-1.957558/2)/(0.99795)} = 11.25.
Thus the location of poles are 0.99795 11.25
![Page 29: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/29.jpg)
Soln: 4….
The least count per degree of the angle is 153.6 /360 = 0.426666 kHz.
The value of 11.25; the un-quantized center frequency is11.25 x 0.426666 = 4.799999 kHz.While quantization to 8 bits, 1 bit is sign bit. So the effective number of bits are 7 only.
The rule is:effective bits will be 6 for integer + fractions,for fraction, the effective bits will be 7.
![Page 30: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/30.jpg)
Soln.4…..;
Case 1 : Quantization of a =-1.957558. Keeping apart the sign, it contains an integer. Therefore 1.957558 x 26 = 125.283712 125
It’s quantized equivalent is 125/26 = 1.953125.
Case 2: Quantization of b = 0.995913.
It’s equivalent integer is 0.995913x(27-1) = 126.480951126
The resulting output is 126/27 = 0.984375.
![Page 31: EKT430/4 DIGITAL SIGNAL PROCESSING 2007/2008 CHAPTER 2 FINITE WORD LENGTH EFFECT](https://reader035.vdocuments.us/reader035/viewer/2022062519/5697bfd01a28abf838caa5bf/html5/thumbnails/31.jpg)
Soln 4…..
The new value of p =0.984375 = 0.992156 = cos-1{(-1.953125/(2 x 0.992156)} = 10.17 The LC per degree of the angle is 0.426666 kHz.
The new center frequency is 10.17 x 0.426666 = 4.3399 kHz. Location of quantized poles are 0.992156
10.17
status Location of poles Center frequency Unqunatized 0.99795 11.25 4.80 kHz quantized 0.992156 10.17
4.34 kHz