11.5.1 adaptive quantization in dpcm - the … speech coding (fig 11.15 – segment of speech which...

11.5.1 Adaptive Quantization in DPCM

Use backward adaptive Quantization (Variation of backward adaptive Jayant Quantizer)

Ex 11.5.1/Pg 338: Use backward adaptive Quantizer in the DPCM to code the speech sample (Fig 11.7). 3rd order predictor and 8 level Quantizer. Use the multipliers M4 = 0.90 = M0, M1 = 0.90 = M5, M2 = 1.25 = M6, M3 = 1.75 = M7

See Fig. 11.10/Pg 339. Better reconstruction However quantizer is not expanding rapidly enough. Increase value of M3

(Speech output has a large spike around 3500 sample).

11.10/Pg 339: Adaptive prediction in DPCM

In Fig 11.7, different speech segments have different characteristics. Adapt the predictor to match the local statistics. (forward or backward).

DPCM with forward adaptive prediction (DPCM – APF)- Forward adaptive: Divide input into segments of blocks. Speech coding – 16ms blocks (i.e. 128 samples at 8KHz). Image coding – (8x8) block. 1] Compute autocorrelation coefficients for each block. 2] Obtain predictor weights. 3] Quantize predictor weights and transmit. (6 bits/weight). Assume samples values outside each block are zero. Block length = M, Autocorrelation for lth block lM-K Rxx

(l)(k) = (1/(M-K)) Xi Xi+k ------- (11.38) i = (l-1)M+1 Rxx

(l)(k) = Rxx(l)(-k)

For k positive Ex: Let l = 1 i.e. 1st block

Create PDF files without this message by purchasing novaPDF printer (http://www.novapdf.com)

http://www.novapdf.com


lM k>0 Rxx

(l)(k) = (1/(M-K)) Xi Xi+k i = (l-1)M+1

Let k = 1, M - 1 Rxx

(l)(1) = (1/(M-1)) Xi + Xi+1

i = 1

= (1/(M-1)) (X1X2 + X2X3 + …… + XM-1XM) For k negative Let k = -1, M Rxx

(l)(-1) = (1/(M-1)) Xi + Xi-1

i = 2

= (1/(M-1)) (X2X1 + X3X2 + …… + XMXM-1) = Rxx

(l)(1) DPCM with backward adaptive prediction (DPCM-APB) Pg340

Assume I order predictor prediction error = dn = (xn - pn)2

dn2 = (xn – a1xn-1)2 where pn

= a1 xn-1

Optimal value of a1 when d1 is minimum i.e. (a1) opt When a1 < (a1) opt, ddn

2 is –ve. da1

Move a1 to the right When a1 > (a1) opt, ddn

2 is +ve. da1 Move a1 to the left




a1

(n+1) = a1(n) - alpha ddn

2 da1

alpha is constant ----- 11.41 ddn

2 = 2 (Xn – a1 Xn-1)(-Xn-1)

da1

= -2dn Xn-1 -------------- (11.43) Substitute 11.43 in 11.41 a1

(n+1) = a1(n)

*alpha*dn Xn-1 ------------ (11.44)

Replace dn by dn

aj

(n+1) = aj(n)

+ alpha*dn Xn-1 Extend this to Nth order predictor N

dn2 = (Xn – aiXn-i )2

i = 1

N

ddn2 = 2 (Xn – a1 Xn-i)(-Xn-i)

daj i =1

= -2 dn Xn-j for j = 1,2,….,N (Absorb 2 and assume dn and dn) aj

(n+1) = aj(n) - alpha ddn

2 daj

= aj(n) + alpha dn Xn-1

_____________ (11.47)

A(n+1) = A(n) + alpha dn Xn-1 --------------- (11.49) A(Nx1)

(n+1) = [ a1(n+1) , a2

(n+1) , …….. , aN(n+1) ]T

A(Nx1)

(n) = [ a1(n) , a2

(n) , …….. , aN(n) ]T

X(Nx1)

(n-1) = [ xn , xn-1

, …….. , xn-N+1(n) ]T

LMS algorithm ---------------------- (11.49) 1] R.Goldberg, “A practical handbook of speech coders”, Boca Raton, FL: CRC press, 1999. 2] M.Bosi and R.E.Goldberg, “Introduction to digital audio coding standards”, Norwell, MA: Kluwer 2002. 3] K.Branderberg, “Applications of DSP to audio & acoustics”, E book in SEL.




11.7 Speech Coding

(Fig 11.15 – Segment of speech which is highly periodic) M-K

Rxx (k) = [1/(M - K)] XiXi+k ----------------- (11.35) i = 1

Autocorrelation function peaks at a lag value of 47 & multiples of 47 Rxx (47) , Rxx (94), pitch period = 47 = 47 x 125 x 10-6 sec, fs = 8KHz (16ms of speech) Build an outer prediction loop with a single coefficient predictor tau = pitch period = 47 (Fig 11.16)

Inner Loop Outer Loop

Minimize perceptual distortion rather than MSPE prediction.




Noise feedback Coding (DPCM) N.S.Jayant and P.Noll, “Digital Coding of Waveforms”, PH, 1984 11.7.1 G.726 ITU – T Speechcoding (ADPCM at 40,32,24 & 16 Kbps) fs = 8Khz , 64Kbps, (8000 Samples/sec), (8 bit/sample) L bit rate (Kbps) bits/sample CR 31 40 5 1.6:1 15 32 4 2:1 7 24 3 2.67:1 L = 2nb -1 16 2 4:1 nb = No of bits/sample L = # of Quantizer levels

Out Out

In In

Midtread 1/alphak Q = (log2 dk - log2 aplhak) alphak dk dk

dk dk alphak alphak alphak is adapted to the input alphak = scale factor Adaptation algorithm: y(k) = log2 alphak --------------------- 11.60 Input Speech or speech like Sample – to – Sample al(k) = 1 yn = unblocked difference fluctuates considerably Voice band data al(k) = 0 yn = blocked Fluctuation is small To handle both these situations use y(k) = al(k) yu(k-1) + (1 - al(k))yu(k-1) ----------------------- 11.61 Speech al(k) = 1, y(k) = yu(k-1)




Voice band data al(k) = 0, y(k) = yl(k-1)

11.8/Pg 349 Image coding




(j - 1, k)

Fixed predictor (j, k - 1) j-1 j (j, k)

Pj,k = Xj,k-1 k>0

Xj-1,k k = 0, j>0

128 k = 0, j = 0 4 level Uniform Quantizer Use arithmetic coder = 1bpp DPCM coding Fixed JPEG SNR 22.33 dB 32.52 dB PSNR 31.42 dB 41.60 dB (j - 1, k - 1) (j - 1, k) Adaptive predictor (Recursively indexed Quantizer) (j, k - 1) (j, k)

P1 = 0.5[xj-1, k + xj, k - 1] P2 = 0.5[xj-1, k-1 + xj, k - 1] P3 = 0.5[xj-1, k-1 + xj-1, k] Median predictor Pj,k = median(P1, P2, P3) Adaptive DPCM JPEG 1bpp SNR 29.2 dB 32.52 dB PSNR 38.28 dB 41.60 dB ------------------------------------------ Fig 11.19/ Pg 351 Out 2.91 2.13 1.05 -.06 In 0.06 1.7 2.58




P.S. These notes including images, Tables, Figures etc are adopted from “K.Sayood, “Introduction to Data Compression 3rd edition”, Morgan Kauffman, 2006.




11.5.1 adaptive quantization in dpcm - the … speech coding (fig 11.15 – segment of speech which...

Documents