sample-rate conversion: algorithms and vlsi implementationk-lug.org/~kjj/digitalfilter.pdf ·...

Diss. ETH No. 10980

Sample-Rate Conversion:Algorithms and VLSI

Implementation

A dissertation submitted to theSWISS FEDERAL INSTITUTE OF TECHNOLOGY

ZURICH

for the degree ofDoctor of technical sciences

presented byFRITZ MARKUS ROTHACHER

Dipl. El.-Ing.born 24. 8. 1965

citizen of Blumenstein BE

accepted on the recommendation ofProf. Dr. W. Fichtner, examiner

Prof. Dr. W. Guggenbuhl, co-examiner

1995

Acknowledgements

I would like to thank my adviser, Prof. W. Fichtner, for his confidence in meand my work and for establishing a generous working environment. I enjoyedworking at the Integrated Systems Laboratory over the last four years.

I am grateful to my associate advisor, Prof. W. Guggenbuhl, for readingand commenting on my thesis.

I want to thank Gunnar Lehtinen, Andreas Steiner, Dominique Muller, andChristian Siegrist for their effort in implementing the sample-rate converterduring their semester and diploma theses.

I would also like to thank the staff of the Integrated Systems Lab: All sec-retaries for handling the administrative work, Hanspeter Mathys and HansjorgGisler for maintaining the logistics, the system administrators Christoph Wickiand Adam Feigin who fixed all computer-related problems, and Hubert Kaeslinand Andreas Wieland of the Microelectronics Design Center for their supportconcerning VLSI design methodology and tools.

My special thanks go to Norbert Felber, who contributed many valuableideas and support in the hardware lab. He found always time and was interestedto discuss digital signal processing and related topics, and offered competentadvice. I owe most of my knowledge of real-world measurement techniquesto him.

Tom Heynemann, Robert Rogenmoser, Hubert Kaeslin, and Nobert Felberimproved the quality of this text by their constructive proofreading.

This work would not have been possible without the support and encour-agement of many people outside the ETH. In particular I would like to thankmy parents for their tolerance and support.

i

Contents

Acknowledgements i

Abstract ix

Zusammenfassung xi

1 Introduction 1

1.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : 1

1.2 Concept : : : : : : : : : : : : : : : : : : : : : : : : : : : 2

1.3 Related Work : : : : : : : : : : : : : : : : : : : : : : : : 4

1.4 Structure of the Thesis : : : : : : : : : : : : : : : : : : : 5

2 Principle in Time and Frequency Domain 7

2.1 Test Signals and Sampling : : : : : : : : : : : : : : : : 8

2.2 Interpolation : : : : : : : : : : : : : : : : : : : : : : : : : 10

2.3 Decimation : : : : : : : : : : : : : : : : : : : : : : : : : 11

2.4 Sample-rate Conversion for Fixed Ratios : : : : : : : : 12

2.5 Sample-rate Conversion for Rational M/L : : : : : : : : 15

iii

2.5.1 Upmode : : : : : : : : : : : : : : : : : : : : : : : 15

2.5.2 Downmode : : : : : : : : : : : : : : : : : : : : : 16

2.5.3 Continuous Mode : : : : : : : : : : : : : : : : : 18

2.6 Sample-rate Conversion for Any Arbitrary Ratio : : : : : 19

3 High-Order Digital Filters 23

3.1 Required Filter Characteristics : : : : : : : : : : : : : : 23

3.1.1 Transition Band : : : : : : : : : : : : : : : : : : : 24

3.1.2 Stopband : : : : : : : : : : : : : : : : : : : : : : 24

3.2 Realization Considerations : : : : : : : : : : : : : : : : 26

3.2.1 FIR Filter and Hold Operation : : : : : : : : : : : 26

3.2.2 FIR Filter and Lagrange Interpolation : : : : : : 27

3.2.3 Alternatives : : : : : : : : : : : : : : : : : : : : : 34

3.3 Required Resources : : : : : : : : : : : : : : : : : : : : 36

3.4 Synthesis of High-Order FIR Filters : : : : : : : : : : : : 38

3.4.1 Specifications : : : : : : : : : : : : : : : : : : : : 38

3.4.2 Design Technique : : : : : : : : : : : : : : : : : 39

3.4.3 Direct Synthesis : : : : : : : : : : : : : : : : : : 40

3.4.4 Prototype Method : : : : : : : : : : : : : : : : : 40

3.4.5 Relaxed Specifications : : : : : : : : : : : : : : 43

3.5 Quantization of Filter Coefficients : : : : : : : : : : : : : 45

3.5.1 Quantization of FIR Coefficients : : : : : : : : : 45

3.5.2 Quantization of Interpolated Coefficients : : : : : 47

4 Frequency Tracking 51

4.1 Requirements : : : : : : : : : : : : : : : : : : : : : : : : 51

4.2 Principle : : : : : : : : : : : : : : : : : : : : : : : : : : : 52

4.3 Frequency Counter : : : : : : : : : : : : : : : : : : : : : 53

4.4 Digital Phase Locked Loop : : : : : : : : : : : : : : : : 55

4.4.1 Phase Detector : : : : : : : : : : : : : : : : : : : 56

4.4.2 Loop Filter : : : : : : : : : : : : : : : : : : : : : 57

4.5 Non-uniform Resampling : : : : : : : : : : : : : : : : : 58

4.6 Performance Comparison : : : : : : : : : : : : : : : : : 62

4.6.1 Frequency Counter : : : : : : : : : : : : : : : : 62

4.6.2 Digital PLL : : : : : : : : : : : : : : : : : : : : : 65

5 Implementation 67

5.1 Frequency Tracking : : : : : : : : : : : : : : : : : : : : 68


5.1.2 Digital PLL : : : : : : : : : : : : : : : : : : : : : 69

5.2 Datapath for FIR Filter : : : : : : : : : : : : : : : : : : : 71

5.2.1 Review of Operations : : : : : : : : : : : : : : : 71

5.2.2 Finite Word-Length in Datapath : : : : : : : : : : 73

5.3 DSP Realization Considerations : : : : : : : : : : : : : 75

5.4 ASIC Realization : : : : : : : : : : : : : : : : : : : : : : 77

5.4.1 Implementation : : : : : : : : : : : : : : : : : : : 77

5.4.2 Discussion : : : : : : : : : : : : : : : : : : : : : 79

6 Measurement Principle & Setup 83

6.1 Generator : : : : : : : : : : : : : : : : : : : : : : : : : : 83

6.2 Analyzer : : : : : : : : : : : : : : : : : : : : : : : : : : : 85

6.2.1 Frequency Matching for the FFT : : : : : : : : : 86

6.2.2 Windowing for the FFT : : : : : : : : : : : : : : 86

6.2.3 Complex Windows : : : : : : : : : : : : : : : : : 89

6.3 Measurement Procedure : : : : : : : : : : : : : : : : : 89

7 Results 93

7.1 Frequency Tracking Measurements : : : : : : : : : : : 94


7.1.2 Digital PLL : : : : : : : : : : : : : : : : : : : : : 95

7.2 System Measurements : : : : : : : : : : : : : : : : : : 98

7.3 Discussion : : : : : : : : : : : : : : : : : : : : : : : : : 103

8 Conclusions 107

A List of Symbols and Integrals 111

B Error Caused by Hold Effect 113

B.1 Sine Wave : : : : : : : : : : : : : : : : : : : : : : : : : 113

B.2 White Noise : : : : : : : : : : : : : : : : : : : : : : : : : 114

B.3 Pink Noise : : : : : : : : : : : : : : : : : : : : : : : : : 114

C Error Caused by Linear Interpolation 117

C.1 Sine Wave : : : : : : : : : : : : : : : : : : : : : : : : : 117

C.2 White Noise : : : : : : : : : : : : : : : : : : : : : : : : : 118

C.3 Pink Noise : : : : : : : : : : : : : : : : : : : : : : : : : 118

D Performance Measurements 121

List of Figures 146

List of Tables 147

Curriculum Vitae 149

Bibliography 154

Abstract

Many modern signal processing tasks are performed in the digital domain.Various sample rates are used depending on the required signal quality andthe available bandwidth. Sample-rate conversion is therefore inevitable tointerface systems with different sample rates. In this thesis the concept ofdigital sample-rate conversion by integer ratios is extended to arbitrary ratiosand the trade-offs of various design parameters for audio applications arediscussed.

Conceptually, the source samples are interpolated to a heavily oversampledintermediate signal using a high-order digital filter. To allow a conversionbetween arbitrary sample rates this intermediate signal is transformed into apseudo-analog representation by holding the value in-between the interpolatedsamples. The resulting continuous-time signal is then resampled at the sinksample rate.

For an infinite interpolation factor, ideal filters, and jitter-free samplingclocks digital sample-rate conversion is lossless. In this thesis the error contri-butions of the various non-idealities are quantified. The synthesis of high-order(> 20000) narrow-band lowpass filters and their quantization properties arediscussed. A digital PLL is presented, which allows to measure precisely thesink sampling moment relative to the source samples, while attenuating jitterof the sampling clocks. The effects of the non-uniform sampling due to clockjitter are treated.

An ASIC for fully digital sample-rate conversion has been developed,fabricated, and characterized using an all-digital measurement system. A new(complex) window technique for the FFT is introduced, which minimizes thewindow side effects when doing spectrum analysis of multirate discrete-timesignals. Based on the results of this thesis, the design parameters for a converterwith full 20-bit quality are proposed.

ix

Zusammenfassung

Viele moderne Signalverarbeitungsaufgaben werden im digitalen Bereich ge-lost. Abhangig von der geforderten Signalqualitat und der verfugbaren Band-breite werden unterschiedliche Abtastraten verwendet. Abtastratenwandlungist daher unumganglich, um Systeme mit unterschiedlichen Abtastraten zuverbinden. In dieser Doktorarbeit werden die Konzepte fur die digitale Ab-tastratenwandlung von ganzzahligen zu beliebigen Verhaltnissen erweitert.In Abhangigkeit von verschiedenen Parametern wird die Qualitat und derAufwand bezuglich Audioanwendungen untersucht.

Vom Prinzip her werden die Eingangswerte zu einem stark uberabge-tasteten Zwischensignal interpoliert unter Verwendung eines digitalen Fil-ters hoher Ordnung. Um eine Wandlung zwischen beliebigen Abtastraten zuermoglichen, wird dieses Zwischensignal in eine pseudo-analoge Darstellunggewandelt. Dazu wird der Wert des Zwischensignals zwischen zwei Abtast-werten gehalten. Das resultierende zeitkontinuierliche Signal wird danach mitdem Ausgangstakt neu abgetastet.

Fur einen unendlich grossen Interpolationsfaktor, ideale Filter und jitter-freie Taktsignale ist die digitale Abtastratenwandlung verlustlos. In dieserDoktorarbeit werden die Fehleranteile von verschiedenen Nichtidealitatenabgeschatzt. Die Synthese von schmalbandigen Tiefpassfiltern hoher Ord-nung (> 20000) und deren Eigenschaften fur begrenzte Wortlangen werdenbehandelt. Ein digitaler PLL wird vorgestellt, welcher die Abtastzeitpunkteam Ausgang relativ zum Eingangstakt sehr genau misst und gleichzeitig Jitterder Taktsignale dampft. Die durch Taktjitter verursachten Auswirkungen vonnicht aquidistanten Abtastzeitpunkten werden ebenso besprochen.

Ein ASIC fur ausschliesslich digitale Abtastratenwandlung wurde entwick-elt, hergestellt und mit einem vollstandig digitalen Messystem ausgemessen.Fur die Spektralanalyse von zeitdiskreten Signalen mit mehreren Abtastraten

xi

wird eine neue (komplexe) Fenstertechnik fur die FFT verwendet, welche dieNebeneffekte des Fensters auf ein Minimum reduziert. Ausgehend von denResultaten dieser Doktorarbeit werden die Parameter fur einen Wandler mitvoller 20-bit Qualitat vorgeschlagen.

1Introduction

1.1 Motivation

Modern signal processing problems are often solved in the digital domaindue to the availability of powerful VLSI circuits which allow to performcomplex operations in real-time, without the well-known shortcomings ofanalog implementations. The source signal is transformed into the digitaldomain by an A/D converter. All data processing, e.g. filtering, shaping,mixing etc., is done in the digital domain and only the final result is convertedback to analog. To overcome the degradation caused by successive A/D-D/Aconversion, all processing blocks must have digital interfaces.

Depending on the available bandwidth of the channel, the required quality,and the data rate of the interfaces a wide variety of sample rates are used:e.g. 48 kHz for professional digital audio systems, 44.1 kHz for consumerdigital audio (CD), 32 kHz for digital satellite radio (DSR), or 44.056 kHz forvideo compatibility (VCR). The incorporation of all these systems is howevertrouble-free, if a sample-rate converter is used at each interface.

Sample-rate conversion is also used in the growing field of digital audiocompression systems. Downsampling followed by data compression allowstransmission of a high-quality audio signal over a 64 kbit/s ISDN telephoneline [dGL94].

1

From source to sink, signals are repeatedly converted for editing, storage,and transmission. To retain the signal quality of the original recording –even after several cascaded converters – the conversion must be virtuallytransparent.

1.2 Concept

Many digital signal processing applications necessitate a conversion of thesample rate at the interface to match it to the following system. Concep-tually, sample-rate conversion by non-rational ratios is accomplished in thedigital domain by interpolating the incoming samples to a high enough sam-pling frequency (usually in the GHz range) and subsequently decimating thisoversampled data to the target sample rate by choosing the appropriate sam-ples. Practically the problem will be partitioned into a frequency tracking unit,which determines precisely the sink phase relative to the source samples, andinto a high-order digital interpolation/decimation filter. This filter limits thesignal either to half the source or to half the sink sample rate to avoid aliasing.Therefore two different modes of operation result depending on whether thesource or the sink sample rate is larger.

Research activities of our laboratory demonstrated a way to overcome theneed for two separate modes for up- and downsampling: The cutoff frequencyof the interpolation/decimation filter is dynamically adjusted – according to theratio of sink and source sample rate – by using the frequency scaling propertyof the Fourier transform. We call this capability for a smooth transitionbetween up- and downsampling CONTINUOUS MODE [RW90]. Based on theCONTINUOUS MODE we proposed and realized a VLSI architecture of a sample-rate converter for arbitrary ratios [LS91]. We will refer to this implementationas SARCO (SAmple-Rate COnverter).

For the interpolation/decimation filter, the precise ratio of the source andthe sink sample rates must be measured. If both sample rates are synchronousi.e. derived from the same master clock and if this master clock is available,both rates can be measured by a counter running with the master clock. If themaster clock is not available, it can be recovered by means of a phase-lockedloop (PLL), which additionally suppresses clock jitter. A buffer (FIFO) is thenused for the data exchange. However, non-synchronous interfaces require anew solution.

In general three cases can be distinguished:

A. Synchronous Interfaces

B. Plesiochronous Interfaces

C. Asynchronous Interfaces

Synchronous interfaces are the easiest to handle (as shown above), butreal-world problems are often plesiochronous or asynchronous. In the ple-siochronous case two systems may have the same nominal sample rates, butdue to tolerances or temperature drift of the two independent oscillators, thetwo frequencies are not phase-locked,but differ by a small, varying amount e.g.in the range of 100 ppm. Asynchronous interfaces apply for examples wheresources of different independent sample rates are to be mixed or equipmentwith arbitrary sample rates has to be connected.

The frequency measurement can be realized accurately by several methods:

1. Measurement performed with a very high-frequency clock (GHz)

2. Averaging measurements performed with a medium-frequency clock

3. Determination of the precise sample phase by a (digital) phase-lockedloop

For dynamically changing sample rates an optimal trade-off between pre-cise tracking and noise suppression must be found. High-frequency measure-ment allows to follow sample-rate changes immediately. Unfortunately jitterof the sampling frequencies is not suppressed in this case, but modulated ontothe audio signal. The lowpass filters used for systems of type 2 and 3 removehigh frequency noise at the cost of slower tracking.

In this thesis the emphasis is put on a unified approach to the implementa-tion of sample-rate conversion of audio signals with arbitrarily changing ratios.The effects of non-ideal filtering and non-uniform sampling are analyzed andtheir effect on the signal quality is discussed. Additionally, considerationson VLSI implementation are presented. This field of research involves alarge number of digital signal processing topics, including high-order filtersynthesis, digital phase-locked loops (DPLL), real-time filtering, time-varyingand adaptive filters, polyphase structures, finite word-length effects, spectrumanalysis, and VLSI architecture.

1.3 Related Work

The simplest solution for sample-rate conversion in the digital domain is torepeat or omit a sample from time to time to match the sample rates. Thismethod is especially applicable for plesiochronous sample rates, but yieldsonly low quality results. In professional applications the problem of sample-rate conversion by arbitrary ratios is often reduced to the synchronous caseby equipping all external sources, such as CD players or DAT machines, witha synchronization input. The speed of the external source is then adaptedaccording to the system master clock.

The basic operations for synchronous sample-rate conversion (type A)from a signal processing point of view have been covered in [CR81]. [LK81,LPW82] extended this work to non-rational ratios,but distinguished two modesdepending on whether the sink sample rate is larger or smaller than the sourcerate. Plesiochronous interfaces (type B) cannot be realized with this distinctionwithout smooth switching between modes. A first hardware implementationhas been realized following the above principles, which required some 400integrated circuits per audio channel [Lag82]. In 1983 a theoretical expositionof many aspects of multirate digital signal processing appeared in [CR83]. Ittreats the advantages and disadvantages of FIR and IIR filters for interpolationand decimation by rational ratios in detail and gives many design examples.These realizations all have the disadvantage that they are not well suited forthe CONTINUOUS MODE, but only for fixed sample-rate ratios. [CR83] alsocovered the realization of high-order filters by cascading several low-orderstages, which are easier to realize. [Ram84] discussed how to reduce thestorage requirements for the filter coefficients by using various interpolationmethods.

Most publications on sample-rate conversion do not cover in depth theneed for proper frequency tracking although this is crucial for high qualityaudio applications. [LPW82] suggested the use of a moving average filterfor frequency tracking. [Sti92] used a digital PLL running at a multiple ofeither the source or the sink sample rate. Frequency measurement with aGHz counter has not jet been realized due to the lack of jitter suppression andtechnology limitations.

In addition to the prototype hardware implementation mentioned above,several sample-rate converters have been realized using general purpose dig-ital signal processors. [PHR91] implemented a sample-rate converter on a

DSP56001. The interpolation/decimation filter has been realized by multiply-ing a SINC function by the Blackman-Harris window function, as suggestedby [Ram82]. [CDPS91] realized a sample-rate converter between 44.1 kHzand 48.0 kHz using B-splines. A two stage FIR interpolation each by a factor2 is followed by a 6th order B-spline interpolation. However, these implemen-tations are limited to fixed sample-rate ratios.

The first commercial VLSI implementation of a sample-rate converter us-ing the CONTINUOUS MODE has appeared recently [AK93]. A second commer-cial realization of a sample-rate converter by [JTG+94] follows an alternativeapproach, which will be discussed briefly.

1.4 Structure of the Thesis

This thesis describes theory, experience, and results that have been gainedduring the development and test of an application specific integrated circuit(ASIC) for sample-rate conversion of audio signals with arbitrarily chang-ing ratios. In particular the error contribution of various non-idealities arediscussed and the parameters for a true 20-bit converter are derived.

In Chapter 2 an overview of the basic principle for sample-rate conversionin the digital domain is given. The well-known concepts of interpolation anddecimation by integer factors are applied to sample-rate conversion by rationalratios. The CONTINUOUS MODE already published by the author in [RW90],which extends the sample-rate conversion to arbitrary sample-rate ratios, isdescribed in detail.

Chapter 3 gives insight into the strategies for the design and synthesis ofvery high-order (> 20 000) narrow-band lowpass filters. Several one- andmultistage implementations are compared in performance, implementationcost, and required computation power. An interpolation method which reducesthe storage requirements for the filter coefficients is presented. Additionally,the effects of filter coefficient quantization on the transfer function is discussed.

Chapter 4 compares several alternatives for the realization of the frequen-cy tracking unit. The performance and cost of high-frequency measurement,averaged medium-frequency measurement, and a digital PLL are comparedfor synchronous, plesiochronous and asynchronous sampling frequencies. Inaddition the effects of non-uniform (jittered) sampling are discussed.

Chapter 5 evaluates the system complexity and compares ASIC and DSPimplementations according to hardware requirements and costs.

In Chapter 6 the measurement principles for multirate digital audio systemsand the hardware setup used for the measurements in this thesis are presented.A new window technique for the FFT is introduced, which allows to minimizethe side effects of the window when doing spectrum analysis of multiratediscrete-time signals.

Chapter 7 compares the different concepts, architectures and implementa-tions by measurements.

2Principle in Time and FrequencyDomain

The obvious method to transfer an audio signal between unassociated samplerates is digital-to-analog (D/A) conversion followed by analog-to-digital (A/D)conversion (Fig. 2.1).

DA-ADconv.epsi103 21 mm

D/A A/D

ReconstructionFilter

Anti-aliasingFilterf source f sink

f source f sink

Figure 2.1: D/A conversion followed by A/D conversion

This method works for any ratio of sink sample rate fsink to source samplerate fsource, but is tainted with several limitations. First of all, to retain thelarge signal-to-noise ratio (S/N) of a digital audio signal expensive high-qualityD/A and A/D converters have to be used. Secondly, the output signal of theD/A converter must be reconstructed by an analog filter of high precisionwith cutoff frequency fsource=2. Additionally, to fulfill the Nyquist theorem,the input signal of the A/D converter must be band-limited to fsink=2 by ananti-aliasing filter.

If the sink sample rate fsink is larger than the source sample rate fsource

7

– we will refer to this case as UPMODE– the second filter can be omitted. Inthe opposite case (DOWNMODE) the first filter is not necessary. Therefore bothlowpass filters can be merged into one with cutoff frequency

fcutoff =

fsource

2 if fsink > fsource UPMODEfsink

2 otherwise DOWNMODE(2.1)

This reconstruction/anti-aliasing filter, which must be realized in the analogdomain, needs a flat passband, a narrow transition band and a highly attenuatedstopband. The required specifications for 16 bit quality make the problemhardly solvable.

A third disadvantage of the D/A-A/D approach is – besides the need forhigh-quality D/A and A/D converters and the expensive analog filter – thatany jitter on the sampling clocks will translate into signal distortion and thusneeds to be suppressed by additional circuitry.

We can avoid many of the above problems by performing the sample-rateconversion process in the digital domain. The A/D and D/A converters areomitted and the analog filters are replaced by digital ones. This concept willbe explained in detail in the following sections.

2.1 Test Signals and Sampling

The spectrum of the example signal that is used for illustrative purposesthroughout this thesis is shown in Figure 2.2.

Signal.epsi49 31 mm

f

X(f)

⋅fsample12 fsample

Figure 2.2: Spectrum of example signal

It is composed of three parts: a low-frequency sine wave, a high-frequencysine wave and a third component, which declines linearly with the frequency.This third component shall represent the spectrum of a typical audio signal,while the first two show the response for low- and high-frequency components.

For noise calculations three sine waves at 1; 10; 20 kHz, white noise, andpink noise are used as test signals. White noise is a random signal, which hasan equal amount of energy per Hertz of bandwidth. Pink noise however israndom noise delivering an equal energy per octave [Ben88]. This distributionmatches more precisely the sensitivity of the ear. Figure 2.3 shows the spectralpower density of pink noise and white noise respectively.

Pink.epsi101 33 mm

f [Hz]20 200 2 000 20 000

-50.8

-40.8

-30.8

-20.8

S(f) [dB/Hz]

-46.0

Figure 2.3: Pink noise S(f) = 16f , white noise S(f) = 1

∆f

The well-known sampling theorem states that a continuous-time signal x(t),with a spectrum X(f) that is band-limited to fsample=2, can be uniquely anderror-free reconstructed from its samples x[n]. The spectrum of these samplesis composed of the original spectrum X(f), which repeats at all multiples of thesampling frequency (Fig. 2.4). If the analog signal x(t) contains componentsat frequencies higher than fsample=2 they will distort the sampled signal in theform of spectral fold-over (aliasing) and can not be recovered anymore.

Sampling.epsi99 18 mm

baseband

f sample 2⋅fsample 3⋅fsample 4⋅fsample 5⋅fsample

Figure 2.4: Spectrum of a sampled signal with sample rate fsample

A second parameter – besides the sampling frequency – which character-izes the sampled data is the word-length used to represent the data samples.

The effects of finite word-length will be neglected in the following sections,but will be discussed in Sections 3.5 and 5.2.2.

2.2 Interpolation

In this section the increase of the sample rate by an integer factorL is described.In the following we refer to this process as interpolation.

Interpolation.epsi99 37 mm

H L

L⋅fsamplef sample

f sample L⋅fsample

Figure 2.5: Interpolation by a factor L

If we increase the sample rate of a signal we can preserve the full signalcontent according to the sampling theorem. After the interpolation the signalspectrum repeats only at multiples of the new sample rateLfsample (Fig. 2.5).The interpolation is accomplished by inserting L-1 zeros between successivesamples (zero-padding) and using an (ideal) lowpass filter with

HL(f) =

(L for 0 < f <

fsample

2

0 for fsample

2 < f <Lfsample

2

(2.2)

The multiplication of a signal with spectrumX(f) with a transfer functionH(f) in frequency domainY (f) = H(f) X(f)corresponds to a convolutionof the signal in the time domain y(t) = h(t) ? x(t). For the zero-paddeddiscrete-time signal x[.] we get

y[m] =

+1Xk=1

hL [m k] x

k

L

; for k = L;2L;3L; : : : (2.3)

If we substitute k by (m divL) L n L we get

y[m] =

+1Xn=1

hLm (m divL) L+ n L

x [m divL n] (2.4)

=

+1Xn=1

hL [n L+m modL] x [m divL n] (2.5)

where m divL is the integer portion of the division m=L , and m modLdenotes the remainder.

For each output value y[m] we therefore multiply and accumulate n sam-ples of the impulse response hL[:] by the corresponding input samples x[:].Note that the samples of the impulse responsehL[:] are equidistant with spacingL.

2.3 Decimation

To decrease the sample rate by an integer factor M (decimation) we mustfirst band-limit the signal to fsample=(2 M) (Fig. 2.6) by the lowpass filterHM (Eq. 2.6) to comply with the sampling theorem and keep only everyMth

sample. As a result, we loose all signal content above half the target samplingfrequency fs=M .

Decimation.epsi104 38 mm

H M

fsample

f sample

1/M⋅fsample

1/M⋅fsample

Figure 2.6: Decimation by a factor M

HM (f) =

(1 for 0 < f <

fsample

2M

0 for fsample

2M < f <fsample

2

(2.6)

To get the decimated signal we start from an initial phase '0, which canbe chosen arbitrarily, keep every Mth sample and skip all other samples(Eq.2.7). There exist therefore M different sets of samples (depending on theinitial phase '0), which all represent the same signal.

y[m] =

+1Xn=1

hM [m M + '0 n] x [n] ; for '0 = 0; : : : ;M 1

(2.7)

2.4 Sample-rate Conversion for Fixed Ratios

In the preceding section we have considered interpolation and decimation astwo separate operations. To perform sample-rate conversion by a fixed ratioM=L these two operations have to be cascaded (Fig. 2.7). Note that theinterpolation must precede the decimation to keep the maximum bandwidth ofthe source signal.

Conversion.epsi103 38 mm

HLL M

=

HM

fsource L⋅fsource L⋅fsource

L/M⋅fsource f sink

f sink

Figure 2.7: Sample-rate conversion by a fixed ratio M=L

The two lowpass filters can be merged into a single one with a cutofffrequency, which is half of the minimum of source and sink sample rate(Eq. 2.8).

HLM (f) =

L for 0 < f < min( fsource2 ; fsink2 )

0 otherwise(2.8)

The interpolation of the source signal by the factor L is succeeded bythe decimation of factor M . Therefore, instead of calculating the interpolatedsignal at the rateL fsource and keeping only everyMth value, we can operatethe interpolation/decimation filter at the rateL=M fsource, as indicated by thedotted line in Figure 2.7, but we must choose the correct set of filter coefficientsfor each sink sample.

If we substitute m in Equation 2.5 by 'sink[m] = m M + '0 (Eq. 2.7)we obtain the formula for sample-rate conversion by fixed integer ratios

y[m] =

+1Xj=1

hLMj L+ 'sink[m] modL

x'sink[m] divL j

(2.9)

with

'sink[m] = m M + '0; mod Modulo division; div Integer division

This is the all-digital equivalent to the D/A-A/D solution (Fig. 2.1, Eq. 2.1),if we use a infinitely large interpolation factor L.

The theoretical concept for sample-rate conversion by fixed integer ratioshas already been treated in [CR83]. But for an actual realization of Equa-tions 2.8 and 2.9 the following problems are to be solved:

1. Equation 2.8 specifies a lowpass filter with a constant passband gainand a transition band of width zero. The impulse response of sucha rectangular filter with cutoff frequency fcutoff and a sample rateL fsample is

h[n] =sin[2n fcutoff

Lfsource]

2n fcutoffLfsource

= SINC [n 2 fcutoffL fsource

] (2.10)

and extends from n = 1 to n = +1. An actual realization canonly be a high-order lowpass filter of finite length due to the causality

principle. The synthesis problems for such high-order (lowpass) filtersare discussed in Chapter 3. In the following we assume that the filter isof order Q L+ 1 with a sufficiently large Q.

2. For a lossless sample-rate conversion we assumed an infinite stopbandattenuation of the lowpass. A filter of finite length can only reach afinite stopband attenuation, however. Furthermore the quantization ofthe filter coefficients h[:] additionally alters the filter transfer function.In Chapter 3 we will evaluate the achievable stopband attenuation ofthe lowpass h[:] and introduce the concept of quasi-floating-point (QFP)notation for the filter coefficients. The effects of finite word-lengthcomputation in the datapath are discussed in Section 5.2.

3. The cutoff frequency of the lowpass (Eq. 2.8) is not constant for variableratios M=L, but is defined by

fcutoff =

fsource

2 if fsink > fsource UPMODEfsink

2 otherwise DOWNMODE(2.11)

[Pel82] distinguished two modes of operation depending on whether thesink sample rate is larger or smaller than the source rate. The samelowpass filter either runs with the source or the sink sample rate andthereby only one set of filter coefficients is needed. The disadvantageof this approach is that switching from one mode to the other requiresmuting of the output signal. The CONTINUOUS MODE to be presented inSection 2.5.3 overcomes this limitation.

4. For fixed sample rates M and L can be calculated in a straightforwardmanner, e.g. for fsource = 48 kHz and fsink = 44:1 kHz we get M =

160 and L = 147. On the other hand, if the two clock sources arenot synchronous their ratio can be non-rational and may even vary overtime. The sample-rate converter must be able to accept any ratio of sinkand source sample rate. Furthermore it must suppress short-time jitterof the sampling clocks. The different frequency tracking principles forarbitrary ratios M=L are described in Chapter 4, whereas in Section 2.6we will estimate the required interpolation factor L for professionalaudio quality.

2.5 Sample-rate Conversion for Rational M/L

In Section 2.4 we treated the sample-rate conversion for fixed ratios and statedthat the cutoff frequency of the interpolation/decimation lowpass must beeither fsource=2 (if fsource < fsink) or fsink=2 (Eq. 2.11). In this section wedescribe how the frequency scaling property of the Fourier transform can beused to adjust the cutoff frequency of the lowpass filter dynamically to allowa smooth transition between both modes.

Rational.epsi103 26 mm

L M

Adaptive Filter

f source f sinkL⋅fsource

Figure 2.8: Sample-rate conversion by a rational ratio M=L

Note that since M=L is rational all operations can be performed on adiscrete time-grid. The time unit is defined by the source sample rate as

tunit =1

L fsource(2.12)

Using this time unit, the source sample period is L tunit and the sink sampleperiod is accordingly M tunit.

2.5.1 Upmode

If the sink sample rate is larger than the source sample rate we can preservethe full bandwidth of the source signal. If we assume filter h to be of orderQ L+ 1, using Equation 2.9, we get

y[m] =

+Q

2Xj=

Q

2

h

264j L+

∆'[m]z | 'sink[m] modL

375x

264

ptr[m]z | 'sink[m] divLj

375 (2.13)

with 'sink[m] = m M + '0

Figure 2.9 gives a graphical representation of Equation 2.13. Because allmathematical operations in Equation 2.13 are integer operations, we will stayon the time-grid defined in Equation 2.12.

Upmode.epsi104 52 mm

ϕ 0

ptr[m]sourcesamples

sinksamples

h [n]

L

M∆ϕ[m]

∆ϕ[m] -L+∆ϕ[m]

L+∆ϕ[m] -2⋅L+∆ϕ[m]

ϕ [m]sink

Figure 2.9: UPMODE, fsink > fsource, L > M

Starting from a measured initial phase '0 (on the grid) the phase of thesink samples'sink[m] is incremented byM (time units) for every value y[m].Each value of 'sink[m] determines both the phase of the sink sample relativeto the last source sample ∆'[m] as well as its position ptr[m]. Depending onthe phase difference ∆'[m] one particular subset of the samples of the impulseresponse h is taken. The samples of the impulse response have a spacing ofL.Since the lowpass filter h is of order Q L + 1, L different subsets of Q + 1values exist.

2.5.2 Downmode

If the sink sample rate is smaller than the source sample rate we are not ableto preserve the full frequency range of the source signal, but must limit it tofsink=2 by the lowpass filter h0 (Eq. 2.11).

+Q

2Xj=

Q

2

h0

264j L+

∆'[m]z | 'sink[m] modL

375 x

264


375 (2.14)

In order to use the same lowpass filter coefficients as in the UPMODE

(fcutoff = fsource=2) we use the frequency scaling property of the Fouriertransform to alter the cutoff frequency from fsource=2 to fsink=2.

h0[i] =fsink

fsource h

fsink

fsource i

=

L

M h

L

M i

(2.15)

If Equations 2.14 and 2.15 are combined we get for y[m] in the DOWNMODE

L

M

+MLQ

2Xj=M

LQ

2

h

264 L

M (j L+

∆'[m]z | 'sink[m] modL)

375 x

264


375

with 'sink[m] = m M + '0 (2.16)

Downmode.epsi108 69 mm

ϕ 0

sourcesamples

sinksamples

h’[n]

ptr[m] L

M∆ϕ[m]

-L /M+∆ϕ [m]

-2⋅L /M+∆ϕ [m]

h [n]

ϕ [m]sink

∆ϕ [m]D

L /M+∆ϕ [m]D

D

D

2

2 2

Figure 2.10: DOWNMODE, fsink < fsource, L < M

Figure 2.10 illustrates Equation 2.16. The phase 'sink[m] for each sinksample y[m] is calculated in the same way as in the UPMODE, but the filtercoefficients must be scaled by L=M . The effective phase difference ∆'D[m]

is therefore L=M ('sink[m] modL) and the samples of the impulse response

have a spacing of L2=M . Since L=M determines only the cutoff frequencyof the decimation filter we can alter it slightly so that L2=M is an integer andstays on the time-grid defined in Equation 2.12.

The number of multiplication and accumulations (P

x[:] h[:]) increasesfrom Q+ 1 in the UPMODE to Q M=L+ 1 in the DOWNMODE.

2.5.3 Continuous Mode

The CONTINUOUS MODE combines both UPMODE and DOWNMODE. Since thecutoff frequency of the lowpass filter is adapted continually, there is a smoothtransition between both modes.

y[m] = %

+QD

2Xj=

QD2

hj L %+ ∆'D [m]

xptr[m] j

(2.17)

with

UPMODE DOWNMODE

if L > M if L < M

i.e. fsink > fsource i.e. fsink < fsource

'sink[m] m M + '0

ptr[m] 'sink[m] divL

% 1 LM

∆'D[m] 'sink[m] modL LM ('sink[m] modL)

QD Q MLQ

Recall that to stay on the time-grid, %must be altered slightly so thatL2=M

is an integer. Additionally, QD must be rounded up to the next even integerand the initial phase ∆'D[0] = L=M ' must be rounded to the next integer.

2.6 Sample-rate Conversion for Any ArbitraryRatio

Up to now we have assumed that the ratio of source to sink sampling frequencyis a rational number and can consequently be written as M=L. If we chooseLlarge enough we can approximate any rational number to a sufficient precision.However there are practical implementation limits for L. In this section wewill estimate the minimum required order of L for 18- and 20-bit quality.

HoldTime.epsi107 35 mm

T=1/(L⋅f )source

âx[n]

x(t)

y(t)

Figure 2.11: Continuous-time representation y(t) by holding value in-between samples x[n]

If the source and the sink sample rates have an arbitrary (or even varying)ratio, we need to change from the discrete time-grid (Eq. 2.12) to a continuous-time representation. This can accomplished by interpolating the source signalwith a large, but fixed interpolation factor L and holding the value in-betweenthe interpolated samples (Fig. 2.11). This continuous-time signal is thenresampled at the sink sample rate.

Note that instead of holding the sample value between successive samples,linear or higher order interpolation between samples could be used as well.However, since the available time-resolution is limited, these operations mustbe followed by a hold operation to get a continuous-time signal. We willdiscuss this approach further in Chapter 3.

Figure 2.12 shows a block diagram of the required operations using the holdoperation. The source signal x[n] is interpolated to x[n] and then convertedto the continuous-time signal y(t) by the hold operation. y(t) can then beresampled at any required moment.

The hold operation of the value of the interpolated source signal during the

Arbitrary.epsi105 30 mm

âx[n]L

AdaptiveFilter

DiscreteTime

ContinuousTime

Hold Re-sample

x[n] y[m]y(t)

f source f sinkL⋅fsource

Figure 2.12: Sample-rate conversion by an arbitrary ratio M=L

time T = 1=(L fsource) can be expressed in the time domain as a convolutionof the interpolated samples with a rectangle of width T (Eq. 2.18).

y(t) = x[n] ? RECT (1

L fsource) (2.18)

This convolution in the time domain corresponds to a multiplication witha SINC function in the frequency domain (Eq. 2.19).

Y (f) = X[f ] HHold(f) = X[f ] SINC (f

L fsource) (2.19)

After interpolation by a factor L the spectrum of the source signal repeatsat all multiples of L fsource. Figure 2.13 shows both the spectrum of the in-terpolated source signal and the transform of the hold operation. The resultingspectrum is the multiplication of both.

Hold.epsi103 30 mm

HHold

1⋅f i

=2⋅f i 3⋅f i

L⋅fsource=

2⋅L⋅fsource=

3⋅L⋅fsource

Figure 2.13: Interpolation by L and hold operation

The baseband signal up to fsource=2 is left (almost) unchanged by thehold operation, while all remaining signal components at multiples of fi =

L fsource are attenuated by the SINC . They will be folded back into thebaseband after the decimation. If fsource and fsink are uncorrelated we canassume that the signal amplitudes of different folding products that are foldedback to the same location in the baseband do neither add nor totally cancel eachother. Therefore their energy can be summed up. The total error power due tothe hold operation for a full-scale signal is listed in Table 2.1 for five differentsignals. The derivation of these values can be found in Appendix B. Note thatthese errors correspond to full-scale signals, and scale down proportionally forsmaller signal levels.

Error in [dB] for L=2k 216 218 220

Sine @ 1 kHz 28:5 k 6:02 124:8 136:9 148:9Sine @ 10 kHz 8:5 k 6:02 104:8 116:9 128:9Sine @ 20 kHz 2:4 k 6:02 98:7 110:8 122:8White Noise 8:6 k 6:02 104:9 117:0 129:0Pink Noise 13:2 k 6:02 109:5 121:6 133:6

Table 2.1: Error caused by hold operation (fsource = 48 kHz)

The largest portion of the error is contributed by the components at thefirst zero crossing of the SINC function (Fig. 2.13, f = fi). If the audiosignal is a sine wave with frequency fsig we get two contributions around fiat f = L fsource fsig with an amplitude of fsig=(L fsource) (Eq. B.2).These two components contain about 60 % of the total error energy due to thehold effect. I.e. for a 1 kHz (20 kHz) signal sampled at fsource = 48 kHz andan interpolation factor of L = 216 the two error peaks appear5:2 dB belowthe value given in Table 2.1 at 130 dB (104 dB).

In this section we have calculated the error introduced due to an arbitrary,non-rational ratio of source and sink sample rate depending on the interpolationfactor and various source signals. Thereby we presumed the sample-rate ratioand thus the resampling moment to be known precisely. We will discussthis assumption further in Chapter 4, after the presentation of the frequencytracking unit.

For performance and cost estimations in the following chapters we willuse two different interpolation factors:

1. For L = 216 18-bit quality can be reached for a pink noise audio signal,but for full-scale signals close to fsource=2 only 16-bit accuracy ispossible.

2. For L = 220 the sum of all error components is below 122 dB for allfive signals shown in Table 2.1. 20-bit quality can therefore be achievedover the full audio bandwidth.

3High-Order Digital Filters

In Chapter 2 we assumed that the interpolation/decimation lowpass has a con-stant passband gain, a transition band of width zero, and an infinite stopbandsuppression (Eq. 2.8). In this chapter we will estimate the minimally requiredfilter specifications for professional digital audio applications and discuss vari-ous realization methods. Furthermore the quantization properties of high-ordernarrow-band lowpass filters are discussed.

As shown in Section 2.5.3, the CONTINUOUS MODE uses the same filtercoefficients for the UPMODE and the DOWNMODE. In the following sections wewill therefore design the filter and estimate the performance for the UPMODE.We will discuss the implications on the DOWNMODE in Section 3.3.

3.1 Required Filter Characteristics

The interpolation/decimation filter is specified by a passband from 0 to fp witha ripple smaller than p and a stopband that starts at fs with an attenuationof at least s (Fig. 3.1).

23

FiniteFilter.epsi103 32 mm

f i =

δs

fp fs

δp

f source L⋅fsource

Figure 3.1: Filter with finite stopband attenuation

3.1.1 Transition Band

The transition band of the filter must be narrow, because signal componentsjust below fsource=2 must pass unaltered, whereas those just above must besufficiently suppressed. For high-quality audio fp = 15=32 fsource andfs = 17=32 fsource is chosen [Pel82]. Note that the transition band issymmetric in respect to fsource=2 and will therefore not alias into the baseband(below fp = 15=32fsource). Table 3.1 gives the borders of the transition bandfor the three most common sampling frequencies. With 48 kHz for example,the passband extends up to 22.5 kHz.

fsource fp fs48.0 kHz 22.5 kHz 25.5 kHz44.1 kHz 20.7 kHz 23.4 kHz32.0 kHz 15.0 kHz 17.0 kHz

Table 3.1: Transition bandwidth for common sampling frequencies

3.1.2 Stopband

In Section 2.3 we stated that the interpolated signal is to be band-limited tohalf the target sample rate before the decimation takes place. This requirementis only approximated by a filter with a finite stopband attenuation s (Fig. 3.1).The remaining stopband energy (from fsource=2 upwards) of the higher orderimages is therefore attenuated by s and is then folded back into the baseband.

The alias error of the residual signals at multiples of fi = Lfsource causedby the hold operation has already been discussed in Section 2.6 (Fig. 2.13,

FilterHold.epsi103 27 mm

1⋅f i 2⋅f i 3⋅f i

δs

Figure 3.2: Weighted stopband error

Table 2.1). This contribution is not affected by the finite stopband attenuationand is therefore left blank in Figure 3.2.

As mentioned above, an additional error contribution results from the non-zero stopband which is weighted by the SINC -function of the hold operation(Fig. 3.2). For a large interpolation factorL the signal energy is approximatelyequally distributed over the stopband since the signal spectrum is repeatedmany times. If we further assume that the filter has a constant stopbandattenuation s the total stopband energy is obtained by

Estop = 2 1Xj=0

Z(j+1)fi fsource

2

jfi+fsource

2

(s)2

fsource

sin( f

fi)

ffi

!2

df (3.1)

Assuming the same error energy density in the (narrow) passband as in thestopband right through, Equation 3.1 can be simplified (using Eq. A.2) to

Estop 2 Z 1

0

(s)2

fsource

sin( f

fi)

ffi

!2

df

=2 (s)2

fsource

2fi

= L (s)

2 (3.2)

If the error introduced by the finite stopband attenuation should be smallerthan 110 dB and L = 216 (hold error at 110 dB for pink noise; Table 2.1)then s must be less than 160 dB (Eq. 3.2). For a stopband attenuation of130 dB and L = 220, s must even be below 190 dB.

3.2 Realization Considerations

In this section we discuss several alternatives for the realization of the inter-polation/decimation filter specified in the previous section. In Section 3.2.1the order and computational cost of a single stage FIR filter is estimated. Thisfilter is followed by the hold operation and the decimation. In Section 3.2.2the hold operation is replaced by linear interpolation, allowing a lower orderfilter. In Section 3.2.3 alternative implementations are discussed briefly.

3.2.1 FIR Filter and Hold Operation

The most direct implementation is the use of a single stage FIR filter followedby the hold operation as depicted in Figure 3.3. What is the order of a singlestage FIR filter, which fulfills the specifications given in Section 3.1?

SingleStage.epsi97 22 mm

LSingleStageFIR

DiscreteTime

ContinuousTime

Hold Re-sample

fsource f sink

Figure 3.3: Single stage FIR filter followed by hold operation

[Rab73] derived an empirical formula to estimate the order of a Chebyshevlowpass filter for specified p; s; fp; fs. Table 3.2 gives the approximate orderN of the lowpass filter according to this formula for various values of Estop,L, and p. The parameters s, fp, fs are calculated according to Equation 3.3.

s =

qEstop=L; fp =

1532

1L; fs =

1732

1L

(3.3)

Assuming a decent passband ripple of 0:01 dB, a filter of order 6 600 000is required. If the passband ripple is reduced to 0:001 dB, the filter orderincreases only slightly to 7:5 106. A filter of order 8:2 106 results, if thestopband error is reduced to 130 dB. If the interpolation factor is increasedto lower the influence of the hold operation the filter order increases rapidly.

Estop L p s N Q

[dB] [dB] [dB]

110 216 0:01 158 6:6 106 102110 216 0:001 158 7:5 106 115130 216 0:001 178 8:2 106 126130 218 0:001 184 33:9 106 129130 220 0:001 190 138:9 106 132

Table 3.2: Estimated filter order

Recall that to compute one sink sample only a subset of Q coefficientsof the impulse response of the filter is taken (Q = N=L, Sec. 2.5.1). Thecomputational cost (which is proportional toQ) is consequently about the sameindependent of the interpolation factor (last three rows of Table 3.2). However,the number of filter coefficients that must be stored increases proportionallyto the interpolation factor. For L = 220 over 108 filter coefficients must bestored. Additionally, direct synthesis of a lowpass filter of order 108 is clearlybeyond practical feasibility.

3.2.2 FIR Filter and Lagrange Interpolation

3.2.2.1 Principle

In Section 2.6 we used the hold operation to convert the highly interpolatedsource signal to a continuous-time signal, which was then resampled at the sinksample rate. Instead of holding the value between two consecutive samplesof the interpolated signal, we now use first order Lagrange interpolation (fromnow on simply called linear interpolation) between the two samples to calculatethe sink value (Fig. 3.4).

The frequency response of a linear interpolator has a SINC 2 characteristic.The noise suppression around the zero crossings of a SINC 2 is much highercompared to the SINC function of the hold operation. Therefore a substan-tially smaller interpolation factor is sufficient. The error energy around thezero crossings for a given interpolation factor L1 = 2k is shown in Table 3.3and derived in Appendix C. If the error caused by the linear interpolation(Table 3.3) is compared to the error caused by the hold operation (Table 2.1) itcan be seen that L1 = 27 corresponds to L = 216 and L1 = 29 corresponds to

Lagrange1.epsi91 21 mm

L FIRFilter

DiscreteTime

ContinuousTime

Re-sample

LinearInter-polation

1

f source f sink

Figure 3.4: FIR filter followed by linear interpolation

L = 220. The interpolation factor can therefore be reduced drastically, if thehold operation is replaced by linear interpolation.

Error in [dB] for L1 =

2k 27 28 29

Sine @ 1 kHz 63:9 k 12:04 148:2 160:2 172:3Sine @ 10 kHz 23:9 k 12:04 108:2 120:2 132:3Sine @ 20 kHz 11:9 k 12:04 96:2 108:2 120:3White Noise 18:7 k 12:04 103:0 115:0 127:1Pink Noise 25:7 k 12:04 110:0 122:0 134:1

Table 3.3: Error caused by linear interpolation (fsource = 48 kHz)

So far in this section we assumed that the linear interpolation is calculatedwith an infinite resolution on the time axis. Therefore we got a continuous-timesignal after the interpolator (Fig. 3.4).


ContinuousTime

Re-sampleL FIR

Filter1

L Hold

DiscreteTimeLinear

Inter-polation

2

Interpolation/DecimationFilter

f source f sink

Figure 3.5: FIR filter followed by discrete-time linear interpolation

In a practical realization we only have a finite time resolution of tunit.

tunit =1

L fsourcewith L = L1 L2,

where L2 is the number of resolved samples by the linear interpolator betweentwo FIR-interpolated samples. The block diagram of Figure 3.4 is thereforecompleted with additional function blocks resulting in Figure 3.5.

We actually have a cascade of three filters (Fig. 3.6): the FIR filter HFIR,the linear interpolator HLin with the first zero crossing at L1 fsource, andthe hold operation HHold with the zero at L1 L2 fsource. If the followingsection we discuss the error contributions of the three filters.

FilterCascade.epsi108 26 mm

sourceL1 f⋅ ⋅L1⋅ sourceL2 f

HFIRHHold

HLin

Figure 3.6: Transfer characteristic of FIR filter, linear interpolator, and holdoperation

Figure 3.7 shows the time domain representation of Figure 3.6. The sourcesamples are interpolated by the FIR filter resulting in the bold samples withspacing 1=(L1 fsource) in Figure 3.7. Linear interpolation leads to L2

intermediate samples with spacing 1=(L1 L2 fsource) = 1=(L fsource).Finally, the hold operation converts the discrete-time signal to a continuous-time representation.

LinTime.epsi89 29 mm âx[n]

x(t)

y(t)

T=1/(L⋅f )sourceT=1/(L ⋅f )source1

Figure 3.7: FIR filter and linear interpolation followed by hold operation

3.2.2.2 Performance Estimation

The error introduced through the three cascaded filters consists of three com-ponents. In the first place the stopband attenuation of the FIR filter is weightedby the SINC 2 transfer function of the linear interpolator, secondly the remain-ing components of the higher order FIR passband attenuated by the SINC 2 aswell, and thirdly the residuals of the hold operation.

The sum of the weighted stopband attenuation is (using Eq. A.3)

Estop1 2 (s1)

2

fsource

Z 1

0

sin( f

L1fsource)

f

L1fsource

!4

df =23 L1 (s1)

2 (3.4)

If we assume a white noise signal, we get for the contributions aroundthe zero crossings of the linear interpolator (Eq. C.7) and the zero-order hold(Eq. B.7)

Estop2 4

7200 (L1)4+

2

72 (L1 L2)2

(3.5)

The total error for a FIR filter with a finite stopband attenuation s1 istherefore for the three different audio signals

ESine23 L1 (s1)

2 +4

45

fsig

L1 fsource

4

+2

3

fsig

L1 L2 fsource

2

(3.6)

EWhite23 L1 (s1)

2 +4

7200 (L1)4+

2

72 (L1 L2)2 (3.7)

EPink23 L1 (s1)

2 +

10 960

L1 fsource

4

+

10 472

L1 L2 fsource

2

(3.8)

Table 3.4 gives the contribution of the three summands for a white noisesignal (Eq. 3.7) at fsource = 48 kHz and sample values of L1, L2, and s1.Note that the second summand is reduced by 7 dB and the third by 4.6 dB, ifpink noise is used instead of white noise.

L1 L2 s1 L Weighted Linear Zero orderstopband interpolation holdEq: 3:4 Table 3:3 Table 2:1

[dB] [dB] [dB] [dB]

27 29137 216

118 103 10528 28

137 216115 115 105

29 27137 216

112 127 105

28 210137 218

115 115 11729 211

152 220127 127 129

Table 3.4: Error contributions from interpolation and finite stopband attenu-ation (white noise signal)

For the first three rows of Table 3.4 the same interpolation factor L =

L1 L2 = 216 is used, but the order of the interpolation filter increases andlikewise the number of coefficients that have to be stored.

For a sine wave test signal the second and the third error summand ofEq. 3.6 depend on the audio signal frequency. Figure 3.8 shows plots using theparameters in last four rows of Table 3.4. All three summands of Equation 3.6are plotted with dashed lines and their sum with a solid line.

For the parameters of two plots in the upper half of Figure 3.8 18-bitperformance for full-scale signals is only reached up to about 5 kHz due to thehold operation (L = 16). The lower two plots represent parameters for 18-and 20-bit performance over the full range of input signals.

So far we only discussed the integral over the error contributions. If weapply a sine wave test signal, the energy will not be equally distributed andwe will get distinct error components in the decimated sink signal. In thefollowing the size of these ‘peaks’ is calculated.

For a sine wave close to fsource=2 the largest contribution to the stopbanderror is caused by the insufficient attenuation of the linear interpolator and thezero-order hold at the first zero crossing (Fig. 3.6).

The minimal attenuation of the linear interpolator at the first zero crossing

ErrorSine.eps108 76 mm

100 1000 10000−130

−120

−110

−100

[Hz]

[dB

]

L1 = 256, L2 = 256, δ s1 = −137 dB

100 1000 10000−130

−120

−110

−100

[Hz]

[dB

]

L1 = 512, L2 = 128, δ s1 = −137 dB

100 1000 10000−130

−120

−110

−100

[Hz]

[dB

]

L1 = 256, L2 =1024, δ s1 = −137 dB

100 1000 10000−130

−120

−110

−100

[Hz]

[dB

]

L1 = 512, L2 =2048, δ s1 = −152 dB

Figure 3.8: Error contributions vs signal frequency for sinusoidal audio sig-nals

for f = L1 fsource fsource=2 is (Eq. C.2)

fsource

2

L1 fsource

!4

=1

16 (L1)4

(3.9)

and for the zero-order hold we get (Eq. B.2)

fsource

2

L1 L2 fsource

!2

=1

4 (L1 L2)2

(3.10)

For L1 = L2 = 256 the two above equations result in 108 dB and 102 dBrespectively. This means that for a source signal close to fsource=2 the sinksignal contains two distinct error components at 102 dB caused by the holdoperation and two error components at 108 dB caused by the linear interpo-lation. The alias frequency of these error components depends on the exactsource and sink sample rates and the signal frequency. Therefore not only theaccumulated stopband error is a matter of concern, but also its distribution.

3.2.2.3 Implementation

Figure 3.9 shows a simplified block diagram of the principle outlined in theprevious section. The two interpolators (by L1 and L2) are merged into oneand the FIR filter and the linear interpolator are combined in one block.


L

DiscreteTime

ContinuousTime

Hold Re-sample

FIRFilter

LinearInter-polation

&

fsource f sink

Figure 3.9: FIR filter with linear interpolator

In the time domain the multiplications of the source signal by the fil-ter transfer functions can be expressed as successive convolutions. We willdiscuss further the two implementations that are indicated with brackets inEquation 3.11.

(x ? hFIR) ? hLin = x ? (hFIR ? hLin) (3.11)

In the left hand case the source signal is interpolated by the FIR filter. Ad-ditional intermediate values are calculated by linear interpolation of the over-sampled signal. In the right hand case the coefficients of the FIR filter arelinearly interpolated and the resulting filter is then applied to the source sig-nal. Both cases result in the same transfer function, since the convolution isassociative.

The above paragraph describes only the conceptual model. As in the caseof the single stage FIR filter not all values of the interpolated source signalmust be calculated since the interpolation is followed by the decimation. Inthe left hand case of Equation 3.11 two successive output samples of the FIRfilter are computed per sink sample. The sink sample is then calculated bylinear interpolation between the two. In the right hand case only one outputsample of the FIR filter is calculated, but each filter coefficient requires aninterpolation.

This second implementation resembles the single stage FIR case. But sincethe linear interpolation is performed in real-time less filter coefficients have tobe stored. Instead of the transfer characteristic of a true single stage filter inFigure 3.1, we get the one in Figure 3.10.

FilterLin.epsi109 26 mmδs1

sourceL1 f⋅ ⋅L1⋅ sourceL2 f

Figure 3.10: FIR with interpolated impulse response

In Section 3.2.1 (Table 3.2) we estimated the order of a single stage FIRinterpolation/decimation filter and stated that such a filter can not be directlysynthesized. Table 3.5 gives the order N of the FIR filter and the number oftaps Q that must be calculated for the FIR filter if a linear interpolator is used.

Estop L1 p s N Q

[dB] [dB] [dB]

118 27 0:001 137 13:3 103 104115 28 0:001 137 26:6 103 104112 29 0:001 137 53:1 103 104

115 28 0:001 137 26:6 103 104127 29 0:001 152 57:3 103 112

Table 3.5: Estimated filter order

The values ofL1 andEstop are taken from Table 3.4. Note that the numberof taps is the same for the first three cases, but the number of stored coefficientsdoubles. The parameters for 20-bit performance increase the number of tapsonly from 104 to 112.

3.2.3 Alternatives

3.2.3.1 High-Order Coefficient Interpolation

For a single stage FIR implementation with an interpolation factor L = 220

a filter of order 108 is required. If the coefficients are interpolated using firstorder Lagrange interpolation the filter order can be reduced to 57 103 asdiscussed in the previous sections. The filter order could be further reduced by

Interpolation L L1 L2 Estop s1 N Q

[dB] [dB]

Zero order 220 220 20127 187 137 225 217 131

1st order 220 29 211127 152 57 288 112

2nd order 220 26 214127 144 6 886 108

3rd order 220 24 216127 138 1671 104

Table 3.6: Coefficient interpolation and filter order

using higher order Lagrange interpolation. Table 3.6 shows the resulting filterorder. Note that the order Q of the actually computed filter stays about thesame, but the reduced storage requirements are traded for a more complicatedcoefficient interpolation.

3.2.3.2 Separate Interpolation and Decimation Filter

[JTG+94] presented an implementation of a sample-rate converter which per-forms interpolation by only L = 64 followed by a proprietary hold operationcalled ‘controlled validation’ (Fig. 3.11). Before the decimation the errorintroduced by the hold operation is attenuated by a second lowpass filter.

Philips.epsi105 28 mm64 1281×,2×,3×

Hold FilterFilter

’Controlled Validation’

64⋅f source 128⋅f sinkf source f sink

Figure 3.11: Separate interpolation and decimation filters

This special hold operation repeats each interpolated source sample 1-3times in such a way that only high-frequency error components are produced,which can be attenuated by the decimation filter. If source and sink samplerate are the same, each interpolated sample is repeated twice on average. Sincethe interpolation filter runs with fsource and the decimation filter runs withfsink the sink samples are automatically band-limited to half the minimum of

source and sink sample rate by-passing the need to adapt the cutoff frequencyof the filter as in the CONTINUOUS MODE.

3.3 Required Resources

Before we discuss the synthesis problem for high-order digital filters we re-capitulate the required resources for the approaches presented in the previoussection. We use the parameters L = 216 and L1 = 28, which results in 18-bitperformance for full-scale signals up to 5kHz and assume a (stereo) audiosignal.

Single Stage FIR Filter

If we assume a filter order of N = 7:5 106, an interpolation factor ofL = 216,and use the hold operation, the number of multiplications for each stereo sinksample is 2 Q = 2 N=L = 230. For a sink sample rate of 48 kHz we get acycle time of 90 ns for the multiplication (and accumulation). This approachhas the drawback that all N = 7:5 106 filter coefficients must be stored in away that they can be accessed in real-time.

Interpolation of FIR Impulse Response

If we use linear interpolation on the impulse response with L1 = 28, L2 = 28

(L = 216), and a filter order of N = 26 103 the number of multiplications forthe filter is 2 Q = 2 N=L1 = 208. We need one additional multiplicationfor each interpolation of the filter coefficients. This results in 3 N=L1 = 312multiplications and L1 = 28 coefficients to store.

FIR Filter followed by Interpolation

If we interpolate the samples after the FIR filter, two FIR calculations areneeded for each (stereo) channel. One additional multiplication per channelis used for the linear interpolation. Therefore the number of multiplications is2 2 Q+ 2 = 4 N=L1 + 2 = 418. The number of stored coefficients is thesame as in the previous case.

Discussion

In Section 3.2 we considered different concepts for the realization of theinterpolation/decimation filter. We have seen that a trade-off between thecomputational cost and the number of stored coefficients must be made.

The noise estimations in this chapter have been done for full-scale signals.However, a full-scale sine wave at 20 kHz is not a realistic audio signal. There-fore we could reduce the requirements for higher frequencies. Additionally,the noise level scales proportionally to the signal level. As a result we haveless distortion for lower level signals. However, these measures reduce therequired filter order only by a small fraction.

So far in this chapter we have only treated the UPMODE. In the DOWNMODE

(i.e. if the sink sample rate is smaller than the source sample rate) the widthof the FIR filter is reduced and therefore the error contributions from thehigher order passbands are smaller. The contribution from the finite stopbandattenuation of the FIR filter is about the same. On the other hand the signalenergy is also reduced due to the smaller passband.

In the following chapters we will treat further only the concept of inter-polation of the FIR impulse response. The synthesis of the FIR filter of orderN = 26 103 is covered in Section 3.4.

3.4 Synthesis of High-Order FIR Filters

In this section we discuss methods which allow to synthesize the high-orderinterpolation/decimation filter. For the examples, the following parametervalues are used:

L1 = 256; N 26 103; s1 = 137 dB

A filter with these parameters is suited for a sample-rate converter with 18-bitperformance up to 5kHz (Fig. 3.8).

3.4.1 Specifications

Let us review the specifications for the interpolation/decimation filter for theparameter values given above.

FilterSpecs.epsi108 32 mm

δs1

fp fs

δp

0.5 samplef/f[ ]

Figure 3.12: Filter specifications

The filter is used for an interpolation by a factor L1 = 256. Its (ideal)cutoff frequency (with a normalized sample rate of 1.0) is therefore 0:5=L1 =

0:00195. The passband and the stopband borders are given in Table 3.7 for atransition bandwidth offsource=16. The maximum allowed passband ripple forhigh-quality digital audio is a disputed subject. We demand a (conservative)value below 0:001 dB. An average attenuation of at least 137 dB must bereached by the filter. Since not only the average stopband attenuation is a matterof concern, but also its distribution we request an minimal attenuation of 137over the full stopband. A Chebyshev filter which meets these specifications isof order 26 103.

How could the order of the interpolation/decimation filter be reduced? Therequired order of a Chebyshev lowpass filter meeting certain specifications for

Passband StopbandLower Edge 0.0000 0.0021Upper Edge 0.0018 0.5000Deviation [dB] 0.001 137

Table 3.7: Filter Requirements

p and s is about inversely proportional to its transition bandwidth. If thesource signal does not extend over the full bandwidth, or if some level ofaliasing can be tolerated for high frequencies, the filter order and thus thecomputational complexity can be reduced drastically by allowing a slightlywider transition band.

An alternative approach is to split up the stopband into multiple parts withdifferent attenuations. The filter order is determined to a large extent by therequired stopband attenuation of the part immediately following the transitionband. The following stopband sections can reach much higher attenuations.We will analyze the above possibility to reduce the filter order in Section 3.4.5.

3.4.2 Design Technique

A wide variety of design techniques for FIR digital lowpass filters have beendeveloped [RG75, Ant93]. We used the technique of equiripple design basedon Chebyshev approximation methods. The filters designed by this methodare optimal in the sense that they fulfill the minimax criterion, i.e. the peakapproximation error in the frequency domain is minimized. The resulting filterhas an equiripple transfer characteristic in the passband and in the stopband.

To solve the Chebyshev approximation problem the well-known computerprogram presented in [MPR73] is used. It is based on the multiple-exchangeRemez algorithm. The program was originally written to synthesize (low-order) filters up to N = 128. We have extended the program for the synthesisof high-order filters. However, the computational cost of the program isapproximately proportional to the square of the filter length.

Many elaborate methods have been proposed to reduce the computationtime of the above mentioned program [AW85, Ant93]. Since our main concernis not runtime, but the convergenceof the algorithm, we optimized the programto allow synthesis of lowpass filters up to orderN = 30 000 (Sec. 3.4.3). Addi-

tionally we used the prototype method proposed in [AW85] to estimate the per-formance of the high-order filter with a low-order approximation (Sec. 3.4.4).

3.4.3 Direct Synthesis

Numerical problems in the original Remez exchange program [Rab73] dis-allow computation of high-order filters. We have improved the program forbetter numerical properties. The original program often computes expressionsof type

1cos() cos( + )

(3.12)

For values of 1 the evaluation of this expressions is inaccurate and mayeven lead to an arithmetic exception (division by zero) due to finite precisionarithmetic. We have replaced the above expressions by

12 sin( +

2 ) sin( 2 )(3.13)

which has much better numerical properties for small values of . Addi-tionally, range checking and scaling of intermediate results have been addedto prevent overflow of accumulator variables during calculation of high-orderfilters.

Using these modifications, direct synthesis of filter with more than 20 000filter taps is feasible. Table 3.8 lists the runtimes on a SUN SPARCstation10/41 for sample specifications.

3.4.4 Prototype Method

The prototype method for narrow-band lowpass filters starts out from the factthat the computational cost is roughly proportional to the square of the filterlength. A low-order prototype filter is synthesized and is then scaled to thehigh-order filter. This approach allows to evaluate rapidly the performance ofdifferent filter specifications at low computational cost.

The frequency scaling property of the Fourier transform is used to derivethe specifications of the prototype. Consider the impulse response h(t) and itsFourier transform H(f). Scaling the impulse response by a factor yields anew Fourier pair

1 h(

t

)$ H( f)

The edges of the pass- and the stopband are scaled to fp andfs. Due tothe wider transition band the filter order is reduced by a factor . Figure 3.14shows an example for = 4. The above scaling transformation operatesprecisely in the continuous-time domain and with good approximation in thediscrete-time domain for narrow-band high-order lowpass filters.

Filter p s RuntimeOrder [dB] [dB] [min]

419 64 0:001 138:8 0.371679 16 0:001 139:0 7.736719 4 0:001 139:0 143

13439 2 0:001 139:0 56426879 1 0:001 139:0 3656

Table 3.8: Runtime and deviation depending on

Runtime.eps106 40 mm

[1/ λ2 ]

10−4

10−3

10−2

10−1

100

10−2

100

102

104

Run

time

[min

]

Figure 3.13: Graphical representation of Table 3.8

Table 3.8 lists the runtime on a SUN SPARCstation 10/41 for a target filterof order 26 879 using the specifications shown in Table 3.7 for different valuesof . The runtime is about inversely proportional to 2 – except for = 1,

where the program converges only slowly. The resulting stopband attenuationdecreases only for large values of .

ZeroPad.eps103 86 mm

100 200 300 400−0.05

0

0.05

0.1

0.15

0.2

Low−order filter: Time domain

[Samples]100 200 300 400

−150

−100

−50

0Low−order filter: Freq. domain

[Samples]

500 1000 1500

−150

−100

−50

0Zero−padded filter: Freq. domain

[Samples]500 1000 1500

0

0.02

0.04

0.06High−order filter: Time domain

[Samples]

Figure 3.14: Scaling of prototype filter by = 4 using zero padding

To avoid the computation-intensive direct synthesis of the high-order filter,the Fourier transform can be used to scale the low-order prototype filter to ahigher order filter. Figure 3.14 shows the required operations. In a first stepthe impulse response of the prototype filter is transformed into the frequencydomain using the discrete Fourier transform (upper left to right). In a secondstep zeros are inserted at the Nyquist sample rate in the frequency domain toincrease the filter order by the factor (right top to bottom). This zero-paddedtransfer function is then converted back into the time domain using the inverseFourier transform (lower right to left).

Due to high stopband attenuation of the filter at the Nyquist sample rate,the discontinuity introduced by the zero padding is very small. Figure 3.15shows a comparison of the transfer function of the scaled prototype ( = 16)with the directly synthesized filter.

ZeroPad2.eps102 72 mm

102

103

104

105

106

−150

−100

−50

0

[Hz]

Low−Order Filter and Zero−Padding

102

103

104

105

106

−150

−100

−50

0

[Hz]

Direct Synthesis

Figure 3.15: Comparison of zero-padded prototype ( = 16) and direct syn-thesis

3.4.5 Relaxed Specifications

The specifications for the interpolation/decimation filter need always to bea trade-off between the computational cost for real-time operation and thequality of the filter. I.e. if the transition band is widened, then the order of thefilter can be reduced substantially, but aliasing occurs for high frequency signalcomponents. The prototype method discussed in the previous section allowsto evaluate the cost and performance of filters with different specifications atlow computational cost.

To reduce the filter order the specifications given in Figure 3.12 are alteredto allow an intermediate stopband with a lower attenuation (Fig. 3.16).

For a sample rate of 48 kHz the transition band extends from 22.5 kHzto 25.5 kHz for the original specifications (Table 3.1). Figure 3.17 shows anexample of a resulting filter for reduced aliasing requirements. The stopbandattenuation and the passband ripple are the same as for the original specifica-tions, but the filter order N is reduced from 26 879 to 21 759. Thereby the

FilterSpecs2.epsi112 34 mm

δs1

fp fs

δp

0.5 samplef/f[ ]

Figure 3.16: Relaxed specifications with intermediate stopband

order Q of the actually computed filter is reduced by 19 % from 105 to 85.The aliasing components in the baseband that result from the relaxed spec-ifications are attenuated by 78 dB (Fig. 3.17) and are located above 21 kHz(fsample = 48 kHz), 14 kHz (fsample = 32 kHz) respectively. The error con-tribution by the higher order transition bands can be neglected since they areattenuated by at least 90 dB (L1 = L2 = 256) by the hold operation and thelinear interpolation (Eq. 3.9, 3.10). The degradation caused by the relaxedspecifications is therefore only marginal.

Aliasing.eps79 63 mm

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−160

−140

−120

−100

−80

−60

−40

−20

0

[Hz]

[dB

]

Figure 3.17: Filter with relaxed specifications (fsample = 48 kHz)

3.5 Quantization of Filter Coefficients

In Section 3.4 a narrow-band FIR lowpass filter of order 26 879 with a mini-mum stopband attenuation of 139dB has been synthesized (average attenuation142 dB). As described in Section 3.2.2 the (virtual) interpolation/decimationfilter of order 106 is obtained by linear interpolation of the filter coefficients(impulse response) of the 26 879 filter. So far the quantization of the filtercoefficients has not been treated. For the implementation the ‘basic coeffi-cients’ are quantized twice: Once for reducing the storage requirements andto match the word-length of the multiplier, and a second time after the linearinterpolation of those stored values. The following two sections cover theeffects of each quantization step.

3.5.1 Quantization of FIR Coefficients

Changing the coefficients of a filter alters its transfer function. In particular,quantizing the coefficients of a lowpass filter reduces the stopband attenuation.Figure 3.18 shows the transfer characteristic of the 26 879 filter with coeffi-cients quantized to 20 bits: the average stopband attenuation is reduced from142 dB to 129 dB. The equiripple characteristic in the stopband is replaced bystopband noise.

Filter26879bit20.eps74 58 mm

[f/f sample]

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−180

−160

−140

−120

−100

−80

−60

−40

−20

0

[dB

]

Figure 3.18: FIR filter, N = 26 879, 20 bit coefficients

To increase the dynamic range and therefore to reduce the quantizationeffects, a floating-point format can be used for the coefficients. The impulseresponse h of the FIR filter has approximately a SINCshape i.e. a j1=xjasymptotic behavior from the central to the outer coefficients (Fig 3.19). Theevaluation of one sink sample is a scalar product of the form

Ph x. If the

summation is started with the outermost coefficients a special block floatingpoint format can be used, which allows the exponent to increase only. We callthis easy to implement format quasi floating point (QFP). It can be realizedin hardware by shifting the accumulator to the right whenever the exponentchanges. No storage of the exponent is necessary, since a simple controllercan do the job.

Filter26879Impulse.eps108 42 mm

0.5 1 1.5 2 2.5

x 104

−0.2

0

0.2

0.4

0.6

0.8

Figure 3.19: Impulse response of FIR filter using fixed-point coefficients

The stopband attenuation of the 26 879 order FIR filter for a given precisionof the coefficients is given in Table 3.9. Note that to reach a stopband attenu-ation of 137 dB (Table 3.4) 22-bit coefficients are required in the fixed-pointformat, but only 19 bits if the QFP format is used (Tab. 3.9).

Coefficient Stopband attenuation[bit] Fixed point QFP18 117 dB 134 dB19 123 dB 139 dB20 129 dB 141 dB22 139 dB 142 dB24 142 dB 142 dB

Table 3.9: Stopband attenuation depending on coefficient precision

The quantization of the FIR coefficients can therefore be modelled by

adjusting the stopband attenuation s1 of the FIR filter according to the spec-trum of the quantized coefficients. Or in other words, the FIR coefficientword-length must be chosen such that the stopband requirements given inSection 3.2.2 are fulfilled for the quantized coefficients.

3.5.2 Quantization of Interpolated Coefficients

The (already quantized) FIR coefficients are linearly interpolated to yield theinterpolation/decimation filter of order 106 (Sec. 3.2.2). In Figure 3.10 thetransfer characteristic of the resulting filter has been sketched: The FIR filtershape is repeated L2 times and attenuated by the SINC 2 transfer functionrepresenting the linear interpolator in the frequency domain. Figure 3.20shows the spectrum of the FIR filter for L2 = 64 using 22-bit (non-QFP) FIRcoefficients with ideal interpolation.

Filter26879bit22-64.eps108 85 mm

[f/f sample]

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−180

−160

−140

−120

−100

−80

−60

−40

−20

0

[dB

]

Figure 3.20: Interpolated FIR filter, L2 = 64

The quantization error of the ‘basic FIR coefficients’ (previous section)

showed approximately random behavior, resulting in a white noise spectrum.The quantization of the linear interpolated (quantized) FIR coefficients resultsin a strongly correlated error function, which can assume only discrete values.This results in a non-white spectrum of the quantization error.

Filter26879bit22-64bit22-Noise.eps104 83 mm

0 50 100 150 200 250 300 350 400 450 500−0.5

0

0.5Detail of quantization noise in time domain

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−160

−150

−140

−130

−120Spectrum of quantization noise, average level = −159 dB

Figure 3.21: Quantization noise of interpolated FIR filter, L2 = 64, 22 bit

The upper graph in Figure 3.21 shows the quantization error of the first500 coefficients. The corresponding spectrum of this function is shown inthe lower half. Note that the same 22-bit coefficients have been used as forFigure 3.20, but the interpolated coefficients have been quantized to 22 bit aswell. The average error level is at 159 dB with some peaks up to 126 dB.

Since all error peaks caused by the quantization of the interpolated coef-ficients (Fig. 3.21) are dominated by the highest peaks of the (non-quantized)interpolation operation (Fig. 3.20) the quantization error contributes only byits average level. Table 3.10 shows the average and the peak quantization errorlevel for 19 and 22-bit quantization and sample values of L2. Note that theaverage quantization error level is lowered by 17 dB using the QFP principle– the peak value even by 30 dB.

Quantization noise level [dB]19-bit 22-bit

L2 fixed-point QFP fixed-point QFPpeak avg. peak avg. peak avg. peak avg.

4 94 130 123 147 111 148 141 16516 102 136 129 153 117 153 148 17164 106 142 138 159 126 159 157 177

256 107 148 148 165 128 166 166 183

Table 3.10: Quantization noise caused by second quantization

Figures 3.22 and 3.23 show the resulting filter transfer characteristic withboth the ‘basic FIR coefficients’ and the interpolated coefficients quantized to22 bit using the fixed-point and the QFP format.

Filter26879bit22-64bit22.eps97 76 mm

[f/f sample]

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−180

−160

−140

−120

−100

−80

−60

−40

−20

0

[dB

]

Figure 3.22: FIR filter, L2 = 64, quantized twice to 22 bit

Filter26879bit22QFP-64bit22QFP.eps97 76 mm

[f/f sample]

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−180

−160

−140

−120

−100

−80

−60

−40

−20

0

[dB

]

Figure 3.23: FIR filter, L2 = 64, quantized twice to 22 bit QFP

4Frequency Tracking

In Section 2.6 we have presented the concept for sample-rate conversion byarbitrary ratios. The source signal is interpolated by a constant factor L,transformed into a continuous-time representation, and is then resampled atthe new (sink) sample rate. In this chapter we discuss methods to determineprecisely the sink sample moment in the digital domain.

4.1 Requirements

In order to allow the computation of a sink value of the sample-rate converter,its precise phase (i.e. the sink sampling time relative to the source samples)must be known. Inaccuracy in measuring this phase results in a wrong sinksampling moment and thus in a non-uniform resampling of the sink signal. InSection 4.5 the effects of the non-uniform sampling on the signal spectrum willbe covered in detail. It will be shown that the sample phase error is modulatedonto the audio signal as shown in Figure 4.1. A precision of at least L bits isrequired for the sink phase.

The (virtual) sample rate after the interpolation and thus the time resolutionfor the sampling moment of the sink samples is L fsource (Eq. 2.12). Forexample, a time resolution of approximately 320 ps is needed for L = 216 anda sample rate of 48 kHz.

51

Modulation.epsi85 39 mm

uniform

non-uniform

Spectrum ofphase error

Samples Spectrum ofresampled sine

Figure 4.1: Effects of non-uniform sampling on the signal spectrum

4.2 Principle

A straight forward solution to determine the phase of the sink samples is to use ahigh-speed counter which starts at each source sample moment and measuresthe position of the sink samples relative to the source samples (Fig. 4.2).Besides the need for a counting frequency in the GHz range and a very stableclock source, this direct approach exhibits the shortcoming that jitter of eitherof the sampling clocks translates directly into phase errors of the sink samples.Since jitter-free sampling clocks are hardly feasible, methods which determinethe precise average phase are asked for.

Phase.epsi95 32 mm

sourcesamples

sinksamples

L

∆ϕ[m] ∆ϕ[m+1]

M[m]

Figure 4.2: Phase relations

Figure 4.2 gives a graphical representation of the phase relations as definedin Chapter 2 (Eq. 2.17). Obviously there is no need to measure each sink samplemoment directly, instead the subsequentsink sample phasesare computed fromthe previous ones by adding the precisely measured ratio of source and sinksample rate (Eq. 4.1).

This indirect solution makes it possible to suppress jitter of the samplingclocks by evaluating a long term average M=L of the ratio fsource=fsink,which in general varies only slowly in time. Note that in Equation 4.1 theinterpolation factor L is constant, but the decimation factor M [m] may varyslowly. The average decimation factor M must be equal to L times the ratio ofsource and sink sample rate. Since L is chosen to be a power of 2 the modulooperation in Equation 4.1 can be implemented easily.

∆'[m+ 1] = (∆'[m] +M [m]) modL; with ML

=fsourcefsink

(4.1)

If the sink phase ∆'[:] is computed according to Equation 4.1, the calculatedphase may drift away from the actual phase due to the finite word-length ofM [m] and the filtering used for the jitter reduction. Therefore measures mustbe taken to avoid this drift.

In the following section a frequency counter solution is presented. Analternative approach to measure the average frequency ratio is a digital PLL(Sec. 4.4). Section 4.5 treats the effects of non-uniform sampling due tovariations of M [m]. A comparison of the performance of the frequencycounter and the digital PLL is given in Section 4.6.

4.3 Frequency Counter

In order to reach the desired overall precision, the ratio of source and sinksample rate has to be determined with a precision of at least Lbits. Thismeasurement can be accomplished by a frequency counter using two differentmethods:

1. The source sample rate is multiplied by a phase locked loop (PLL) to2Lk fsource and is then used to measure the sink sample rate (Fig. 4.3).If a 2Lk-fold multiple of the source sampling clock is already available,the PLL can be omitted. An additional precision of k bits is obtainedby averaging 2k consecutive measurements. This averaging operationadditionally serves as lowpass filter to suppress short-time jitter.

2. The additional PLL used in the above method can be avoided by using afast counter, which runs independently of the source and sink sampling

Freq2.epsi70 32 mm

PLL

Cou

nter

f source

2L-k ⋅

Error trackingAveraging (2 )k f sink/ f sourcef sink

f source

Figure 4.3: Frequency counter with PLL

clock. The basic idea is to measure and average both periods 1=fsourceand 1=fsink and divide the resulting values by each other (Fig. 4.4).

Freq.epsi77 32 mm

Div

ider

Hig

h-Sp

eed

Cou

nter

f ref

Error trackingAveraging (2 )k

Error trackingAveraging (2 )k

2L-k ⋅f≈

f sink/ f source

f sink

f source

sample

Figure 4.4: Frequency counter with high-speed reference clock

Both methods demand a trade-off between the measurement frequency,the memory requirements of the moving average filter and the jitter reduction.Direct measurements require a GHz counter frequency with a low-jitter clocksource and must be followed by a lowpass filter to eliminate jitter caused by thesampling clocks. Low-frequency measurements need a high-order averagingfilter to achieve the required accuracy and therefore allow to track sample-ratechanges only slowly.

A VLSI implementation with a 50 MHz reference counter and two movingaverage filters of length 128 has been realized at our laboratory [MS93]. Inorder to avoid drifting of the group delay, care has been taken that the sink phasecalculated from the averaged values of source and sink sample rates (Eq. 4.1)does not drift away from the actual value. We implemented the additionalerror-tracking circuitry suggested by [Pel82]. This VLSI implementation

fulfilled our expectations for rational ratios of source and sink sample rates.If both rates differ only by a small amount (plesiochronous case) this solutionis not accurate enough. Consequently an alternative approach to resolve thisproblem was studied (Sec 4.4). The cause of the inaccuracy of the abovefrequency counter implementation and performance simulations are given inSections 4.5 and 4.6.

4.4 Digital Phase Locked Loop

An alternative approach to measure the average frequency ratio is a digitalphase locked loop (DPLL). A DPLL is a digital circuit that controls an outputsignal in a way that its phase matches the phase of an input signal [dD89,Bes93]. PLLs (especially analog PLLs with analog or digital phase detectors)are widely used e.g. in telecommunication applications for modulation anddemodulation of signals, clock recovery, or as frequency synthesizers.

In the case of the frequency tracking unit, the purpose of the DPLL isto control the ratio of source and sink sample rate (M [m]=L) in a way thatthe calculated sink phase matches the phase of the (de-jittered) sink samplingclock. The lowpass characteristic of the PLL is used to suppress short-timejitter of the sampling clocks.

We use an approach similar to the servo controlled loop proposed in [Ada93],but we use an adaptive loop filter and avoid the resynchronization of the sam-pling clocks.

Pll.epsi100 32 mm VCO

IntegratorLoopFilter

PhaseDetector

ϕ sink

ϕ source

⋅L∆ϕ

Counter

f source

f sink

Figure 4.5: Digital PLL (DPLL)

The all-digital PLL consists of the same three basic building blocks as aconventional PLL (Fig. 4.5):

1. Phase detector: The phase detector generates a signal that is proportionalto the difference of source phase 'source and sink phase 'sink.

2. Loop filter: The loop filter is a lowpass filter that determines the dynamicbehavior of the PLL. It attenuates short-time jitter of the phase detectoroutput and minimizes the average deviation. Again, a trade-off betweenfast tracking and noise suppression must be made.

3. VCO: A VCO in the strict sense of the term is a voltage-controlledoscillator. The input value corresponds to the (filtered) phase difference.The output frequency of the oscillator is altered proportionally to theinput. The VCO can therefore be looked at as an integrator in the digitaldomain, which is realized as an accumulator.

4.4.1 Phase Detector

The purpose of the phase detector is to measure the phase difference betweenthe source and the sink samples. At the first glance this phase detection canbe realized by a simple subtraction of source and sink phase. However, thissimple solution is not applicable, because the two phases are not updated atthe same time. The source phase 'source is incremented by L at the ratefsource. The sink phase 'sink on the other hand is incremented by a variableamount M [m] at the rate fsink. For precise operation of the phase detectorboth phase values must be fetched exactly at the sink sample moment. Itis therefore undesirable to introduce additional jitter by synchronizing bothsampling clocks to a common master clock.

Gray.epsi102 17 mm Gray

DecoderRegisterGrayCounter ϕ source

⋅L

f source f sink

Figure 4.6: Synchronization with Gray counter

An elegant solution to avoid the resynchronization is shown in Figure 4.6.The counter running at the source sample rate fsource is realized as Graycounter. Since the Gray code is a unit-distance code, i.e. only one bit changesfor subsequent counter values, consistent readout of the counter is guaranteedat any moment.

4.4.2 Loop Filter

The output of the phase detector is not constant, but fluctuates considerably.This phase noise origins from several sources. Two sources of the phase noiseare the jitter of the source and sink sampling clocks. A third source of phasenoise is caused by the fact that the source and the sink phase are incrementedonly by discrete amounts at different rates: 'source is incremented by L at therate of fsource, whereas 'sink is incremented by M [m] at sink sample ratefsink. The loop filter following the phase detector must suppress the phasenoise caused by all these sources.

The design of the loop filter is a trade-off between the noise bandwidthof the filter and the dynamic behavior (fast tracking). We use a first orderlowpass filter with lead-lag compensation to stabilize the full loop as proposedby [Sti92].

Openloop.eps108 73 mm

10−1

100

101

102

103

−100

−50

0

Frequency [Hz]

Gai

n [d

B]

10−1

100

101

102

103

−120

−150

−180

Frequency [Hz]

Pha

se [D

eg]

Figure 4.7: Open-loop transfer characteristic of the DPLL

The cutoff frequency of the lowpass filter can be adapted in four steps.Figure 4.7 shows the four open-loop transfer characteristics of the loop filter.The zero and pole of the lead-lag compensation are 1:5 decades apart. Theresulting phase margin is 69.

The variable bandwidth of the loop filter allows to adapt the dynamic be-havior of the DPLL. With a wide filter a fast pull-in and an immediate trackingfor varying sample rates is accomplished. If the sample rates are stable overtime the bandwidth of the loop is reduced resulting in an improved noise sup-pression. The integrator following the lead-lag section avoids discontinuitiesof the output phase when switching the bandwidth of the loop filter.

Pll2.epsi106 47 mm

f sink

2K

2K1

:2 K3

:2 K2

MACCU∆ϕ

In-lockDetector K1, K2, K3

+±+−

++ −

+

−ϕsource

ϕsink

2K

f sink

f source

LoopFilter

Figure 4.8: Adaptive digital PLL

Figure 4.8 shows the loop filter and the surrounding blocks. Note that allmultiplications can be realized as shift operations. The output of the phasedetector is monitored to determine whether the PLL is locked. The bandwidthof the loop filter can then be reduced (parameter K1;K2;K3), which resultsin a higher jitter rejection.

The amount of jitter attenuation of the sampling clocks can be derived fromthe closed-loop transfer characteristic of the DPLL shown in Figure 4.12. E.g.sinusoidal jitter on a sampling clock at 50 Hz (400 Hz) is attenuated by 70 dB(105 dB) for the narrowest loop filter.

4.5 Non-uniform Resampling

In the previous chapters we have treated the error introduced through thefinite interpolation factor and the limited stopband attenuation of the inter-polation/decimation filter. So far we have assumed ideal resampling of theinterpolated signal. In this section we go a step further and estimate the errorcaused by the non-ideal frequency tracking. A comprehensive treatment of

non-uniform sampling can be found e.g. in [Mar93]. [Ada93] studied theeffects of clock jitter on the performance of various D/A converters.

The calculated phase'sink[m] =P

M [m] modL is restricted to discretevalues due to the finite word-length of M . In the following the differencebetween the actual phase 'ideal[m] and the phase computed by hardware'sink[m] is called 'q [m]. Note that'q [m] does not only include the quantiza-tion of M , but also all other by-products of the non-ideal frequency tracking.

Consider an ideal sine wave with amplitudeA and frequency fsig, which isresampled at the (computed) sink sample moment. The resulting sink samplesare

y[m] = A sin

2fsig

'ideal[m] + 'q [m]

L fsource

(4.2)

[Har90] approximated the spectrum of y[m] for a sinusoidal shape of'q [m]. He interpreted Equation 4.2 as an FM modulation with the (ideal)sine wave as carrier and a modulation index of 2fsig=(L fsource). Theresulting spectrum is a sum of Bessel functions. The Bessel functions canbe approximated for small modulation indices by simple functions [Fet90].[Har90] supported his findings with simulations and measurements.

We use a generalized approach, which leads to the same result for a si-nusoidal shape of 'q , but is valid for any shape of 'q. Equation 4.2 can bemodified for small modulation indices by using the trigonometric equations

sin(+ ) = sin cos + sin cos (4.3)

sin + cos with 1 (4.4)

Applying Equation 4.4 to Equation 4.2 yields

y[m] = A sin

2fsig 'ideal[m]

L fsource

+A 2fsigL fsource

cos

2fsig 'ideal[m]

L fsource

'q [m] (4.5)

The error introduced by the above approximation (Eq. 4.4) is of order2/ ('q=L)

2 and can therefore be neglected for large interpolation factors.

The resulting signal consists of two orthogonal parts: the original sine waveand the phase deviation 'q multiplied by an attenuated cosine function. Thismultiplication in the time domain corresponds to a convolution in frequencydomain. The spectrum of 'q is shifted by the cosine function to the samefrequency as the original sine wave, but is attenuated by the factor 2fsig=(L fsource) relative to the signal amplitude.

Note that not the deviation of the decimation factor M [m] is modulatedonto the signal, but its sum – the phase deviation 'q [m]. The attenuation isproportional to both the signal frequency and signal amplitude. Therefore thejitter attenuation is the same for a full-scale 1 kHz signal or a 20 dB 10 kHzsignal.

Figure 4.9 shows sample traces for an uniform distribution of the quanti-zation error of M [m] between 1 and +1 LSB. The second trace shows thephase deviation 'q [m] of the sink sample moment, which is the integral of thefirst trace. The third trace shows the second summand – the error term – ofEquation 4.5 for a sampling frequency of 48 kHz. Note that the error spectrumhas a 1=f characteristic, since it is the integral over a signal with uniformdistribution. The phase deviation is modulated onto the signal according toEquation 4.5. The lowest trace in Figure 4.9 shows the resulting spectrum fora sine wave at 997 Hz.

Jitter.eps109 124 mm

50 100 150 200 250 300 350 400 450 500−2

0

2Frequency Deviation

[Samples]

[LS

B_1

6]

50 100 150 200 250 300 350 400 450 500

−0.05

0

0.05Phase Deviation

[Samples]

[Deg

]

102

103

104

−140

−120

−100

Attenuated Spectrum of Phase Deviation

[Hz]

[dB

]

102

103

104

−140

−120

−100

Phase Deviation Modulated on 997 Hz Sine

[Hz]

[dB

]

Figure 4.9: Simulation of distortion caused by uniform jitter of M

4.6 Performance Comparison

In the previous section we have discussed the influence of non-ideal sinkphase computation on the signal quality. In this section the performance of thefrequency counter method and the DPLL are discussed. Measurements willbe given in Section 7.1. To compare both solutions we distinguish three cases:synchronous, plesiochronous, and asynchronous sampling clocks.

In the synchronous case source and sink sample rates are subharmonicsof a common master clock. On the other hand, if source and sink samplerate are asynchronous no such common master clock exists. The two clocksare plesiochronous if they have the same nominal frequency, but are derivedfrom different oscillators. In this case the two sampling clocks are then notphase-locked, and their frequencies may differ slightly due to tolerances of theoscillators. Plesiochronous conditions are often found at system interfaces.

4.6.1 Frequency Counter

In this section the performance of the frequency counter (running at 50 MHz)for synchronous, plesiochronous, and asynchronous sample rates is discussed.

The moving average filter which follows the counter (Fig. 4.3) averagesthe last 2k counter readings. The transfer characteristic of the filter is a SINCwith the first zero crossing at fsample=2k and a decay of 20 dB/dec. Figure 4.10shows the transfer characteristic for k = 7 and fsample = 48 kHz. Note thatfor these parameters only frequencies above 150 Hz are suppressed.

Movingaverage.eps108 36 mm

101

102

103

104

−60

−40

−20

0

[Hz]

[dB

]

Figure 4.10: Spectrum of moving average filter

Figure 4.11 shows traces of the distortion caused by the non-ideal fre-quency tracking of the frequency counter for synchronous, plesiochronousand asynchronous operation. Three subplots are given (corresponding to Fig-ure 4.9 in the previous section): The deviation ofM [m] in LSBs, the deviationof the phase 'q[m] and the error spectrum of the phase deviation according toEq. 4.5.

In the synchronous case the filter is not needed at all since the counteroutput is constant and no phase distortion results from the frequency tracking.

In the asynchronous case the error energy is roughly equally distributedover the frequency band and the moving average filter reduces it by a factor2k. All distortion products are below 125 dB for the example given inFigure 4.11.

In the plesiochronous case the error energy is concentrated at low frequen-cies and can not be suppressed by the moving average filter. In the example inFigure 4.11 the frequency difference between the quartz that is used to generatethe source sampling clock and the quartz of the reference counter is 100 Hz.This difference frequency is not suppressed by the moving average filter, butcan clearly be seen in the plots of the phase deviation for the plesiochronouscase in Figure 4.11. Distortion products at multiples of 100 Hz are producedfor the given example.

The frequency counter approach is therefore well suited for the synchro-nous case. Both sample rates can exactly be measured by a counter runningwith the common master clock. If the counter frequency is asynchronous tothe sampling clock, the counter readings alternate between two values whichdiffer by one LSB of the counter. These counter readings can be averaged by amoving average filter to increase the precision of the frequency measurement.In the plesiochronous case the counter reading alternate also only between twovalues as in the asynchronous case, but these variations are at low frequencieswhich are not suppressed by the filter. Therefore distortion products with thedifference frequency of the source and sink sample rate result.

The performance can be improved by using a higher order moving averagefilter, which reduces the cutoff frequency, or by a different type of filter. For acutoff frequency of 0.4 Hz (as with the digital PLL) a moving average filter oflength 216 is required. A high-order moving average filter has the drawbackthat it tracks slower and needs more storage resources. A different type offilter with a low enough cutoff frequency must be followed by a controller to

Synchronous

sync.eps92 47 mm

200 400 600 800 1000 1200 1400 1600 1800 2000−2

0


[Samples]

[LS

B_1

6]

200 400 600 800 1000 1200 1400 1600 1800 2000−1

0

1Phase Deviation

[Samples]

[Deg

]

102

103

104

−140

−120

−100


[Hz]

[dB

]

Asynchronous

async.eps92 47 mm

50 100 150 200 250 300 350 400 450 500−2

0


[Samples]

[LS

B_1

6]

50 100 150 200 250 300 350 400 450 500−4−2

0246

x 10−3 Phase Deviation

[Samples]

[Deg

]

102

103

104

−140

−120

−100


[Hz]

[dB

]

Plesiochronous

ples.eps92 47 mm

200 400 600 800 1000 1200 1400 1600 1800 2000−2

0


[Samples]

[LS

B_1

6]

200 400 600 800 1000 1200 1400 1600 1800 2000

−0.1

0

0.1

Phase Deviation

[Samples]

[Deg

]

102

103

104

−140

−120

−100


[Hz]

[dB

]

Figure 4.11: Simulation of frequency counter followed by moving averagefilter (fsample = 48 kHz, fsig = 997 Hz, k = 7)

ensure that the phase calculated from the frequency ratio does not drift awayfrom the effective phase (see page 53). These limitations can be overcome bythe DPLL solution.

4.6.2 Digital PLL

An architecture of the digital PLL has already been presented in Figure 4.8.The Gray counter used as input to the DPLL shows a similar output behavioras the frequency counter described in the previous section. The counter isincremented at the rate fsource, but the counter value is fetched with fsink.If source and sink sample rates are plesiochronous the output of the phasedetector contains mostly low frequency components. For proper operation ofthe PLL this low frequency distortion must be suppressed by the loop filter.

Closedloop.eps90 61 mm

10−1

100

101

102

103

−90

−180

0

Frequency [Hz]

Pha

se [D

eg]

10−1

100

101

102

103

−100

−50

0

Frequency [Hz]

Gai

n [d

B]

Figure 4.12: Closed-loop transfer characteristic of the DPLL

Figure 4.12 shows the closed-loop transfer characteristic of the DPLL forthe four bandwidth of the loop filter (0.4, 1.9, 29, 480 Hz). If source and sinksample rates are stable the cutoff frequency can be stepwise reduced. Thisimproves the performance in the plesiochronous case. For varying samplerates a wider filter allows fast tracking. Performance simulations similarto Figure 4.11 for M 216 showed that all distortion products are below125 dB for the DPLL solution.

5Implementation

In this chapter we will discuss alternatives for the implementation of a sample-rate converter following the concepts presented in the previous chapters. Twodifferent realization methods are analyzed:

1. Application specific integrated circuit (ASIC)

We have implemented a prototype ASIC for 18-bit audio sample-rateconversion with arbitrary sampling frequency ratios [LS91, RF94]. Wewill refer to this implementation as SARCO. Alternatives, improvementsand extensions to this architecture and realization are considered in thischapter. Performance measurements are given in Chapter 7.

2. Digital signal processor (DSP)

Modern DSPs have powerful instruction sets to implement digital signalprocessing tasks. The choice of a programmable solution allows to tailorthe implementation for each application, but is not cost effective whenlarge volumes are required. We will estimate the necessary resourcesfor integer DSPs with 24 24 bit multiplier as the members of theDSP5600x family [Mot88]. The dynamic range of a 24 bit datapath with56 bit accumulation is well suited for high-quality audio applications.

Figure 5.1 shows the basic building blocks of the sample-rate converter.The frequency tracking unit (Chap. 4) determines the ratio of source to sink

67

BlockDiagramm.epsi58 54 mm

Frequencytracking

High order digitalinterpolation / decimation

filter

Sourcesamples

Sinksamples

Filtercoefficients

Sourceclock

Sinkclock

Frequencyratio

Figure 5.1: Block diagram

sample rate while the interpolation/decimation filter does the actual conversion(Chap. 2). The third block represents the storage of the filter coefficients,whichare interpolated from a stored subset as described in Chapter 3.

5.1 Frequency Tracking

The implementation of the two methods for frequency tracking proposed inChapter 4 – frequency counter and digital PLL – will be discussed here.


A frequency counter which runs at the chip clock (50 MHz) according to Fig-ure 4.4 with a moving average filter of length 128 has been implemented [MS93]for the prototype ASIC. The frequency tracking unit is connected to the filterdatapath by a serial interface which is accessible from external pins. Thisconcept allowed to analyze the output of the frequency tracking unit and toevaluate alternative solutions.

Measurements showed a precision of 14bit for the worst case, where sourceand sink sample rate are plesiochronous. Inaccuracies in the measurement ofthe sample-rate ratio produce phase noise of the sink sampling moment, which

is modulated onto the audio signal according to Equation 4.5 (Fig. 4.9). Sincethe error is proportional to the signal amplitude A and the signal frequencyfsig, the distortion has the largest influence for signal frequencies close tofsample=2 with an amplitude near full-scale.

The frequency counter of SARCO occupies an area 13:6 mm2. 30% of thisarea is used for the RAMs of the moving average filter.

5.1.2 Digital PLL

A digital PLL solution has been developed for frequency tracking in order toreach an accuracy of 20 bit for the ratio of source to sink sample rate. Thebasic architecture of the digital PLL has already been given in Figure 4.8.Since the loop filter is a recursive (IIR)-filter with poles close to the unit circle,large word-lengths are needed for the filter datapath to achieve the required20-bit precision for M . Because of the feedback involved, the circuit must becarefully designed to guarantee stable operation and to avoid limit cycles dueto quantization effects.

Extensive simulations have been performed using a finite word-lengthmodel of the PLL. A minimal word-length of 44 bits is required for theaccumulator. After the initialization during pull-in of the PLL the internalregisters have not reached their stable values, which may lead to overflow inthe accumulator. Therefore a limiter has been added after the phase detector,which saturates the output of the phase detector to avoid overflow.

PLL.epsi105 44 mm

f sink

218

218

2K1

:2 K3

:2 K2

MACCU

In-lockDetector K1, K2, K3

+±+−

++ −

+

−

ϕsink

ϕsource

∆ϕ’21

28

44

28 25Saturate

ptr ∆ϕ

f sink

f source

Figure 5.2: Implementation of digital PLL

The in-lock detector adapts the bandwidth of the loop filter using theparameters K1, K2, and K3 according to the output of the phase detector.If the output of the phase detector is below a certain limit during some timeperiod the bandwidth is gradually reduced. If a large phase difference isdetected the bandwidth of the loop filter is increased at once to allow fasttracking of sample-rate changes and to guarantee that the PLL stays locked.

The PLL in Figure 5.2 has been realized in a Field-Programmable GateArray (FPGA) having the same serial interface as SARCO. Unfortunately theinterface of SARCO limits the frequency ratio to a precision of only 16 bits.Therefore an additional circuit has been added, which generates a 16 bit ap-proximation of the 20 bit ratio M by controlling the ‘duty cycle’ of the leastsignificant bit. This FPGA realization of the PLL is however well suited for asample-rate converter with an (audio) precision of up to 20 bit.

The datapath of the PLL implemented in a Xilinx XC4010 FPGA runs at2 MHz, since only 7 additions/subtractions must be calculated for each sinksample. The frequency ratio M is transmitted over the serial interface runningat 20 MHz. The XC4010 contains 400 configurable logic blocks (CLBs),which is claimed to be equivalent to 10 000 gate equivalents (GE). Each CLBprovides two independent function generators of four inputs and two flip-flops.Considerable effort was required to fit the PLL into one FPGA, since all CLBsare occupied: 91 % of the CLBs are used for logic functions, 9 % as additionalrouting resources.

To compare the complexity of the DPLL to the frequency counter solution,the die size for an ASIC realization of the PLL has been estimated for the sametarget technology as the frequency counter. The DPLL requires only 6:3 mm2,which is about half the area of the frequency counter. The accuracy has beenimproved from 14 to 20 bit.

Table 5.1 summarizes the technical data of both approaches.

ASIC FPGA[mm2] [GE] [CLB] [GE]

Freq. counter 13:6 9 100 + RAM - -DPLL 6:3 5 900 400 10 000

Table 5.1: Implementation alternatives for frequency tracking

5.2 Datapath for FIR Filter

5.2.1 Review of Operations

In Chapter 2 we have explained the principle of sample-rate conversion forarbitrary ratios: It can be realized by a FIR filter of order Q with time-varyingcoefficients (Eq. 2.17). An equidistant subset of theQ L filter coefficients ofthe high-order interpolation/decimation filter is chosen for each output sample.The selection of the subset depends on the difference between source and sinksample phase ∆'[m]. Equation 2.17 is rewritten below for time-varyingM [m]

and %[m].

y[m] = %[m]

+QD [m]

2Xj=

QD[m]2

hj L %[m] + ∆'D[m]

xptr[m] j

(5.1)

'sink[m] = 'sink[m 1] +M [m 1]; 'sink[0] = '0 (5.2)

ptr[m] = 'sink[m] divL (5.3)

∆'[m] = 'sink[m] modL (5.4)

%[m] =

1 UPMODEL

M [m]DOWNMODE

(5.5)

QD[m] =M [m]

LQ (5.6)

∆'D[m] = %[m] ∆'[m] (5.7)

To clarify the required operations, a step-by-step description of the algo-rithm is given below.

1. Determine the new sink sample phase'sink[m] according to the currentsample-rate ratio M [m 1] fetched from the frequency tracking unit(Eq. 5.2). Split the sink phase'sink[m] into a lower part ∆'[m] (Eq. 5.4)which determines the sink phase relative to the source samples, and intoan upper part ptr[m] (Eq. 5.3) which points to the source sample. Notethat the values of 'sink[m], ptr[m], and ∆'[m] are directly available, ifthe frequency tracking is realized with a DPLL according to Figure 5.2.

2. If the sink sample rate is smaller than the source sample rate (DOWN-MODE) the number Q of coefficients and the relative phase ∆'[m] aremultiplied by M [m]=L and %[m] respectively (Eq. 5.6, 5.7) yieldingQD[m] and ∆'D[m]. Note that since M [m] > L in DOWNMODE, % issmaller than 1.

3. The ‘inner loop’ of the algorithm is of the formP

h[hadr] x[xadr].The address of the source samples xadr[j] = ptr[m] j is incrementedby one for each calculation. The spacing of the filter coefficients h isequidistant with spacing L (UPMODE) or %[m] L (DOWNMODE).

Since only a fraction of the filter coefficients h are stored, the actualcoefficients are to be interpolated in real-time. The coefficient addresshadr has to be split into two parts: hadrHigh = hadr divL2 is usedto access the stored FIR coefficients hFIR; hadrLow = hadr modL2

is used for the interpolation. The difference ∆h between successivecoefficients needed for the interpolation can either be stored or calculatedin real-time – both methods need two memory accesses.

Multiplication Memory access Address updatexl = xLeft[xadr]

h = hadrLow ∆h+ h1 xr = xRight[xadr]

al = h xl + al h1 = hFIR[hadrHigh] hadr+= % L

ar = h xr + ar ∆h = hFIR[hadrHigh] xadr ++

For each iteration of the inner loop a minimum of three multiplications,four memory accesses and two address updates must be calculated.

4. In DOWNMODE the accumulators al, ar must be multiplied by % due tothe scaling of the filter cutoff frequency (Eq 2.15).

Note that the number of repetitions of the inner loop is proportional to %. InUPMODE it is constant. For the extreme case fsink = fsource=2 (DOWNMODE)the number of loop repetitions doubles compared to the UPMODE. Since y[m]

is calculated at the sink sample rate (which is fsource=2 in this case) theoverall speed requirements for the inner loop in DOWNMODE are the same asfor fsink = fsource.

If the computation effort outside the inner loop is neglected, the time forone repetition of the inner loop must be below

UPMODE 1=(Q fsink)DOWNMODE 1=(Q fsource)

(5.8)

5.2.2 Finite Word-Length in Datapath

In this section the implications of the finite word-lengths in the datapath onthe signal quality are discussed. Figure 5.3 shows the basic structure for thefilter calculation.

Datapath.epsi95 59 mm

MultiplyAccumulateUnit

Scale

CircularBuffer

y[m]byba

x[n]

bx bh

h1, ∆hρ[m]

bx

AddressGenerator

bm

xadr

M[m]

hadr

xl, xr

Figure 5.3: Datapath: word-lengths for FIR filter

The source samples x[n] are stored in a circular buffer. The size of thisbuffer is mainly determined by the filter order QD. The allowed short-timevariations of the sample rates requires (few) additional words. A larger buffersize is more tolerant to sample-rate variations, but increases the group delayof the filter.

The address generator calculates the coefficient address hadr and thesample address xadr according to Equation 5.1.

The product of the source samples xl, xr of word-length bx and the filtercoefficients h1, ∆h of word-length bh is quantized to bm bits and accumulated.The word-length of the accumulator ba must be large enough to avoid overflow.If QFP coefficients are used (Sec. 3.5.1) the accumulator is renormalized byshifting when the exponent changes: block ‘Scale’ in Figure 5.3. As a last stepthe accumulator value is multiplied by %. For this last multiplication the samemultiplier can be used as before, but the accumulator output must be quantizedto bh bits. Note that this last multiplication also allows to alter the overall gain

bx bh bm ba by Quantization[bit] [bit] [bit] [bit] [bit] noise [dB]

SARCO [RF94] 18 22 22 25 18 123AD1890 [Dev93] 20 22 25 27 24 127DSP5600X [Mot88] 24 24 48 56 24 140

Table 5.2: Datapath width in bits ( = QFP)

if desired. Table 5.2 shows sample word-length for three implementations.The influence of the finite word-length filter coefficients on the filter transferfunction has already been shown in Section 3.5. The finite word-lengths in thedatapath are the reason for several sources of quantization noise:

1. Quantization to bm bits after the multiplication in the inner loop

2. Quantization caused by the renormalization for QFP coefficients

3. Quantization to bh1 and bm1 bits before and after the multiplicationwith %

4. Output quantization to by bits

The quantization noise – neglecting output quantization – is therefore

for fixed-point coefficientsqbh1 + qbm1 +Q qbm

for QFP coefficientsqbh1 + qbm1

+2qbm + 4qbm1 + 4qbm2 + 8qbm3 + : : :

+qbm + qbm1 + qbm2 + qbm3 + : : :

For the QFP coefficients we thereby assumed that two coefficients haveexponent 0, four have exponent 1, four have exponent 2, etc. The resultingquantization noise for Q = 104 is given in the last column in Table 5.2. Theerror contribution of the datapath can therefore be neglected for the givenexamples for audio word-length up to 20 bit.

5.3 DSP Realization Considerations

The operations reviewed in the previous section seem to be well suited fora DSP implementation, since a fast multiply-add unit and a large memorybandwidth are required. DSP solutions have been reported i.e. for the followingimplementations:

[CDPS91] realized a sample-rate converter for fixed frequency ratios on aDSP56001 using B-splines. Two cascaded FIR interpolators each by a factor2 are followed by a 6th order B-spline interpolation. 300 instruction cyclesare needed per sink sample allowing real-time on a DSP56001 @ 33 MHz.

[PHR91] implemented a 16-bit sample-rate converter for rational fre-quency ratios on a DSP56001. The interpolation/decimation filter has beensynthesized by multiplying a SINC function with the Blackman-Harris win-dow, as suggested by [Ram82]. The interpolation/decimation filter is of orderN = 10239 using an interpolation factor L of 160. Therefore the number ofrepetitions Q of the inner loop is 63. Each repetition of the inner loop requiresonly two instruction cycles, since no coefficients are interpolated. Thereforeonly 126 instruction cycles are to be executed in the inner loop for each sinksample.

Due to the real-time coefficient interpolation and the frequency tracking animplementation of our algorithm is more complicated than the examples shownabove. The frequency tracking is not well suited for a DSP implementationdue to the synchronization problems, but can be realized as FPGA (Sec. 5.1.2).Some problems of the filter implementation on a DSP56001 are pointed outbelow.

The DSP56001 has two 25624 bit on-chip data memories, a 51224 bitprogram memory and many powerful addressing modes, but only one accessto external memory per instruction cycle is possible. Due to the 24-bit data busand 16-bit address bus the external memory space is limited to 64k 24 bit.The address ALUs, which allow auto-increment and modulo addressing, arelimited to a 16-bit word-length. The data ALU has two 56-bit accumulatorsand allows single cycle multiply-accumulate from the X and Y register.

The source samplesx[:] and the program are stored in the on-chip (X-)RAM.A maximum of 64k filter coefficients can be stored in an external RAM, whichmust be fast enough for one cycle accesses. Since only two accumulatorsare available in the DSP56001, the ‘inner loop’ is split into two loops, which

are executed one after the other: in the first loop the filter coefficients h areprecomputed and stored in the internal (Y-)RAM; in the second loop the actualfilter function

Ph x (al+= h xl, ar+= h xr) is calculated.

In the first loop the coefficient address is updated: hadr+ = % L,hadrHigh = hadr divL2, hadrLow = hadr modL2, and the coefficientsh1, ∆h are accessed from external memory. The actual filter coefficient isthen interpolated h = hadrLow ∆h+ h1 and stored in the internal memory.Figure 5.4 shows two realization alternatives: on the left hand side the DSP isused for all to above computations; on the right hand side the FPGA calculatesthe coefficient address update and addresses the RAM. This second alternativerequires a larger FPGA, but needs less (DSP) instruction cycles.

The second loop – an FIR filter with constant coefficients – is easy toimplement in two MAC instruction cycles, since both operands are stored inthe internal RAM.

BlockDiagramm-DSPnew.epsi105 37 mm


filter

Frequencytracking

XC4013

DSP56001

Filtercoefficients

RAM 64k×24

h1, ∆h

y[m]x[n]

hadrHigh

ρ[m], M[m]hadr Low


filter

Frequencytracking

XC4010

DSP56001

Filtercoefficients

RAM 64k×24

h1, ∆h

y[m]x[n]

hadrHighρ[m], M[m]

Figure 5.4: Block diagram using DSP and FPGA

The author estimates that 5 - 7 instruction cycles are required for an actualimplementation of the inner loop. A DSP5600x running at 40 (66) MHz allowsto compute 400 (660) MAC/s at a rate of 50 kHz. If 6 instruction cycles areused for the inner loop, and if we assume that 90% of the computation time isspent in the inner loop, 60 (100) loop repetitions can be realized. A DSP/FPGAimplementation of a sample-rate converter for arbitrary ratios seems thereforepossible, but rather expensive if needed in large quantities.

5.4 ASIC Realization

5.4.1 Implementation

A VLSI implementation of a sample-rate converter (SARCO) has been realizedin several semester and diploma theses [RW90, LS91, MS93]. This ASICis realized in a standard cell and macrocell technique using a 1:2m doublemetal CMOS technology. It contains 230 000 transistors, including 14 kbitof static RAM and an 18 22 bit pipelined multiplier. The total die area is6.4 mm 9.6 mm. Figures 5.5 and 5.6 show the floorplan and a photomicro-graph of the chip.

The architecture of a 16-bit sample-rate converter for arbitrary ratios andthe high-order filter synthesis program have been developed in a first diplomathesis [RW90]. A sophisticated scheduling schema for the pipelined multiplier-accumulator has been elaborated, which allows to keep the pipeline fullyloaded during the filter computation. Based on the results of this work, thespecifications for a 18-bit sample-rate converter have been established.

To verify the performance of the algorithm for the chosen parametersa software model of the sample-rate converter has been written and simu-lated [LS91]. In a second step this model has been partitioned into severalfunctional units according to the planned chip architecture. An additionalmodel has then been written for each functional block, which matched exactlythe intended implementation e.g. concerning word-length of the datapath. Thisbit-level model allowed to simulate the quantization properties of the wholesystem. However, the integer model executed much faster than the bit-levelmodel.

The bit-level model has been used to generate stimuli and expected re-sponses for the simulation of the VLSI circuit during development. Thismethodology allowed to verify each functional block separately as soon as itwas implemented. Additional simulations have been performed using severalfunctional blocks.

Some key parameters of the actual implementation of SARCO are givenbelow. A more detailed description of the filter implementation can be foundin [LS91]. In [MS93] the design of the frequency counter is documented indetail.

Floorplan.epsi108 70 mm

FrequencyMeasurement

Unit

MovingAverage RAM

PipelinedMultiplier

Accu

FIR AddressGenerator Registers

CircularBufferLeft

CircularBufferRight

Arithmetic Unit

Circular Buffer

Frequency Tracking

Figure 5.5: Floorplan of SARCO

Sarco-scan50dpi.eps108 72 mm

Figure 5.6: Photomicrograph of SARCO

The (stereo) input samples are stored in two 256 18 bit circular buffers,for the left and right channel each. Special address generation units computethe required addresses for these sample buffers and the address of the filtercoefficients. The heart of the digital filter architecture is a microprogrammedpipelined multiply-accumulate unit. A cycle time of 50 ns has been reachedusing three stage pipelining of the multiplier. The multiply-accumulate unit isused both for the coefficient interpolation and for the FIR filter calculation.

The FIR filter coefficients of this prototype design are stored off-chip inthe quasi floating point format (QFP): 22 bit mantissa, 4 bit exponent. The25 bit wide accumulator is dynamically rescaled according to the exponent ofthe filter coefficient. However only subsequent shifts by either 1, 2 or 3 bitsare implemented.

A (total) interpolation factor L of 65536 is used (L1 = 256, L2 = 256).Therefore the required number of filter taps (Q) is 105 for the full and 85 for therelaxed specifications (Chapter 3). Each filter tap needs three multiplications:one for the coefficient interpolation and two for the FIR filter calculation ofthe stereo channels. Besides the inner loop, 30 additional cycles per outputsample are required for initialization and postprocessing. To compute one sinksample, a total of 3 85 + 30 = 285 cycles (at 20 MHz) is needed. Thereforethe maximum sample rate which can be processed is 70 kHz.

SARCO is targeted for 18-bit performance. Therefore the input word-lengthis 18 bits and the output is rounded to 18 bits. Figure 3.8 showed that for thechosen parameters (L1 = 256,L2 = 256, s1 = 137 dB) 18-bit performancefor full-scale sine waves is only reached up to a signal frequency of 5 kHz.Improvements are suggested in the following section.

5.4.2 Discussion

During the development and measurements of the sample-rate converter SARCO

a deeper understanding of all side effects involved has been gained. The effectsof the various parameters on the resulting signal quality have been discussedin detail in the previous chapters. The following improvements to the aboveimplementation make full 20-bit performance possible.

The frequency tracking unit realized on the ASIC reaches only a worst caseaccuracy of 14 bit. The digital phase-locked loop solution (realized as FPGA)replacing this unit allows to improve the resolution of the frequency tracking

Filter57855bit22QFP-256bit22QFP.eps106 83 mm

[f/f sample]

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5−180

−160

−140

−120

−100

−80

−60

−40

−20

0

[dB

]

Figure 5.7: Interpolated FIR filter (N = 57855), L2 = 256, 22 bit QFP

to 20 bit. As an additional benefit the required chip area used by this unit isreduced from 13.6 mm2 to 6.3 mm2.

The filter coefficients of SARCO are stored in an external memory. To allowone-cycle access, SRAMs buffered by a Lithium battery are used. To reducethe number of pins and the external components of the 20-bit version, thecoefficients can be stored on-chip. The parameter Q, which defines the orderof the interpolation/decimation filter of SARCO, is programmable for Q 85(N 21 759). For full 20-bit performance a filter order of at least 57 855 isrequired. Since the impulse response is symmetric, and if the QFP notation isused a ROM size of 28 928 22 bit is sufficient. For the 1:2m technology ofSARCO an additional die area of 26 mm2 is necessary.

The number of filter coefficients can be substantially reduced by usinghigher order coefficient interpolation as outlined in Section 3.2.3.1. A secondmultiplier is then needed to allow coefficient interpolation in real-time.

Figure 5.7 shows the spectrum of the interpolation/decimation filter of order57 855 using an interpolation factor L2 = 256 and 22 bit QFP quantization.The average quantization noise level is at 186 dB with a peak value of165 dB. The 22 bit QFP format is therefore well suited for a filter of order57 855.

The circular buffer for the source samples and the multiplier must beextended from 18 to 20 bit. This leads to an area increase of approximately1 mm2.

The above enhancements from the (nearly) 18-bit implementation of SARCO

to a full 20-bit implementation increases the chip area from 61.4 mm2 to81 mm2 (+32%).

To estimate the die area of the above 20-bit architecture in a 0.8m tech-nology the main blocks have been retargeted to this technology. Based onthese results the author estimates that a die area of approximately 50 mm2 issufficient.

6Measurement Principle & Setup

In this chapter the setup of an all-digital measurement system, which has beenbuilt for the characterization of the sample-rate converter is described. In ad-dition to the digital measurements informal hearing tests have been performedusing a CD-player as source.

At a first glance it seems to be an easy task to measure the performanceof a digital system with digital input and output using DSP-based equipment.However many specific particularities must be taken into account for accurate,consistent and relevant measurement of high-quality digital audio equipment.

The measurement system is composed of generator and analyzer hardware,and a workstation (Fig. 6.1). The generator applies repeatedly in real-timea sequence of 24-bit audio samples to the device under test (DUT). Theanalyzer acquires the response samples, which are then processed, displayedand analyzed on a workstation using MATLAB [Mat88] – a high-performancenumeric computation and visualization software.

6.1 Generator

The source sample rate fsource is generated by dividing the frequency fgenof a quartz oscillator by U . This concept allows for a very stable low-jitter

83

Ssetup.epsi///Chap6103 52 mm

AnalyzerRAM256k x 24 bit

Digitally Generated(Sine) - Waves

Spectrum Analysiswith MATLABWorkstation

DUTGeneratorRAM256k x 24 bit

SourceSamples

SinkSamples

f sinkf source

Figure 6.1: Setup of all-digital measurement system

sampling clock source. With a 50 MHz quartz and sampling frequencies inthe range of 48 kHz discrete frequencies with a resolution of 46 Hz can begenerated.

A large memory (256 k 24 bit) holds the source sample values and isread out at the source sample rate. Any sequence of source values of lengthup to the size of the memory can be loaded into the RAM and is then appliedperiodically to the DUT. This approach allows the use of dedicated test signals,like pseudo-random noise, artificially distorted sine waves, multiple frequencysignals, as well as music signals of a length up to 6 s.

The ratio of the test signal frequency to sampling frequency has a significanteffect on the spectral content of the signal [HB86]. If the sampling frequency isa multiple of the test signal frequency then the quantization error componentsare located at the harmonics of the test signal. This is often undesirable sincethe true harmonic distortion should be separated from the quantization noise.

If adequate dither is added prior to the quantization, the above effect isreduced, since dither randomizes the quantization process and results in awhite-noise characteristic of the quantization noise [VL84, LWV92]. Thesame effect can be reached if the test signal frequency and the samplingfrequency are chosen to be of non-integer (preferably close to irrational) ratioand the analysis is performed over several periods of the test signal. Using thismethod, the signal degradation caused by the added dither can be avoided.

If the generator RAM contains LOOP values and NP periods of the testsignal, the following equations hold:

fsource =1U fgen (6.1)

fsig =NP

LOOP fsource =

NP

LOOP U fgen (6.2)

NP should be chosen to be a (large) prime number to ensure thatNP andLOOP have no common factors for reasonable choices of fsig and fsource.For fsource 48 kHz and fsig 997 Hz, i.e. the following parameters result(fgen = 50 MHz):

U = 1 042; NP = 5 087; LOOP = 244 832

The resulting frequencies are: fsource = 47:985 kHz and fsig = 997:002 Hz.

Using this method, only discrete but closely spaced test signal frequenciescan be generated (the next larger signal frequency would be 997.006 Hz in theabove example). It can be checked easily that the signal frequency fsig is notrelated to the other frequencies by a small integer ratio – neither to fsource norto fgen.

6.2 Analyzer

The analyzer is built in a fashion similar to the generator (Fig. 6.1). The ana-lyzer quartz frequency fanl is divided byV to generate the sink sampling clockfsink. The sink sample rate can alternatively be derived from the generatorquartz instead of the separate analyzer quartz to allow synchronous measure-ments. The samples are acquired at the sink sample rate with 24-bit precisionand written into the analyzer RAM. The spectrum of the acquired samples iscalculated using the FFT and displayed graphically on the workstation usingMATLAB.

The power spectrum of an N -point FFT consists of N discrete frequencyvalues – so-called bins. Therefore the frequency resolution is limited tofsink=N . When calculating the FFT, it is assumed that the sequence of Npoints is repeated periodically. The discontinuities at the beginning and atthe end of the sequence, which may result from this repetition, distort thespectrum. The two well-known methods to cope with this problem are givenin the next two sections.

6.2.1 Frequency Matching for the FFT

If the test signal frequency is chosen to lay on a bin, i.e.

fsig =n fsink

N(6.3)

the data sequence contains exactly n periods of the test signal and no discon-tinuity occurs at the boundary of the measurement interval. This technique isoften used in A/D- and D/A-converter testing [Der87], but is not well suited forsample-rate conversion characterization, since it must be possible to choosethe test signal frequency independently of the sink sample rate to comply withthe requirements given in the previous section. In addition this method isonly feasible if Equation 6.3 can be met exactly, i.e. if the exact sink samplingfrequency is available. A small mismatch leads to spectral leakage discussedbelow.

6.2.2 Windowing for the FFT

The second method to avoid the discontinuity at the border of the data sequenceis to multiply the data by a window function w[i]. [Har78] compared manydifferent windows and concluded that the Kaiser-Bessel and the Blackman-Harris window perform best. We use the 4-term Blackman-Harris window(BH4) and the Kaiser-Bessel window with = 6 (KB6), due to their highsidelobe attenuation.

In Figure 6.2 the spectrum of the BH4-window is plotted. The highestsidelobes are at 92 dB. The 6-dB bandwidth of the window is 2.72 bins.Therefore signal frequencies which are at least 3 bins apart can be distin-guished [Har78]. Note that if the signal frequency falls exactly on a bin of theFFT, the BH4 window is sampled at it’s (equidistant) zero-crossings and theeffect of the window disappears. On the other hand, if the signal frequencyfalls in the middle of two bins, the BH4 window is sampled on the upperenvelope in Figure 6.2.

The parameter of the Kaiser-Bessel window allows a trade-off betweenmainlobe width and sidelobe attenuation. The Kaiser-Bessel window for = 6 is plotted in Figure 6.3. The 6-dB bandwidth of the KB6 window is3.31 bins, and the highest sidelobes are at 145 dB. The window coefficientsare Bessel functions and therefore the zero-crossings are not equidistant.

BH4win.eps89 69 mm

−80 −60 −40 −20 0 20 40 60 80−160

−140

−120

−100

−80

−60

−40

−20

0

[bin]

[dB

]

Figure 6.2: Spectrum of 4-term Blackman-Harris window (BH4)

KB6win.eps89 69 mm

−80 −60 −40 −20 0 20 40 60 80−180

−160

−140

−120

−100

−80

−60

−40

−20

0

[bin]

[dB

]

Figure 6.3: Spectrum of Kaiser-Bessel window with = 6 (KB6)

What is the required accuracy of the measurement system for high-qualitydigital audio characterization?

For a full-scale sine wave quantized with k bits the quantization noisefloor is 6:02 k 1:76 dB below the signal level. Therefore the windowsidelobes of the BH4 window at92 dB would be only a minor error for 16-bitmeasurements. However, this estimate is misleading, since the quantizationnoise is modified by the window function and distributed over N=2 bins ofthe FFT resulting in a lower noise level. Several measurements are usuallyaveraged to obtain a uniform distribution of the noise over all bins.

Two characteristic values of a specific window w[i] are its coherent gaincg and its equivalent noise bandwidth nbw [Har78].

cg =1N

NXi=1

w[i]; nbw =1

N cg2

NXi=1

w2[i] (6.4)

The coherent gain cg is the average value of the window. To preservethe energy of the signal, the amplitude of the spectrum is divided by cg. Thewindow function amplifies the noise level by its noise-bandwidth nbw. In aN -point FFT the average noise floor for k-bit quantization appears thereforeat

10 log

N

2 nbw

6:02 k 1:76 dB (6.5)

Table 6.1 gives examples values of Eq. 6.5 for the BH4 window (nbw = 2:004).The listed values must be increased by 1.0 dB for the KB6 window, due to itslarger noise bandwidth (nbw = 2:492).

N k =16 bit k =18 bit k =20 bit1024 122:2 dB 134:2 dB 146:2 dB

16384 134:2 dB 146:2 dB 158:2 dB65536 140:2 dB 152:2 dB 164:2 dB

Table 6.1: Average noise floor level of a sine wave with k-bit quantizationusing a N -point FFT with BH4 window

6.2.3 Complex Windows

In Sections 6.2.1 and 6.2.2 we have presented two methods to allow accuratespectrum analysis of high-quality audio signals. Both methods exhibit certainlimitations: The frequency matching method requires the precise samplingfrequencies to be available and the performance of the windowing methoddepends strongly on the chosen window. The BH4 window has its sidelobesas high as92 dB and the KB6 window needs coefficients which are expensiveto calculate and allows only a limited frequency resolution, due to its widermainlobe.

To overcome the inaccuracies caused by the sidelobes of the BH4 window,the signal frequency must be chosen to fall exactly on a bin as mentionedabove. To avoid this restriction we make use of the frequency transformationproperty of the Fourier transform. The signal frequency is determined by afirst FFT and is then shifted slightly onto a bin (Eq. 6.6) without affecting thesignal energy.

s0 = s ejΩt (6.6)

We have called this combination of the complex correction and the window‘complex BH4 window’ (BH4C) [RF94]. It is now also used in the newestversion of the ‘Audio Precision One’ software. This method is only applicablefor windows which have equidistant zero-crossings that can be matched on abin (i.e. not to KB6 window).

If the frequency measurement is not accurate enough remnants of thewindow will still be visible. Figure 6.4 shows the Hanning, the BH4, and theKB6 window with three different offsets.

The BH4C window is a good compromise between the mainlobe width andthe inaccuracies introduced by imprecise frequency measurement. The KB6window is well suited for multi-tone measurements, where it is not possible(and not desirable) to match all frequencies on a bin.

6.3 Measurement Procedure

This section concludes this chapter with the step-by-step description of theprocedure used for the measurements in this thesis.

Wincompare.eps89 72 mm

−10 0 10−160

−140

−120

−100

−80

−60

−40

−20

0Offset = 0.000 bin

[bin]

[dB

]

−10 0 10−160

−140

−120

−100

−80

−60

−40

−20

0Offset = 0.004 bin

[bin]

[dB

]

−10 0 10−160

−140

−120

−100

−80

−60

−40

−20

0Offset = 0.500 bin

[bin]

[dB

]

Figure 6.4: Influence of (bin-)offset on window performance: BH4 ( ),Hanning ( ), KB6 ( )

1. The source and the sink sample rates are chosen. Since they are gener-ated by dividing the frequency of the generator and the analyzer quartzrespectively, parameters U , V are set. If a particular precise sample rate– such as 44.100 kHz – must be generated, an appropriate quartz with amultiple of this rate must be used.

2. The source samples are loaded from the computer into the generatorRAM. A signal frequency according to Equation 6.2 should be chosen.If this is not possible dither must be added to the source signal.

3. N sink samples are acquired into the analyzer RAM and transferred tothe computer. A first FFT is used to determine the signal frequency. Thesignal is then multiplied by the complex Blackman-Harris window anda second FFT is calculated in double precision using MATLAB.

To average several (Avg) measurements the signal acquisition is re-peated Avg times. The acquired N sink samples are transformed intothe frequency domain by the FFT and are then averaged.

4. The current standard for measurement of digital audio equipment [Cab91]states that the notch filter to suppress the signal frequency for ‘total har-monic distortion and noise’ (THD+N) measurements should have an‘electrical Q’ between 1 and 5. We realized the notch filter in frequencydomain, by setting the appropriate bins to zero. Two THD+N values aregiven in each plot: one for Q = 5 and one for a notch filter with a con-stant width of 16 bins (i.e. Q = 21 for fsig = 1 kHz, fsink = 48 kHz,N = 16k).

According to the standard only noise components up to 20 kHz are takeninto account. Therefore it is possible to get THD+N values which areslightly below the minimum for k-bit quantization.

For further reference Figure 6.5 shows the spectrum of an (ideal) sinewave quantized with 18 bit. A 16k-FFT has been calculated using the BH4Cwindow. The signal level is 0 dB, and the total harmonic distortion and noiseis the same for the narrow and the standard (Q = 5) notch filter. 16 sequenceshave been averaged (Avg = 16). The slow rise of the noise floor towards lowfrequencies is a residual of the BH4C compensation.

Sine18BH4.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20

016384 FFT, BH4C, 997Hz, 0dB, THD+N = −110.5dB; −110.5dB, Avg = 16

[Hz]

[dB

]

Figure 6.5: Sine wave quantized with 18 bit

7Results

In this chapter measurements of our VLSI implementation (SARCO) of thesample-rate converter are presented to verify the theoretical concepts devel-oped in the previous chapters. Measurements of a commercial implementationare added for comparison purposes.

Three major sources of signal degradation due to sample-rate conversionhave been identified in this thesis:

Non-uniform sampling due to finite precision frequency tracking

Non-ideal reconstruction/anti-aliasing lowpass filter (including the holdoperation, the coefficient interpolation and quantization, and the finitestopband attenuation)

Quantization noise in the datapath

The phase deviation of the computed sink sample moment caused by thefinite precision frequency tracking and jitter of the sampling clocks results in anon-uniform resampling of the signal. The phase deviation is modulated ontothe audio signal and leads to a wider signal peak. The level of the modulatederror signal is proportional to both the signal frequency fsig and the signalamplitude A. Note however that jitter on the sampling clocks is attenuated by

93

the digital PLL and therefore the signal quality may even be improved by thesample-rate conversion.

The zero-order hold operation and the coefficient interpolation lead tospurious signal peaks near the zero-crossings of the interpolation functionspectrum ( SINC for hold operation, SINC 2 for linear interpolation). Thesepeaks are folded back into the baseband by the resampling process. Note thatthe resulting filter transfer function is still strictly linear phase, since the filtercoefficients are exactly symmetric even after interpolation and quantization.These error components caused by the interpolation and hold operation caneasily be identified in the spectrum, since they appear always in pairs with(nearly) identical amplitude, but with a frequency difference of twice thesignal frequency fsig. The amplitude of these peaks scales proportionally tothe signal amplitude A. The error peaks caused by the hold operation areproportional to the signal frequency fsig, whereas the peaks caused by thecoefficient interpolation scale by (fsig)

2.

In Section 7.1 measurements of the distortion caused by the non-idealfrequency tracking are shown and discussed. In Section 7.2 the performanceof the entire sample-rate converter is measured and error components causedby the non-ideal filtering are pointed out. Measurements of the error causedby the finite datapath width are not shown, since this quantization is maskedby the 18 bit output quantization of SARCO.

7.1 Frequency Tracking Measurements

The phase of the sink samples 'sink[m] is calculated starting from an initialphase '0. It is incremented for each sink sample by the (measured) ratio ofsource and sink sample rate M [m] (Eq. 5.2, Fig. 4.2). Jitter of the source andsink sampling clock and the finite arithmetic precision of the frequency trackingunit lead to a (small) inaccuracy of the calculated sink sample moment. For thecharacterization in this section the output of the frequency measurement unitis acquired with the analyzer (Sec. 6.2). This measured data is used to samplean ideal, non-quantized sine wave at the corresponding sink sample moments.The spectrum of this non-uniformly sampled sine wave is then calculated anddisplayed.

The following measurements base on source and sink sample rates derivedfrom different quartz oscillators as described in Chapter 6. The sample-rate

ratio M [m] is tracked by either the frequency counter integrated on SARCO

(Sec. 5.1.1) or the digital PLL realized as FPGA (Sec. 5.1.2). 16 384 valuesof M [m] are acquired for different source and sink sample-rate pairs usingthe digital analysis system. Plots of the deviation caused by the non-idealfrequency tracking are shown on Pages 96 and 97. Each plot is subdividedinto three traces. The top trace shows the deviation of M [m] from the averagevalue in (16-bit) LSBs. The sink phase is computed as integral of M [m]. Thephase deviation'q[m] is shown in the middle plot. The bottom trace shows the(simulated) spectrum of an full-scale sine wave at 997 Hz, which is samplednon-uniformly with the measured phase deviation shown in the middle trace.

The measurements on Page 96 correspond to the asynchronous sample ratesfsource = 44:1 kHz and fsink = 48:0 kHz, whereas on Page 97 measurementsfor fsource = 48:0 kHz and fsink = 47:9 kHz are shown. This second casecorresponds to plesiochronous sample rates with a difference frequency of100 Hz. Note that the error distribution of frequency tracking measurementsdepends on the precise quartz frequencies of generator and analyzer, sincetheir difference frequency may also appear in the measurements.


Figures 7.1 and 7.4 show measurements of the performance of the frequencycounter. The measured frequency ratioM [m] varies by up to2 LSBs (16 bit)even though both sample rates are constant. The resulting phase noise containsboth low-frequency and high-frequency components up to80 dB for a 1 kHzfull-scale signal. The frequency and amplitude of the error components dependon the precise frequency of the source and sink sampling clocks and on theclock frequency of the counter. If the counting frequency is unrelated to thesampling frequency the amplitude of the error peaks in Figures 7.1 and 7.4 isreduced.

7.1.2 Digital PLL

In Figures 7.3 and 7.6 performance measurements of the (20 bit) digital PLLare shown. Due to a limitation in the SARCO implementation the frequencyratio M [m] can only be used with a precision of 16 bits. For comparisonFigures 7.2 and 7.5 show the performance of the PLL with M [m] reduced to16 bit.

PROTO-40-44-48.eps95 48 mm

2000 4000 6000 8000 10000 12000 14000 16000−0.5

0

0.5Phase Deviation

[Samples]

[Deg

]

102

103

104

−150

−100Phase Deviation Modulated on 997 Hz Sine

[Hz]

[dB

]

50 100 150 200 250 300 350 400 450 500−2

0


[Samples]

[LS

B_1

6]

Figure 7.1: Frequency counter 16 bit, 44.1 kHz! 48 kHz

SARCO-44-48.eps95 48 mm

50 100 150 200 250 300 350 400 450 500−2

0


[Samples]

[LS

B_1

6]

102

103

104

−150


[Hz]

[dB

]

50 100 150 200 250 300 350 400 450 500−5

0

5x 10

−3 Phase Deviation

[Samples]

[Deg

]

Figure 7.2: Digital PLL 16 bit, 44.1 kHz! 48 kHz

SARCO20-44-48.eps95 48 mm

2000 4000 6000 8000 10000 12000 14000 16000−2

−1

0Phase Deviation

[Samples]

[Deg

]

102

103

104

−150


[Hz]

[dB

]

2000 4000 6000 8000 10000 12000 14000 16000−0.1

0

0.1Frequency Deviation

[Samples]

[LS

B_1

6]

Figure 7.3: Digital PLL 20 bit, 44.1 kHz! 48 kHz

PROTO-40-48-47.eps95 48 mm

2000 4000 6000 8000 10000 12000 14000 16000−0.5

0

0.5Phase Deviation

[Samples]

[Deg

]

102

103

104

−150


[Hz]

[dB

]

50 100 150 200 250 300 350 400 450 500−2

0


[Samples]

[LS

B_1

6]

Figure 7.4: Frequency counter 16 bit, 48 kHz! 47.9 kHz

SARCO-48-47.eps95 48 mm

50 100 150 200 250 300 350 400 450 500−2

0


[Samples]

[LS

B_1

6]

2000 4000 6000 8000 10000 12000 14000 16000−0.5

0

0.5Phase Deviation

[Samples]

[Deg

]

102

103

104

−150


[Hz]

[dB

]

Figure 7.5: Digital PLL 16 bit, 48 kHz! 47.9 kHz

SARCO20-48-47.eps95 48 mm

2000 4000 6000 8000 10000 12000 14000 16000−0.5

0

0.5Phase Deviation

[Samples]

[Deg

]

102

103

104

−150


[Hz]

[dB

]

2000 4000 6000 8000 10000 12000 14000 16000−0.1

0

0.1Frequency Deviation

[Samples]

[LS

B_1

6]

Figure 7.6: Digital PLL 20 bit, 48 kHz! 47.9 kHz

The output of the PLL frequency tracking unit varies only by one LSB.Due to the lowpass characteristic of the PLL these variations are restricted tolow frequencies. Figure 7.3 shows a measurement for fsource = 44:1 kHz andfsink = 48:0 kHz. During the measurement interval of 16k values the 20-bitLSB ( = 0:06 16-bit LSB) switches only once. Because the frequency ratio isin general not a power of 2 it cannot be exactly represented by a unique binarynumber, but it is approximated by an appropriate duty cycle of the LSB. Sincethe LSB of the frequency ratio changes only at low frequencies, its integral –the phase deviation – may reach large values as shown in the middle trace ofFigure 7.3. This phase deviation leads to a widening of the signal spectrumbelow 120 dB as shown in the bottom trace.

Figure 7.2 shows the measurements corresponding to Figure 7.3, but withM [m] reduced to 16 bit. The duty cycle of the (16-bit) LSB is 15 : 1 as shownin the upper trace. The resulting phase deviation has a sawtooth shape witha frequency of approximately 48 kHz=15 = 3:2 kHz, which leads to spectrallines at i.e. (0:9973:2) kHzfor a sine wave at 997 Hz, as shown in the bottomtrace of Figure 7.2. Note that the spectral line at (0:997 3:2) kHz is aliasedto 2:2 kHz.

In the plesiochronous case shown in Figures 7.5 and 7.6 the differencefrequency of the sample rates (100 Hz) is not totally suppressed by the lowpassof the PLL, but only attenuated. This leads to sidebands at multiples of 100 Hz.These sidebands cause only a minor degradation of the audio quality, sincethey are all below 110 dB and will additionally be masked by the full-scale997 Hz signal.

The performance of the digital PLL discussed in this section could even beimproved by carefully dithering the computed sink phase 'sink to distributethe quantization error over the full spectrum. For many implementations theinherent dither caused by the jitter of the sampling clocks may be sufficient.

7.2 System Measurements

In this section measurements of the performance of SARCO using the digitalPLL (FPGA) for frequency measurement are shown. Additionally, measure-ments of a commercial realization (AD1890 [Dev93]) and of SARCO using thefrequency counter are given in appendix D for comparison purposes.

All the following measurements have been carried out using an undithered18 bit input signal and follow the concepts outlined in Chapter 6. The sourceand the sink sample rate have been derived from separate quartz oscillatorswith the a nominal frequency of 50 MHz.

16 384 sink samples are acquired, multiplied by the complex 4-termBlackman-Harris (BH4C), and Fourier transformed. The spectra of 16 con-secutive measurements are averaged (Avg = 16). The ‘notch filter’ for theTHD+N calculations is realized in frequency domain either as a 16 bin widebrick-wall filter (THD+N =107:4 dB in Fig. 7.7) or by using a filter with an‘electrical Q’ of 5 (THD+N = 107:7 dB).

Sarco-44k-48k-1k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO

16384 FFT, BH4C, 996Hz, 0dB, THD+N = -107.4dB; -107.7dB, Avg = 16

Figure 7.7: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz

Figure 7.7 shows the resulting spectrum for sample-rate conversion from44.1 kHz to 48.0 kHz. The THD+N almost reaches the ideal value of a requan-tized 18-bit signal! The THD+N value is 3 dB above the ideal value (Fig 6.5,THD+N = 110:5 dB) due to the 18-bit output quantization of SARCO.

The error caused by the coefficient interpolation is below the noise floor fora signal frequency of 1 kHz (compare to Figure 3.8, L1 = L2 = 256, Eq. 3.6),but the hold operation causes error components up to128 dB (Eq. 3.10). Forthe measurement in Figure 7.7 they do not appear in the audio band, but if


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0



Figure 7.8: SARCO + FPGA: 0 dB 997 Hz, 48.0! 44.1 kHz


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0



Figure 7.9: SARCO + FPGA: 0 dB 15 kHz, 48.0! 44.1 kHz


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_48k_997_0dB_18_, ser_47k_16k_sync, SARCO


Figure 7.10: SARCO + FPGA: 0 dB 997 Hz, 48.0! 47.9 kHz


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_32k_997_-20dB_18_, par_31k_16k_async, SARCO

16384 FFT, BH4C, 997Hz, -20dB, THD+N = -106.1dB; -106.5dB, Avg = 16

Figure 7.11: SARCO + FPGA:20 dB 997 Hz, 32.0! 31.96 kHz

source and sink sample rates are exchanged they appear at frequencies around20 kHz (Fig. 7.8). Since the error components caused by the hold operation arealiased from their original frequency k L fsource fsig into the baseband,their frequency depends sensitively on the exact values of fsource and fsink.

The amplitude of the error caused by the hold operation scales linearly withthe signal frequency. If the signal frequency is increased from 1 kHz to 15 kHzthe error components rise by 24 dB (= 20 log 15=1). Figure 7.9 shows theresulting spectrum for a 15 kHz signal converted from 48 kHz to 44.1 kHz. Fora signal frequency of 15 kHz the error components caused by the coefficientinterpolation become also visible with an amplitude below113 dB (Eq. 3.9).Note that both error sources could be drastically reduced by increasing theinterpolation factors L1 and L2.

Figure 7.10 shows the artifacts caused by the frequency tracking in theplesiochronous case. Sidebands with a modulation frequency of 100 Hz appear113 dB below the signal peak. The phase noise caused by the non-idealfrequency tracking scales proportionally to the signal amplitude. Figure 7.11shows the spectrum of a 20 dB signal converted from 32 kHz to 31.96 kHzresulting in a reduced (absolute) amplitude of the modulated phase error.

For plesiochronous sample rates derived from crystal oscillators smallartifacts may result from the finite precision frequency tracking. Howevershort-time jitter of the sampling clocks is suppressed by the lowpass charac-teristic of the PLL. In Figure 7.12 the closed-loop transfer function and thusthe jitter attenuation is replotted from Figure 4.12. Clock jitter above 20 Hz isattenuated by 50 dB; jitter above 300 Hz even by over 100 dB.

Jitterreject.eps108 36 mm

100

101

102

103

−100

−50

0

Frequency [Hz]

Jitte

r G

ain

[dB

]

Figure 7.12: Jitter rejection on sampling clocks

To measure the jitter rejection capabilities of the sample-rate converter thesource sampling clock fsource is generated by a signal generator which allows

to modulate the output frequency by an external signal (Fig. 7.13).

JitterMeasure.epsi74 34 mm

Source Samples

f sinkD/A

Sample-rate converter

f source

(jitter)SignalGenerator

Figure 7.13: Jitter measurement set-up

The source samples and the (jittered) sampling clock are fed to a D/A-converter and the spectrum is displayed using a HP35670A dynamic signalanalyzer. Figure 7.14 shows the spectrum of a 997 Hz signal with sinusoidal50 Hz jitter modulated on the source sampling clock. In Figure 7.15 thespectrum of the sink signal is shown. The 50 Hz jitter is attenuated by 70 dBas expected from Figure 7.12. For the measurements in Figures 7.16 and 7.17triangular 400 Hz jitter at40 dB was modulated on the source sampling clock.Since 400 Hz jitter is attenuated by more than 100 dB it is suppressed belowthe 18-bit noise floor in Figure 7.17. Additional jitter measurements are givenin Appendix D.

7.3 Discussion

In this chapter we have presented measurements of the frequency tracking unitand the entire sample-rate converter and identified the source of the differenterror components. In spite of all these non-idealities, the overall conversionquality is very high. A total harmonic distortion and noise (THD+N) value be-low106 dB (94 dB) has been reached for 1 kHz (15 kHz) full-scale signals.For signals with jittered sampling clocks the quality is even improved!

If we take into account the decreasing sensitivity of the human ear towardshigher frequencies and the lower signal amplitude of realistic audio signalsthe performance decrease for audio frequencies above 10 kHz is of minorimportance. Informal hearing test revealed no audible artifacts.

As pointed out in Chapter 3 (Fig. 3.8), the conversion quality can be

Source-jitter-50Hz-20dB.epsi110 62 mm

Figure 7.14: Analog source signal with 20 dB 50 Hz sinusoidal jitter of thesource sampling clock (expanded scale due to limited resolutionof HP35670A Dynamic Signal Analyzer)

SARCO-jitter-50Hz-20dB.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure 7.15: SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with 20 dB50 Hz sinusoidal jitter of the source sampling clock


Figure 7.16: Analog source signal with40 dB 400 Hz triangular jitter of thesource sampling clock (expanded scale due to limited resolutionof HP35670A Dynamic Signal Analyzer)


102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure 7.17: SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with 40 dB400 Hz triangular jitter of the source sampling clock

enhanced to 20 bit by increasing the interpolation factorsL1 andL2 to 512 and2048 respectively.

Error2.eps108 73 mm

100 1000 10000−130

−125

−120

−115

−110

−105

−100

−95

−90

[Hz]

[dB

]

Figure 7.18: Total error contributions of (unquantized) interpolation/de-cimation filter for sinusoidal audio signals: SARCO ( ),SARCO18 ( ), SARCO20 ( )

In Figure 7.18 the error contributions caused by the hold operation, thecoefficient interpolation and the finite stopband attenuation are (re)plotted forthree different sets of L1 and L2: SARCO (L1 = L2 = 256); SARCO18 usingthe same basic filter, but with higher coefficient interpolation (L1 = 256,L2 = 1024); SARCO20 redesigned for full 20-bit performance. Note that theplots in Figure 7.18 do not include the degradation caused by the quantizationof the filter coefficients and in the finite word-length datapath.

8Conclusions

In this thesis the concepts for fully digital sample-rate conversion of stereoaudio signals between arbitrary sample rates have been covered. It has beenshown that a converter with 20-bit audio quality over the full audio band isfeasible.

Conceptually, the process of converting the digital source signal into theanalog domain by a D/A converter and resampling it at the sink sample rateby a A/D converter is imitated in the digital domain by a high-order interpola-tion/decimation process. The nearly ideal reconstruction filter, which transmitsthe baseband unaltered but suppresses totally the higher order images, can notbe realized efficiently directly, but is approximated by a three-stage digitalfilter: high-order FIR filter, interpolator, and zero-order hold. A trade-offbetween the target quality, the memory requirements and the computationalcost must be found for each implementation.

It has been shown that digital lowpass filters of order up to 30 000 can bedirectly synthesized using a filter design program that has been optimized for(very) high-order filters.

A 18-bit prototype ASIC for sample-rate conversion of audio signals by ar-bitrary ratios has been developed and fabricated in a standard cell and macrocelltechnique using a 1:2m double metal CMOS technology. It contains 230 000transistors, including 14 kbit of static RAM and an 18 22 bit pipelined mul-

107

tiplier. To improve the conversion quality for plesiochronous sample rates adigital PLL has been implemented in an FPGA and interfaced to the ASIC.Short-time jitter of the sampling clocks is suppressed by the loop filter of thePLL. The adaptive bandwidth of the loop filter allows for vary-speed oper-ation, where fast tracking is required. The ASIC is therefore well suited tosynchronize an input with jittered sampling clock to the master clock of adigital audio system.

In this work we concentrated on the effects caused by digital signal process-ing and identified the sources of the different error components, but did notevaluate the results concerning psycho-acoustic phenomena. However, infor-mal hearing tests have been carried out. Non of the involved persons coulddistinguish the sound of original CD signal from the audio signal convertedby the ASIC.

An all-digital measurement system has been built for the characterizationof the sample-rate converter in the digital domain. A generator applies in real-time a sequence of 24-bit audio samples to the circuit and the analyzer acquires256k sink values, which are then Fourier transformed. Two windows havebeen found suitable for spectrum analysis of multirate signals: the complex4-term Blackman-Harris window (BH4C) and the Kaiser-Bessel window for = 6 (KB6). Using these windows, accurate spectrum analysis of high-quality signals is possible without constraining the allowed signal frequenciesor sample rates.

In this thesis the contribution of various non-idealities to the error in audiosample-rate conversion has been evaluated and listed. This is not a hint onthe low quality of digital sample-rate conversion, but allows to customizesample-rate converters for each application according to the required qualityand allowed cost.

The principles of digital sample-rate conversion can be extended towardsother applications as i.e. the conversion of rectangular pixels of a CCD-camerato square shape for pattern recognition applications.

Appendix AList of Symbols and Integrals

Continuous-time signal (:)

Discrete-time signal [:]

Sample rate fsample

Source sample rate fsourceSink sample rate fsinkSignal frequency fsigCutoff frequency fcutoffGenerator frequency fgenAnalyzer frequency fanl

Impulse response h(:)

Transfer characteristic H(:)

Interpolation filter HL

Decimation filter HM

Adaptive filter HLM

Passband ripple pStopband ripple sPassband edge fpStopband edge fs

Modulo division modInteger division divConvolution ?

Source samples x[n]

Sink samples y[m]

Interpolation factor L

Decimation factor M

Time grid tunit

Initial phase '0

Sink phase 'sink[m]

Phase difference ∆'[m]

SINC (x) sin(x)=(x)

RECT (x)

1 if jxj < 0:50 otherwise

111

The following definite integrals are used [GR94]:Z 1

0

sin(a x)a x

dx =12

a[a > 0] (A.1)

Z1

0

sin(a x)a x

2

dx =12

a[a > 0] (A.2)

Z 1

0

sin(a x)a x

4

dx =13

a[a > 0] (A.3)

Z 1

0

sin(a x)a x

6

dx =1140

a[a > 0] (A.4)

Z 1

0

sin(a x)a x

8

dx =151630

a[a > 0] (A.5)

The following approximations are used [BK85]:

sinx x+O(x3) [x 1] (A.6)

sin(n x) x+O(x3) [x 1] (A.7)

cos(x) 1 +O(x2) [x 1] (A.8)

The following summation formulas are used [BK85]:

1Xx=1

1x2

=2

6(A.9)

1Xx=1

1x4

=4

90(A.10)

1Xx=1

1x6

=6

945(A.11)

1Xx=1

1x8

=8

9450(A.12)

Appendix BError Caused by Hold Effect

B.1 Sine Wave

Ehold =

1Xj=1

sin( jfifsig

fi)

jfifsig

fi

!2

(B.1)

for fsig fi we get

Ehold

1Xj=1

2

fsigfi

j

!2

= 2

fsig

fi

2

1Xj=1

1j2

(B.2)

and finally, if we substitute fi by L fsource we get a total of

Ehold 2

3

fsig

L fsource

2

(B.3)

113

B.2 White Noise

Ehold =1

∆f

1Xj=1

Z jfi+∆f2

jfi∆f2

sin( f

fi)

ffi

!2

df (B.4)

for fsig fi we get

Ehold

1Xj=1

Z +∆f2

∆f2

1∆f

f

fi

j

!2

df =1

∆f (fi)2

1Xj=1

1j2

Z +∆f2

∆f2

f 2df

(B.5)

=1

∆f (fi)22

6

∆f 3

3 4=

2

72

∆ffi

2

(B.6)

with ∆f = fsource and fi = L fsource we finally get

Ehold 2

72 L2(B.7)

B.3 Pink Noise

Ehold 2 1Xj=1

Z jfi+20 000

jfi+20

16 f

sin( f

fi)

ffi

!2

df (B.8)

for fsig fi we get

Ehold 2 1Xj=1

Z 20 000

20

16 f

f

fi

j

!2

df =2

6 (fi)2

1Xj=1

1j2

Z 20 000

20f df

(B.9)

2

6 (fi)22

6

12 (20 000)2 =

20 000

6 fi

2

(B.10)

with fi = L fsource we finally get

Ehold

10 472

L fsource

2

(B.11)

Appendix CError Caused by LinearInterpolation

C.1 Sine Wave

Ehold =

1Xj=1

sin( jfifsig

fi)

jfifsig

fi

!4

(C.1)

for fsig fi we get

Ehold

1Xj=1

2

fsigfi

j

!4

= 2

fsig

fi

4

1Xj=1

1j4

(C.2)

and finally, if we substitute fi by L fsource we get a total of

Ehold 4

45

fsig

L fsource

4

(C.3)

117

C.2 White Noise

Ehold =

1Xj=1

Z jfi+∆f2

jfi∆f2

1∆f

sin( f

fi)

ffi

!4

df (C.4)

for fsig fi we get

Ehold 1

∆f

1Xj=1

Z +∆f2

∆f2

f

fi

j

!4

df =1

∆f (fi)4

1Xj=1

1j4

Z +∆f2

∆f2

f 4df

(C.5)

=1

∆f (fi)44

90

∆f 5

5 16=

4

7200

∆ffi

4

(C.6)

with ∆f = fsource and fi = L fsource we finally get

Ehold 4

7200 L4(C.7)

C.3 Pink Noise

Ehold 2 1Xj=1

Z jfi+20 000

jfi+20

16 f

sin( f

fi)

ffi

!4

df (C.8)

for fsig fi we get

Ehold 2 1Xj=1

Z 20 000

20

16 f

f

fi

j

!4

df =2

6 (fi)4

1Xj=1

1j4

Z 20 000

20f 3 df

(C.9)

2

6 (fi)44

90

14 (20 000)4 =

11080

20 000

fi

4

(C.10)

with fi = L fsource we finally get

Ehold

10 960

L fsource

4

(C.11)

Appendix DPerformance Measurements

121

AD1890-44k-48k-1k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_44k_997_0dB_18_, ser_48k_16k_async, AD1890


Figure D.1: AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz

PROTO-44k-48k-1k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_44k_997_0dB_18_, par_48k_16k_async, PROTO


Figure D.2: SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0



Figure D.3: SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz

AD1890-48k-44k-1k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0





102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0





102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0




AD1890-48k-44k-15k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0



Figure D.7: AD1890: 0 dB 15 kHz, 48.0! 44.1 kHz


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0



Figure D.8: SARCO: 0 dB 15 kHz, 48.0! 44.1 kHz


102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0



Figure D.9: SARCO + FPGA: 0 dB 15 kHz, 48.0! 44.1 kHz

AD1890-48k-47k-1k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0





102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0





102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_48k_997_0dB_18_, ser_47k_16k_sync, SARCO



AD1890-32k-31k-1k.eps102 72 mm

102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_32k_997_-20dB_18_, ser_31k_16k_async, AD1890




102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_32k_997_-20dB_18_, par_31k_16k_async, PROTO




102

103

104

-160

-140

-120

-100

-80

-60

-40

-20

0

sin_32k_997_-20dB_18_, par_31k_16k_async, SARCO


Figure D.15: SARCO + FPGA:20 dB 997 Hz, 32.0! 31.96 kHz

AD1890-twin.eps88 73 mm

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 104

−160

−140

−120

−100

−80

−60

−40

−20

016384 FFT, KB6 , 12000Hz, −6dB, THD+N = −6.0dB; −102.4dB, Avg = 16

sin_44k_11000_0dB_18__twin, ser_48k_16k_async, AD1890

[dB

]

Figure D.16: AD1890: 6 dB 11 + 12 kHz, 44.1! 48.0 kHz

PROTO-twin.eps88 73 mm

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 104

−160

−140

−120

−100

−80

−60

−40

−20


sin_44k_11000_0dB_18__twin, ser_48k_16k_async, PROTO

[dB

]

Figure D.17: SARCO: 6 dB 11 + 12 kHz, 44.1! 48.0 kHz

SARCO-twin.eps88 73 mm

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 104

−160

−140

−120

−100

−80

−60

−40

−20


sin_44k_11000_0dB_18__twin, ser_48k_16k_async, SARCO

[dB

]

Figure D.18: SARCO + FPGA: 6 dB 11 + 12 kHz, 44.1! 48.0 kHz

AD1890-jitter.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.19: AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz with wideband jitter ofthe source sampling clock

PROTO-jitter.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20


sin_44k_997_0dB_18_, ser_48k_16k_async, PROTO

[dB

]

Figure D.20: SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz with wideband jitter ofthe source sampling clock

Source-jitter.epsi110 63 mm

Figure D.21: Analog source signal with wideband jitter of the source samplingclock (expanded scale due to limited resolution of HP35670ADynamic Signal Analyzer)

SARCO-jitter.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.22: SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with widebandjitter of the source sampling clock

AD1890-jitter-50Hz-20dB.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.23: AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz with 20 dB 50 Hzsinusoidal jitter of the source sampling clock

PROTO-jitter-50Hz-20dB.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.24: SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz with 20 dB 50 Hz sinu-soidal jitter of the source sampling clock


Figure D.25: Analog source signal with 20 dB 50 Hz sinusoidal jitter of thesource sampling clock (expanded scale due to limited resolutionof HP35670A Dynamic Signal Analyzer)


102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.26: SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with 20 dB50 Hz sinusoidal jitter of the source sampling clock

AD1890-jitter-400Hz-40dB.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.27: AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz with 40 dB 400 Hztriangular jitter of the source sampling clock

PROTO-jitter-400Hz-40dB.eps102 72 mm

102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.28: SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz with 40 dB 400 Hz tri-angular jitter of the source sampling clock


Figure D.29: Analog source signal with40 dB 400 Hz triangular jitter of thesource sampling clock (expanded scale due to limited resolutionof HP35670A Dynamic Signal Analyzer)


102

103

104

−160

−140

−120

−100

−80

−60

−40

−20



[dB

]

Figure D.30: SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with 40 dB400 Hz triangular jitter of the source sampling clock

List of Figures

2.1 D/A conversion followed by A/D conversion : : : : : : : 7

2.2 Spectrum of example signal : : : : : : : : : : : : : : : 8

2.3 Pink noise S(f) = 16f , white noise S(f) = 1

∆f : : : : : : 9

2.4 Spectrum of a sampled signal with sample rate fsample 9

2.5 Interpolation by a factor L : : : : : : : : : : : : : : : : : 10

2.6 Decimation by a factor M : : : : : : : : : : : : : : : : : 11

2.7 Sample-rate conversion by a fixed ratio M=L : : : : : : 12

2.8 Sample-rate conversion by a rational ratio M=L : : : : 15

2.9 UPMODE, fsink > fsource, L > M : : : : : : : : : : : : : 16

2.10 DOWNMODE, fsink < fsource, L < M : : : : : : : : : : : 17

2.11 Continuous-time representation y(t) by holding value in-between samples x[n] : : : : : : : : : : : : : : : : : : : 19

2.12 Sample-rate conversion by an arbitrary ratio M=L : : : 20

2.13 Interpolation by L and hold operation : : : : : : : : : : 20

3.1 Filter with finite stopband attenuation : : : : : : : : : : 24

3.2 Weighted stopband error : : : : : : : : : : : : : : : : : 25

3.3 Single stage FIR filter followed by hold operation : : : : 26

141

3.4 FIR filter followed by linear interpolation : : : : : : : : : 28

3.5 FIR filter followed by discrete-time linear interpolation : 28

3.6 Transfer characteristic of FIR filter, linear interpolator,and hold operation : : : : : : : : : : : : : : : : : : : : : 29

3.7 FIR filter and linear interpolation followed by hold opera-tion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 29

3.8 Error contributions vs signal frequency for sinusoidal au-dio signals : : : : : : : : : : : : : : : : : : : : : : : : : 32

3.9 FIR filter with linear interpolator : : : : : : : : : : : : : 33

3.10 FIR with interpolated impulse response : : : : : : : : : 34

3.11 Separate interpolation and decimation filters : : : : : : 35

3.12 Filter specifications : : : : : : : : : : : : : : : : : : : : 38

3.13 Graphical representation of Table 3.8 : : : : : : : : : : 41

3.14 Scaling of prototype filter by = 4 using zero padding : 42

3.15 Comparison of zero-padded prototype ( = 16) and di-rect synthesis : : : : : : : : : : : : : : : : : : : : : : : 43

3.16 Relaxed specifications with intermediate stopband : : : 44

3.17 Filter with relaxed specifications (fsample = 48 kHz) : : : 44

3.18 FIR filter, N = 26 879, 20 bit coefficients : : : : : : : : : 45

3.19 Impulse response of FIR filter using fixed-point coefficients 46

3.20 Interpolated FIR filter, L2 = 64 : : : : : : : : : : : : : : : 47

3.21 Quantization noise of interpolated FIR filter, L2 = 64,22 bit : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 48

3.22 FIR filter, L2 = 64, quantized twice to 22 bit : : : : : : : 50

3.23 FIR filter, L2 = 64, quantized twice to 22 bit QFP : : : : 50

4.1 Effects of non-uniform sampling on the signal spectrum 52

4.2 Phase relations : : : : : : : : : : : : : : : : : : : : : : 52

4.3 Frequency counter with PLL : : : : : : : : : : : : : : : 54

4.4 Frequency counter with high-speed reference clock : : 54

4.5 Digital PLL (DPLL) : : : : : : : : : : : : : : : : : : : : : 55

4.6 Synchronization with Gray counter : : : : : : : : : : : : 56

4.7 Open-loop transfer characteristic of the DPLL : : : : : 57

4.8 Adaptive digital PLL : : : : : : : : : : : : : : : : : : : : 58

4.9 Simulation of distortion caused by uniform jitter of M : 61

4.10 Spectrum of moving average filter : : : : : : : : : : : : 62

4.11 Simulation of frequency counter followed by moving av-erage filter (fsample = 48 kHz, fsig = 997 Hz, k = 7) : : : 64

4.12 Closed-loop transfer characteristic of the DPLL : : : : : 65

5.1 Block diagram : : : : : : : : : : : : : : : : : : : : : : : 68

5.2 Implementation of digital PLL : : : : : : : : : : : : : : : 69

5.3 Datapath: word-lengths for FIR filter : : : : : : : : : : : 73

5.4 Block diagram using DSP and FPGA : : : : : : : : : : : 76

5.5 Floorplan of SARCO : : : : : : : : : : : : : : : : : : : : : 78

5.6 Photomicrograph of SARCO : : : : : : : : : : : : : : : : 78

5.7 Interpolated FIR filter (N = 57855), L2 = 256, 22 bit QFP 80

6.1 Setup of all-digital measurement system : : : : : : : : : 84

6.2 Spectrum of 4-term Blackman-Harris window (BH4) : : 87

6.3 Spectrum of Kaiser-Bessel window with = 6 (KB6) : : 87

6.4 Influence of (bin-)offset on window performance : : : : 90

6.5 Sine wave quantized with 18 bit : : : : : : : : : : : : : 91

7.1 Frequency counter 16 bit, 44.1 kHz ! 48 kHz : : : : : : 96

7.2 Digital PLL 16 bit, 44.1 kHz ! 48 kHz : : : : : : : : : : : 96

7.3 Digital PLL 20 bit, 44.1 kHz ! 48 kHz : : : : : : : : : : : 96

7.4 Frequency counter 16 bit, 48 kHz ! 47.9 kHz : : : : : : 97

7.5 Digital PLL 16 bit, 48 kHz ! 47.9 kHz : : : : : : : : : : : 97

7.6 Digital PLL 20 bit, 48 kHz ! 47.9 kHz : : : : : : : : : : : 97

7.7 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz : : : : : : 99


7.9 SARCO + FPGA: 0 dB 15 kHz, 48.0! 44.1 kHz : : : : : : 100


7.11 SARCO + FPGA: 20 dB 997 Hz, 32.0!31.96 kHz : : : 101

7.12 Jitter rejection on sampling clocks : : : : : : : : : : : : 102

7.13 Jitter measurement set-up : : : : : : : : : : : : : : : : : 103

7.14 Analog source signal with 20 dB 50 Hz sinusoidal jitterof the source sampling clock : : : : : : : : : : : : : : : 104

7.15 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with20 dB50 Hz sinusoidal jitter of the source sampling clock : : : 104

7.16 Analog source signal with 40 dB 400 Hz triangular jitterof the source sampling clock : : : : : : : : : : : : : : : 105

7.17 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with40 dB400 Hz triangular jitter of the source sampling clock : : : 105

7.18 Total error contributions of (unquantized) interpolation/de-cimation filter for sinusoidal audio signals : : : : : : : : 106

D.1 AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz : : : : : : : : : : 122

D.2 SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz : : : : : : : : : : : 122

D.3 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz : : : : : : 123

D.4 AD1890: 0 dB 997 Hz, 48.0! 44.1 kHz : : : : : : : : : : 124

D.5 SARCO: 0 dB 997 Hz, 48.0! 44.1 kHz : : : : : : : : : : : 124


D.7 AD1890: 0 dB 15 kHz, 48.0! 44.1 kHz : : : : : : : : : : 126

D.8 SARCO: 0 dB 15 kHz, 48.0! 44.1 kHz : : : : : : : : : : : 126

D.9 SARCO + FPGA: 0 dB 15 kHz, 48.0! 44.1 kHz : : : : : : 127

D.10 AD1890: 0 dB 997 Hz, 48.0! 47.9 kHz : : : : : : : : : : 128

D.11 SARCO: 0 dB 997 Hz, 48.0! 47.9 kHz : : : : : : : : : : : 128


D.13 AD1890: 20 dB 997 Hz, 32.0! 31.96 kHz : : : : : : : 130

D.14 SARCO: 20 dB 997 Hz, 32.0! 31.96 kHz : : : : : : : : 130

D.15 SARCO + FPGA: 20 dB 997 Hz, 32.0!31.96 kHz : : : 131

D.16 AD1890: 6 dB 11 + 12 kHz, 44.1! 48.0 kHz : : : : : : 132

D.17 SARCO: 6 dB 11 + 12 kHz, 44.1! 48.0 kHz : : : : : : : 132

D.18 SARCO + FPGA: 6 dB 11 + 12 kHz, 44.1!48.0 kHz : : 133

D.19 AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz with widebandjitter of the source sampling clock : : : : : : : : : : : : : 134

D.20 SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz with wideband jitterof the source sampling clock : : : : : : : : : : : : : : : 134

D.21 Analog source signal with wideband jitter of the sourcesampling clock : : : : : : : : : : : : : : : : : : : : : : : 135

D.22 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with wide-band jitter of the source sampling clock : : : : : : : : : 135

D.23 AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz with20 dB 50 Hzsinusoidal jitter of the source sampling clock : : : : : : : 136

D.24 SARCO: 0 dB 997 Hz, 44.1!48.0 kHz with 20 dB 50 Hzsinusoidal jitter of the source sampling clock : : : : : : : 136

D.25 Analog source signal with 20 dB 50 Hz sinusoidal jitterof the source sampling clock : : : : : : : : : : : : : : : 137

D.26 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with20 dB50 Hz sinusoidal jitter of the source sampling clock : : : 137

D.27 AD1890: 0 dB 997 Hz, 44.1! 48.0 kHz with40 dB 400 Hztriangular jitter of the source sampling clock : : : : : : : 138

D.28 SARCO: 0 dB 997 Hz, 44.1! 48.0 kHz with 40 dB 400 Hztriangular jitter of the source sampling clock : : : : : : : 138

D.29 Analog source signal with 40 dB 400 Hz triangular jitterof the source sampling clock : : : : : : : : : : : : : : : 139

D.30 SARCO + FPGA: 0 dB 997 Hz, 44.1! 48.0 kHz with40 dB400 Hz triangular jitter of the source sampling clock : : : 139

List of Tables

2.1 Error caused by hold operation (fsource = 48 kHz) : : : 21

3.1 Transition bandwidth for common sampling frequencies 24

3.2 Estimated filter order : : : : : : : : : : : : : : : : : : : 27

3.3 Error caused by linear interpolation (fsource = 48 kHz) : 28

3.4 Error contributions from interpolation and finite stopbandattenuation (white noise signal) : : : : : : : : : : : : : : 31

3.5 Estimated filter order : : : : : : : : : : : : : : : : : : : 34

3.6 Coefficient interpolation and filter order : : : : : : : : : : 35

3.7 Filter Requirements : : : : : : : : : : : : : : : : : : : : 39

3.8 Runtime and deviation depending on : : : : : : : : : : 41

3.9 Stopband attenuation depending on coefficient precision 46

3.10 Quantization noise caused by second quantization : : : 49

5.1 Implementation alternatives for frequency tracking : : : 70

5.2 Datapath width in bits : : : : : : : : : : : : : : : : : : : 74

6.1 Average noise floor level of a sine wave with k-bit quan-tization using a N -point FFT with BH4 window : : : : : 88

147

Curriculum Vitae

I was born in Thalwil, Switzerland, on August 24, 1965. After finishinghigh school at the Kantonsschule Ramibuhl Zurich (Matura Typ C) in 1984, Ienrolled in Electrical Engineering at the Swiss Federal Institute of TechnologyETH Zurich. I received a M.Sc. in Electrical Engineering (Dipl. El.-Ing.ETH) in 1990 and joined then the Integrated Systems Laboratory of the ETH,where I worked as a research and teaching assistant in the ASIC design and testgroup. Besides the work on sample-rate conversion presented in this thesis, mysecond main project was the evaluation of template matching algorithms andthe realization of an ASIC for real-time 2D correlation of gray-scale imagesfor an industrial process automation system.

149

Bibliography

[Ada93] Robert Adams. Jitter analysis of asynchronous sample-rate con-version. In Proceedings of the 95th Convention of the AES,Preprint # 3712. Audio Engineering Society, 1993.

[AK93] Robert Adams and Tom Kwan. Theory and VLSI architectures forasynchronous sample-rate converters. Journal of the Audio Eng.Soc. (AES), 41(7/8):539 – 555, July/August 1993.

[Ant93] A. Antoniou. Digital Filters: Analysis, Design, and Applications.McGraw-Hill Book Company, second edition, 1993.

[AW85] John W. Adams and Alan N. Willson. On the fast design of high-order FIR digital filters. IEEE Transactions on Circuits and Systems,32(9):958 – 960, September 1985.

[Ben88] K. Blair Benson. Audio engineering handbook. McGraw-Hill BookCompany, first edition, 1988.

[Bes93] Roland E. Best. Phase-locked loops. McGraw-Hill Book Company,second edition, 1993.

[BK85] I.N. Bronstein and K.A.Semendjajew. Taschenbuchder Mathematik.Verlag Harri Deutsch, 22nd edition, 1985.

[Cab91] Richard Cabot. AES standard method for digital audio engineering –measurement of digital audio equipment. Journal of the Audio Eng.Soc. (AES), 39(12):962 – 975, December 1991.

[CDPS91] S. Cucchi, F. Desinan, G. Parladori, and G. Sicuranza. DSP im-plementation of arbitrary sampling frequency conversion for highquality sound applications. In Proc. of the Int. Conf. on Acoustics,Speech and Signal Processing, pages 3609 – 3612, Vol. 5, Toronto,1991.

151

[CR81] Ronald E. Crochiere and Lawrence R. Rabiner. Interpolation anddecimation of digital signals - a tutorial review. In Proceedings ofthe IEEE, pages 300–331, 1981.

[CR83] Ronald E. Crochiere and Lawrence R. Rabiner. Multirate DigitalSignal Processing. Prentice Hall, Englewood Cliffs, New Jersey07632, 1983.

[dD89] Richard C. den Dulk. An approach to systematic phase-lock loopdesign. PhD thesis, TU Delft, 1989.

[Der87] F. Deravi. Design issues in FFT test systems for high performanceconverters. Colloquium on ’Advanced A/D Conversion Techniques’,15. April 1987.

[Dev93] Analog Devices. Stereo asynchronous sample rate convertersAD1890/AD1891. Datasheet, 1993.

[dGL94] R. de Gaudenzi and M. Luise. Audio and video digital radio broad-casting systems and techniques. Elsevier science B.V., Amsterdam,1994.

[Fet90] A. Fettweis. Elemente nachrichtentechnischer Systeme. B. G. Teub-ner, Stuttgart, 1990.

[GR94] I.S. Gradshteyn and I.M. Ryzhik. Table of integrals, series, andproducts. Academic Press, 1994.

[Har78] Francis J. Harris. On the use of windows for harmonic analysis withthe discrete fourier transform. Proceedings of the IEEE, 66(1):51 –83, January 1978.

[Har90] Steven Harris. The effects of sampling clock jitter on nyquist sam-pling analog-to-digital converters, and on oversampling delta-sigmaADCs. Journal of the Audio Eng. Soc. (AES), 38(7/8):537 – 542,July 1990.

[HB86] Joel M. Halbert and R. Allan Belcher. Selection of test signals forDSP-based testing of digital audio systems. Journal of the AudioEng. Soc. (AES), 34(7/8):546 – 555, July 1986.

[JTG+94] J. Janssen, D. Therssen, J. Van Ginderdeuren, L. Van Paepegem,P. Van Lierop, Z.L. Wu, and H. Verhoeven. A new principle/ICfor audio sampling rate conversion. In Proceedings of the 96th

Convention of the AES, Preprint # 3807. Audio Engineering Society,1994.

[Lag82] Roger Lagadec. Digital sampling frequency conversion. In DigitalAudio, Collected Papers from the AES Premiere Conference, NewYork, pages 90 – 96. Audio Engineering Society, 1982.

[LK81] Roger Lagadec and Henry O. Kunz. An universal, digital samplingfrequency converter for digital audio. In Proc. of the Int. Conf. onAcoustics, Speech and Signal Processing, pages 595 – 598, Vol. 2,Atlanta, 1981.

[LPW82] Roger Lagadec, Daniele Pelloni, and Daniel Weiss. A 2-channel,16-bit digital sampling frequency converter for professional digitalaudio. In Proc. of the Int. Conf. on Acoustics, Speech and SignalProcessing, pages 93 – 96, Vol. 1, Paris, 1982.

[LS91] Gunnar Lehtinen and Andreas Steiner. Digital audio sampling con-verter. Master’s thesis, Integrated Systems Lab, Swiss Federal Insti-tute of Technology (ETH Zurich), 1991.

[LWV92] Stanley P. Lipshitz, Robert A. Wannamaker, and John Vanderkooy.Quantization and dither: A theoretical survey. Journal of the AudioEng. Soc. (AES), 40(5):355 – 375, May 1992.

[Mar93] Robert J. Marks II. Advanced Topics in Shannon Sampling andInterpolation Theory. Springer-Verlag New York, 1993.

[Mat88] Matlab. Matlab user’s manual. MathWorks, Inc., Natick, MA, 1988.

[Mot88] Motorola. DSP56000 digital signal processor user’s manual, 1988.

[MPR73] J.H. McClellan, T.W. Parks, and L.R. Rabiner. A computer programfor designing optimum FIR linear phase digital filters. IEEE Trans.Audio Electroacoust., AU-21(6):506 – 526, December 1973.

[MS93] D. Muller and Ch. Siegrist. Frequency measurement for audio sam-pling rate conversion. Semesterarbeit, 1993. Integrated SystemsLab, Swiss Federal Institute of Technology (ETH Zurich).

[Pel82] Daniele Pelloni. Der einstufige Abtastratenumsetzer. Unpublished,1982.

[PHR91] Sangil Park, Garth Hillman, and Roman Robles. A novel structurefor real-time digital sample-rate converters with finite precision erroranalysis. In Proc. of the Int. Conf. on Acoustics, Speech and SignalProcessing, pages 3613 – 3616, Vol. 5, Toronto, 1991.

[Rab73] L.R. Rabiner. Approximate design relationships for lowpass FIRdigital filters. IEEE Trans. Audio Electroacoust., AU-21:456 – 460,October 1973.

[Ram82] Tor A. Ramstad. Sample-rate conversion by arbitrary ratios. InProc. of the Int. Conf. on Acoustics, Speech and Signal Processing,pages 101 – 104, Vol. 1, Paris, 1982.

[Ram84] Tor A. Ramstad. Digital methods for conversion between arbitarysampling frequencies. IEEE Transactions on Acoustics, Speech, andSignal Processing, 32(3):577 – 591, June 1984.

[RF94] F. Rothacher and N. Felber. VLSI implementation of a fully digitalasynchronous audio sample-rate converter. In Proceedings of the96th Convention of the AES, Preprint # 3832. Audio EngineeringSociety, 1994.

[RG75] Lawrence R. Rabiner and Bernard Gold. Theory and Applicationof Digital Signal Processing. Prentice Hall, Englewood Cliffs, NewJersey 07632, 1975.

[RW90] Fritz Rothacher and Andreas Wieland. DIDI: Audio sampling-frequency converter for arbitrary changing ratios. Master’s thesis,Integrated Systems Lab, Swiss Federal Institute of Technology (ETHZurich), 1990.

[Sti92] Eduard F. Stikvoort. Some Subjects in Digital Audio, Noise Shap-ing, Sampling-Rate Conversion, Dynamic Range Compression andTesting. PhD thesis, TU Eindhoven, 1992.

[VL84] John Vanderkooy and Stanley P. Lipshitz. Resolution below the leastsignificant bit in digital systems with dither. Journal of the AudioEng. Soc. (AES), 32(3):106 – 113, March 1984. Corrections in JAES32(11):889.

sample-rate conversion: algorithms and vlsi implementationk-lug.org/~kjj/digitalfilter.pdf ·...

Documents