3 signal processing essentials representation of audio signals

Upload: ciro3039

Post on 06-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    1/73

    Signal Processing Essentials

    Representation of Audio Signals

    06.10.2011, 52.219, Waldo Nogueira

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    2/73

    Music

    Technology

    Group My Self

    Contact Details Email: [email protected]

    Telephone: +34 93 542 2806Web: https://dtic.upf.edu/~wnogueira (underconstruction)

    Office: 55.314 Address: Music Technology Group

    Tnger Building - CommunicationCampus-PoblenouUniversitat Pompeu FabraRoc Boronat, 13808018 Barcelona

    Moodle:

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    3/73

    Music

    Technology

    GroupContents

    Review of the last 3 Lectures: From Analogue to Digital

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    4/73

    Music

    Technology

    GroupLecture 1: Introduction

    medi, dades, codis i artefactes la finestra daudici. Visi histrica de la

    codificaci digital teorema de Nyquist: mostreig i aliasing caracterstiques dels filtres analgics anti-aliasing

    i reconstructor freqncies de mostreig estndard (32/44.1/48

    KHz)mostreig i retenci (sample&hold) en A/D importncia del rellotge de mostreig (clock):jitteri PLL efecte dapertura (en A/D i en reconstrucci D/A)

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    5/73

    Music

    Technology

    GroupLecture 2: Quantization

    quantitzaci error de quantitzaci (distorsi harmnica)

    dither: definici i tcniques (rectangular,triangular, gaussi) aplicacions del ditheri criteris de selecci en

    lentorn professional tecnologia bsica dimplementaci de la

    conversi A/D/A

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    6/73

    Music

    Technology

    GroupLecture 3: Sampling, PCM and more

    codificaci no-PCM: diferencial (delta D, DPCM) altres codificacions PCM: sigma-delta -D, -

    DPCM sobremostreig (oversampling): definici, teoria

    de la informaci noise-shaping (NS): definici i exemples

    dimplementaci diagrama global de la conversi A/D/A (amb

    ovesampling i NS) conversors comercials en lestudi

    denregistrament: parmetres i criteris relaci entre formats comercials i tipus de

    conversi (CD, DVD-A, SACD, estat slid)

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    7/73

    Music

    Technology

    GroupContents for Today: Part I/II

    After Analogue to Digital conversion Part I: Introduction and Overview:

    Source Coding in a digital information system

    Quantification and Coding of signals with PCM Problem: Large amount of information

    PCM for Audio and Speech Signals

    Information Theory: The message

    Overview: Structure of the next 6 Lectures

    Part II: Signal Processing Fundamentals

    Spectra of Analogue Signals, Convolution and Filtering, UniformSampling

    Discrete time signal processing (Transforms, DFT, DCT, STFT)

    Differential equations and digital filters (Poles,Zeros,Freq. Resp.

    Review of MultiRate signal processing (Down/Up-samping, QMF)

    Discrete-Time Random Signals Information Theory

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    8/73

    Music

    Technology

    GroupContents

    PART I

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    9/73

    Music

    Technology

    GroupDigital Coding

    Representation of Analog signals using binarydigital sequences

    Goals: Acceptable quality with the minimum number of bits

    For a fixed speed to obtain the minimum degradation

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    10/73

    Music

    Technology

    Group

    Source coding in digitalinformation system

    SourceCoder

    ChannelCoder Modulator

    Channel

    De-Modulator

    ChannelDe-coder

    SourceDe-coder

    BinarySymbols

    Source

    Sink

    Noise+

    Interference

    Discrete Channel

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    11/73

    Music

    Technology

    Group

    Quantification and Coding of Signalswith PCM

    Time

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    12/73

    Music

    Technology

    GroupProblem

    The digital representation of Audio Signalsrequires relatively large data rates

    Digital Audio Signal = 24 x Digital Speech Signal

    2 x 768 kbit/s 64 kbit/s

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    13/73

    Music

    Technology

    Group

    PCM-Format for Audio and SpeechSignals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    14/73

    Music

    Technology

    GroupAdvantages of digital coding

    Regeneration Possibility to clean pulse. When reconstructing the signal

    it is possible to improve the SNR

    In analog comunications if signal is amplified, the noise isamplified too

    Error protection

    Encripting StorageMultiplexing (time and frequency) Less degradationUse of signal processing techniques

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    15/73

    Music

    Technology

    GroupBit Rate

    At Source Coder output we obtain:

    Bits/sample: R

    Sampling Rate: Fs

    Transmission speed I: FsR bits/second

    Quantization

    &Sampling

    Source

    Coder

    Source

    Fs: samples/sR: bits/sample

    How can be the transmission rate be reduced, or usedo timall Information Theor

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    16/73

    Music

    Technology

    GroupInformation Theory: Message

    R H

    Redundant

    (known)

    Not Redundant

    (unknown)

    Irrelevant

    (nomeaning

    )

    Releva

    nt

    (hasmea

    ning)

    Interesting

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    17/73

    Music

    Technology

    Group

    General Block Diagram of an Audiocoder

    Time-FrequencyTransform

    Quantizationand Coding

    FrameConstruction

    Psychoacoustic Model

    PCM- Signal

    706 KBIT/SQuantizationand Coding

    Quantizationand Coding

    Coded AudioSignal

    128 KBIT/S

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    18/73

    Music

    Technology

    GroupEvaluation of coding schemes

    4 parameters to evaluate a coding scheme 1) Speed

    64 Kbps 2,4 Kbps

    256 Kbps 16 Kbps

    1410 Kbps 128 Kbps

    Quality

    Objective Measures Subjective Measures

    Complexity

    MIPS Delay

    < 100 ms

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    19/73

    Music

    Technology

    GroupOverview

    Theory 1,2,3,4.5 (until slide 27): Done by Enric Gine Theory 4.5: Signal Processing Essentials

    Theory 5: Quantization and Entropy Coding

    (coming from sigma-delta)

    Theory 6: Time-Frequency Analysis

    Theory 7: Psychoacoustics & Psycho.Models

    Theory 8: Bit Allocations Strategies

    Theory 9: Linear Prediction and Narrow/Wideband coding

    Seminar 1, 2, 3: Done by Enric Gine

    Seminar 4: Audio Coding Standards 1 (CELP, AC-3, mp-1-2-3,

    AAC) Seminar 5: Audio Coding Standards 2 (mpeg-Surround, USAC)

    Seminar 6: Losless Audio Coding

    Seminar 7: Quality Measures for Audio Coding (First Seminar?)

    Callibration of the system from input/output(Look at Watkinson)

    Seminar 8: Sinusoidal Coders

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    20/73

    Music

    Technology

    GroupOverview: Labs

    Lab 1:Enric Lab 2: Signal Processing Essentials Lab 3: Quantization & Entropic Coding Lab 4: Psychoacousitcs Lab 5: To be defined!!!

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    21/73

    Music

    Technology

    Group

    General Block Diagram of an Audiocoder

    Time-FrequencyTransform

    Quantizationand Coding

    FrameConstruction

    Psychoacoustic Model

    PCM- Signal

    706 KBIT/SQuantizationand Coding

    Quantizationand Coding

    Coded AudioSignal

    128 KBIT/S

    Theory 5

    Theory 6

    Theory 7

    Theory 8

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    22/73

    Music

    Technology

    GroupContents

    PART II

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    23/73

    Music

    Technology

    GroupContents

    PART II: Introduction Spectra of Analog Signals Review of convolution and filtering Uniform Sampling

    Discrete-time signal processing Transforms for Discrete-Time Signals The Discrete and the Fast Fourier Transform The Discrete Cosine Transform The Short-Time Fourier Transform

    Difference equations and digital filters

    The transfer and the frequency response functions Poles, Zeros, and Frequency Response Examples of digital filters for audio applications

    Review of Multirate signal processing Down-sampling by an integer Up-sampling by an integer Sampling rate changes by Noninteger factors Quadrature mirror filter banks

    Discrete-Time Random Signals Random signals processed by LTI Digital Filters Autocorrelation Estimation from Finite-Length Data

    Review on Information Theory Entropy, Shannon, Rate Distortion

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    24/73

    Music

    Technology

    GroupSpectra of Analoge Signals (1)

    Frequency Spectrum of an analog signal isdescribed in terms of the continuous Fourier

    transform (CFT). ,

    Where wis the frequency in radians per second (rad/s), w = 2f,where fis the frequency in Hz.

    Example 1: The pulse-sinc CFT pair

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    25/73

    Music

    Technology

    GroupSpectra of Analog Signals (1)

    Example 2:

    Loss of resolution andspectral leackage (timedomain truncation)

    M i

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    26/73

    Music

    Technology

    GroupSpectra of Analog Signals

    The inverse CFT is given by

    12

    In CFT theory, x(t) and X(w) are called atransform pair, i.e.,

    x(t) X(w)

    M i

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    27/73

    Music

    Technology

    GroupSpectra of Analog Signals

    In real life signal processing, all signals have finite length, Time domain truncation always occurs.

    Truncation done using rectangular windows.

    To smooth out frame transitions and control spectral leakage(sidelobes), the signal is often tappered prior to truncation usingwindows such as Hamming, Barlett, trapezoidal windows.

    Music

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    28/73

    Music

    Technology

    GroupReview of Convolution and Filtering

    A linear time-invariant (LTI) system

    A linear filter satisfies the property ofsuperposition: y(t), is the convolution of the input, x(t),

    with the filter impulse response, h(t).

    Convolution is represented by the integral

    .

    LTI system

    h(t)

    Y(t) = x(t)*h(t)x(t)

    Music

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    29/73

    Music

    Technology

    GroupReview of Convolution and Filtering

    The CFT of the impulsive response, h(t), is thefrequency response of te filter, i.e.,

    h(t) H(w). Example: First order first-order RC circuit (low

    pass filter)

    Impulse response is a decaying exponential, and frequencyresponse is a rational function. This function is complex-valuedand its magnitude represents the gain of the filter with respectto frequency at steady state. A sinusoid is scaled in amplitude

    and phase is shifted with respect to phase response

    Music

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    30/73

    Music

    Technology

    GroupUniform Sampling

    Nyquist Fs 2B

    Multiplication in timedomain is convolution infrequency domain.

    Sampling is

    multiplicationspectrum isconvolution of spectra

    with pulse train: 1 .

    An ideal LPF can recover thebaseband and hence perfectreconstruct the analog signal fromthe digital

    Music

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    31/73

    Music

    Technology

    GroupUniform Sampling

    Reconstruction process:

    The reconstruction LPFinterpolates between thesamples and reproducesthe analog signal

    The interpolation becomesevident once the filtering

    operation is interpreted intime domain asconvolution

    Reconstruction occurs by

    interpolating with the sincfunction, which is theimpulse response of theideal LPF.

    Note that if Fs is less than 2B, aliasing occurs, andperfect reconstruction is no longer possible

    In real life, the analog signal is not ideally badlimited ann the sampling process is not perfect(sampling pulses have finite amplitude and duration) Aliasing

    To reduce aliasing, the signals is prefiltered by ananti-aliasing low pass and usually over-sampled

    (ws>2B) sigma delta

    Music

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    32/73

    Music

    Technology

    Group2.5 Discrete-time signal processing

    Audio coding algorithms operate on a quantizeddiscrete-time signal.

    Prior to compression, most algorithms requirethat the audio signal is acquired with high fidelitycharacteristics.

    Typically it is assumed that the signal is band-limited at 20 kHz, sampled at 44.1 kHz and

    quantized at 16 bits per sample

    Music

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    33/73

    u c

    Technology

    Group2.5.1 Transforms for Discrete-Time signals

    Discrete-time signals are described in thetransform domain using thez-transform and thediscrete-time Fourier transform (DTFT).

    The z-transform is defined as: , where z is complex If the z-transform is evaluated on the unit circle,

    i.e, forz , 2

    Then the z-transform becomes the discrete timeFourier transform (DTFT)

    .

    Music 2 5 2 The Discrete and the Fast Fourier

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    34/73

    Technology

    Group

    2.5.2 The Discrete and the Fast FourierTransform

    The Fourier transform is developed by startingfrom the DTFT analysis expression, andconsidering a finite length signal consisting ofNpoints, i.e.,

    Furthermore, the frequency-domain signal issampled uniformly at N points within one period, = 0 to 2, i.e.,

    k =

    , K= 0,1,...,N-1

    It is typical in the DSP literature to replace kwith the frequency index kand hence, it can bewritten as:

    .

    / , k = 0,1,2,...,N-1

    Music 2 5 2 The Discrete and the Fast Fourier

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    35/73

    Technology

    Group

    2.5.2 The Discrete and the Fast FourierTransform (2)

    The sampling in the frequency domain forcesperiodicity in the time domain, i.e.,

    x(n) = x(n+N).

    We also have periodicity in the frequencydomain, X(k) = X(k+N), because the signal in

    the time domain is also discrete. Theseperiodicities create circular effects whenconvolution is performed by frequency-domain

    multiplication, i.e,x(n) h(n) X(k)H(k),

    Where x(n) h(n) = .

    Music 2 5 2 The Discrete and the Fast Fourier

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    36/73

    Technology

    Group

    2.5.2 The Discrete and the Fast FourierTransform (2)

    The N-point inverse DFT (IDFT) is written as:

    1

    , 0,1, , 1

    The DFT can be computed efficiently using the fastfourier transform (FFT).

    FFT takes advantage of redundancies in the DFT sumby decimating the sequence into subsequences witheven and odd indices.

    It can be shown that ifNis a radix-2 integer, the N-point DFT can be computed using a series ofbutterfly stages.

    The complexity associated with the DFT is of theorderN2and for the FFT is roughly of the order of

    Nlog2N

    Music

    2 5 3 Th Di t C i T f

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    37/73

    Technology

    Group2.5.3 The Discrete Cosine Transform

    The discrete cosine transform (DCT) of x(n) canbe defined as

    , 0 1,Where c(0) = 1/ 2, and c(k) = 1 for 1kN-1.

    Depending on the periodicity and symmetry of theinput signal, x(n), the DCT can be computed using

    different orthonormal transforms (usually DCT-1,DCT-2, DCT-3, and DCT-4) (More details will begiven on Chapter 6)

    Music

    h l 2 5 4 Th Sh t Ti F i T f

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    38/73

    Technology

    Group2.5.4 The Short-Time Fourier Transform

    Spectral analysis of non-stationary signals cannot be accommodatedby classical Fourier transform since the signal has time-varyingcharacteristics

    Time-varying spectral analysis can be performed using the short-timeFourier transform (STFT).

    ,

    ,

    W here = wT = 2fT is the normalized frequency in radians, and h(n)is the sliding analysis window The synthesis expression (inverse transform) is given by

    12 ,

    If n = m nad h(0) = 1, then x(n) can be obtained from previous Eq. The basic assumption in this type of analysis-synthesis is that the

    signal is slowly time-varying and can be modeled by its short-timespectrum.

    The temporal and spectral resolution of the SFTF are controlled by thelength of the window (usually constrained to be about 5-20 ms andhence spectral resolution is sacrificed)

    Music

    T h l 2 5 4 Th Sh t Ti F i T f

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    39/73

    Technology

    Group2.5.4 The Short-Time Fourier Transform

    The sequence, h(n), can also be viewed as theimpulse response of a LTI filter, which is excitedby a frequency-shifted signal. This leads to the

    filter-bank interpretation of the STFT, i.e, for adiscrete frequency variable k = k(), k =0,1,,N-1 and and N chosen such that thespeech band is covered.

    Then the analysis expression is written as: ,

    Music

    T h l g2.5.4 The Short-Time Fourier

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    40/73

    Technology

    Group Transform

    And the synthesis expression is:

    , ,

    Where is the signal reconstructed withinthe band of interest. If h(n), , and N are chosencarefully, the reconstruction given in previous slideis depicted in the following Figure, where

    .

    The k-th channel of the analysis-synthesis filterbank

    Music

    Technology Overlap Add Decomposition

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    41/73

    Technology

    GroupOverlap-Add Decomposition

    Consider breaking an inputsignal x into frames using afinite, zero-phase, length Mwindow w. Then we may

    express the mth windowed dataframe as:

    , ,

    where,

    R frame step (hop size)m frame index

    The hop size is the number ofsamples between the begin-

    times of adjacent frames.Specifically, it is the number ofsamples by which we advanceeach successive window

    Music

    Technology Overlap Add Decomposition

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    42/73

    Technology

    GroupOverlap-Add Decomposition

    For frame-by-frame spectral processing to work,we must be able to reconstruct x from theindividual overlapping frames, ideally by simply

    summing them in their original time position

    = Hence, x = 1, ,This is the constant-overlap-add (COLA) constraintfor the FFT analysis window. It has also been calledpartition of unity transform

    Music

    Technology 2 6 Difference Equations and Digital Filters

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    43/73

    Technology

    Group2.6 Difference Equations and Digital Filters

    Digital filters are characterized by differenceequations of the form

    y(n) is given as a linear combination of present and past inputsminus linear combination of past outputs (feedback term).

    The parameters ai and bi are the filter coefficients and controllfrequency response characteristics of a digital filter.

    Filter coefficients can be made adaptive (time-varying) IIR filter (when feedback coefficients are non zero, frequency

    response is infinitely long).

    Music

    Technology2.6 Difference Equations and Digital

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    44/73

    Technology

    Group Filters

    The impulse response h(n) is:

    In statistical signal representation (ARMA)The filter coefficients are chosen such that the filter is stable

    An input-output equation of a causal filter can also bewritten in terms of the impulse response of the filter, i.e.,

    .

    Music

    Technology2.7 The Transfer and the Frequency

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    45/73

    Technology

    Group Response Functions

    The z-transform of the impulse response of afilter is called the transfer function is given by

    .

    Considering the difference equation, we can alsoobtain the transfer function in terms of filterparameters, i.e.,

    1 The ratio of output input in the z domain gives thetransfer function in terms of the coefficients

    0 0

    1 1

    Music

    Technology2.7 The Transfer and the Frequency

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    46/73

    gy

    Group Response Functions

    The frequency response function is a special caseof the transfer function of the filter. That is for , then

    2.7.1 Poles, Zeros and Frequency Response

    A z domain function, H(z), can be written in terms of its

    poles and zeros as follows:

    Where are the zeros and poles of H(z) and G is cte

    The location of poles and zeos affect shape of freq response

    Magnitude of freq.response can be writte as:

    Music

    Technology2.7 The Transfer and the Frequency

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    47/73

    gy

    Group Response Functions

    It is evident that when anIsolated Zero is close to the

    it unit circle, it will give rise to a peak in the magnitudefrequency response will assume a small value at thatfrequency

    When an isolated pole is close to unit circle it will give riseto a peak in the magnitude frequency response at that

    frequency. In speech processing,

    the presence of poles in z domain representations of the vocal tract,has been associated with the speech formants.

    Formant synthesizers use the pole locations to form synthesis filters forcertain phonemes.

    The presence of zeros has been associated with the coupling of nasaltract. For example, zeros associate with nasal such as m and n.

    Music

    Technology2.8 Review of multirate signal

    i (MSP)

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    48/73

    Group processing (MSP)

    MSP involves the change of sampling rate whilethe signal is in the digital domain Applications: Reduce algorithmic (SW/HW) complexity

    Increase resolution

    Oversampling A/D (less requirem. for antialiasing filter),but causes data-rate increase Downsampling required

    Downsampling requires anti-aliasing filter low-pass indigital domain

    Sigma delta changes complexity from analog domain into

    digital domain

    For D/A conversion signal is interpolated in the digitaldomain reducing requirements on the analog

    reconstruction (interpolation) filter

    Music

    Technology 2.8.1 Downsampling by an Integer

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    49/73

    Group2.8.1 Downsampling by an Integer

    Downsampling: Increase the sampling period/Decreasesampling frequency and data rate of digital signal:

    Given the DTFT transform pairs

    and

    It can be shown that the DTFT of the original and decimatedsignal are related by

    1

    /

    .

    Downsampling introduces L copies of the original DTFTthat are both amplitude and frequency sacaled by L May introduce aliasing.

    Aliasing can be eliminated if DTFT is bandlimited to /L,

    i.e:

    0,

    .

    Music

    Technology 2.8.1 Downsampling by an Integer

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    50/73

    Group8 o sa p g by a tege

    Example: DTFTs of the signal during downsampling process

    Music

    Technology

    G2.8.2 Upsampling by an Integer

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    51/73

    Groupp p g y g

    Involves reducing the sampling period byintroducing additional regularly spaced samplesin the signal sequence

    The DTFT of the up-sampled signal relates to the

    DTFT of the original signal as follows: .

    Music

    Technology

    G2.8.2 Upsampling by an Integer

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    52/73

    Groupp p g y g

    Therefore, the DTFT of the up-sampled, sdescribed by a series of compressed images ofthe DTFT of the original signal located at integer

    mutliples of 2/M rads. To complete the upsampling process, an

    interpolation stage is required that fills

    appropriate values in the time-domain to replacethe artificial zero-valued samples introduced bythe sampling

    Music

    Technology

    Group

    2.8.3 Sampling Rate Changes byNonInteger Factors

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    53/73

    Group NonInteger Factors

    Can be accomplished by cascading upsamplingand down-sampling The upsampling stage precedes the downsampling stage

    and, The low pass interpolation and antialiasing filters are

    combined into one filter whose bandwidth is the minimum

    of the two filters

    Example: We wand a non integer sampling periodmodification of such that Tnew = 12T/5.

    In this case we choose L=12 and M=5.

    Hence, the bandwidth of the low-pass filter is the minimum

    of /12 and /5.

    Music

    Technology

    GroupQuadrature Mirror Filter Banks

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    54/73

    Group

    Analysis of a signal in a Perceptual Audio Coderis accomplished using filter-banks or frequency-domain transformations (or combinations)

    The filterbank is used to decompose the signalinto several frequency subbandsDifferent coding strategies are then derived and

    imlemented in each subband (subband coding)- Aliasing between subbands

    because of imperfect freq.responses- Aliasing effects no perfect

    reconstruction

    - Solution: Combine up-&downoperations with appropriate filterdesigns

    Music

    Technology

    GroupQuadrature Mirror Filter

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    55/73

    Group

    Perfect reconstruction fiitlerbank is calledquadrature mirror fitler bank (QMF)

    Analysis H0(z) and H1(z) + downsampling Synthesis F0(z) and F1(z) If the process includes quantizers, those will be

    placed after the downsampling stages

    Music

    Technology

    GroupQuadrature Mirror Filter

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    56/73

    Group

    The input signal x(n) is first filtered anddownsampled

    The DTFT is

    Plot of DTFT of the original and down-sampledsignals is presented here:

    Music

    Technology

    GroupQuadrature Mirror Filter

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    57/73

    Group

    The reconstructed signal , is derived by addingthe contributions from the up-sampling and theinterpolations of the low and the high band.

    It can be shown that the reconstructed signal in z-domain has the form

    12 0 0 1 1 12 0 0 1 1

    The signal X(-z) is associated with the alisaing term.The aliasing term can be cancelled by designingfilters to have the following mirror symmetries:

    F0(z) = H1(-z) and F1(z)=-H0(-z)

    Music

    Technology

    GroupQuadrature Mirror Filter

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    58/73

    Group

    Under these conditions, the overall transfer function ofthe filter bank can then be written as

    0 0 1 1 . If T(z) = 1, then the filter bank allows perfect

    reconstruction.

    Perfect delayless reconstruction is not realizable, but anall-pass filter bank with linear phase characteristics can be

    designed easily For example a first order FIR filter

    0 1 1 1

    Results in alias free reconstruction. The overall transferfunction is:

    1 - 1 2

    The signal is reconstructed within a delay of one sampleand with an overall gain of 2

    Music

    Technology

    GroupQuadrature Mirror Filter

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    59/73

    p

    QMF filter banks can be cascaded to form treestructures.

    If we represent the analysis stage of a filter bank

    as a block that divides the signal in low and highfrequency subbands, then by cascading severalof such blocks, we can divide the signal into

    smaller subbands (association with wavelettransform theory)

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    60/73

    p

    Signals can be classified as deterministic or random Deterministic: Values at any point in time can be defined

    precisely by a math. Equation (e.g. x(n)=sin(n/4))

    Random: Uncertain values and are usually described usingstatistics

    A discrete-time random process involves an ensemble ofsequences x(n,m) where m is the index of the m-th

    sequence in the ensemble and n is the time index In practice one does not have access to all possible sample

    signals of a random process. Therefore, the determination ofthe statistical structure of a random process is often done from

    the observed waveform. This approach becomes valid andsimplifies if the random signal is Ergodic.

    Ergodicity: Statistics of a random process can bedetermined using time-averaging operations on asingle observed signal

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    61/73

    Ergodicity requires that statistics of the signalare independent of the time of observation

    Stationarity: Random process is widesensestationary it its statistics, upto the second order,are independent of time.

    Although it is difficult to show analytically thatsignals with various statistical distributions are

    ergodic, it can be shown that a stationary zero-mean Gaussian process is ergodic up to secondorder (in many practical applications involving stationarity

    process, it is assumed that the process is also ergodic)

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    62/73

    The mean value, x , of the discrete-time, widesense stationary signal, x(n), is a first orderstatistic that is defined as the expected value of

    x(n), i.e., lim

    12 1 ,

    Where E[] denotes statistical expectation The variance is the standard deviation of the

    signal. For a zero-mean signal , the variance is

    simply .

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    63/73

    The autocorrelation of a signal is a second-orderstatistic defined by:

    lim1

    2 1 ,

    m is autocorrelation lag index. The autocorrelationcan be viewed as a measure of predictability of the

    signal in the sense that a future value of a correlatedsignal can be predicted by processing informationassociated with its past values.

    For example speech is a correlated waveform, and,

    hence, it can be modelled by linear predictionmechanisms that predict its current value from a linearcombination of past values

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    64/73

    Correlation can also be viewed as a measure ofredundancy in the signal: Correlated waveforms can be parameterized in terms of

    statistical time-series models; and, hence, represented bya reduced number of information bits

    The autocorrelation sequence rxx(m), issymmetric and positive definite, i.e,

    0 . Example 1: Autocorrelation of a white noise

    signal

    ,Where is the variance of the noise.Autocorrelation of white noise is an impulse

    Uncorrelated signal

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    65/73

    Example 2:

    Autocorrelation of the output FIR digital filter, H(z) to awhite noise input of zero mean and unit variance is

    2 2 1 3 2 1 2Cross-correlation is a measure of similarity between two

    signals. The cross-correlation of signal, x(n), relative to asignal, y(n), is given by

    .Similarly, cross correlation of a signal, y(n), relative to a

    signal, x(n), is given byr (m) = E[y(n+m)x(n)].

    Note that the symmetry property of the cross-correlation isr

    (m) = r

    (-m).

    Music

    Technology

    GroupDiscrete-Time Random Signals

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    66/73

    The power spectral density (PSD) of a randomsignal is defined as the DTFT of theautocorrelation sequence,

    . The PSD is real-valued and positive and

    describes how the power of the random process

    is distributed across frequency. Example: The PSD of a white noise signal is

    Music

    Technology

    Group

    Random Signals Processed by LTIdigital filters

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    67/73

    Review characterization of statistics of the outputof a causal LTI digital filter that is excited byrandom signal.

    The output of a causal digital filter can becomputed by convolving the input with itsimpulse response, i.e.,

    .

    We can derive following expressions for mean,

    crosscorrelation, and power spectral density atoutput

    | =0

    Music

    Technology

    GroupInformation Theory

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    68/73

    In the paper titled A Mathematical Theory ofCommunication, Claude Shannon wrote:

    The fundamental problem of communication is that ofreproducing at one point either exactly or

    approximately a message selected at another point

    Information Theory (IT): Mathematical framework forapproaching a large class of problems related toencoding, transmission, and decoding information in

    a systematic and disciplined way. Since audio (speech, music, etc) is a form of

    communication, information theory has served as a

    based for audio coding

    Music

    Technology

    GroupInformation Theory

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    69/73

    Entropy: Can be used to describe the quantity ofinformation 1) The amount of uncertainity before seeing an event

    2) The amount of surprise when seeing an event

    3) The amount of information after seeing an event

    (These three are virtually the same)

    According to IT, the information derivable fromoutcome xi depends on its probability P(xi) P(xi ) small large degree of information

    P(xi ) largesmall amount of information

    The amount of information is defined as:

    1

    Music

    Technology

    GroupInformation Theory

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    70/73

    Interpretation for the logarithm The information for two independent events to occur (where the

    joint probability is the multiplication of both individualprobabilities) can be simply carried out by the addition of the

    individual information of each event. When the logarithm base is 2, the unit information is called bit.

    (1 bit of information is needed to specify the outcome)

    X is a discrete random variable taking value xi (symbol)from a finite sample space S={x1,x2,,xi,} (alphabet).Xi is produced from alphabet S according to theprobability distribution of the random variable X.

    Entropy of random variable xi

    1

    H(X) is the amount of information required to specify whatkind of symbol has occurred on average

    Music

    Technology

    GroupSummary

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    71/73

    Concepts covered during this chapter Continuous Fourier Transform

    Spectral Leakage effects

    Convolution, Sampling, and Aliasing issues

    Discrete-time Fourier transform and z-transform

    The DFT, FFT, DCT, and STFT basics

    Information Theory

    Music

    Technology

    Group

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    72/73

    Any Question?

    Music

    Technology

    Group

  • 8/3/2019 3 Signal Processing Essentials Representation of Audio Signals

    73/73

    Thanks!Thanks!