3 signal processing essentials representation of audio signals
TRANSCRIPT
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
1/73
Signal Processing Essentials
Representation of Audio Signals
06.10.2011, 52.219, Waldo Nogueira
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
2/73
Music
Technology
Group My Self
Contact Details Email: [email protected]
Telephone: +34 93 542 2806Web: https://dtic.upf.edu/~wnogueira (underconstruction)
Office: 55.314 Address: Music Technology Group
Tnger Building - CommunicationCampus-PoblenouUniversitat Pompeu FabraRoc Boronat, 13808018 Barcelona
Moodle:
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
3/73
Music
Technology
GroupContents
Review of the last 3 Lectures: From Analogue to Digital
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
4/73
Music
Technology
GroupLecture 1: Introduction
medi, dades, codis i artefactes la finestra daudici. Visi histrica de la
codificaci digital teorema de Nyquist: mostreig i aliasing caracterstiques dels filtres analgics anti-aliasing
i reconstructor freqncies de mostreig estndard (32/44.1/48
KHz)mostreig i retenci (sample&hold) en A/D importncia del rellotge de mostreig (clock):jitteri PLL efecte dapertura (en A/D i en reconstrucci D/A)
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
5/73
Music
Technology
GroupLecture 2: Quantization
quantitzaci error de quantitzaci (distorsi harmnica)
dither: definici i tcniques (rectangular,triangular, gaussi) aplicacions del ditheri criteris de selecci en
lentorn professional tecnologia bsica dimplementaci de la
conversi A/D/A
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
6/73
Music
Technology
GroupLecture 3: Sampling, PCM and more
codificaci no-PCM: diferencial (delta D, DPCM) altres codificacions PCM: sigma-delta -D, -
DPCM sobremostreig (oversampling): definici, teoria
de la informaci noise-shaping (NS): definici i exemples
dimplementaci diagrama global de la conversi A/D/A (amb
ovesampling i NS) conversors comercials en lestudi
denregistrament: parmetres i criteris relaci entre formats comercials i tipus de
conversi (CD, DVD-A, SACD, estat slid)
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
7/73
Music
Technology
GroupContents for Today: Part I/II
After Analogue to Digital conversion Part I: Introduction and Overview:
Source Coding in a digital information system
Quantification and Coding of signals with PCM Problem: Large amount of information
PCM for Audio and Speech Signals
Information Theory: The message
Overview: Structure of the next 6 Lectures
Part II: Signal Processing Fundamentals
Spectra of Analogue Signals, Convolution and Filtering, UniformSampling
Discrete time signal processing (Transforms, DFT, DCT, STFT)
Differential equations and digital filters (Poles,Zeros,Freq. Resp.
Review of MultiRate signal processing (Down/Up-samping, QMF)
Discrete-Time Random Signals Information Theory
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
8/73
Music
Technology
GroupContents
PART I
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
9/73
Music
Technology
GroupDigital Coding
Representation of Analog signals using binarydigital sequences
Goals: Acceptable quality with the minimum number of bits
For a fixed speed to obtain the minimum degradation
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
10/73
Music
Technology
Group
Source coding in digitalinformation system
SourceCoder
ChannelCoder Modulator
Channel
De-Modulator
ChannelDe-coder
SourceDe-coder
BinarySymbols
Source
Sink
Noise+
Interference
Discrete Channel
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
11/73
Music
Technology
Group
Quantification and Coding of Signalswith PCM
Time
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
12/73
Music
Technology
GroupProblem
The digital representation of Audio Signalsrequires relatively large data rates
Digital Audio Signal = 24 x Digital Speech Signal
2 x 768 kbit/s 64 kbit/s
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
13/73
Music
Technology
Group
PCM-Format for Audio and SpeechSignals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
14/73
Music
Technology
GroupAdvantages of digital coding
Regeneration Possibility to clean pulse. When reconstructing the signal
it is possible to improve the SNR
In analog comunications if signal is amplified, the noise isamplified too
Error protection
Encripting StorageMultiplexing (time and frequency) Less degradationUse of signal processing techniques
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
15/73
Music
Technology
GroupBit Rate
At Source Coder output we obtain:
Bits/sample: R
Sampling Rate: Fs
Transmission speed I: FsR bits/second
Quantization
&Sampling
Source
Coder
Source
Fs: samples/sR: bits/sample
How can be the transmission rate be reduced, or usedo timall Information Theor
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
16/73
Music
Technology
GroupInformation Theory: Message
R H
Redundant
(known)
Not Redundant
(unknown)
Irrelevant
(nomeaning
)
Releva
nt
(hasmea
ning)
Interesting
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
17/73
Music
Technology
Group
General Block Diagram of an Audiocoder
Time-FrequencyTransform
Quantizationand Coding
FrameConstruction
Psychoacoustic Model
PCM- Signal
706 KBIT/SQuantizationand Coding
Quantizationand Coding
Coded AudioSignal
128 KBIT/S
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
18/73
Music
Technology
GroupEvaluation of coding schemes
4 parameters to evaluate a coding scheme 1) Speed
64 Kbps 2,4 Kbps
256 Kbps 16 Kbps
1410 Kbps 128 Kbps
Quality
Objective Measures Subjective Measures
Complexity
MIPS Delay
< 100 ms
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
19/73
Music
Technology
GroupOverview
Theory 1,2,3,4.5 (until slide 27): Done by Enric Gine Theory 4.5: Signal Processing Essentials
Theory 5: Quantization and Entropy Coding
(coming from sigma-delta)
Theory 6: Time-Frequency Analysis
Theory 7: Psychoacoustics & Psycho.Models
Theory 8: Bit Allocations Strategies
Theory 9: Linear Prediction and Narrow/Wideband coding
Seminar 1, 2, 3: Done by Enric Gine
Seminar 4: Audio Coding Standards 1 (CELP, AC-3, mp-1-2-3,
AAC) Seminar 5: Audio Coding Standards 2 (mpeg-Surround, USAC)
Seminar 6: Losless Audio Coding
Seminar 7: Quality Measures for Audio Coding (First Seminar?)
Callibration of the system from input/output(Look at Watkinson)
Seminar 8: Sinusoidal Coders
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
20/73
Music
Technology
GroupOverview: Labs
Lab 1:Enric Lab 2: Signal Processing Essentials Lab 3: Quantization & Entropic Coding Lab 4: Psychoacousitcs Lab 5: To be defined!!!
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
21/73
Music
Technology
Group
General Block Diagram of an Audiocoder
Time-FrequencyTransform
Quantizationand Coding
FrameConstruction
Psychoacoustic Model
PCM- Signal
706 KBIT/SQuantizationand Coding
Quantizationand Coding
Coded AudioSignal
128 KBIT/S
Theory 5
Theory 6
Theory 7
Theory 8
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
22/73
Music
Technology
GroupContents
PART II
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
23/73
Music
Technology
GroupContents
PART II: Introduction Spectra of Analog Signals Review of convolution and filtering Uniform Sampling
Discrete-time signal processing Transforms for Discrete-Time Signals The Discrete and the Fast Fourier Transform The Discrete Cosine Transform The Short-Time Fourier Transform
Difference equations and digital filters
The transfer and the frequency response functions Poles, Zeros, and Frequency Response Examples of digital filters for audio applications
Review of Multirate signal processing Down-sampling by an integer Up-sampling by an integer Sampling rate changes by Noninteger factors Quadrature mirror filter banks
Discrete-Time Random Signals Random signals processed by LTI Digital Filters Autocorrelation Estimation from Finite-Length Data
Review on Information Theory Entropy, Shannon, Rate Distortion
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
24/73
Music
Technology
GroupSpectra of Analoge Signals (1)
Frequency Spectrum of an analog signal isdescribed in terms of the continuous Fourier
transform (CFT). ,
Where wis the frequency in radians per second (rad/s), w = 2f,where fis the frequency in Hz.
Example 1: The pulse-sinc CFT pair
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
25/73
Music
Technology
GroupSpectra of Analog Signals (1)
Example 2:
Loss of resolution andspectral leackage (timedomain truncation)
M i
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
26/73
Music
Technology
GroupSpectra of Analog Signals
The inverse CFT is given by
12
In CFT theory, x(t) and X(w) are called atransform pair, i.e.,
x(t) X(w)
M i
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
27/73
Music
Technology
GroupSpectra of Analog Signals
In real life signal processing, all signals have finite length, Time domain truncation always occurs.
Truncation done using rectangular windows.
To smooth out frame transitions and control spectral leakage(sidelobes), the signal is often tappered prior to truncation usingwindows such as Hamming, Barlett, trapezoidal windows.
Music
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
28/73
Music
Technology
GroupReview of Convolution and Filtering
A linear time-invariant (LTI) system
A linear filter satisfies the property ofsuperposition: y(t), is the convolution of the input, x(t),
with the filter impulse response, h(t).
Convolution is represented by the integral
.
LTI system
h(t)
Y(t) = x(t)*h(t)x(t)
Music
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
29/73
Music
Technology
GroupReview of Convolution and Filtering
The CFT of the impulsive response, h(t), is thefrequency response of te filter, i.e.,
h(t) H(w). Example: First order first-order RC circuit (low
pass filter)
Impulse response is a decaying exponential, and frequencyresponse is a rational function. This function is complex-valuedand its magnitude represents the gain of the filter with respectto frequency at steady state. A sinusoid is scaled in amplitude
and phase is shifted with respect to phase response
Music
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
30/73
Music
Technology
GroupUniform Sampling
Nyquist Fs 2B
Multiplication in timedomain is convolution infrequency domain.
Sampling is
multiplicationspectrum isconvolution of spectra
with pulse train: 1 .
An ideal LPF can recover thebaseband and hence perfectreconstruct the analog signal fromthe digital
Music
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
31/73
Music
Technology
GroupUniform Sampling
Reconstruction process:
The reconstruction LPFinterpolates between thesamples and reproducesthe analog signal
The interpolation becomesevident once the filtering
operation is interpreted intime domain asconvolution
Reconstruction occurs by
interpolating with the sincfunction, which is theimpulse response of theideal LPF.
Note that if Fs is less than 2B, aliasing occurs, andperfect reconstruction is no longer possible
In real life, the analog signal is not ideally badlimited ann the sampling process is not perfect(sampling pulses have finite amplitude and duration) Aliasing
To reduce aliasing, the signals is prefiltered by ananti-aliasing low pass and usually over-sampled
(ws>2B) sigma delta
Music
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
32/73
Music
Technology
Group2.5 Discrete-time signal processing
Audio coding algorithms operate on a quantizeddiscrete-time signal.
Prior to compression, most algorithms requirethat the audio signal is acquired with high fidelitycharacteristics.
Typically it is assumed that the signal is band-limited at 20 kHz, sampled at 44.1 kHz and
quantized at 16 bits per sample
Music
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
33/73
u c
Technology
Group2.5.1 Transforms for Discrete-Time signals
Discrete-time signals are described in thetransform domain using thez-transform and thediscrete-time Fourier transform (DTFT).
The z-transform is defined as: , where z is complex If the z-transform is evaluated on the unit circle,
i.e, forz , 2
Then the z-transform becomes the discrete timeFourier transform (DTFT)
.
Music 2 5 2 The Discrete and the Fast Fourier
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
34/73
Technology
Group
2.5.2 The Discrete and the Fast FourierTransform
The Fourier transform is developed by startingfrom the DTFT analysis expression, andconsidering a finite length signal consisting ofNpoints, i.e.,
Furthermore, the frequency-domain signal issampled uniformly at N points within one period, = 0 to 2, i.e.,
k =
, K= 0,1,...,N-1
It is typical in the DSP literature to replace kwith the frequency index kand hence, it can bewritten as:
.
/ , k = 0,1,2,...,N-1
Music 2 5 2 The Discrete and the Fast Fourier
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
35/73
Technology
Group
2.5.2 The Discrete and the Fast FourierTransform (2)
The sampling in the frequency domain forcesperiodicity in the time domain, i.e.,
x(n) = x(n+N).
We also have periodicity in the frequencydomain, X(k) = X(k+N), because the signal in
the time domain is also discrete. Theseperiodicities create circular effects whenconvolution is performed by frequency-domain
multiplication, i.e,x(n) h(n) X(k)H(k),
Where x(n) h(n) = .
Music 2 5 2 The Discrete and the Fast Fourier
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
36/73
Technology
Group
2.5.2 The Discrete and the Fast FourierTransform (2)
The N-point inverse DFT (IDFT) is written as:
1
, 0,1, , 1
The DFT can be computed efficiently using the fastfourier transform (FFT).
FFT takes advantage of redundancies in the DFT sumby decimating the sequence into subsequences witheven and odd indices.
It can be shown that ifNis a radix-2 integer, the N-point DFT can be computed using a series ofbutterfly stages.
The complexity associated with the DFT is of theorderN2and for the FFT is roughly of the order of
Nlog2N
Music
2 5 3 Th Di t C i T f
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
37/73
Technology
Group2.5.3 The Discrete Cosine Transform
The discrete cosine transform (DCT) of x(n) canbe defined as
, 0 1,Where c(0) = 1/ 2, and c(k) = 1 for 1kN-1.
Depending on the periodicity and symmetry of theinput signal, x(n), the DCT can be computed using
different orthonormal transforms (usually DCT-1,DCT-2, DCT-3, and DCT-4) (More details will begiven on Chapter 6)
Music
h l 2 5 4 Th Sh t Ti F i T f
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
38/73
Technology
Group2.5.4 The Short-Time Fourier Transform
Spectral analysis of non-stationary signals cannot be accommodatedby classical Fourier transform since the signal has time-varyingcharacteristics
Time-varying spectral analysis can be performed using the short-timeFourier transform (STFT).
,
,
W here = wT = 2fT is the normalized frequency in radians, and h(n)is the sliding analysis window The synthesis expression (inverse transform) is given by
12 ,
If n = m nad h(0) = 1, then x(n) can be obtained from previous Eq. The basic assumption in this type of analysis-synthesis is that the
signal is slowly time-varying and can be modeled by its short-timespectrum.
The temporal and spectral resolution of the SFTF are controlled by thelength of the window (usually constrained to be about 5-20 ms andhence spectral resolution is sacrificed)
Music
T h l 2 5 4 Th Sh t Ti F i T f
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
39/73
Technology
Group2.5.4 The Short-Time Fourier Transform
The sequence, h(n), can also be viewed as theimpulse response of a LTI filter, which is excitedby a frequency-shifted signal. This leads to the
filter-bank interpretation of the STFT, i.e, for adiscrete frequency variable k = k(), k =0,1,,N-1 and and N chosen such that thespeech band is covered.
Then the analysis expression is written as: ,
Music
T h l g2.5.4 The Short-Time Fourier
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
40/73
Technology
Group Transform
And the synthesis expression is:
, ,
Where is the signal reconstructed withinthe band of interest. If h(n), , and N are chosencarefully, the reconstruction given in previous slideis depicted in the following Figure, where
.
The k-th channel of the analysis-synthesis filterbank
Music
Technology Overlap Add Decomposition
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
41/73
Technology
GroupOverlap-Add Decomposition
Consider breaking an inputsignal x into frames using afinite, zero-phase, length Mwindow w. Then we may
express the mth windowed dataframe as:
, ,
where,
R frame step (hop size)m frame index
The hop size is the number ofsamples between the begin-
times of adjacent frames.Specifically, it is the number ofsamples by which we advanceeach successive window
Music
Technology Overlap Add Decomposition
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
42/73
Technology
GroupOverlap-Add Decomposition
For frame-by-frame spectral processing to work,we must be able to reconstruct x from theindividual overlapping frames, ideally by simply
summing them in their original time position
= Hence, x = 1, ,This is the constant-overlap-add (COLA) constraintfor the FFT analysis window. It has also been calledpartition of unity transform
Music
Technology 2 6 Difference Equations and Digital Filters
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
43/73
Technology
Group2.6 Difference Equations and Digital Filters
Digital filters are characterized by differenceequations of the form
y(n) is given as a linear combination of present and past inputsminus linear combination of past outputs (feedback term).
The parameters ai and bi are the filter coefficients and controllfrequency response characteristics of a digital filter.
Filter coefficients can be made adaptive (time-varying) IIR filter (when feedback coefficients are non zero, frequency
response is infinitely long).
Music
Technology2.6 Difference Equations and Digital
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
44/73
Technology
Group Filters
The impulse response h(n) is:
In statistical signal representation (ARMA)The filter coefficients are chosen such that the filter is stable
An input-output equation of a causal filter can also bewritten in terms of the impulse response of the filter, i.e.,
.
Music
Technology2.7 The Transfer and the Frequency
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
45/73
Technology
Group Response Functions
The z-transform of the impulse response of afilter is called the transfer function is given by
.
Considering the difference equation, we can alsoobtain the transfer function in terms of filterparameters, i.e.,
1 The ratio of output input in the z domain gives thetransfer function in terms of the coefficients
0 0
1 1
Music
Technology2.7 The Transfer and the Frequency
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
46/73
gy
Group Response Functions
The frequency response function is a special caseof the transfer function of the filter. That is for , then
2.7.1 Poles, Zeros and Frequency Response
A z domain function, H(z), can be written in terms of its
poles and zeros as follows:
Where are the zeros and poles of H(z) and G is cte
The location of poles and zeos affect shape of freq response
Magnitude of freq.response can be writte as:
Music
Technology2.7 The Transfer and the Frequency
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
47/73
gy
Group Response Functions
It is evident that when anIsolated Zero is close to the
it unit circle, it will give rise to a peak in the magnitudefrequency response will assume a small value at thatfrequency
When an isolated pole is close to unit circle it will give riseto a peak in the magnitude frequency response at that
frequency. In speech processing,
the presence of poles in z domain representations of the vocal tract,has been associated with the speech formants.
Formant synthesizers use the pole locations to form synthesis filters forcertain phonemes.
The presence of zeros has been associated with the coupling of nasaltract. For example, zeros associate with nasal such as m and n.
Music
Technology2.8 Review of multirate signal
i (MSP)
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
48/73
Group processing (MSP)
MSP involves the change of sampling rate whilethe signal is in the digital domain Applications: Reduce algorithmic (SW/HW) complexity
Increase resolution
Oversampling A/D (less requirem. for antialiasing filter),but causes data-rate increase Downsampling required
Downsampling requires anti-aliasing filter low-pass indigital domain
Sigma delta changes complexity from analog domain into
digital domain
For D/A conversion signal is interpolated in the digitaldomain reducing requirements on the analog
reconstruction (interpolation) filter
Music
Technology 2.8.1 Downsampling by an Integer
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
49/73
Group2.8.1 Downsampling by an Integer
Downsampling: Increase the sampling period/Decreasesampling frequency and data rate of digital signal:
Given the DTFT transform pairs
and
It can be shown that the DTFT of the original and decimatedsignal are related by
1
/
.
Downsampling introduces L copies of the original DTFTthat are both amplitude and frequency sacaled by L May introduce aliasing.
Aliasing can be eliminated if DTFT is bandlimited to /L,
i.e:
0,
.
Music
Technology 2.8.1 Downsampling by an Integer
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
50/73
Group8 o sa p g by a tege
Example: DTFTs of the signal during downsampling process
Music
Technology
G2.8.2 Upsampling by an Integer
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
51/73
Groupp p g y g
Involves reducing the sampling period byintroducing additional regularly spaced samplesin the signal sequence
The DTFT of the up-sampled signal relates to the
DTFT of the original signal as follows: .
Music
Technology
G2.8.2 Upsampling by an Integer
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
52/73
Groupp p g y g
Therefore, the DTFT of the up-sampled, sdescribed by a series of compressed images ofthe DTFT of the original signal located at integer
mutliples of 2/M rads. To complete the upsampling process, an
interpolation stage is required that fills
appropriate values in the time-domain to replacethe artificial zero-valued samples introduced bythe sampling
Music
Technology
Group
2.8.3 Sampling Rate Changes byNonInteger Factors
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
53/73
Group NonInteger Factors
Can be accomplished by cascading upsamplingand down-sampling The upsampling stage precedes the downsampling stage
and, The low pass interpolation and antialiasing filters are
combined into one filter whose bandwidth is the minimum
of the two filters
Example: We wand a non integer sampling periodmodification of such that Tnew = 12T/5.
In this case we choose L=12 and M=5.
Hence, the bandwidth of the low-pass filter is the minimum
of /12 and /5.
Music
Technology
GroupQuadrature Mirror Filter Banks
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
54/73
Group
Analysis of a signal in a Perceptual Audio Coderis accomplished using filter-banks or frequency-domain transformations (or combinations)
The filterbank is used to decompose the signalinto several frequency subbandsDifferent coding strategies are then derived and
imlemented in each subband (subband coding)- Aliasing between subbands
because of imperfect freq.responses- Aliasing effects no perfect
reconstruction
- Solution: Combine up-&downoperations with appropriate filterdesigns
Music
Technology
GroupQuadrature Mirror Filter
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
55/73
Group
Perfect reconstruction fiitlerbank is calledquadrature mirror fitler bank (QMF)
Analysis H0(z) and H1(z) + downsampling Synthesis F0(z) and F1(z) If the process includes quantizers, those will be
placed after the downsampling stages
Music
Technology
GroupQuadrature Mirror Filter
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
56/73
Group
The input signal x(n) is first filtered anddownsampled
The DTFT is
Plot of DTFT of the original and down-sampledsignals is presented here:
Music
Technology
GroupQuadrature Mirror Filter
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
57/73
Group
The reconstructed signal , is derived by addingthe contributions from the up-sampling and theinterpolations of the low and the high band.
It can be shown that the reconstructed signal in z-domain has the form
12 0 0 1 1 12 0 0 1 1
The signal X(-z) is associated with the alisaing term.The aliasing term can be cancelled by designingfilters to have the following mirror symmetries:
F0(z) = H1(-z) and F1(z)=-H0(-z)
Music
Technology
GroupQuadrature Mirror Filter
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
58/73
Group
Under these conditions, the overall transfer function ofthe filter bank can then be written as
0 0 1 1 . If T(z) = 1, then the filter bank allows perfect
reconstruction.
Perfect delayless reconstruction is not realizable, but anall-pass filter bank with linear phase characteristics can be
designed easily For example a first order FIR filter
0 1 1 1
Results in alias free reconstruction. The overall transferfunction is:
1 - 1 2
The signal is reconstructed within a delay of one sampleand with an overall gain of 2
Music
Technology
GroupQuadrature Mirror Filter
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
59/73
p
QMF filter banks can be cascaded to form treestructures.
If we represent the analysis stage of a filter bank
as a block that divides the signal in low and highfrequency subbands, then by cascading severalof such blocks, we can divide the signal into
smaller subbands (association with wavelettransform theory)
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
60/73
p
Signals can be classified as deterministic or random Deterministic: Values at any point in time can be defined
precisely by a math. Equation (e.g. x(n)=sin(n/4))
Random: Uncertain values and are usually described usingstatistics
A discrete-time random process involves an ensemble ofsequences x(n,m) where m is the index of the m-th
sequence in the ensemble and n is the time index In practice one does not have access to all possible sample
signals of a random process. Therefore, the determination ofthe statistical structure of a random process is often done from
the observed waveform. This approach becomes valid andsimplifies if the random signal is Ergodic.
Ergodicity: Statistics of a random process can bedetermined using time-averaging operations on asingle observed signal
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
61/73
Ergodicity requires that statistics of the signalare independent of the time of observation
Stationarity: Random process is widesensestationary it its statistics, upto the second order,are independent of time.
Although it is difficult to show analytically thatsignals with various statistical distributions are
ergodic, it can be shown that a stationary zero-mean Gaussian process is ergodic up to secondorder (in many practical applications involving stationarity
process, it is assumed that the process is also ergodic)
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
62/73
The mean value, x , of the discrete-time, widesense stationary signal, x(n), is a first orderstatistic that is defined as the expected value of
x(n), i.e., lim
12 1 ,
Where E[] denotes statistical expectation The variance is the standard deviation of the
signal. For a zero-mean signal , the variance is
simply .
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
63/73
The autocorrelation of a signal is a second-orderstatistic defined by:
lim1
2 1 ,
m is autocorrelation lag index. The autocorrelationcan be viewed as a measure of predictability of the
signal in the sense that a future value of a correlatedsignal can be predicted by processing informationassociated with its past values.
For example speech is a correlated waveform, and,
hence, it can be modelled by linear predictionmechanisms that predict its current value from a linearcombination of past values
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
64/73
Correlation can also be viewed as a measure ofredundancy in the signal: Correlated waveforms can be parameterized in terms of
statistical time-series models; and, hence, represented bya reduced number of information bits
The autocorrelation sequence rxx(m), issymmetric and positive definite, i.e,
0 . Example 1: Autocorrelation of a white noise
signal
,Where is the variance of the noise.Autocorrelation of white noise is an impulse
Uncorrelated signal
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
65/73
Example 2:
Autocorrelation of the output FIR digital filter, H(z) to awhite noise input of zero mean and unit variance is
2 2 1 3 2 1 2Cross-correlation is a measure of similarity between two
signals. The cross-correlation of signal, x(n), relative to asignal, y(n), is given by
.Similarly, cross correlation of a signal, y(n), relative to a
signal, x(n), is given byr (m) = E[y(n+m)x(n)].
Note that the symmetry property of the cross-correlation isr
(m) = r
(-m).
Music
Technology
GroupDiscrete-Time Random Signals
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
66/73
The power spectral density (PSD) of a randomsignal is defined as the DTFT of theautocorrelation sequence,
. The PSD is real-valued and positive and
describes how the power of the random process
is distributed across frequency. Example: The PSD of a white noise signal is
Music
Technology
Group
Random Signals Processed by LTIdigital filters
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
67/73
Review characterization of statistics of the outputof a causal LTI digital filter that is excited byrandom signal.
The output of a causal digital filter can becomputed by convolving the input with itsimpulse response, i.e.,
.
We can derive following expressions for mean,
crosscorrelation, and power spectral density atoutput
| =0
Music
Technology
GroupInformation Theory
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
68/73
In the paper titled A Mathematical Theory ofCommunication, Claude Shannon wrote:
The fundamental problem of communication is that ofreproducing at one point either exactly or
approximately a message selected at another point
Information Theory (IT): Mathematical framework forapproaching a large class of problems related toencoding, transmission, and decoding information in
a systematic and disciplined way. Since audio (speech, music, etc) is a form of
communication, information theory has served as a
based for audio coding
Music
Technology
GroupInformation Theory
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
69/73
Entropy: Can be used to describe the quantity ofinformation 1) The amount of uncertainity before seeing an event
2) The amount of surprise when seeing an event
3) The amount of information after seeing an event
(These three are virtually the same)
According to IT, the information derivable fromoutcome xi depends on its probability P(xi) P(xi ) small large degree of information
P(xi ) largesmall amount of information
The amount of information is defined as:
1
Music
Technology
GroupInformation Theory
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
70/73
Interpretation for the logarithm The information for two independent events to occur (where the
joint probability is the multiplication of both individualprobabilities) can be simply carried out by the addition of the
individual information of each event. When the logarithm base is 2, the unit information is called bit.
(1 bit of information is needed to specify the outcome)
X is a discrete random variable taking value xi (symbol)from a finite sample space S={x1,x2,,xi,} (alphabet).Xi is produced from alphabet S according to theprobability distribution of the random variable X.
Entropy of random variable xi
1
H(X) is the amount of information required to specify whatkind of symbol has occurred on average
Music
Technology
GroupSummary
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
71/73
Concepts covered during this chapter Continuous Fourier Transform
Spectral Leakage effects
Convolution, Sampling, and Aliasing issues
Discrete-time Fourier transform and z-transform
The DFT, FFT, DCT, and STFT basics
Information Theory
Music
Technology
Group
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
72/73
Any Question?
Music
Technology
Group
-
8/3/2019 3 Signal Processing Essentials Representation of Audio Signals
73/73
Thanks!Thanks!