digital speech processing— lecture 10 short-time fourier
TRANSCRIPT
2
Review of STFT1
1 2 3 0
2
1
ˆ ˆˆ
ˆˆ
ˆˆ
ˆ. ( ) [ ] [ ]
ˆ function of (looks like a time sequence)ˆ function of (looks like a transform)
ˆ ˆ( ) defined for , , ,...;
. Interpretations of ( )ˆ. fix
ω ω
ω
ω
ω
ω π
∞−
=−∞
= −
= ≤ ≤
∑j j mn
m
jn
jn
X e x m w n m e
n
X e n
X e
n ˆ
ˆ ˆ
ˆˆed, variable; ( ) DTFT [ ] [ ]
DFT View OLA implementationˆ ˆ 2. variable, fixed; ( ) [ ] [ ]
Linear Filtering view filter bank implementation FBS i
ω
ω ω
ω ω
ω −
= = −⎡ ⎤⎣ ⎦⇒ ⇒
= = ∗
⇒ ⇒⇒
jn
j j nn
X e x m w n m
n n X e x n e w n
mplementation
3
Review of STFT
3
22
ˆˆ
. Sampling Rates in Time and Frequency ˆrecover [ ] from ( )
1. time: ( ) has bandwidth of Hertz samples/sec rate
Hamming Window: (Hz) sample
ω
ω
⇒
⇒
= ⇒
jn
j
S
x n X e
W e BB
FBL
exactly
4
at
(Hz) or every L/4 samples
2. frequency: [ ] is time limited to samples need at least frequency samples to avoid time aliasing
⇒
SFL
w n LL
4
Review of STFT with OLA method can recover ( ) using lower
sampling rates in either time or frequency, e.g., can sample every samples (and divide by window), or can use fewer than frequency sam
x n
LL
exactly
ples (filter bank channels); but these methods are highly subject to aliasing errors with any modifications to STFT can use windows (LPF) that are longer than samples and
still recover with L
frequency channels; e.g., ideal LPF is infinite in time duration, but with zeros spaced samples apart where / is the BW of the ideal LPF
< ⇒
S
N LN
F N
5
Review of STFTH0(ejω)
H1(ejω)
HN-1(ejω)
+…
X(ejω) Y(ejω)
1
0
1
0
1
0
[ ] [ ]
( ) ( )
[ ] [ ] [ ] [ ] [ ]
[ ] [ ]
[ ] [ ] [ ] [ ] [ ] ...
need to design digital filters that match criteria for exact reconstr
ω
ω ω
δ
δ
δ
−
=
−
=
∞
=−∞
∞
=−∞
=
= =
= = =
= −
= − = + +
⇒
∑
∑
∑
∑
%
%
%
kj nk
Nj j
kk
N
kk
r
r
h n w n e
H e H e
h n h n n w n p n
p n N n rN
h n N w rN n rN w w N
uction of [ ] and which still work with modifications to STFTx n
Tree-Decimated Filter Banks• can sample STFT in time and frequency using lowpass
filter (window) which is moved in jumps of R<L samples if L≤N where:– L is the window length, – R is the window shift, and – N is the number of frequency channels
• for a given channel at ω=ωk, the sampling rate of the STFT need only be twice the bandwidth of the window Fourier transform– down-sample STFT estimates by a factor of R at the transmitter– up-sample back to original sampling rate at the receiver– final output formed by convolution of the up-sampled STFT with
an appropriate lowpass filter, f [n]7
Filter Bank Channels
8
Fully decimated and interpolated filter bank channels; (a) analysis with bandpass filter, down-shifting frequency and down-sampling; (b) synthesis with up-sampler followed by lowpassinterpolation filter and frequency up-shift; (c) analysis with frequency down-shift followed by lowpass filter followed by down-sampling; (d) synthesis with up-sampling followed by frequency up-shift followed by bandpass filter.
Full Implementation of Analysis-Synthesis System
9
Full implementation of STFT analysis/synthesis system with channel decimation by a factor of R, and channel interpolation by the same factor R (including a box for short-time modifications.
10
Review of Down and Up-Sampling
R↓][nx [ ] [ ]dx n x nR=
11 2
0
( )/( ) ( )
aliasing addition
ω ω π−
−
=
= ∑R
j j r Rd R
rX e X e
R↑][nx[ / ] 0, 1, 2,...
[ ]0 otherwiseu
x n R nx n
= ± ±⎧= ⎨⎩
( ) ( ) imaging removal
ω ω=j j RuX e X e
11
[ ]
1
0
1
0
1
1
1 ( )
assuming no short time modifications ( ) ( ) ( ) ( )
( ) ( ) ( )
( ) [ ] [ ]
( ) ( ) ( ) ( )
k k
rR
k k
k
k
rR
j j mrR rR
m W
Nj j n
rRkN
j nrR
k
j n m
m W r k
X k X e w rR m x m e
y n P k Y e eN
P k X k f n rR eN
x m w rR m f n rR P k eN
ω ω
ω ω
ω
ω
−
∈
−
=
−
=
∞−
∈ =−∞ =
•
= = −
=
= −
⎡ ⎤= − −⎢ ⎥
⎣ ⎦
∑
∑
∑
∑ ∑1
0
1
0
11 ( )
recognizing that when ( ) ,
( )
can show that the condition for ( ) ( ), is
( ) ( ) ( )
k
N
Nj n m
k q
r
P k k
e n m qNN
y n x n n
w rR n qN f n rR q m n
ω δ
δ
−
− ∞−
= =−∞
∞
=−∞
• = ∀
= − −
• = ∀
− + − = ⇒ =
∑
∑ ∑
∑
Filter Bank Reconstruction
12
Filter Bank Reconstruction
( ) ( )( )
1
0
2 21
0
1
1
1
( ) ( )
( ) ( )
frequency domain equations, assuming: [ ] [ ] ; [ ] [ ] ;
( ); ( ) ;
( ) ( )
( ) ( )
( )
k k
k k k k
k
j n j nk k
j j j jk k
Njj
kk
R j jR R
k
j
h n w n e g n f n e
H e W e G e F e
Y e P k G eN
H e X eR
X e
ω ω
ω ω ω ω ω ω
ωω
π πω ω
ω
− −
−
=
− − −
=
•
= =
= =
= ⋅
⎡ ⎤⎢ ⎥⎣ ⎦
=
∑
∑l l
l
( )1
0
2 21 1
1 0
1( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) aliasing terms
k
k
Nj j
k kk
R Nj jjR Rk k
k
j j
P k G e H eRN
X e P k G e H eRN
H e X e
ω ω
π πω ωω
ω ω
−
=
− −− −
= =
+ ⋅
= ⋅ +
∑
∑ ∑l l
l
%
13
Filter Bank Reconstruction
1
0
21
0
1 1
1 0 1 2( )
ˆ conditions for perfect reconstruction ( ( ) ( ))
( ) ( ) ( ) ( )
and
( ) ( ) ( ) for , ,...,
k
k
j j
Njj j
k kk
N jj Rk k
k
Y e X e
H e P k G e H eRN
P k G e H e RRN
ω ω
ωω ω
πωω
−
=
− −
=
• =
= =
= =
∑
∑l
%
l
Flat Gain
Alias Cancellation
15
Decimated Analysis-Synthesis Filter Bank
2
1 0 1 14
1 04
( ) ( )
( )
practical solution for the case ( ) ( ), (2-band solution)where the conditions for exact reconstruction become
( ) ( ) ( ) ( ) ( ) ( )
and
( ) ( ) (
ω ω ω π ω π
ω ω π
− −
−
• = = =
⎡ ⎤+ =⎣ ⎦j j j j
j j
w n f n N R
P F e W e P F e W e
P F e W e
( ) ( )
2
2 2
1 0
0 1 114
( ) ( )
( )
) ( ) ( ) ( )
aliasing cancels exactly if ( ) ( ) (with ( ) ( ))giving
( ) ( )
ˆ( ) ( )
ω π ω π
ω ω
ω ω π ω
− −
− −
⎡ ⎤+ =⎣ ⎦
• = − = =
⎡ ⎤− =⎢ ⎥⎣ ⎦= −
j j
j j
j j j M
P F e W e
P P F e W e
W e W e e
y n x n M
Decimated Analysis-Synthesis Filter Bank
16
1( )
0
1( ) [ ] ( )
( ) ( ) ( )
when aliasing is completely eliminated (by suitable choice of filters),the overall analysis-synthesis system has frequency response:
where
ω ωω
ω ω ω
−−
=
=
=
∑%
%
k
Njj
ek
j j je
H e P k W eRN
W e W e F e
h
1
0
[ ] [ ] [ ]
[ ] [ ] [ ]1[ ] [ ]
where
ω−
=
=
= ∗
= ∑ k
e
eN
j n
k
n w n p n
w n w n f n
p n P k eRN
Decimated Analysis-Synthesis Filter Bank
• To design a decimated analysis/synthesis filter bank system, need to determine the following:1. the number of channels, N, and the decimation/interpolation
ratio R. (Usually determined by the desired frequency resolution)
2. the window, w[n], and the interpolation filter impulse response, f [n]. (These are lowpass filters that should have good frequency-selective properties such that the bandpass channel responses do not overlap into more than one band on either side of the channel).
3. the complex gain factors, P[k], k=0,1,…,N-1. (These constants are important for achieving flat overall response and alias cancellation.)
17
Maximally Decimated Filter Banks
• we show here that it is possible to obtain exact reconstruction with R=N and N<L.– R=N is termed maximal decimation because this is the
largest decimation factor that can be used with an N-channel filter bank and still achieve exact reconstruction.
– nominal bandwidth of the bandpass channel signal is increased from ±π/N to ±π
– down-sampling by N accomplishes frequency down-shifting normally accomplished by modulation
18
Two Channel Filter Banks
22
0 0 1 1
( ) ( )0 0 1 1
2; , 0,1
1( ) ( ) ( ) ( ) ( ) ( )21 ( ) ( ) ( ) ( )2
For this two-band system, we have: Applying the frequency domain analysis we get:
ω ω ω ω ω ω
ω ω π ω ω π
ω π
− −
= = = =
⎡ ⎤= +⎣ ⎦
⎡+ +
k
j j j j j j
j j j j
R N k k
Y e G e H e G e H e X e
G e H e G e H e ( )
( ) ( )0 0 1 1
( )
1/ [0], [1]
( ) ( ) ( ) ( ) 0
not including multiplier or the gain factors Aliasing can be eliminated completely by choosing the filters such that:
(ali
ω π
ω ω π ω ω π
−
− −
⎤⎣ ⎦
⎡ ⎤+ =⎣ ⎦
j
j j j j
X e
N P P
G e H e G e H e
0 0 1 11 ( ) ( ) ( ) ( ) 1 (2
asing cancelation)
Perfect reconstruction requires:
flat gain condition)ω ω ω ω⎡ ⎤+ =⎣ ⎦j j j jG e H e G e H e
Quadrature Mirror Filter (QMF) Banks
23
( )1 0 1 0
0 0 0 0
( )1 1 1 0
[ ] [ ] ( ) ( )
[ ] 2 [ ] ( ) 2 ( )
[ ] 2 [ ] ( ) 2 ( )
Alias cancellation achieved if the filters are chosen to satisfythe conditions:
Note that the
j n j j
j j
j j
h n e h n H e H e
g n h n G e H e
g n h n G e H e
π ω ω π
ω ω
ω ω π
−
−
= ⇔ =
= ⇔ =
= − ⇔ = −
0[ ]re is only a single lowpass filter, with impulse response
on which all other filters are based; filter has nominal cutoff of /2 rad with a narrow transition to a stopband with adequate attenuat
h nπ
1 0
0 1
[ ] [ ][ ] [ ]
ionto isolate the two bands The filter is the complementary highpass filter The filters and are called Quadrature Mirror Filters (QMF)
j nh n e h nh n h n
π=
Quadrature Mirror Filter (QMF) Banks
24
( ) ( )
( ) ( ) ( 2 )0 0 0 0
2 2( )0 0
2 ( ) ( ) 2 ( ) ( ) 0
( ) ( ) ( )
Substituting for the filter relationships we get: i.e., perfect alias cancellation for any choice of lowpass filter
ω ω π ω π ω π
ω ω ω π
− − −
−
− =
= −
j j j j
j j j
H e H e H e H e
Y e H e H e
( ) ( )2 2( )0 0
( ) ( ) ( )
( ) ( )
For perfect reconstruction we require that:
i.e., flat magnitude with delay of samples, due to the delayof the causal filters of the filte
ω ω ω
ω ω π ω− −
⎡ ⎤ =⎢ ⎥⎣ ⎦
⎡ ⎤− =⎢ ⎥⎣ ⎦
%j j j
j j j M
X e H e X e
H e H e e
Mr bank
Quadrature Mirror Filters
25
0 0
0 0
[ ] [( 1) ], 0 1;( 1) / 2
( ) (
Assume FIR lowpass filter of length samplessuch that thisfilter has linear phase with a delay of samplesand a frequency response of the form: j j
Lh n h L n n L
L
H e A eω
= − − ≤ ≤ −−
=
( ) ( )
( 1)/2
0
2 2( 1) ( )0 0
) ( 1)
( )
( ) ( ) ( )
must be oddwhere is real. Using this lowpass filter in thefilter bank, the condition for perfect reconstruction becomes:
j L
j
j j j L j j
e L
A e
H e A e e A e e
ω ω
ω
ω ω π ω π ω
− −
− − −
−
⎡ ⎤= −⎢ ⎥⎣ ⎦%
( ) ( )
( 1)
2 2( ) ( 1)0 0
1
( ) ( ) ( )
If is odd, we get:
linear phase is guaranteed, but not perfect reconstruction
L
j j j j L
L
H e A e A e eω ω ω π ω
−
− − −
−
⎡ ⎤= +⎢ ⎥⎣ ⎦%
Quadrature Mirror Filters
26
( ) ( )2 2( )0 0( ) ( ) 1
For flat gain we need to satisfy the condition:
Solution is to use high order linear phase QMF filters Johnston proposed an algorithm for design of the basic
lowpass fi
j jA e A eω ω π−⎡ ⎤+ =⎢ ⎥⎣ ⎦
lter that minimized the deviation from unity while providing a desired level of stopband attenuation
28
Quadrature Mirror Filters
• practical realization of QMF filters
• note that aliasing is cancelled, but the overall frequency response is not perfectly flat
Lowpass filter designed by windowing
Highpass filter is mirror image of lowpass filter
30
• The motivation for the subband approach is to process different speech properties independently in each band and thus being able to localize the band (compact bandwidth) is important.
• Also,based on the original motivation of short-time analysis, temporal focus (within a limited time duration) is also important.
• Filter bank design is thus result of a trade-off between perfect reconstruction and bandwidth concentration in a joint criterion:
• When processing (e.g., quantization, a non-linear operation) is involved, harmonic distortions are inevitable and perfect reconstruction cannot be achieved, as seen in the simple example.
• There exists a design procedure (Smith-Barnwell) for QMF filters that attempts to meet the design constraints as best as possible
Issues with Perfect Reconstruction
∫∫ −−+=π ωπ
ω
ω ωαωα0
22 )|)(|1()1(|)(| deTdeWJ jj
s
Stop-band leakage Deviation from PR
31
1. Start with a sequence of length, say, 2L-1 (L even) that satisfiesthat is, even samples=0, except at n=0.
, the value of the sequence at n=0, needs to be large enough to satisfy the positivity condition (see below).
2. Factorsuch that all roots of are inside the unit circle.
3. Positivity condition:4. The mirror channel:
5. To eliminate aliasing:
Smith-Barnwell QMF Design
{ } )()()()( 100
−== zWzWngzG Z
0C);(2)()1()( 0 nCngng n δ=−+
)( zW
0)()()()(2
000 >== − ωωωω jjjj eWeWeWeG
1,)()( 01 −=−−= −− LNeeWeW Njjj ωωω
)()1()(ly equivalentor )()( 011
01 nNwnwzzWzW nN −−=−−= −−
)()1()(ly equivalentor )()( 1010 nwnfzWzF n−=−=)()1()(ly equivalentor )()( 0101 nwnfzWzF n−−=−−=
)()( and )()( 0110 zWzFzWzF −−=−=
32
Smith-Barnwell QMF Design - Example
condition positivity esatisfy th toadded is 0.1 0,at =n
12length has )( then ,)( ofLength 0 −= LngLnw7,,2,1,0,)2/sin()( ±±±== Ln
nnng
ππ
0.0455]- 0.0000 0.0637 0.0000- 0.1061- 0.0000 0.3183 0.6 0.3183 0.0000 0.1061- 0.0000- 0.0637 0.0000 -0.0455[)( =ng
L = 8, N = L – 1 = 7
choose
{ } )()()()( 100
−== zWzWngzG Z
Factorize )()()( 100
−= zWzWzG
0.1461]- 0.1182 0.1590 0.2032- 0.2332- 0.3426 0.8091 1[)(0 =nw
circleunit theinside roots all with length :(z)0 LW
1]- 0.8091 0.3426- 0.2332- 0.2032 0.1590 0.1182- -0.1461[)()1()( 01 =−−= nNwnw n
1] 0.8091 0.3426 0.2332- 0.2032- 0.1590 0.1182 -0.1461[)()1()( 10 =−= nwnf n
0.1461]- 0.1182- 0.1590 0.2032 0.2332- 0.3426- 0.8091 1[)()1()( 01 −=−−= nwnf n
all arrays start with n=0
33
QMF Frequency Responses (Example)
Normalized frequency (xπ)
1050
-5-10-15-20
20
-2-4-6-8
-10-12
Mag
nitu
de (d
B)
Pha
se(d
egre
e, x
100)
39
Quantization noise is concentrated in the band that it is generated in.
Quantized Channel Signals
][ˆ0 ny
][ˆ1 ny
][ˆ ny
]1[ −nδ
]1[ −nδ][nδ
][nδ
][0 nh 2↓ 2↑ ][0 ng
][1 nh 2↓ 2↑ ][1 ng
][0 ne
][1 ne
0 1
2 20 1ˆ ( ) ( ) ( )ω ω ωσ σ= +j j j
e eyP e G e G e
40
Subband Coding
• advantages of subband coding
• each subband can be encoded according to the perceptual criterion appropriate to that band
• quantization noise can be confined to the band that produces it
• low energy bands can be encoded so as to produce less noise
• applicable to coding in the 9.6-16 Kbps range
41
Practical Example of Subband Coder
• subband coder using tree-structured filer bank (Crochiere, 1981)
44
Concept of Time-Frequency Resolution• When interested in the changing characteristics of the signal,
short time Fourier transform or short time spectral analysis is used.
12222
12222
|)(||)(|)(
|)(||)(|)(
−∞
∞−
∞
∞−
−∞
∞−
∞
∞−
⎥⎦⎤
⎢⎣⎡•−=
⎥⎦⎤
⎢⎣⎡•−=
∫∫
∫∫
ωωωωωωσ
σ
ω dGdG
dttgdttgtt
c
ct
:spreadFrequency
:spread Temporal
τττω ωτdetgftF j∫∞
∞−
−−= )()(),(time. in focus providing function window the is )(tg
Continuous prolate spheroidal wave function: a bandlimited signal that have the highest energy concentration in a specified time interval.
.5.0)(=ωσσ t
tg product bandwidth-time minimum the achieves it Gaussian, is If
• To maintain a similar level of uncertainty, the window function should be short to provide time resolution for high frequency components and long to provide frequency resolution for low frequency components.
45
Wavelet-Based MethodsAnalysis Filters Synthesis Filters
Band 1 ↓ R Q Q-1 ↑ R Band 1
Band 2 ↓ R Q Q-1 ↑ R Band 2
Band N ↓ R Q Q-1 ↑ R Band N
. . . . . . . .
Subband method
Analysis Filters Synthesis Filters
↓ R Q Q-1 ↑ R
↓ R Q Q-1 ↑ R
↓ R Q Q-1 ↑ R
Wavelet method
WaveletTransform
InverseTransform
What is in the Wavelet Transform block? What is the difference?
46
Wavelet Transform• Built on set of expansion (or basis) functions that are
derived from a mother wavelet through scaling and translation:
∞<Ψ
= ∫∞
0
20 |)(| ωωω
ψ dC
)}({)}({ ,, tt aa ττ ψωψω F)( and F)( Let =Ψ=Ψ
: waveletmother a be Let )(tψ ⎟⎠⎞
⎜⎝⎛ −
=a
ta
taτψψ τ
1)(,
Continuous Wavelet Transform:
⎟⎟⎠
⎞⎜⎜⎝
⎛−=⎟⎟
⎠
⎞⎜⎜⎝
⎛ −== ∫∫
∞
∞−
∞
∞− af
adt
attf
adtttfaF aw
τψττψψτ τ***
, *)(1)(1)()(),(
Inverse transform: ∫ ∫∞
∞−
∞
∞−= 2, )(),(1)(
adadtaF
Ctf aw
τψτ τψ
00 =Ψ )(
analysis for signal the compress or stretch∫∞
∞−⎟⎠⎞
⎜⎝⎛ −= dt
atatf
aaFw
τψτ *)(1),(
mspectrogra scalogram; == 22 |),(||),(| τωτ FaFw
Admissibility condition:
47
Wavelet and Fourier Transform
⎟⎠⎞
⎜⎝⎛ −=⎟
⎠⎞
⎜⎝⎛ −
= ∫∞
∞− af
adt
attf
aaFw
τψττψτ ** *)(1)(1),(
)}({)}({ ,, tt aa ττ ψωψω F)( and F)( =Ψ=Ψ )}({ tfF F)( =ω
τωττ ωωτψψ j
aa eaaa
ta
t −Ψ=Ψ⇔⎟⎠⎞
⎜⎝⎛ −
= )()(1)( ,,
)()(*)(1)),((),( * ωωτψττω aFaa
fa
aFaW w Ψ=⎭⎬⎫
⎩⎨⎧
⎟⎠⎞
⎜⎝⎛−== FF
transform. waveletthe of tionrepresenta domainfrequency the is ),( ωaW
49
Scaling and Translation• For discrete wavelet transform,
mm anaa −− == 000 ττ and
,12 00 == τ and If a
and
m, n are integers
Znmntaatt mmnma ∈−== ,)()()( 00
2/0,, τψψψ τ
Znmntt mmnm ∈−= ,)2(2)( 2/
, ψψ
wavelets.affine called is it complete, is set the If )}({ , tnmψ
∫∞
∞−−== dtntatfawaF mm
nmw )()(),( 002/
0, τψτ
∑∑=m n
nmnm twtf )()( ,, ψ
*, ,( ) ( ) ( ) ( )Orthogonality condition : m n p qt t dt m p n qψ ψ δ δ
∞
−∞= − −∫
The orthogonality condition may not be easy to satisfy for a general set of “wavelets;” but it is still possible to invert discrete wavelet transform, sometimes.
dyadic or octave sampling
50
STFT and CWT
0ω 05ω03ω 08ω04ω0ω 02ω
time
frequ
ency
timefre
quen
cy
Constant-Q bandwidthConstant bandwidth
52
Other Simpler Transforms or Filter Banks
• Short-time analysis can also be viewed as data transformation, executed in a block-by-block manner, using for example:– Discrete Cosine Transform– Discrete Sine Transform– Filter bank based on DFT/FFT
• General remarks:– Discussion so far aims at a filter bank framework that allows perfect
reconstruction (i.e., no loss of information in the original signal if desired);
– But many speech processing systems do not require perfect reconstruction; rather, one may want to focus on “extraction” of key parameters or features from the speech signal;
– Therefore, other analysis methods (e.g., transformations that do not have inverse) may still apply.
53
DFT and DCTN-point DFT (Discrete Fourier Transform)
Relates to real part of 2N-point DFT when the signal is symmetric
implies
2N-pointsymmetric
2N-point DFTN-point DCT
Possible excessive discontinuity at edges
∑∑−
=
−
=
− =→−==1
0
12
0
)/2( )/2cos()(2)()2()( and )()(N
n
N
n
knNj NknnxkXnNxnxenxkX ππ
∑−
=
−=1
0
)/2()()(N
n
knNjenxkX π
But, be careful about the point of symmetry!!
∑∑∞
−∞=
−
=
+==r
N
k
knNjkNj rNnxeeXN
nx )()(1)(~ 1
0
)/2()/2( ππ
54
even around and odd around ; is even around and odd around .
Discrete Cosine Transforms
This image cannot currently be displayed.
∑−
=⎥⎦
⎤⎢⎣
⎡⎟⎠⎞
⎜⎝⎛ +=
1
0 21cos)(
N
nk n
Nknxc π
∑−
=⎥⎦
⎤⎢⎣
⎡⎟⎠⎞
⎜⎝⎛ ++=
1
1 21cos)()0(
21 N
nk k
Nnnxxc π
DCT-I ( )2
1
1 (0) ( 1) ( 1) ( ) cos2 1
Nk
kn
knc x x N x nNπ−
=
⎛ ⎞= + − − + ⎜ ⎟−⎝ ⎠∑
DCT-II
Most common DCT; equivalent (up to a scale factor of 2) to a DFT of 4N real inputs of even symmetry with the even-indexed elements set to zero; i.e., an artificial shift of half sample in the original sampling setup.
DCT-III(IDCT)
0x Nx21
−=kkc21
−= Nk
Still more variants.
Extensively used in audio, image and video
55
Discrete Sine Transform
1,,2,1,0,,1
)1)(1(sin1
2−=
+++
+= Nji
Nij
Nsij L
π
[ ] 10,
−== N
jiijsS
Relates to imaginary part of ~2N-point DFT
Sequence correspondsTo 8-point DFT
3-point sequence 8-point anti-
symmetric sequence
2-D case; eliminate one dimension for 1-D
56
Discrete (Time) Fourier Transform - Revisited• The DTFT (discrete time Fourier transform) of an N-
point sequence is
• Sample the DTFT at
• The result is the DFT (Discrete Fourier Transform)
• If we compute the inverse DFT, we obtain
ω k = (2π/N)k, k = 0,1,K, N −1.
.continuous is Frequency ω∑−
=
−=1
0)()(
N
n
njj enxeX ωω
∑∑∞
−∞=
−
=
+==r
N
k
knNjkNj rNnxeeXN
nx )()(1)(~ 1
0
)/2()/2( ππ
)()()(1
0
)/2(/2 kXenxeX
N
n
knNjNk
j == ∑−
=
−=
ππω
ω