speech coding using lpc. what is speech coding speech coding is the procedure of transforming...
TRANSCRIPT
Speech CodingUsing LPC
What is Speech Coding
Speech coding is the procedure of transforming speech signal into more compact form for
Transmission Available Bandwidth
Encryption
Uncompressed Speech signal
Analog speech is a bandpassed signal between 200 and 3400 Hz.
Uncompressed digital speech is a bit stream at 64kB/s.
Transmission technology must transmit the signals from point A to point B:
with minimum degradationusing minimum bandwidth
Speech coding
By coding we mean an efficient representation of the signal
– COMPRESSION
The main approaches: waveform coding transform coding Parametric / hybrid coding
} smart quantizers
{
How each of these works:
Waveform coders: try to find an efficient representation of the waveform, directly.
Transform coders: try to find an efficient representation in the frequency domain.
Parametric coders: try to find a small set of parameters that are an efficient representation of the signal.
FFT, etc.
)(Hexc. speech
Comparison of Comparison of speech coders
LPC (Linear Predictive coding)
LPC is a model for signal production: it is based on the assumption that the speech signal is produced by a very specific model.
Speech Production in HumaSpeech Production in Humans
The speech signal is created by: A pressure source (lungs),
exciting ... A Filter (Vocal tract:
pharynx - mouth [soft palate, tongue] - nasal cavity)
For DSP Engineer For DSP Engineer
An excitation source A time varying filter
H(t, )
filter:Excitation speech
The model and its representationThe model and its representation
The LPC model looks at speech as: Excitation:
periodic (voiced) - originating in the larynx
noise (unvoiced) - fricative, produced in the mouth
An all-pole filter representing the vocal tract
H()
all polefilter:.. ..
Block Diagram
Why the name “Linear Predictive Coding”
It is assumed that the new sample is the weighted linear combination of previous samples
p
inGeins
ians
1)()()(
Z-Plane RepresentationZ-Plane Representation
In the z-plane we can write the model as a transfer function:
H zG
a zii
i
p( )
1
1
• Clearly this transfer function has only poles - which is why it represents an all pole filter.
Mathematical analysisMathematical analysis
Reminder: our problem is to find the LPC parameters, for a given speech signal. This is called the Inverse Problem.
How do we find the set of parameters that gives the best match to the signal?
What are these Parameters
The Coefficients of the All Pole Filter Pitch of the speech
How do we find the Coefficients: least squares
Formulation: Given a signal s(n); Defining an error as:
Find the set of that will minize the mean square error:
p
ii insansne
1)()()(
ai
E e nn
2( )
Solution:Solution:
Simply equate the derivative of E to zero:
E
ai p
i
0 1, ...
• Which gives us the Normal Equations:
piinsnsinsknsan
p
k nk ...1,)()()()(
1
• These are no more than p linear equations in p unknowns...
Or in matricial form:
n
n
n
p
nnn
nnn
nnn
nspns
nsns
nsns
a
a
a
pnspnsnspnsnspns
pnsnsnsnsnsns
pnsnsnsnsnsns
)()(
)()2(
)()1(
)()()2()()1()(
)()2()2()2()1()2(
)()1()2()1()1()1(
2
1
A correlation; in other words: take the signal, multiply it by a shifted version, and sum.
Since our signal is long and time varying- we did it on short windows
Two variants: autocorrelation method covariance method
?)()( n
inskns
What is each element of the form-
Solving the Matrix
Found the Coefficients a(i) by Using the
Levinson-Durbin recursion method
Second Parameter
Pitch was found by the finding the correlation of the signal window with itself
Then these parameters were transmitted
Predictor coefficients 18 * 8 = 144
Gain 5
Pitch period 6
Voiced/unvoiced switch
1
Total 156
Overall bit rate50 * 156 = 7800
bits / second
Bit rate for plain LPC vocoder
Predictor coefficients
18 * 8 = 144
Gain 5
DCT coefficients
40 * 4 = 160
Total 309
Overall bit rate
50 * 309 = 15450 bits /
second
Bit rate for voice-excited LPC vocoder with DCT
Conclusion Sound produced through LPC method is
not exactly the real sound but it sounds intelligibly understandable
LPC can be used in Speech recognition systems
LPC was widely used in Military because of low bit rate in transmission
There are many variants over the basic scheme: LPC-10, CELP, MELP, RELP, VSELP, ASELP, LD-CELP...