lpc10 2.4kbps federal standard in speech coding soo hyun bae school of electrical & computer...

18
LPC10 LPC10 2.4kbps federal standard in 2.4kbps federal standard in speech coding speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology <[email protected]> ECE 8873 Data Compression & Modeling 03/17/2004

Upload: jovan-surgent

Post on 14-Dec-2015

232 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

LPC10 LPC10 2.4kbps federal standard in 2.4kbps federal standard in

speech codingspeech coding

LPC10 LPC10 2.4kbps federal standard in 2.4kbps federal standard in

speech codingspeech coding

Soo Hyun Bae

School of Electrical & Computer Engineering

Georgia Institute of Technology<[email protected]>

ECE 8873 Data Compression & Modeling

03/17/2004

Page 2: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

AgendaAgendaAgendaAgenda

1. Taxonomy of Speech Coders

2. LPC10 Properties

3. Voicing Classification

4. Levinson-Durbin Recursion

5. Pitch Detection

6. Synthesize Speech

7. Speech Coder Comparision

Page 3: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Linear PredictionLinear PredictionLinear PredictionLinear Prediction

Speech Coder Standard

FS1015-LPC10 Coefficient 10

FS1016-CELP Code Excitation

MELP Mixed Excitation

IS-54 VCELP Vector Sum Excited

IS-96 QCELP QualComm Code Excited

LD-CELP G.728 Low-Delay Code-Excited

G.729 CS-ACELP Conjugate-structure Algebraic-Code-Excited

LP

LP

LP

LP

LP

LP

LP

Page 4: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

LPC10

Where is LPC10?Where is LPC10?Where is LPC10?Where is LPC10?

• Taxonomy of Speech Coders

Speech Coders

Waveform Coders Vocoders

Time Domain : PCM. ADPCM

Frequency Domain : Sub-band coders,

Adaptive transform coder

Linear Predictive Coder Formant Coders

Waveform Coders : Preserve the signal waveform not speech

Vocoders : Analyze speech, extract parameters, use parameters to synthesize speech

Page 5: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Properties (1)Properties (1)Properties (1)Properties (1)

• So called LPC10 because 10 LP coefficients are used

• Bandwidth: 2.4kbps• Samples/frame : 180 samples• Bits/frame: 54 bits• Frame Size: 22.5ms = 44.44 frames/sec• Target stream : 8khz sampling rate, 16bit

quantization

Page 6: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Properties (2)Properties (2)Properties (2)Properties (2)

• “Buzzy” since noise through parameter updates

• Regularly voiced excitation is unnatural, makes some jitter

• Voicing error produce significant distortions

• Only models speech, doesn’t work if backgound noise. Not suitable to mobile phone application

Page 7: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Encoded streamEncoded streamEncoded streamEncoded stream

LP Coefficients Pitch&Voicing Energy

0 41 48 53- The remaining 1 bit is for synchronization

• LP Coefficients: Levinson-Durbin Recursion

• Pitch & Voicing : Causal & Noncausal Prediction Gain

• Energy : Low-Band Speech Energy

Page 8: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

VocoderVocoderVocoderVocoder

Original Speech

Analysis:• Voiced/Unvoiced decision• Pitch Period (voiced only)• Signal power (Gain)

G

Pulse Train

Random Noise

Vocal TractModel

V/U

Synthesized Speech

DecoderSignal Power

PitchPeriod

Encoder

Page 9: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Voicing Classification(1)Voicing Classification(1)Voicing Classification(1)Voicing Classification(1)

Voiced Source– Generated by vocal cords’ vibrations– Periodic, spacing is the pitch,

Unvoiced Source– Generated without vibrations– Excitation is modeled by a White Gaussian Noise source– No pitch

How to discriminate?

0F

Fisher’s Method

Page 10: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Voice Classification (2)Voice Classification (2)Voice Classification (2)Voice Classification (2)

Compute R(0)

R(0) > R(0) for noise ?Compute LPC and

Pitch Detection

Yes

Silence PeriodNo

Page 11: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Pitch & Voicing (1)Pitch & Voicing (1)Pitch & Voicing (1)Pitch & Voicing (1)

• If x(n) is periodic in N, R(k) is also periodic in N• Hard to compute

1

0

)()()(kN

m

kmxmxkR

1

0

)()()(kN

m

cc kmxmxkR

otherwise

Cnxif

Cnxif

nx L

Lc

0

)(1

)(1

)(

Page 12: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Pitch & Voicing (2)Pitch & Voicing (2)Pitch & Voicing (2)Pitch & Voicing (2)

Page 13: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Reflection Coefficient (1)Reflection Coefficient (1)Reflection Coefficient (1)Reflection Coefficient (1)

• Human auditory system is more sensitive to poles then to zeros

Where G is the gain, p is the order, a’s are poles

p

iii zaza

GzH

1

*1 )1)(1(

)(

Page 14: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Reflection Coefficient (2)Reflection Coefficient (2)Reflection Coefficient (2)Reflection Coefficient (2)

j

j

j

j

j

j

j

j

j

j

j

j

j

a

ja

ja

ja

a

a

R

0

0

0

0

0

0

1

)1(

)1(

)(

0

0

)(

)2(

)1(

1

111

• Levinson-Durbin Recursion for all-pole model

)(

)3(

)2(

)1(

)0()3()2()1(

)3()0()1()2(

)2()1()0()1(

)1()2()1()0(

3

2

1

pR

R

R

R

a

a

a

a

RpRpRpR

pRRRR

pRRRR

pRRRR

p

Toeplitz

Page 15: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Energy – Gain CoefficientEnergy – Gain CoefficientEnergy – Gain CoefficientEnergy – Gain Coefficient

• From autocorrelation matching property, G is calculated from MSE given by Levinson-Durbin Revursion

• Transmit the coefficient G• Recall

p

kPk kRaRG

1

2 )()0(

p

iii zaza

GzH

1

*1 )1)(1(

)(

Page 16: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Synthesize speechSynthesize speechSynthesize speechSynthesize speech

G

Pulse Train

Random Noise

H(z)

V/U

Synthesized Speech

DecoderSignal Power

PitchPeriod

• Recall the Encoder/Decoder structure

Page 17: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

Speech Coder ComparisonSpeech Coder ComparisonSpeech Coder ComparisonSpeech Coder Comparison

Original

Page 18: LPC10 2.4kbps federal standard in speech coding Soo Hyun Bae School of Electrical & Computer Engineering Georgia Institute of Technology ECE 8873 Data

ReferencesReferencesReferencesReferences

• Welch V.C., Tremain T.E., Campbell J. P. Jr., “A comparison of US Government standard voice coders”, MILCOM’89, Vol. 1, pp269-273, 1989.

• Cox R. V., “Three New Speech Coders from the ITU Cover a Range of Applications”, Comm. Magazine of IEEE, Vol. 35, pp40-47, 1997

• Campbell J. P. Jr., Tremain T.E., “Voiced/Unvoiced Classification of Speech with Applications to the U.S. Government LPC-10E Algorithm”, ICASSP86, Vol. 11, pp473-476, 1986

• http://www.ee.ucla.edu/~ingrid/ee213a/speech/speech.html

• http://mia.ece.uic.edu/~papers/WWW/MultimediaStandards/

• http://www.ecse.rpi.edu/Homepages/shivkuma/

• http://www.eee.strath.ac.uk/r.w.stewart/index2.htm

• http://web.syr.edu/~gsriniva/tech/docs/

• http://www.speech.cs.cmu.edu/comp.speech/Section3/Software/celp-3.2a.html

• http://www.arl.wustl.edu/~jaf/lpc/• http://www.ecsl.cs.sunysb.edu/cse660/speech.html