homomorphic speech processing

Slide 1

Unit-3Homomorphic Speech Processing

General Discrete-Time Model ofSpeech Production

Basic Speech Modelshort segment of speech can be modelled as having been generated by exciting an LTI system either by a quasi-periodic impulse train, or a random noise signal.speech analysis => estimate parameters of the speech model, measure their variations (and perhaps even their statistical variability-for quantization) with time.speech = excitation * system responsewant to deconvolve speech into excitation and system response => do this using homomorphic filtering methods

Superposition Principle

Generalized Superposition for Convolution

Homomorphic Filter

Canonic Form for HomomorphicDeconvolution

Canonic Form for Homomorphic Convolution

Properties of CharacteristicSystems

Discrete-Time Fourier Transform Representations

Canonic Form for Deconvolution Using DTFTs

Characteristic System forDeconvolution Using DTFTs

Inverse Characteristic System for Deconvolution Using DTFTs

Issues with Logarithms

Problems with arg Function

Complex Cepstrum Properties

Complex and Real Cepstrum

TerminologySpectrum Fourier transform of signal autocorrelation Cepstrum inverse Fourier transform of log spectrum Analysis determining the spectrum of a signal Alanysis determining the cepstrum of a signal Filtering- linear operation on time signal Liftering linear operation on cepstrumFrequency independent variable of spectrum Quefrency independent variable of cepstrum Harmonic integer multiple of fundamental frequency Rahmonic integer multiple of fundamental frequency

z-Transform Representation

Characteristic System for Deconvolution

Inverse Characteristic System forDeconvolution

Homomorphic Vocoder

The time dependent homomorphic processing leads to a convenient representation in which the basic speech parameters are clearly displayed and isolated from one another.The time dependent complex cepstrum retains all the information of the time dependent fourier transoform,which is exact representation of speech wave.

The cepstrum however, ignore the phase of the time dependent cepstrum cannot uniquely represent the speech waveform.The Cepstrum is a convenient basis for estimating pitch, voicing and format frequencies.The Cepstrum has been used directly as a representation of speech in a system that has been called Homomorphic vocoder.

In the homomorphic vocoder,the cepstrum is computed once every 10-20 msec.Pitch and voicing are estimated from the cepstrum and low time part of each cepstrum is quantized and encoded for transmission or storage.At the synthesizer an approximation to the impulse response is computed from the quantized low time cepstrum and explicitly convolved with an excitation function created at the synthesizer from the pitch , voicing and amplitude information.

Homomorphic vocoder produce very high quality,natural sounding speech .Homomorphic vocoder as in the case of all vocoder system which attempt to separate the speech parameters into excitation and vocal tract parameters, achieves low information rate and provides added flexibility in manipulating the speech signal at the expense of added complexity in the representation and degradation in quality.

Nature of Interfacing soundsDifferent types of interfacing may need different suppression techniques.Noise may be continous,impulsive,or periodic and its amplitude may vary across frequencyExample background or transmission noise is often continuous and brand bandNoise which is not additive can be handled by applying a logarithmic transformation to noisy Signal, either in the time domain or in the frequency domain which converts the distortion to an additive one.

SPEECH ENHANCEMENT (SE) TECHNIQUES

SPECTRAL SUBTRACTION (SS)

FILTERING AND ADAPTIVE NOISE CANCELLATIONFilteringMUlti-Microphone Adaptive Noise Cancellation (ANC)

Thank U

homomorphic speech processing

Education