rasta processing of speech

Post on 13-May-2015

490 Views

Category:

Science

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

A presentation of Hermansky & Morgan's 1994 paper, RASTA Processing of Speech. Learn the dramatic effect of RASTA on critical band analysis when combined with PLP to do speech detection! Hermansky, Hynek, and Nelson Morgan. "RASTA processing of speech." Speech and Audio Processing, IEEE Transactions on 2.4 (1994): 578-589.

TRANSCRIPT

RASTA Processing of SpeechHynek Hermansky & Nelson Morgan

The Question

Stochastic techniques to derive information from sound seems wasteful, especially since non-speech components have a predictable effect on speech signal.

Can we suppress spectral components that change too quickly or slowly to be speech?

The Answer

RASTA - much like human listeners, isolates not the speech components, but the relative spectral changes in order to reduce slowly changing or steady state factors (noise!). This emphasizes changes/“edges”.

Quick disclaimer: we definitely know what we’re talking about

Edge Detection

Inspiration

Humans can perceive speech like sounds depending on the spectral difference between the current sound and the preceding sound.

Sounds!

An analogous situation might occur in time-reversed speech:

Intelligibility of Time Reversed Speech

Filters

More Sounds!

What band pass filters sound like from Chris’ experiments.

Speech Processing Reviewhttp://www.learnartificialneuralnetworks.com/images/srfig01.jpg

Speech Processing Reviewhttp://www.learnartificialneuralnetworks.com/images/srfig01.jpg

Perceptual Linear Predictionhttp://svr-www.eng.cam.ac.uk/~ajr/SA95/img181.gif

Replace conventional critical-band short term spectrum in PLP analysis with spectral estimate from frequencies band-pass filtered via a sharp spectral zero.

New estimate is less sensitive to variations.

The RASTA Method

1. Compute critical-band power spectrum (PLP)2. Transform spectral amplitude through compressing static

nonlinear transformation (RASTA)3. Filter the time trajectory of each transformed spectral

component (RASTA)4. Transform the filtered speech representation through

expanding static nonlinear transformation (RASTA)5. Multiply by the equal loudness curve and exponentiate by

0.33 to simulate hearing (PLP)6. Compute an all-pole model of the result (PLP)

RASTA-PLP

The Key→ suppress constant factors in the auditory-like spectrum, prior to estimation of language model.

Research issues:- What domain is filtering in?- What filter to use?

Speech Signal

Spectral Analysis

Bank of Compressing Static Nonlinearities

Bank of Linear Bandpass Filters

Bank of Expanding Static Nonlinearities

Continued Processing

For this paper: an IIR filter with this transfer function

Resulting Filter

- Affects choice of compressing/expanding static nonlinear function (The domain):

1. Logarithmic2. Lin-Log

Two Flavors of RASTA

Logarithmic Amplitude Transformation (step 2)Antilogarithmic (exponential) transformation (step 4)

Natural Logarithm dependent on J, a signal-dependent positive constant that is linear like for J < 1 and logarithmic like for J > 1

J=0.1

J=1.0

Results

Digits recorded over phone lines, with or without noise or changes in noise over time

Isolated Digits Recognition

Large Vocab Continuous Speech

Four speakers each reading 2,652 sentencesSentences were preserved as recorded or had a low-pass filter applied to them

Next Experiments

● Let’s train the model in with no noise and then test it in a situation with noise in the background

● Analogous to software assembled in the factory and used in the real world

● RASTA > PLP when noise changes between training and test

● Success of RASTA depends on transform of signal

Isolated Digits Recognition

Large Vocab Continuous Speech

● Again, success depends on filter used

Optimizing J

● It seems important, then, to pick an appropriate J = domain parameter, for each level of noise

● This can be approximated by measuring energy at the first part of an utterance

● Performance improves even more!

Consequences of RASTA Processing

● Most important advance of RASTA: compare current information to previous information

● This highlights transitions and changes → edge detection!

top related