an analysis/synthesis auditory filterbank atr human ... · analysis/synthesis gammachirp filterbank...
TRANSCRIPT
- 1 -
An Analysis/Synthesis Auditory Filterbank
Based on an IIR Implementation of the Gammachirp
Toshio Irino* and Masashi Unoki*,**
* ATR Human Information Processing Research Labs.
2-2 Hikaridai Seika-cho Soraku-gun Kyoto, 619-0288, JAPAN* * Japan Advanced Institute of Science and Technology
1-1 Asahidai Tatsunokuchi Nomi Ishikawa, 923-1292, JAPAN
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 2-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Abstract
This paper proposes a new auditory filterbank that enables signal resynthesis
from dynamic representations produced by a level-dependent auditory
filterbank. The filterbank is based on a new IIR implementation of the
gammachirp, which has been shown to be an excellent candidate for
asymmetric, level-dependent auditory filters. Initially, the gammachirp filter is
shown to be decomposed into a combination of a gammatone filter and an
asymmetric function. The asymmetric function is excellently simulated with a
minimum-phase IIR filter, named the “asymmetric compensation filter”. Then,
two filterbank structures are presented each based on the combination of a
gammatone filterbank and a bank of asymmetric compensation filters controlled
by a signal level estimation mechanism. The inverse filter of the asymmetric
compensation filter is always stable because the minimum-phase condition is
satisfied. When a bank of inverse filters is utilized after the gammachirp
analysis filterbank and the idea of wavelet transform is applied, it is possible to
resynthesize signals with small time-invariant errors and achieve a guaranteed
precision. This feature has never been accomplished by conventional active
auditory filterbanks. The proposed analysis/synthesis gammachirp filterbank is
expected to be useful in various applications where human auditory filtering has
to be modeled.
Keywords: Auditory filterbank, Level-dependent asymmetric spectrum,
Analysis/synthesis system, Wavelet, Gammatone
- 1 -
1. INTRODUCTION
Intensive efforts have been made to introduce human auditory
characteristics into the signal processing for telecommunications systems
including a recent example in audio coding. A number of auditory models have
been proposed to simulate the peripheral auditory system (for a review, see
Giguère and Woodland, 1994), but none of them have been used as successfully
as linear predictive analysis and the Fourier transforms in such systems. One
obvious reason for this has been the processing speed; fast digital signal
processors should resolve this problem in the near future. One of the other
major reasons might be that no signal resynthesis procedure is provided with
any realistic auditory model.
Linear auditory filterbanks, or wavelet transforms, have been used for
signal resynthesis (Combes et. al, 1989; Yang et. al, 1992), but they are unable
to account for the dynamic characteristics of basilar membrane motion. Iterative
procedures to reconstruct signals from cochleagrams (i.e., short-time averaged
amplitude responses of basilar membrane motion without phase information)
(Irino and Kawahara, 1993; Slaney, 1995) are applicable to such nonlinear
filterbanks, but they do not guarantee the precision of the resynthesis due to
local minima. Thus, it would be desirable to have a dynamic auditory
filterbank that also provides a sound resynthesis procedure resulting in no
perceptual distortion. This paper shows that it is possible to derive such an
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 2-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
analysis/synthesis filterbank with time-varying coefficients through a new
implementation of the "gammachirp" (Irino, 1995, 1996; Irino and Patterson,
1997).
The gammachirp was analytically derived as a function satisfying
minimal uncertainty in joint time-scale representations (Cohen, 1993; Irino,
1995, 1996). The gammachirp auditory filter is an extension of the popular
gammatone filter (for a review, see Patterson et. al 1995); it has an additional
frequency-modulation term to produce an asymmetric amplitude spectrum.
When the degree of asymmetry is associated with the stimulus level, the
gammachirp filter can provide an excellent fit to 12 sets of notched-noise
masking data from three different studies (Irino and Patterson, 1997). The
gammachirp has a much simpler impulse response than recent physiological
models of cochlear mechanics (Giguère and Woodland, 1994), which have not
provided a good fit to human masking data. Moreover, the chirp term in the
gammachirp is consistent with physiological observations on frequency-
modulations or frequency “glides” in mechanical responses of the basilar
membrane (Møller and Nilsson, 1979; de Boer and Nuttall, 1997; Recio et. al,
1998).
The gammachirp filter has been implemented as a finite impulse
response (FIR) filter because the gammachirp is defined as a time-domain
function. Application to an auditory filterbank, however, poses some problems.
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 3-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
For simulation of the dynamic characteristics of the cochlea, for instance, the
filter coefficients have to be recalculated and applied to the signal for each
sample time. Unfortunately, the large number of FIR coefficients, especially at
low frequencies, precludes fast filtering. Moreover, this simulation becomes
unrealistic if the signals are stored until all FIR coefficients are recalculated for
every sample point. The calculation of the filter output and the update of the
filter coefficients should be performed almost simultaneously. Therefore, the
gammachirp filter needs to be implemented with a small number of filter
coefficients which dictates that it is an infinite impulse response (IIR) filter
(Irino and Unoki, 1997a,b).
IIR implementations of modified gammatone filters have been
developed to introduce asymmetry into auditory filter shapes, i.e., the All-Pole
Gammatone Filter (APGF) or the One-Zero Gammatone Filter (OZGF)
(Slaney, 1993; Lyon, 1996). The degree of filter asymmetry has been associated
with signal level using a level estimation circuit (Pflueger et. al, 1998). The
shapes of these filters, however, depend on the sampling rate of the system
(Irino and Unoki, 1997a) and have not been directly fitted to psychoacoustical
masking data. Moreover, it has not been demonstrated that signals are
resynthesized from the output of such nonlinear filterbanks with time-varying
asymmetric filters. These are main topics of this paper.
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 4-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
2. IMPLEMENTATION OF GAMMACHIRP FILTER
2.1 Definition and Fourier transform of the gammachirp
The complex impulse response of the gammachirp (Irino, 1995, 1996;
Irino and Patterson, 1997) is given as
gc(t)= at n−1 exp −2πbERB( fr) t( ) exp j2πf rt + jc ln t + jφ( ), (1)
where time t>0, a is the amplitude, n and b are parameters defining the
envelope of the gamma distribution, and fr is the asymptotic frequency. c is a
parameter for the frequency modulation or the chirp rate, φ is the initial phase,
ln t is a natural logarithm of time, and ERB (fr) is the equivalent rectangular
bandwidth of an auditory filter at fr. At moderate levels, ERB(fr)=24.7+0.108fr
in Hz (Glasberg and Moore, 1990). When c=0, the chirp term, cln t, vanishes
and this equation represents the complex impulse response of the gammatone
defined by the envelope which is a gamma distribution function and the carrier
which is a sinusoid at frequency fr (Patterson et. al, 1995). Accordingly, the
gammachirp is an extension of the gammatone with a frequency modulation
term.
The Fourier transform of the gammachirp in Eq. (1) is derived as
follows.
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 5-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
GC ( f) =aΓ(n + jc)e jφ
{2πbERB( fr ) + j2π( f − fr)}n + jc
=a
{2π b 2 + ( f − fr )2 ⋅ e jθ}n + jc
= a ⋅1
{2 π b 2 + ( f − fr)2 }n ⋅e jn θ
⋅1
{2π b 2 + ( f − f r)2 }jc ⋅ e−cθ
= a ⋅1
{2 π b 2 + ( f − fr)2 }n
⋅e− jnθ
⋅ e cθ ⋅ e− jc ln{2π b 2 +( f − fr ) 2 }
(2)
θ = arctanf − fr
b (3)
where a = aΓ(n + jc)e jφ and b = bERB( fr) . The first term a is a constant. The
second term is known as the Fourier spectrum of the gammatone,GT ( f ). The
third term represents an asymmetric function, HA( f ) , that is described in more
detail in the next subsection. When the amplitude is normalized (a = 1), the
frequency response of the gammachirp is
GC ( f) = GT( f ) ⋅ HA( f ) . (4)
The amplitude spectrum is
| GC( f ) |=| GT ( f ) |⋅ | HA ( f ) |=1
{2π b 2 + ( f − fr)2 }n
⋅e cθ . (5)
Obviously, when c=0, | HA ( f) |(=e cθ ) becomes unity and Eq. (5) represents the
amplitude spectrum of the gammatone, | GT( f ) |. Figure 1 shows the amplitude
spectra of (a) a gammachirp filter | GC( f ) |, (b) a gammatone filter | GT( f ) |,
and (c) an asymmetric function | HA ( f) | when the chirp parameter c=-2. The
amplitude of | HA ( f) | is biased by about -4 dB to normalize the peak of
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 6-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
| Gc( f ) | to 0 dB. Since the amplitude spectrum of the gammatone filter
| GT( f ) | is symmetric on a linear frequency axis, the asymmetric function
| HA ( f) | introduces spectral asymmetry and a shift of the peak frequency into
the gammachirp spectrum | GC( f ) |.
--- Insert Figure 1 about here ---
The peak frequency fp in the amplitude spectrum can be obtained
analytically by setting the derivative of Eq. (5) to zero and solving the equation
for the frequency. The result is
fp = fr +c ⋅b
n= f r +
c ⋅bERB( fr )
n. (6)
Therefore, the size of the peak shift is proportional to the chirp parameter c and
the ratio of the envelope parameter b ERB(fr) to n.
2.2 Characteristics of the gammachirp and the asymmetric function
To describe the spectral characteristics of the gammachirp and the
asymmetric function precisely, Eq. (4) is rewritten in a form that explicitly uses
the relevant parameters; that is,
GC ( f;n,b,c, fr ) = GT ( f ;n,b, fr ) ⋅ HA( f ;b,c, fr ). (7)
The asymmetric function uses parameters b, c, and fr whereas the gammatone
uses parameters n, b, and fr.
--- Insert Figure 2 about here ---
Figure 2 shows the amplitude spectra of (a) the gammachirp
| GC( f ;n,b,c, f r) | and (b) the asymmetric function | HA ( f;b,c, fr ) | when the
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 7-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
values of the chirp parameter c are integers between –3 and 3. Several
characteristics are derived from this figure and the equations described above.
(a) Figure 2(a) shows that the filter slope below the peak frequency is shallower
than the slope above it in the gammachirp when the parameter c is negative.
The situation is the reverse when the parameter c is positive. The filter shape
is symmetric when c is zero because it is the gammatone.
(b) The asymmetric function | HA ( f;b,c, fr ) | in Fig. 2(b) is an all-pass filter
when c=0. Using Eq. (2),
HA( f ;b,0, fr) = 1. (8)
| HA ( f;b,c, fr ) | is a high-pass filter when c>0, and a low-pass filter when
c<0. The slope and the range of the amplitude increase when the absolute
value of c increases. The filter shapes of the gammachirps in Fig. 2(a) reflect
these characteristics.
(c) | HA ( f;b,c, fr ) | changes monotonically in frequency. Neither a peak nor a
dip ever occurs in this function.
(d) The gain of the asymmetric function is anti-asymmetric. For an arbitrary
frequency f1,
| HA ( fr − f1; b,c , fr) |=| HA( fr + f1;b,c, fr) |−1. (9)
(e) With Eq. (2), this produces
HA( f ;b,c, fr ) = HA( f ;b,−c, f r)−1 . (10)
(f) For arbitrary chirp parameters c1 and c2,
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 8-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
HA( f ;b,c1 + c2 , fr) = HA( f; b,c1, fr ) ⋅ HA( f ;b, c2 , fr ). (11)
(g) Using Eqs. (7), (10), and (11),
GC ( f;n,b,c1, f r) = GT ( f ;n, b, fr) ⋅ HA( f ;b,c1 , fr)
= GT ( f;n,b, f r) ⋅ HA( f ;b,c1 + c2 , fr) ⋅ HA ( f ;b,−c2, fr )
= GC( f ;n, b, c1 + c2, fr) ⋅ HA ( f ;b,−c2 , fr ) (12)
Equation (12) states that a gammachirp, with an arbitrary chirp parameter c1, is
a product of a gammachirp with a different chirp parameter c1 +c2, and an
asymmetric function having the difference between them, -c2. This is because
the asymmetric function HA( f ;b,c, fr ) is an exponential function in parameter
c.
These characteristics are necessary conditions for designing the
approximation filter in the next subsection, and they act as a guide for
establishing an analysis/synthesis filterbank in Section 3.
2.3 Asymmetric compensation filter
As shown by Eq. (4), a gammachirp filter can be implemented by
cascading a gammatone filter and an asymmetric filter. Since efficient
implementations of the gammatone are already known (Slaney, 1993; Patterson
et. al, 1995), this section concentrates on an approximation filter for the
asymmetric function described in the previous section. It is necessary to design
a filter satisfying the conditions (a) to (g) in the previous section. As a first
step, a filter satisfying condition (d) is considered because this characteristic
seems the most relevant for filter design purposes.
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 9-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
IIR filters satisfying Eq. (9) have a pair of a pole and a zero
symmetrically located at fr+∆fk and fr−∆fk , and the number of pairs corresponds
to the number of ∆fk. This is so the absolute values, r, of the corresponding
poles and zeros are equal. They must be inside the unit circle so that IIR filters
converge; this is known as the minimum phase condition (Oppenheim and
Schafer, 1975). Since the bandwidth gets narrower when r gets closer to unity, r
is negatively correlated to the bandwidth parameter bERB(fr). Condition (b)
implies that ∆f is proportional to c and is positively correlated with bERB(fr). A
cascaded second-order digital filter satisfying these properties is
HC (z) = HCk(z)k =1
N
∏ , (13)
HCk (z) =(1− rke
jϕk z−1)(1− rke− jϕ kz−1)
(1− rkejφk z−1)(1− rke
− jφ k z−1 ), (14)
rk = exp{−k ⋅ p1 ⋅ 2π bERB(f r)/ fs} , (15)
φk = 2π{ fr + p0k −1 ⋅ p2 ⋅ c⋅ bERB( fr)}/ fs , (16)
ϕ k = 2π{fr − p0k− 1 ⋅ p2 ⋅c⋅ bERB( fr)}/ fs , (17)
where p0, p1, and p2 are positive coefficients and fs is the sampling rate. The
reason for cascading filters with slightly offset poles and zeros is to satisfy
condition (c) approximately. This filter is referred to as an "asymmetric
compensation (AC)" filter.
--- Insert Figure 3 about here ---
Figure 3 shows the amplitude spectra of this digital filter | HC( f ) |
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 10-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
(dashed lines) and the asymmetric function | HA ( f) | (solid lines) in Eq. (5) as a
function of the chirp parameter c. The number of cascaded filters was four, the
amplitude was normalized at frequency fr, and the values of p0, p1, and p2 were
set properly (described in the next subsection). The dashed lines are very close
to the solid lines when the frequency is less than 3000 Hz. Above 3000 Hz,
however, the disparity gets larger. This, however, does not cause serious
problems because the asymmetric compensation filter is always accompanied
by the gammatone filter, which is a band-pass filter.
Actually, the results will show that four cascaded second-order filters
are sufficient when the parameter b is equal to or greater than unity and the
chirp parameter c is between -3 and 1 (see subsection 2.4.1). In this case, the
numbers of poles and zeros are 16 in total. Although it is possible to improve
the fitting by increasing the number of cascaded filters, a reasonable number of
stages is determined by considering the trade-off between the number of
coefficients and the degree of fitting.
2.4 Asymmetric compensation gammachirp
The asymmetric compensation filter cascaded to the gammatone filter
approximates the gammachirp filter. The amplitude spectrum of this filter is
found, by replacing | HA ( f) | with | HC( f ) | in Eq. (5),
| GCAC( f ) |=| GT ( f ) |⋅ | HC ( f) |. (18)
This filter GCAC ( f ) is referred to as an "Asymmetric Compensation -
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 11-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
gammachirp" or "AC-gammachirp" filter until the end of Section 2, so as to
distinguish it from the original gammachirp defined by Eq. (1).
--- Insert Figure 4 about here ---
2.4.1 Comparison of the amplitude spectrum
Figure 4 shows the amplitude spectra of the gammachirp | GC( f ) | in Eq.
(5) (solid lines) and the AC-gammachirp | GCAC( f ) | in Eq. (18) (dashed lines).
The amplitude | GCAC( f ) | has been normalized to improve the fit. The
frequency for normalizing the amplitude of each second-order filter is closely
related to the peak shift in Eq. (6), and for the k-th filter,
f = f r + k ⋅ p3 ⋅c ⋅bERB( fr )/ n . (19)
The coefficients p0, p1, p2, and p3 are set heuristically as
p0 = 2, (20)
p1 = 1.35 - 0.19 |c|, (21)
p2 = 0.29 - 0.0040 |c|, (22)
p3 = 0.23 + 0.0072 |c|. (23)
The root-mean-squared (rms) error between the original gammachirp
filter and the AC-gammachirp filter is less than 0.41 dB in Fig. 4 in the range
where | GC( f ) |> −50 dB . The average rms error is only 0.63 dB for 90 sets of
parameter combinations {n = 4; b = 1.0, 1.35, and 1.7; c = 1, 0, -1, -2, and -3; fr
= 250, 500, 1000, 2000, 4000, and 8000 (Hz) }, i.e., about the range of
parameter values in a typical fit (Irino and Patterson, 1997). The rms error
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 12-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
exceeds 2 dB only for three sets when fr = 8000 Hz and c = -3.
The fit improved only slightly when the coefficients in Eqs. (21), (22),
and (23) were optimized using an iterative least squared-error method. It would
be possible to improve the fit by changing the locations of the poles and zeros
defined in Eqs. (15), (16), and (17), but that is beyond the scope of this paper.
--- Insert Figure 5 about here ---
2.4.2 Comparison of the impulse response and the phase spectrum
Figure 5(a) shows an example of the impulse response of the
gammachirp defined in Eq. (1) (solid line) and the AC-gammachirp
corresponding to Eq. (18) (dashed line). The difference in the impulse response
between the original gammachirp and the AC-gammachirp is about -50 dB
SNR and therefore is negligible. Their phase spectra shown in Fig. 5(b) are very
close to each other. Therefore, the AC-gammachirp is able to provide an
excellent approximation to the original gammachirp in terms of phase
characteristics, i.e.,
GC ( f) ≅ GCAC( f ) = GT( f ) ⋅ HC( f ), (24)
gc(t) ≅ gcAC(t) = gt(t) *hc (t) , (25)
where * denotes the convolution.
2.4.3 Similarity between the AC-gammachirp and the original gammachirp
The similarity between the AC-gammachirp filter and the original
gammachirp filter is discussed in this subsection. The characteristics of the
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 13-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
asymmetric function HA( f ) are listed as the conditions (b), (c), (d), (e), and (f)
for the design of the asymmetric compensation filter HC ( f) . For condition (b),
HC ( f) strictly satisfies Eq. (8) and the other conditions. When setting c to 0,
the phases of the poles and zeros in Eqs. (16) and (17) become the same, and
then, Eqs. (13) and (14) become unity. It is obvious from Fig. 3 that HC ( f) is
high-pass when c>0 and low-pass when c<0. For condition (c), the asymmetric
compensation filter HC ( f) approximately satisfies the condition in positive
frequencies when fr> 0. HC ( f) has slopes centered at +fr and -fr in the
amplitude spectrum, whereas HA( f ) has a slope centered at fr. This is
because HC (z) is designed to have real coefficients using conjugate pairs of
poles and zeros to accompany a gammatone filter with a real sinusoidal carrier.
For condition (d), HC ( f) strictly satisfies Eq. (9) because changing fr+f1 to fr-f1
simply replaces the denominator and the numerator of Eq. (14). HC ( f) strictly
satisfies Eq. (10) for condition (e). Changing the sign of c replaces the poles
and zeros in Eqs. (16) and (17), moreover, it is possible to derive a stable
inverse filter since the asymmetric compensation filter satisfies the minimum
phase condition. The inverse filter is always stable even if the parameter values
are time varying. Accordingly, it is possible to cancel out the forward filter with
the inverse filter. Then, the total response of the combination is a unit impulse.
This feature leads to an analysis/synthesis filterbank (described in subsection
3.3). For condition (f), HC ( f) approximately satisfies Eq. (11), as shown in
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 14-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 3.
Consequently, the AC-gammachirp filter follows condition (a) and
approximately satisfies condition (g) (Eq. (12)). Since the IIR asymmetric
compensation filter has few coefficients, fast level-dependent auditory filtering
can be performed by combining the compensation filter with a fast
implementation of the gammatone (Slaney, 1993; Patterson et. al, 1995).
3. GAMMACHIRP FILTERBANK
This section presents an analysis/synthesis gammachirp filterbank with
time-varying coefficients and a parameter controller based on sound level
estimation. Initially, we describe examples of the analysis filterbank to consider
a basic structure that establishes a synthesis procedure that should be
independent of the method of parameter control when used in various
applications.
--- Insert Figure 6 about here ---
3.1 Analysis filterbanks
Figure 6 shows an example of a gammachirp filterbank consisting of a
gammatone filterbank, a bank of asymmetric compensation filters, and a
parameter controller (Irino and Unoki, 1997a,b, 1998). It is a straightforward
implementation of Eqs. (24) and (25) for each filter. Since the auditory filter
shape is level-dependent (Lutfi and Patterson, 1984; Glasberg and Moore, 1990
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 15-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
for a review; Irino and Patterson, 1997), the sound level of incoming signals is
estimated in the parameter controller using the output of the asymmetric
compensation filterbank. An example of the parameter controller is shown at
the right-bottom. The controller consists of a bank of rectifiers, leaky
integrators (LI) and level-to-parameter converters. It is possible to consider a
number of implementations for level estimation (for example, Giguère and
Woodland, 1994; Pflueger et. al, 1998), but they are basically similar in level
estimation at the output of bandpass filters (see discussion in Rosen and Baker,
1994). This filterbank has been applied to noise suppression (Irino, 1999); here,
we introduce physiological knowledge into the filterbank structure.
When the sound level is sufficiently high, the cochlear filter has a broad
bandwidth and behaves like a passive and linear filter. As the signal level
decreases, the filter gain increases and the bandwidth becomes narrower
because of the active processes (Pickles, 1988 for a review). This suggests that
a physiologically plausible auditory filter would be a combination of a linear
broadband filter and a nonlinear level-dependent filter that sharpens the filter
shape. Recent observations have shown that the frequency modulation or
“glide” persists even in post-mortem or at high sound pressure levels (Recio et.
al, 1998). Accordingly, the linear filter can be simulated with a broadband
gammachirp filter. As shown in Eq. (12), a gammachirp filter with an arbitrary
chirp parameter c can be produced with a combination of another gammachirp
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 16-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
filter and an asymmetric function. Therefore, the second filter can be simulated
using a level-dependent asymmetric compensation filter as long as the total
filter response is simulated using the gammachirp.
--- Insert Figure 7 about here ---
Accordingly, a candidate filterbank structure is proposed in Fig. 7. It
consists of a linear gammatone filterbank, a linear asymmetric compensation
filterbank, and a level-dependent asymmetric compensation filterbank
controlled by a parameter controller. The output of the linear asymmetric
compensation filterbank is equivalent to the output of a linear gammachirp
filterbank (dashed box (c) at top). This output is fed into the asymmetric
compensation filterbank to obtain the total output (d). The parameter controller
is similar to that described above. The structure is based on a combination of a
linear filterbank with bandpass filters and a nonlinear asymmetric compensation
filterbank, as in the previous filterbank. For signal processing applications, this
filterbank structure is very important in facilitating the synthesis procedure.
However, to determine the parameters, it is necessary to wait for results on
gammachirp fits to psychoacoustical masking data across the full range of
center frequencies (for preliminary results, Irino and Patterson, 1999).
--- Insert Figure 8 about here ---
3.2 Analysis/synthesis filterbank
One of the most important features of the gammachirp filterbank is its
ability to establish an analysis/synthesis system as shown in Fig. 8. Moreover,
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 17-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
this feature, as discussed in the following, is valid for any kind of parameter
controller. Initially, a signal (a) is filtered by a linear gammachirp filterbank
(A). When the chirp parameter c is set to zero for all channels, it is a
gammatone filterbank. The output of the linear filterbank (b) is converted into
the output of the level-dependent gammachirp filterbank (c) using a bank of
asymmetric compensation filters (B) controlled by the parameter controller (C).
Since the asymmetric compensation filter is an IIR minimum phase filter, it is
possible to make a bank of inverse asymmetric compensation filters (D) by
exchanging poles and zeros. When the time-varying coefficients for the
original and inverse filterbanks are always the same at each sampling point, the
output of the level-dependent gammachirp filterbank (c) is converted into a
representation (d), which is strictly the same as the output of the linear
gammachirp filterbank (b). The filterbank output is, then, equalized in phase
using the time-reversal gammachirp filterbank (E), which is the same as the
linear filterbank (A), except that the impulse response of each filter is reversed
in time. Finally, the output with phase equalization is summed with a
weighting function to reproduce the signal.
A combination of the linear analysis filterbank (A), the linear synthesis
filterbank (E), and the weighted sum (F) is almost equivalent to a linear,
wavelet, analysis/synthesis procedure (Combes et. al, 1989). Since the
combination of the asymmetric compensation filterbank (B) and its inverse
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 18-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
filterbank (D) produces unit impulses for all channels, the error between the
original and synthetic signals is strictly determined by this linear
analysis/synthesis filterbank.
--- Insert Figure 9 about here ---
Figure 9 shows an example of analysis/synthesis frequency
characteristics for a level-dependent gammachirp filterbank with equally-
spaced filters for ERB rates between 100 and 6000 Hz using a gammatone
filterbank in (A) and (E), i.e., the gammachirp filterbank when c =0 for all
channels. Figure 9(b) shows the same graph with a magnified ordinate scale.
The maximum error is less than 0.01 dB with 100 channels and is only about
0.03 dB even with 50 channels. It appears that about 100 channels are sufficient
to minimize the errors. Moreover, the errors are completely independent of
parameter control. Consequently, the gammachirp filterbank is able to perform
signal resynthesis without producing any undesirable distortion.
The discussion above guarantees the minimum distortion of the
analysis/synthesis filterbank system. This filterbank is applicable to various
applications when inserting a modification block between the asymmetric
compensation filterbank (B) and its inverse filterbank (C). For example, it is
possible to construct a noise-suppression filterbank which does not produce any
musical noise that is perceptually undesirable (Irino, 1999).
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 19-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
4. SUMMARY
This paper presents an analysis/synthesis auditory filterbank using the
gammachirp. Initially, the gammachirp function is analyzed to find
characteristics for effective digital filter simulation. The gammachirp filter is
shown to be approximated excellently by the combination of a gammatone filter
and an IIR asymmetric compensation filter. The new implementation reduces
the computational cost for time-varying filtering because both filters can be
implemented with only a few filter coefficients. Since the IIR asymmetric
compensation filter is a minimum phase filter, the inverse filter is also stable.
Two examples of gammachirp filterbanks are presented; each is a combination
of a linear gammachirp filterbank and a bank of level-dependent asymmetric
compensation filters, controlled by the signal-level estimation mechanism. A
synthesis procedure for such analysis filterbanks is proposed to accomplish
signal resynthesis with a guaranteed precision and no undesirable distortion.
This feature has never been accomplished with conventional auditory
filterbanks. The analysis/synthesis gammachirp filterbank with time-varying,
level-dependent coefficients is usable in various signal processing applications
requiring the modeling of human auditory filtering.
ACKNOWLEDGMENTS
The authors wish to thank Roy D. Patterson of Cambridge Univ., for his
continuous advice, Minoru Tsuzaki and Hani Yehia of ATR-HIP, and Malcolm
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 20-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Slaney of Interval Research for their valuable comments. A part of this work
was performed while the second author was a visiting student at ATR-HIP. The
authors also wish to thank Masato Akagi of JAIST, and Yoh'ichi Tohkura,
Hideki Kawahara, and Shigeru Katagiri of ATR-HIP for the arrangements. This
work was partially supported by CREST (Core Research for Evolutional
Science and Technology) of the Japan Science and Technology Corporation
(JST).
REFERENCES
Cohen, L. (1993). "The scale transform," IEEE Trans. Signal Processing,
41,3275-3292.
Combes, J. M., Grossmann, A. and Tchamitchian, Ph. Eds. (1989). "Wavelets,"
Springer-Verlag, Berlin.
de Boer, E. and Nuttall, A. L. (1997). "The mechanical waveform of the basilar
membrane. I. Frequency modulations (''glides'') in impulse responses and
cross-correlation functions," J. Acoust. Soc. Am., 101, 3583-3592.
Giguère, C. and Woodland, P. C. (1994). "A computational model of the
auditory periphery for speech and hearing research. I. Ascending path," J.
Acoust. Soc. Am., 95, 331-342.
Glasberg, B. R. and Moore, B. C. J. (1990). ” Derivation of auditory filter
shapes from notched-noise data,” Hear. Res., 47, 103-138.
Irino, T. and Kawahara, H. (1993). "Signal reconstruction from modified
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 21-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
auditory wavelet transform," IEEE Trans. Signal Processing, 41, 3549-
3554.
Irino, T. (1995). "An optimal auditory filter," in IEEE Signal Processing
Society, 1995 Workshop on Applications of Signal Processing to Audio
and Acoustics, New Paltz, NY.
Irino, T. (1996). "A 'gammachirp' function as an optimal auditory filter with the
Mellin transform," IEEE Int. Conf. Acoust., Speech Signal Processing
(ICASSP-96), 981-984, Atlanta GA.
Irino, T. (1999). "Noise suppression using a time-varying, analysis/synthesis
gammachirp filterbank,” IEEE Int. Conf. Acoust., Speech Signal Processing
(ICASSP-99), Phoenix, AZ.
Irino, T. and Patterson, R.D. (1997). "A time-domain, level-dependent auditory
filter: The gammachirp," J. Acoust. Soc. Am. 101, 412-419.
Irino, T. and Patterson, R.D. (1999). "A gammachirp summary of cochlear
mechanics that can also explain level-dependent auditory masking in
humans quantitatively," Symposium on Recent Developments in Auditory
Mechanics, Sendai, Japan.
Irino, T. and Unoki, M. (1997a). "An efficient implementation of the
gammachirp filter and its filterbank design," ATR Technical Report, TR-H-
225.
Irino, T. and Unoki, M. (1997b), “An efficient implementation of the
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 22-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
gammachirp filter and filterbank,” Trans. Tech. Com. Psycho. Physio.
Acoust., Acoust. Soc. Jpn., H-97-69 (in Japanese).
Irino, T. and Unoki, M. (1998). "A time-varying, analysis/synthesis auditory
filterbank using the gammachirp,” IEEE Int. Conf. Acoust., Speech Signal
Processing (ICASSP-98), 3653-3656, Seattle WA.
Lyon, R. F. (1996). "The all-pole gammatone filter and auditory models," in
Forum Acusticum '96, Antwerp, Belgium.
Lutfi, R.A. and Patterson, R.D. (1984).”On the growth of masking asymmetry
with stimulus intensity,” J. Acoust. Soc. Am. 76, 739-745.
Møller, A.R. and Nilsson, H.G. " Inner ear impulse response and basilar
membrane modelling," Acustica, 41, 258-262, 1979.
Oppenheim and Schafer (1975).”Digital Signal Processing,” Prentice-Hall,
New Jersey.
Patterson, R. D., Allerhand, M. and Giguère, C. (1995). "Time-domain
modelling of peripheral auditory processing: a modular architecture and a
software platform," J. Acoust. Soc. Am., 98, 1890-1894.
Pflueger, M., Hoeldrich, R. and Reidler, W. (1998),”Nonlinear All-Pole and
One-Zero Gammatone Filters,” Acta Acoustica, 84, 513-519.
Pickles, J.O. (1988).”An Introduction to the Physiology of Hearing,” Academic
Press, London.
Recio, A.R., Rich, N.C., Narayan, S.S. and Ruggero, M.A. (1998). ”Basilar-
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 23-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
membrane response to clicks at the base of the chinchilla cochlea,” J.
Acoust. Soc. Am., 103, 1972-1989.
Rosen, S. and Baker, R.J. (1994)." Characterising auditory filter nonlinearity,"
Hear. Res., 73, 231-243.
Slaney, M. (1993). "An efficient implementation of the Patterson-Holdsworth
auditory filter bank," Apple Computer Technical Report #35.
Slaney, M. (1995). “Pattern Playback from 1950 to 1995,” IEEE Conf. Syst.
Man, Cyben., Vancouver, Canada.
Yang, X, Wang, K. and Shamma, S. A. (1992). "Auditory representations of
acoustic signals," IEEE Trans. Information Theory, 38, 824-839.
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 24-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Figure Captions
Figure 1. Amplitude spectra of (a) a gammachirp filter | Gc( f ) |, (b) a
gammatone filter | GT( f ) |, and (c) an asymmetric function | HA ( f) |, where
n=4, b=1.019, c=-2, and f r=2000 Hz.
Figure 2. Amplitude spectra of (a) a gammachirp filter | Gc( f ) | and (b) an
asymmetric compensation filter | HA ( f) | as a function of the chirp parameter
c where n=4, b=1.019, and fr=2000 Hz. The amplitude is normalized to 0 dB
at the peak frequency in panel (a) and at fr in panel (b).
Figure 3. Amplitude spectra of asymmetric functions | HA ( f) | (solid lines) and
asymmetric compensation filters | HC( f ) | (dashed lines) where n=4,
b=1.019, c is an integer between -3 and 3, and fr=2000 Hz. The amplitude is
normalized to 0 dB at fr.
Figure 4. Amplitude spectra of original FIR gammachirp filters | Gc( f ) | (solid
lines) and asymmetric compensation (AC) gammachirp filters | GCAC( f ) |
(dashed lines) where n=4, b=1.019, c=-1, and the values for fr are 250, 500,
1000, 2000, 4000, and 8000 Hz.
Figure 5. (a) Impulse responses and (b) phase spectra of an FIR gammachirp
filter (Eq. (1)) (solid lines) and an asymmetric compensation (AC)
gammachirp filter (dashed lines). The parameters are n=4, b=1.019, c=-1,
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 25-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
and fr=2000 Hz.
Figure 6. Block diagram of a level-dependent gammachirp filterbank.
Figure 7. Block diagram of a gammachirp filterbank based on physiological
constraints.
Figure 8. Block diagram of a level-dependent, analysis/synthesis gammachirp
filterbank.
Figure 9. Frequency responses of the analysis/synthesis gammachirp filterbank
shown in Fig. 8 when the frequency range of the filterbank is between 100
and 6000 Hz and the number of channels is 50 (dashed lines) or 100 (solid
lines). Panel (b) is the magnified ordinate of panel (a).
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 26-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 1
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 27-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 2
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 28-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 3
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 29-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 4
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 30-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 5
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 31-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 6
Σ
Fromadjacentchannels
Activity to parameter
Parameter Control Unit
LI
LI
LI
LI
k-thchannel
k-th control-output
LI
SignalInput
GammatoneFilterbank
ParameterController
AsymmetricCompensation
Filterbank
GammatoneFilterbank
Output
GammachirpFilterbank
Output
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 32-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 7
(a) SignalInput
(c) LinearGammachirp
Filterbank Output
(d) GammachirpFilterbank Output
(C)Asymmetric
CompensationFilterbank
(D)ParameterController
(B)Linear
AsymmetricCompensation
Filterbank
(b) LinearGammatone
Filterbank Output
(A)Linear
GammtoneFilterbank
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 33-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 8
(A)Linear
GammachirpFilterbank
(E)Time-reversal
LinearGammachirp
Filterbank
(D)Inverse
AsymmetricCompensation
Filterbank
(a) SignalInput
SynthesisAnalysis
(C)ParameterController
(B)Asymmetric
CompensationFilterbank
(e) ResynthesizedSignal
(c) GammachirpFilterbank Output
(F)Weighted
Sum
Σ
(b) LinearGammachirp
Filterbank Output
(d) RecoveredLinear Gammachirp
Filterbank Output
Irino and Unoki:Analysis/Synthesis gammachirp filterbank
- 34-
J. Acoust. Soc. Jpn.(E),Vol. 20,No. 5, pp397-406, Nov. 1999
Fig. 9