methodsfor reducingaudible artifacts in a wavelet ... · a wavelet-basedmethod for denoising...

13
ENGINEERING REPORTS Methods for Reducing Audible Artifacts in a Wavelet- Based Broad-Band Denoising System* PAVAN K. RAMARAPu AND ROBERT C. MAHER, AES Member Department of Electrical Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0511, USA A wavelet-based methodfor denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm based on the method ofCoifman and Wickerhauser (1992). The modified algorithm shows noticeable improvement in quality and computational overhead compared to the original method when denoising digitized music signals with different levels of additive broad-band noise. Several implementation issues are described. 0 INTRODUCTION cation of a gain function, and calculation of the inverse DFT. Furthermore_ the SS method makes minimal as- There are wide variety of applications where it is de- sumptions about the signal and noise, and when carefully sirable to enhance degraded audio signals. The degrada- implemented, it resu!ts in reasonably clear enhanced sig- tion of audio may be due to errors or data loss, poor nals. Moreover, this approach has been heuristically de- analog recording techniques or storage media, the pres- veloped and experimentally optimized. The main draw- ence of high levels of background noise in the recording back of the SS method is that it can create noticeable itself, or several of these causes. The typical objectives residual noise with annoying tonal characteristics, often of enhancement are to improve perceived quality, in- referred to as "musical noise." This musical noise is a crease intelligibility, reduce listener fatigue, and soon. common side effect ofmost broad-band denoising tech- Depending on the specific application, the enhancement niques [3]-[5]. system may be directed at only one of these objectives Vaseghi and Frayling-Cork [3] used a frequency inter- or at several [1]. In this engineering report the main polation scheme to restore archival gramophone re- objective is to enhance the quality of degraded audio cordings. Their method was effective, but had the draw- signals, music (broad-band audio) in particular, and back that more than one copy of the recording was thereby to reduce listener fatigue, required. Ephraim and Van Trees [6] used a nonparamet- The distinct types of degradation common in audio tic signal subspace approach and showed that the SS recordings fall into two general classes: method is a special case of this approach. Coifman and 1) Local degradation: Clicks, scratches, and so on, Wickerh_tuser [7] have used a wavelet-based approach to where the signal is only disrupted at certain positions reduce broad-band noise. Other approaches to enhance and restoration is only required at these places noisy speech and other audio signals can be found in [1]. 2) Global degradation: Hiss, wow, hum, and soon, In all ofthese approaches the audio (time-domain) where the signal is disrupted to some extent at all times, signal is first transformed to the frequency or time- There are several notable methods for audio denois- frequency domain, and the processing of the noisy signal ing. The spectral subtraction (SS) method has become is accomplished in the transform domain. After pro- a standard procedure in speech .enhancement [2]. SS cessing, the altered signal is transformed back into the gained popularity because it is easy to understand and time domain to reconstruct the noise-reduced audio implement, that is, it only requires a discrete Fourier signal. . / transform (DFT) Ofeach frame of the noisy signal, app!i- In this engineering report, signal quality enhancement of broad-band noisy audio signals using a wavelet packet * Manuscript received 1996 November 25; revised 1997 basis is explored. In particular, the algorithm proposed September 2. by Coifman and Wickerhauser [7] for denoising music 178' J.AudioEng.Soc., Vol. 46,No.3, 1998 March i

Upload: others

Post on 24-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERINGREPORTS

Methods for Reducing Audible Artifacts in a Wavelet-Based Broad-Band DenoisingSystem*

PAVAN K. RAMARAPu AND ROBERT C. MAHER, AES Member

Department of Electrical Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588-0511, USA

A wavelet-based method for denoising broad-band noisy audio signals is proposed andimplemented. The method is a modified algorithm based on the method of Coifman andWickerhauser (1992). The modified algorithm shows noticeable improvement in quality andcomputational overhead compared to the original method when denoising digitized musicsignals with different levels of additive broad-band noise. Several implementation issuesare described.

0 INTRODUCTION cation of a gain function, and calculation of the inverseDFT. Furthermore_ the SS method makes minimal as-

There are wide variety of applications where it is de- sumptions about the signal and noise, and when carefully

sirable to enhance degraded audio signals. The degrada- implemented, it resu!ts in reasonably clear enhanced sig-tion of audio may be due to errors or data loss, poor nals. Moreover, this approach has been heuristically de-analog recording techniques or storage media, the pres- veloped and experimentally optimized. The main draw-ence of high levels of background noise in the recording back of the SS method is that it can create noticeableitself, or several of these causes. The typical objectives residual noise with annoying tonal characteristics, oftenof enhancement are to improve perceived quality, in- referred to as "musical noise." This musical noise is acrease intelligibility, reduce listener fatigue, and so on. common side effect of most broad-band denoising tech-Depending on the specific application, the enhancement niques [3]-[5].system may be directed at only one of these objectives Vaseghi and Frayling-Cork [3] used a frequency inter-or at several [1]. In this engineering report the main polation scheme to restore archival gramophone re-objective is to enhance the quality of degraded audio cordings. Their method was effective, but had the draw-signals, music (broad-band audio) in particular, and back that more than one copy of the recording wasthereby to reduce listener fatigue, required. Ephraim and Van Trees [6] used a nonparamet-

The distinct types of degradation common in audio tic signal subspace approach and showed that the SSrecordings fall into two general classes: method is a special case of this approach. Coifman and

1) Local degradation: Clicks, scratches, and so on, Wickerh_tuser [7] have used a wavelet-based approach towhere the signal is only disrupted at certain positions reduce broad-band noise. Other approaches to enhanceand restoration is only required at these places noisy speech and other audio signals can be found in [1].

2) Global degradation: Hiss, wow, hum, and so on, In all ofthese approaches the audio (time-domain)where the signal is disrupted to some extent at all times, signal is first transformed to the frequency or time-

There are several notable methods for audio denois- frequency domain, and the processing of the noisy signaling. The spectral subtraction (SS) method has become is accomplished in the transform domain. After pro-a standard procedure in speech .enhancement [2]. SS cessing, the altered signal is transformed back into thegained popularity because it is easy to understand and time domain to reconstruct the noise-reduced audioimplement, that is, it only requires a discrete Fourier signal. . /

transform (DFT) Of each frame of the noisy signal, app!i- In this engineering report, signal quality enhancementof broad-band noisy audio signals using a wavelet packet

* Manuscript received 1996 November 25; revised 1997 basis is explored. In particular, the algorithm proposedSeptember 2. by Coifman and Wickerhauser [7] for denoising music

178' J.AudioEng.Soc.,Vol.46,No.3, 1998Marchi

Page 2: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERING REPORTS WAVELET-BASEDBROAD-BAND DENOISING SYSTEM

signals is modified in several ways in order to obtain x(t), V 2 = _N=20titOi(t), V3 = _N=30titOi(t), and so on.

better perceptual quality. 5) For each Vi, starting at j = 1, the normalizedIn the next section the original algorithm of Coifman entropy H(Vj) is calculated. When H(Vjo) is greater than

and Wickerhauser is reviewed. The modified algorithm or equal to a specified threshold % this step is stopped.

and the simulations concerned with broad-band noise Then E_ etitoi(t) is the coherent part and the rest are theremoval are described in Section 3. Finally, the report residual terms.concludes with a summary and some suggestions for 6) The residual terms by definition constitute the noisyfurther research in this area. part of the signal and can be treated as a new signal

which in turn can be expanded and separated into its

I ORIGINAL DENOISING ALGORITHM coherent and noisy components. Thus at each iterationa coherent part is extracted from the signal currently

The approach for signalqualityenhancementexplored considered, and the leftover noise is treated in turn.in this engineering report is guided by the so-called best- The coherent parts, one from each iteration, are addedbasis paradigm [7]- [9]. This paradigm consists of three together to produce the total coherent portion. Berger etmain steps: al. [8] report that approximatelyseven iterationswere

1) Select a "best" basis (or coordinate system) for the used to process typical signals.

problem at hand from a library of bases (a fixed yet This published algorithm solves the problem of audioflexible set of bases consisting of wavelets and wave- enhancement by attacking it in the manner of audio eom-let packets), pression. The algorithmrelies heavily on selectingan

2) Sort the coordinates (features) by "importance" for appropriate value of the prespecified threshold, andthe problem at hand and discard "unimportant" features needs several iterations to denoise a broad-band noisy(noisyones), signal. Inthe nextsectionthisoriginalalgorithmis mod-

3) Use the retained coordinates to solve the problem ified to determine whether the enhancement can be ac-at hand (audio signal quality enhancement) by projecting complished in a more efficient way.the signal onto retained bases.

The method extracts from the noisy signal a coherent 2 PROBLEM FORMULATIONpart which is well represented by the given wavelets anda noisy or incoherent part which cannot be "well eom- Consider a discrete degradation model for the broad-

pressed." bandnoisecorruptedaudiosignalThis algorithm adaptively uses libraries of orthonor-

real waveforms, such as wavelet packets and local trigo- y = x + · (1)nometric waveform libraries, for separating the coherentsignal from the presumably incoherent unwanted noise, where y, x, · E 9tn and n = 2% The vector y represents

the broad-band noisy audio signal and x is the unknown1.1 Denoising Algorithm true (nOise-free) audio signalto be recovered. The vector

The best-basis algorith m of Coifman and Wicker- E is assumed to be white Gaussian noise (WGN), thathauser starts with a signal segment x(t) of length K as is, · -- N(0, (r2 I), and (r2 is not known a priori.well as a library of orthonormal bases. The library could Now consider an algorithm, based on the method out-have, for example, the Fourier basis, the Haar-Walsh lined in Section 1, to estimate the desired audio signalwavelet packets, various QMF Daubechies filters deft- x from the available broad-band noise corrupted signalning smoother wavelet packets, local trigonometric y. First, the library of orthonormal bases L = {B_,waveforms, and perhaps other orthonormal bases [10]- B1..... Bra}is prepared with the Daubechies wavelets[12]. Trigonometric bases and wavelet packets were (orders 2-16) and coiflets (orders 2-6) [9], [10]. Letused for this purpose by Berger et al. [8]. The following Bmrepresent the best basis selected from a dictionary ofsequence is used for the analysis procedure in [8]. orthonormal wavelet packet bases. The number of bases

1) The signal x(t) is first expanded in each basis, that M in this library typically ranges from 5 to 20 or more,is x(t) = _= _oqtoi(t), where roi are the orthogonal basis depending on the available resources and any a prioriwaveforms [10], oq are the coefficients, and N is the knowledge of the signal x.order of the expansion. A cost function is assigned to The audio signals treated in this work are all discrete:each expansion. Berger et al. use the Shannon second- they are vectors of a finite number of real-valued sam-order entropy - Er= _oq210g2 or/2. ples. The audio signals are monophonic (single channel),

2) The basis giving the least cost is chosen from the 5 to 10 seconds in duration, and sampled at a rate ofcollection of bases. 44 100samplesper second. The signals aredivided into ,

3) The best-basis coefficients are sorted in the order frames of 2048 samples (about 46 ms) in order to followof decreasing magnitude, that is, upon renaming, [oql the local characteristics of the audio. The best-basis al-

la2l · · · gorithm is applied separately on each frame.4) A number of leading (large-magnitude)terms are In the next section the modified algorithm is dis-

kept as the coherent part, based on a predetermined cussed. The reason to modify the published algorithmthreshold cost of the remaining terms. The cost is based of Coifman et al. is threefold: 1) to select adaptively theon the vectors of successive residual tails given by Vi = threshold level for each frame while processing the noise

J. Audio Eng. Soc., Vol. 46, No. 3, 1998 March 179

Page 3: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

RAMARAPUANDMAHER ENGINEERINGREPORTS

corrupted signal, 2)to further reduce the clicks and whis- method is that the signal can be well decomposed intoties, which are subjectively more annoying, in the pro- a wavelet basis whereas noise cannot be representedcessed output signal, and 3) to optimize the algorithm for by any of the bases. For example, if the frame under

the audio enhancement case since the existing method is consideration has only noise, then it cannot be repre-geared toward solving the problem of data compression, sented by any of the bases. So the threshold level should

be high. On the other hand, when no noise is present in3 MODIFIED DENOISING ALGORITHM the frame under consideration, the signal is well repre-

sented by the best wavelet basis and so the thresholdIt is assumed that the noise-free signal frames x can level should be near zero. Hence a method is needed to

be completely represented by basis Bm, adjust the threshold according to the nature of the frameunder treatment.

x = gm O[m (2) As can be seen by the contour map of the waveletcoefficients in Fig. 1, the broad-band noise cannot be

where W=E _}_n × n is an orthogonal matrix whose column represented well by the basis [no visible structure in Fig.vectors are the basis elements ofB mand (gm_ _)_n is the l(b)], whereas the noiseless audio signal (music alone)

vector of expansion coefficients of x. The basis B m is the can be effectively represented by a few components. Abest basis calculated for the frame x using the minimum- different viewpoint is shown in Fig. 2 with the waveletentropy criterion [7]. scales (parameter j) and the time (parameter i) shown

In order to reduce the sensitivity of the algorithm to along the x axis itself. This gives a better representationx (threshold level), the wavelet coefficients obtained for sorting the coefficients and removing the ones belowfrom the best-basis decomposition are subjected to the threshold level, as shown in Fig. 3.signal-dependent thresholding. The rationale for this The signal-dependent thresholding chosen here ex-

tl27e+04: 112.4e+04; Jl2.1e+04

]11.8e+04: {11.5e+04

i tI 1.2e+04

i _ . . ]I:]6e+03_i e+03

1 10'00 2000

(a)

2.7e+032.4e+032.1 e+031.8e+031.5e+031.2e+03

au 9006OO

3 _300

,11 1000 2000

(b)

,0 - '119 { 2.7e+048 i 112.4e+04] 12.1 e+047 ]11.8e+04

- 61_ {11.5e+04

'_ i_ tm1.2e+044 _ ]_9e+033 · I 1{]6e+032

I 10'00 2000

(c)

Fig. 1. Contour plots of wavelet transforms. Note that different bases are used for each plot. (a) Noisefree audio signal. (b) Broad-band noise. (c) Broad-band noise corrupted audio signal, x-axis time index; 3rd axis (darkness of pixel location)--magnitude ofwavelet coefficient.

180 J.AudioEng.Soc.,VoL46,No.3, 1998March

Page 4: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERINGREPORTS WAVELET-BASEDBROAD-BANDDENOISINGSYSTEM

ploits the second-order statistics of the signal and adap- the threshold level. The governing equation for the softtively selects the threshold level. The threshold level for threshold is given bysoft thresholding is selected with the algorithm shownin Fig. 4. The characteristic function to select the thresh- Otto(i, j) <_ (Xm(i, j) e[am(id)/'r- ll/'y , Otm(i,j ) _ 1' (3)

old is derived from experimentation with sample audiodata. The threshold function depends on the maximum where i is the shift parameter and j the dilation param-value of the decomposed wavelet coefficients, the root eter, 'y is the decay constant, and x the threshold level.mean square (rms) value, and the average (mean) value A small value of 'y gives results that are similar to hardof the wavelet coefficients for the signal under con- thresholding, and a higher value implies that thresh-sideration, olding is hardly done. The suggested value of % chosen

So-called birdy noise occurs as components are switched after some experimentation, is 0.2.on and off near the threshold. Although the chirp effect An exponential function was chosen as the kernel foris not as annoying as the broad-band noise, the birdy the soft thresholding since all functions in general can benoise is often irritating if it is not masked by the sig- approximated and represented by exponential functions.hal itself. Moreover, the exponential function is a smooth function

One way to alleviate the birdy noise problem is to use and decays asymptotically to zero.a soft threshold instead of a hard one. After selecting Reconstruction (via the inverse wavelet transform) isthe nominal threshold level using the scheme shown in performed using the soft thresholded coefficients. TheFig. 4, soft thresholding is done to the coefficients below resulting signal is expected to contain a reduced amount

50001 ....i i i i

4000t ........... ?........... ':........... i........... i ............_0001..........._...........:..........._.......................

ooo...........i'l........'ii,-t[.....'--it'l'-'[li............(a)

50001 ,, i ', ',

',4000(................................................ r ...........'

3000( ............................................................

2000(............ 'r..............................................

]ooot............ '_..............................................,, ,, : :

o(b)

5000( : : ,

4000( ........... 'r........... ',..................................

i,iltlil3000C................................................

000ii] i,i] ii]tfiiii[]tii,itt[l, ........... ooo,iii iiiiii..... ...........00 5000 10000 15000 2000() ' 25000

(C)

Fig. 2. Wavelet coefficients. (a) Noisefree audio signal. (b) Broad-band noise. (c) Broad-band noise corrupted audio signal['xaxis--wavelet index, y axis--magnitude of wavelet coefficient.

J.AudioEng.Soc.,Vol.46,No.3,1998March 181

Page 5: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

RAMARAPU AND MAHER ENGINEERING REPORTS

of broad-band noise, signal is divided into overlapping frames. Each frameThe block diagram of the modified algorithm is shown is decomposed using the best basis, followed by soft

in Fig. 5. thresholding and then reconstruction of the time-domainsignal from the thresholded wavelet coefficients. A cross-

3.1 Overlapping Windows fade function derived from the raised-cosine (Hann) win-A naive way to segment a signal into frames is to dow technique is used to reconstruct the sample values

divide the signal into nonoverlapping segments, which in the overlapping regions. This scheme is shown in Fig.is equivalent to multiplying the signal by an abrupt seg- 6. The cross-fade technique merges the reconstructedmentation function. This type of windowing can give sample values of the previous frame with the recon-

rise to sharp, regularly spaced discontinuities in the re- structed sample values of the current frame at the overlapconstructed signal. On playback, the background noise regionof the two frames. An overla p length of 128is noticeably reduced in the enhanced audio file, but the samples was used with satisfactory results. The frameperiodic clicks caused by the abrupt window segmenta- "hop" or spacing is 2048 - 128 samples long.tion degrade the subjective quality of the reconstructedsignal. 3.2 Additional Reduction of Birdy Noise

A better approach to alleviate the problems caused Although the soft threshold reduced the audibility ofby abrupt window transitions is to employ overlapping the birdy noise, additional reduction was still desirable.

windows. In this overlapping window approach the input Since the wavelet coefficients usually do not vary much

5000_

00iiiiiiiiiiiiiiiiiifiliiitiiillit''..........................................200011.......... 'r.......... 'r.......... 'r.....

i.l '

1000t .... [ ............

0 ........ i ...............

(a)

sooo,, , , ,

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::.....1000 ............................. L........... ,____J

.... i

(b)

5000,

4000 ............ '_........... 'r .......... _ .......... - '. - - ........

3000......... r ...........................

200i ............ 'r........ - ................................

100( ..................... ,_........... ': ...................i

0

(c)

5000

4000 ...... '..... '_........... '. ....................... '. ..........

3000 ........... _........... i ................................

2000........................ 'r..................................!

1000 ..... : - - -. ............... i ........................... j .....

1

/ 15000 10000 15000 20000 25000

(d)

Fig. 3. Noisy audio signal. (a) Unsorted wavelet coefficients. (b) Sorted wavelet coefficients. (c) Soft thresholded coefficients.(d) Hard thresholded coefficients, x axis--wavelet index; y axis--magnitude of wavelet coefficient.

182 J. Audio Eng. Soc., Vol. 46, No. 3, 1998 March

Page 6: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERING REPORTS WAVELET-BASED BROAD-BAND DENOISING SYSTEM

from one frame to the next, coefficients that have not in the processed signals in the manner of hysteresis. Forbeen thresholded in the adjacent frames are left without example, if ct(k- _)(i,j), a(k)(i, j), a (k+°(i, j) are waveletthresholding, even if they lie below the current adaptive coefficients for. the frames k - 1, k, k + 1, respectively,threshold level. This knowledge of preceding and suc- and 'r(k-°, 'r(k), x(k+° are the threshold levels for theceeding frames is used to reduce the birdy noise level wavelet coefficients for the frames k - 1, k, k + 1,

From Previous Stage O_ = Wavelet Coefficients

m = max. value of (Z

I r = rms value of O_

'_=m/2 _,,x a--average value of O_i

__ _i= Threshold LevelIf __ qr=ng5 GotoNextStage

IFalse - 'lJ= m/10

Fig. 4. Scheme for threshold level selection.

©Z

1L

Time ,...... I Frame'k-2' I

[ Frame'k-l' ]

For EaClhFrame: [ Frame 'k' I+

I Frame 'k+l' [Find the best

basisand I Frame'k+2' I

computethe ...etc...DWT

I II

Calculate the I Select the Soft Thresholdl I Reconstruct

RMS,averagei - th ffi '

and maximum = Theshold Level - e coe relents__ the

valuecoefficientsOfDWT (x) below x II[ [ audio signal

Fig. 5. Block diagram of modified algorithm.

J. Audio Eng. Soc., Vol. 46, No. 3, 1998 March , 183

Page 7: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

· RAMARAPU AND MAHER ' ' ENGINEERING REPORTS

respectively, then The quality of the enhanced signal was judged quiteclose to that of the original signals, except for the intro-

if tx(k- 1)(i,j) _< 'r(k- 1) and tx(_ +.1)(i ' j) _<,r(k+l) duction of' occasional clicks· and background warble or· - (4) birdy noise due to the processing of the signal. In the

· informaltests, 11 out of.14 subjectspreferredthe en-the threshold is applied to coefficient tx(k)(i, j) if hanced signal from the modified algorithm when com-txtk)(i,j) _< x(k). However, if tx(k)(i,j) is above the thresh- pared to the enhanced signal from the original method.old level, then no thresholding is done. With this ap- Fig. 7 shows the results of processing a few timeproach the birdy noise was reduced appreciably, al- frames of the xan.aif (synthesized music)audio signalthough it still could be heard faintly on certain examples, corrupted with - 10 dB of broad-band noise, and Fig.In any case, this approach seems to be more effective 8 gives the corresponding Fourier transform.

than averaging in the time or the frequency ·domain. Fig. 9 shows the results of processing a few timeframes of the vibraphone, aifaudio signal corrupted with

4 SIMULATIONS -10 dB of broad-band noise, and Fig. 10 gives thecorresponding Fourier transform. The enhanced wave-

To test the techniques described in Section 3 various forms in both cases can be seen to be largely free ofaudio signals with 5- to 10-second duration were consid- broad-band noise.

ered. All of the examples were 16-bit mono signals sam- Figs. 11 and 12 present the wavelet coefficients ofpled at 44.1 kHz. These signals were artificially cor- the enhanced frame and the error between the wavelet

rupted with - 10-dB, -20-dB, and -30-dB levels of coefficients of the corrupted and the enhanced frames.additive broad-band (white) noise, and. then processed

With the modified algorithm proposed earlier. .5 DISCUSSIONThe enhanced signals were then compared with the

undegraded (original) audio signals. Informal listening The modified algorithm presented in this engineeringtests performed on these signals show that the proposed report removes broad-band noise to a great extent frommethod gave a noticeable reduction of the additive noise, the music signals, but at the same time introduces faint

ReConstructed values

of previous frame

X]

Reconstructed values

· ofcurrentframe

'_ X2

g = {1 + cos(nh/N)} (l-g)2

0 J, J-'JJt_ ts 2 _5

0 NX = Xlg + X2(1-g )

Final reconstructed values for overlapping frame

Fig. 6. Scheme for cross-fade technique.

184 d. Audio Eng. Soc., Vol. 46, No. 3, 1998 March

Page 8: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERING REPORTS WAVELET-BASED BROAD-BAND DENOISING SYSTEM

birdy noise in the processed output signals. The algo- The incremental modifications made to the method ofrithm has some difficulty with high percussion instru- Berger et al. [8] provide the following benefits:ments, presumably bJecause of the similarity betweenthe inharmonic components of the percussion signals and · The enhanced audio signal retains more of the desiredthe corrupting broad-band noise. Apart from degrading signal structure and contains fewer clicks and whistlesthe high percussion instruments, the algorithm works compared to the original method.well on a wide range of music signals. · The original method is geared more toward the prob-

One problem with the denoising algorithm is that it lem of data compression, whereas the proposedseems to work unevenly in the regions of high and low method is intended for the problem of audio enhance-activity of the audio signal. The high-energy portions ment only.of the music are cleaned and made more vibrant, whereas ° The threshold value in the original method must be

denoising the low-energy portions of music sounds un- specified in advance, whereas the proposed methodnatural because too much harmonic content is removed, automatically selects the threshold adaptively for

10001 : : : i : ! t

5oo.... ' ....... i......... !.......... i....... i .....0 ' -

-100'500 ,- ......... :.......... i .....

(a)

!7I(h)

1000_) ,, : : ,,

5oo'- - - ' ....... i......... !.......... i....... i ......

(c)

10000[ , : : , ,, /

· soo+......... _-......... i.......... i.......... !.......... !.........

"°°t.........'i..............................!...................1-1000_) . ' .... /

(d)

1000,,, ,, ,, : ',

5001......................................... :.......... :.........

0 " : " ' " " -

-500_) ................ : .........................................

-1000 , 1600 2600 3000 4000 5000 ;000

........ > Sample Number

(e)

Fig. 7. (a) Time frames of xan.aif signal. (b) Signal corrupted with broad-band noise - 10 dB below signal level. (c) Enhancedsignal using modified algorithm. (d) Difference between (b) and (c). (e). Residual error between (a) and (c). x axis--waveletindex; y axis--magnitude of wavelet coefficient.

J. Audio Eng. Soc., VoL 46, No. 3, 1998 March 185

Page 9: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

RAMARAPU AND MAHER ENGINEERING REPORTS

each frame, tion that the broad-band noise cannot be efficiently· There is less computational overhead in the proposed represented by any of the bases in a library of wave-

noniterative algorithm when compared to the origi- lets. The algorithm has been implemented and evalu-nal method, ated with the goal that the enhanced music should be

as clean as possible without removing a noticeable6 CONCLUSION amount of the musical content. The enhanced audio

signals processed by this scheme are found to be

A denoising algorithm based on the method of Be- largely free from the broad-band noise, although somerger et al. [8] has been suggested and successfully minor problems, such as faint birdy noise and warbleimplemented to reduce broad-band noise in recorded arising from the processing method, still remain formusic signals. The algorithm is based on the assump- future investigation.

7000000 D i i i,il: : i a I_llll I I I Illll I I I I ia:Il t Ii I I I I III60oo0oo_ ,, nt,c ,-, _nr,c -- -iff,_,r_r- -,-,-, 1,r,r ---,--,-t,, r,'_5o0ooooI' ...................................... I.............................. I..... I4000000 F - - _l. _l_ .t J J _.tJL _ . _1_ .i..t J J LijL _ _ .I..l_lt. JJJ _.JJ t. _ . .t..i.J. j j I.t.t I_ . . _l..1_ · j j I. Jj

I I I I I I I II I I I I I I II I I I I I I III I I I I I I Ill I I I I I I II3000000 I- - - -- -,-; _ _ ;.,-_;- - - -,- -,- T '; _ ';',', ;' ' ' ',- -,-IT ';E -,'_ ;' - ' %' -,' T_ ........... i ....I i i Ii I I I I I I I I I2oooo0oF---,--,-, ,: r,-,,---,--,-It;1000000I-..................................... _fft__UIL......................... I..... I

0 I · .......... I II1''' · --'_'_ ." I Ilttlll ' I lilll''[

1 10 100 1000 10000 100000

frequency[Hz]

(a)

ooooooooooooI.......................................Ii:ii- - -I- -I- ! I I rill' - - -I- -i' ! I I till' - - -i- -i- t I 1 till' - ' '1' -I- ? I I rnr - - '1' '1' ? q 1 ri,

4000000 . -- -- -I- -I- _ J J LIJ L - - -I- -I- _ J J LIJ L * -/-I' -I- _Ji I I I I I Ill I I I I I I III I I I f I I IIILIJ t' - -'-il 'I- · J "J J'*lJ L - -- -I- -I- ·Ji I I I I I Ill I I I t _ L IJ3000000 ...................................... '"---,--,-T_5;','_? , , , , ,,,,, , _ , , ,,,,, , e , , ,,,'_ , -_'T';_?,'_2000000 ---,--_-Tq'Ir,-Jr'---,--,-Tqlr,',('---m--i-T-i , _'_r---_--i-Tllr,"l_'---,--_-t-ilrtq

1 10 100 1000 10000 100000

frequency [Hz]

(b)

oooo ooooooI ....................................... i11 ii 1- - -I- -I- I I q tlqr - - -i- -i- T 1 i rllr- - -i- -i- t q I ?lqr - - -i- -i- t I I rlqr ' - -I- -I' T I I rlq

5000000 -- -- --I-- --I-- 4+ '4 4 4'1"4 _ -- -- --I-- --I-- 4 '4 _ _ I'_ b -- -- --I-- --I-- _ '4 '4 _' 14 I.- -- -- --I-- .I. _ -4 4 l- I-4 I_ . -- --I-- --I. _ .4 4 &l.l_

40OO000 -- - -I- -I- · J J LIJL - - -I- '1' J' J J J'lJL ' - -I- -I- J' J ·1'1L - - -I- -I- 'L J "J LIJ L - - -I- -I- I J L IJ

3000000 I I I I I I I II I I I I I I I II I I I I I I III I I I I I I III I I I I Ill---_--,- T 5 _ 7,_?---,-',- T '_ 7,;_----_- %- T 5 ,_,'_?''-,- %' T_ _ ;',%;-'',' %- T _'; ;','_2000000 - - -i- -_- t 1 1 Ti'Ii' - - -i- -)- t '11 rlqf' - - -i- -i- T "1 _q r' - - -I- -I- t I I r_q/' - ' 'l' -_- T I 3 rlq1000000 .,

1 10 100 1000 10000 100000

frequency[Hz]

(c)

400000 , i i i i i i ii i i i i i i i ii i i i i i i ii i I I i i i iii I I I I I I

I I I I I I I II I I I I I I I I1 I I I I I I Il I I I I I I I I I I I I I I I

300000 .... ,- -,- T _ '_r,", c - - -,- -,- T '_ i r,'_r - - -,- ' - T '_'_r,'_r - - -,- -,- t '_ '_ r,'_r ' ' ',' ',' T '_ '_ r,-I I I I I I I II I I I I I I I II I I I I I II I I I I I I III I I I I I I

I I I I I I I II I I I I I I I I _ I i' i I I I I I III I I I I I I I I

100000 , , , , ,,'-,,----,- -, ,-, ---,--_-+-,.,,-,-_,----,--_-..,.,,.,.,...... :'--'-: ................0 i i i i i illl i i , , , i ,,l , i i J I ...... lilt ., ~1 _ i , Ii I

1 10 100 1000 10000 100O00

frequency[Hz]

(d)

400000 i i i i i i i ii i i i i i i i ii i i i i i i i ii i i i i i i iii i i i i i i i i

350000 .... ,- -,- T '__ r,'_r - - -,- -,- T 'i '; r,-,r - - -,- ' - T%_ r,'_r - - -,- -,- T _ '; r,'_r ' ' ',' ',' '_ _ r,'_300000 .... ,- -_- t '_'_ r,_ r - - -_- -,- T _ i r,',c ' ' -_- ' - T '_ '_ r,%r' - - -,' '_- T 'i 1 r_ _' - - -_- -_- - -i 7 r_250000 - - -I' -I- lr _ 3 TI-I r - ' -I- -I- T q q rl-t r - - -i- ' - T 3 I ri11' - - -i- -i- ? 1 1 rl-_r - - -i- -I- T I 1 rl-_

- - -i- -i- T _ _ Tier - - -I- -I-- T _ q rl_ - - l- -1'1'1 I r' - - -I- -i- T '1 "1 r "_ _' - ' -i- i' T '1 'T rl'l

150000 ._., .,,-,---i- -_- + -_ 4 +1 --100000 * * ,*,-,

50000 *,*, ,.,-_0 , , , , ,, i i ,,,,

1 10 100 1000 10000 100000

frequency[Hz]

(e)

Fig. 8. Fourier transforms. (a) Time framesof xan.aif signal. (b) Signal corruptedwith broad-bandnoise - 10dB below signallevel. (c) Enhancedsignal using modified algorithm. (d) Differencebetween(b)and (c). (e) Residualerror between(a) and (c).y axis--magnitude of Fourier coefficient.

186 J. Audio Eng. Soc., Vol. 46, No. 3, 1998 March

Page 10: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERINGREPORTS WAVELET-BASEDBROAD-BANDDENOISINGsYSTEM

7 REFERENCES Application of Adaptive Filters for Restoration of Ar-chived Gramophone Recordings," in Proc. IEEE Int.

[1] J. S. Lim, Speech Enhancement (Prentice-Hall, Conf. on Acoustics, Speech, Signal Processing (1988),Englewood Cliffs, NJ, 1983). pp. 2548-2551.

[2] S. F. Boll, "Suppression of Acoustic Noise in [5] S. V. Vaseghi, "Algorithms for Restoration ofSpeech Using Spectral Subtraction," IEEE Trans. Archived Gramophone Recordings," Ph.D. thesis, Uni-Acoustics, Speech, Signal Process., vol. ASSP-27, pp. versity of Cambridge, Cambridge, UK (1988).112-120 (1979 Apr.). [6] Y. Ephraim and H. L. Van Trees, "A Signal Sub-

[3] S. V. Vaseghi and R. Frayling-Cork, "Restoration space Approach for Speech Enhancement," IEEE Trans.of Old Gramophone Recordings," J. Audio Eng. Soc., Speech and Audio Process., vol. 3, pp. 251-266vol. 40, pp. 791-801 (1992 Oct.). (1995 July).

[4] S. V. Vaseghi and P. J. W. Rayner, "A New [7] R. R. Coifman and M. V. Wickerhauser, "Entropy-

500 ......... L........ : ....... :.........

0 : : '

-50-1000b ' ' ' i(a)

l°°°lsoo0...... ',- ,_ u _-_-" " ! i .. !t

(b)

10001 , , , . , , ! t

500 ......... '. __i ..... i ....... :......... :.......... :.......

0 ' : : -' ' - .....

-50 .....-looob ' ', ', ', :(c)

lOOk[no : : _: : : t5oo_-......... i ......... i.......... !.......... !..... ..... !......... '

aJ_L'.dl,dla_kl_,&_,udh_' t_g[._L[tJi._.hll&m_,.,uaU_aJ]aaJ,kLIJ,kJt&l_aa ]'5°°I -........ i.......... ':.......... !.......... i.......... i.....

-lOOOO ', ', ', ', ',(d)

lO001

500(

0 i : : : ; ......

-500_P- -

-1000 1(}00 2600 36oo' 4600 5800 6000

........ > Sample Number

(e)

Fig. 9. (a) Time frames of vibraphone.aif signal. (b) Signal corrupted with broad-band noise - 10 dB below signal level.(c) Enhanced signal using modified algorithm. (d) Difference between (b) and (c). (e) Residual error between (a) and (c): xaxis--wavelet index; y axis--magnitude of wavelet coefficient.

J, Audio Eng. Soc., VoL 46, No. 3, 1998 March 187

Page 11: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

RAMARAPUANDMAHER ENGINEERINGREPORTS

Based Algorithms for Best Basis Selection," IEEE Yale University, New Haven, CT (1994 Dec.).

Trans. Inform. Theory, vol. 38, pp. 713-718 (1992 [10] G. Kaiser, A Friendly Guide to Wavelets (Birk-Mar.). hiiuser,Boston,1994).

[8] J. Berger, R. R. Coifman, and M. J. Goldberg, [11] I. Daubechies, "Orthonormal Bases of Com-

"Removing Noise from Music Using Local Trigonomet- pactly Supported Wavelets," Comm. Pure Appl. Math.,ric Bases and Wavelet Packets," J. Audio Eng. Soc., vol. 41, pp. 909-996 (1988).

vol. 42, pp. 808-818 (1994 Oct.). [12] I. Daubechies, "Ten Lectures on Wavelets," in

[9] N. Saito, "Local Feature Extraction and Its Appli- CBMS-NSF Regional Conference Series in Appliedcations Using a Library of Bases," Ph.D. dissertation, Mathematics, vol. 61, pp. 909-996 (1992).

6000000 i i i i i iiii i I i i i iiii i i D i i iiii i I I I I I Iii i I I I I II

i i i i i i i ii i i I i i i i ii I I i jIIIiI, i i i i4000000 .......................................... ........, ,,,, , °l ....I ' ' L'.i i iiiiiii i i iiiiiii I i illllll i I iilllll i I iiill

3000000 - - -1- -1- .......... 1--, ........ ---,- i:_i_iiiii:......... - - -1- -.-. ...... ---i .... !.i;r,.

2000000 _l. _1_ ,[ J J J. Idl. _*_ .1_ .L Ji Ji Ll.I I_ .1_ I JJL JL_ .I._1_ I J J LIJL , _1..I. LIjI I i i i i i Il I I I I I Il I I I I I Il I I I I I I III I I , I I

1000000 -[- -,- ? q '1 t,'l r -i- -,' ? q 1 ri-Ir --- _q._' 'l- ',- T q I r i't t' ' ' -i- -,- '1 1 rl't0 _ i i iiil_ ....... '_ t i i I i i i i [ i i i i i

1 10 100 1000 10000 100000

frequency [Hz]

(a)

6O00000 , i i i i i i ii i i i i i i i il ' i -i I i i i iii I i i i i i ill i i i i i i i

5000000 -- -- --I-- --I-- t _ _ f I--Iff ' -- --I-- --I' _e _ _ _1'1P - - -I' -I-- t q q _lqr -- -- --I-- --I-- 9 q _ pl_p . . --i. --I-- t q _ plq

' '-.[J'L'JL -'-.'.-_JJ''JL-..'..'-LJJL'J'.-..' .,.i ,l,,I, , a I lilt.4000000 .... ,--, , ,.,,,,-- , , , ,,T,,, , , , ,t,,,, i- i T_7,_;''''l'','T_;','

3000000 - - -1--I' ? q q ri-ir-- -t- -i- ? '11 ri-if--- -1- t_i_li_iii"T q'l ri'If'- '-I--1- qr q _ ri'ir '-'1''1' TI*I _ r"t2000000 ...... zJJ_,_ ........ zJa_,__ _,_ , Ja_J_. -,. -,- _ J, J, L,d_ .... ,__ JL,_i i i i i illl i i I , i i ill i I , i i Il I , I I Ill I I Jill

1000000 -t--i-T '11 1'['it'---1--i-T q 3 1ri'Ir- _ ':_d_ L- -i- -I- T '1 1 rl-i_- -I' -I- · 311'110 I I i illll_ '_ ............. ; .... , i , i I , _ ,

1 10 100 1000 10000 100000

frequency [Hz]

(b)

6000000 , , , , ,,,,, , , , , ,,,,, , , , , ,,,,, , , , i ,,,,_ , , i,,,,

i i I i i I i ii i i I I i i i ii i I i i i i i il i i I i I I Ill i I i i I i i4000000 .... ,--,' T _ _ ;',_---,'-,- T '_ 7,'_;'--',' ',- T';_ 7,'_;'--','-,' 7 _'; 7,'_' ...........I I I i I Il

3000000..............i .........ii ...........................ii ..................2000000 ...... _ _,_, &'_ ...... _ _,_L ..... J:_ ........ z LUL ...... _ _-L'_I i i i i i i i i i i ii i I i i i ii i I i I I11 I i I I i I Il

1000000 -,--,-?,3r,,r---,--,-,,lr,,r]_ :!_'']-',-Tq _rl, r -,-',-T 11 rlq0 I i i i i i ill _ --* , . i i i I r ii i i B I I I I

10 100 1000 10000 100000

frequency [Hz]

(c)

350000 , i I , i till i I I i i iiii i I I I I illi I I i i i i ill i I i I I iii

300000 - - -I- -I- T _ q TI_ ff - - '1- -I- T 3 1 _1-11-- - '!_' -I- T _ _ TI-I ff ' - '1- '1- T _ _1Fl-Ii' - - '1' -I' T 1 1 rl_250000 - - -i- -i- + _ _ _- i-_ P - - -i- -i- ._ -_ d i-i-_ _- .... l- . ._ =_ _.1.1 i. - - _1_ .1_ . =_ .i _.1._ p.. _ _,. _1_ . .I =_ pl_

200oo0........ ; _,_.......................... ._,............................ ; .....

i i i iii _/]_ i1-_; / j_l_i_ i_ [_ ] -I i i I i i i ii i i i ii i I I i i i i ii i i I i i i I I

150000 - , -,-_ , ,;,,, -,-- T_';7 , -- ,'- - ,_. --','',-T_;','_;''--,'-,-7_;;','_100000 .... I- -[' T 'i *i r_r - - '_- _ - , ..... ,'.T I 3 r,N t' - - ',- -,' 1_'1 1 r,'_50000 ................ -' --'- '_- --_ ' --_ ,r*-t_..-_ -J......................... '1

1 10 100 1000 10000 100000

frequency [Hz]

(d)

350000 , i i i i Illl I I I I I IIII I I I I I IIII I I I I I I Iii I I I I I II300000 ---t-'l- T q_rlql'---I-'l-T'11TI-I_--'-I'-I-TN_ITI'II--- 'I-'l-TN1 rlq_-*-i--i- ,11rlll

250000 - - -I- -i- + -1 -I _-i-I _- - - -I- -i- + -I q _-i_ 1- - - -i- -I- .I. -_ -* _-i-I ,- - - -i- -i- + -_ .l Pl_ t- - - -i- -i- -I -, ,-i__ _ _1_ _1_ _ J J LI.J I. _ _ _1_ _1_ ,[ J J i.l_l I. _ _ _1 _1_ ,,[ ..J J ,L I

150000 I I I i i i i II I i i [ I i i ii i i i I i iii i I i i i i iii i i I I i i i i.... ,--,-;_r,_, - .... ;_r,-r-- -,-; r, - "--'-,,_',-':-'_-',-',- T _- 7 ....... _7,]100000 ' q 3 r,'_

500000 i , tilt,

1 10 100 1000 10000 100000

frequency [Hz]

(e)

Fig. 10. Fourier transforms. (a) Time frames of vibraphone.aif signal. (b) Signal corrupted with broad-band noise - 10 dBbelow signal level. (c) Enhanced signal using modified algorithm. (d) Difference between (b) and (c). (e) Residual error between(a) and (c). y axis--magnitude of Fourier coefficient.

188 J. Audio Eng. Soc., Vol. 46, No. 3, 1998 March

Page 12: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

ENGINEERINGREPORTS WAVELET-BASEDBROAD-BANDDENOISINGSYSTEM

10

9. : : 2.7e+04

81 _ .. 2.4¢+04!

[ 2.1e+047; _ i 1.8e+04

_ _a 6I. I :ii!!i 1.5e+04m m 5' 1.2e+04

4 [

31 _ 3e+032j '

II0'00 2000

(a) (b)

Fig. 11. Contour plots. (a) Wavelet transforms for enhanced audio signal. (b) Difference in wavelet transforms between corruptedand enhanced signals, x axis--time index, 3rd axis (darkness of pixel location)--magnitude of wavelet coefficient.

50001

40001..............

30001 ........... __. i .. - ....... L......... L...........

i0 . . . · ..... t. ..,. . . , ,

(a)

5000_

40001 ...........................................................,

30001 ........... L........... L........... _ ........... _...........J

i

i

20001 ....................... r ....................... _ ...........i

i

10001............................................... _...........

00 5000 10000 15000 20000 25000

(b)

Fig. 12. Wavelet coefficients. (a) Enhanced audio signal. (b) Error between corrupted and enhanced audio signals, x axis--waveletindex, y axis--magnitude of wavelet coefficient.

J. AudioEn0. Soc.,Vol.46, No.3, 1998March 189

Page 13: Methodsfor ReducingAudible Artifacts in a Wavelet ... · A wavelet-basedmethod for denoising broad-band noisy audio signals is proposed and implemented. The method is a modified algorithm

RAMARAPUANDMAHER ENGINEERINGREPORTS

THE AUTHORS

I iI

P. K. Ramarapu R.C. Maher

Pavan K. Ramarapu was born in Hyderabad, India, separation of voices in musical signals. Dr. Maherin 1973. He received a B.E. degree from Osmania Uni- joined the University of Nebraska-Lincoln in 1989,versity, Hyderabad, Indian, in 1994, and an M.S. degree serving first as an assistant professor (1989-1995) andfrom the University of Nebraska-Lincoln, in 1996, both then as a tenured associate professor (1995-96) within electrical engineering. From 1996 to 1997 he was a the Department of Electrical Engineering. His primarydoctoral student at the University of Nebraska. In 1997 teaching and research interests involved applications ofhe joined Zilog, Inc., of Campbell, California, where digital signal processing to audio engineering, musiche is involved with modem system design. His interests synthesis, and acoustics. He received several teachinginclude digital signal processing and telecommu- awards, including a University of Nebraska College Dis-nications, tinguished Teaching Award in 1995.

Dr. Maher left academia in 1997 to become director· of engineering with EuPhonics, Inc., of Boulder, Colo-

Robert C. (Rob) Mailer was born in Cambridge, UK, rado, where he was recently promoted to vice presidentin 1962. He holds a B.S. degree from Washington Uni- of engineering. His research and development effortsversity, St. Louis, MO (1984), an M.S. degree from the involve advanced digital music synthesis, sound local-University of Wisconsin-Madison (1985), and a Ph.D. ization techniques, and audio DSP architectures. He is adegree from the University of Illinois-Urhana (1989), member of several professional organizations, includingall in electrical engineering. While a doctoral student, the Audio Engineering Society, IEEE, Acoustical Soci-he worked as a graduate research assistant under James ety of America, ASEE, ICMA, and the Tau Beta Pi,Beauchamp with the University of Illinois Computer Eta Kappa Nu, Phi Kappa Phi, and Sigma Xi honorMusic Project. His doctoral dissertation dealt with the societies.

190 J. AudioEng.Soc.,Vol.46,No.3, 1998March