single channel speech music separation using nonnegative matrixfactorization and spectral masks
DESCRIPTION
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS. Emad M. Grais. Hakan Erdogan. 17 th International Conference on Digital Signal Processing,2011. Jain- De,Lee. Outline. INTRODUCTION NON-NEGATIVE MATRIX FACTORIZATION - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/1.jpg)
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS
Jain-De,Lee
Emad M. Grais Hakan Erdogan
17th International Conference on Digital Signal Processing,2011
![Page 2: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/2.jpg)
Outline INTRODUCTION
NON-NEGATIVE MATRIX FACTORIZATION
SIGNAL SEPARATION AND MASKING
EXPERIMENTS AND DISCUSSION
CONCLUSION
![Page 3: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/3.jpg)
Introduction There are two main stages of this work
– Training stage– Separation stage
Using NMF with different types of masks to improve the separation process
– The separation process faster– NMF with fewer iterations
![Page 4: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/4.jpg)
Introduction Problem formulation
– The observe a signal x(t) ,which is the mixture of two sources s(t) and m(t)
– Assume the sources have the same phase angle as the mixed
),(),(),( ),(),(),(
),(),(),(ftMjftSjftXj eftMeftSeftX
ftMftSftX
Where (t , f) be the STFT of x(t)
X = S + M
![Page 5: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/5.jpg)
Non-negative Matrix Factorization
Non-negative matrix factorization algorithm
Minimization problem
Different cost functions C of NMF– Euclidean distance– KL divergence
BWV
),(min,
BWVCWB
subject to elements of B,W 0≧
![Page 6: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/6.jpg)
Non-negative Matrix Factorization
The magnitude spectrogram S and M are calculated by NMF
Larger number of basis vectors– Lower approximation error– Redundant set of basis– Require more computation time
musicmusicTrain
speechspeechTrain
WBM
WBS
![Page 7: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/7.jpg)
Signal Separation and Masking
The NMF is used decompose the magnitude spectrogram matrix X
The initial spectrograms estimates for speech and music signals are respectively calculated as follows
WBBX musicspeech ][
Mmusic
Sspeech
WBM
WBS
~
~
Where WS and WM are submatrices in matrix W
![Page 8: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/8.jpg)
Signal Separation and Masking
Use the initial estimated spectrograms and to build a mask as follows
Source signals reconstruction
S~ M~
PP
P
MSSH ~~~
XHM
XHS
)1(ˆ
ˆ
Where 1 is a matrix of ones is element-wise multiplication
![Page 9: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/9.jpg)
Signal Separation and Masking
Two specific values of p correspond to special masks– Wiener filter(soft mask)
– Hard mask
22
2
~~~
MSSHWiener
)~~~
(22
2
MSSroundH hard
![Page 10: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/10.jpg)
Signal Separation and Masking
The value of the mask versus the linear ratio for different values of p
![Page 11: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/11.jpg)
Experiments and Discussion Simulation
– 16kHz sampling rate– Speech
• Training speech data-540 short utterances• Testing speech data-20 utterances
– Music• 38 pieces for training• one piece for testing
– Hamming window-512 point– FFT size-512 point
![Page 12: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/12.jpg)
Experiments and Discussion
![Page 13: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/13.jpg)
Experiments and Discussion
![Page 14: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/14.jpg)
Experiments and Discussion
![Page 15: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/15.jpg)
Experiments and Discussion
![Page 16: SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816353550346895dd3f403/html5/thumbnails/16.jpg)
Conclusion The family of masks have a parameter to control the
saturation level
The proposed algorithm gives better results and facilitates to speed up the separation process