convex optimization in sinusoidal modeling for audio signal processing michelle daniels phd student,...
Post on 19-Dec-2015
219 views
TRANSCRIPT
![Page 1: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/1.jpg)
Convex Optimization in Sinusoidal Modeling for Audio Signal ProcessingMichelle Daniels
PhD Student, University of California, San Diego
![Page 2: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/2.jpg)
Outline Introduction to sinusoidal modeling Existing approach Proposed optimization post-processing Testing and results Conclusions Future work
2
![Page 3: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/3.jpg)
Analysis of Audio Signals Audio signals have rapid variations
Speech Music Environmental sounds
Assume minimal change over short segments (frames) Analyze on a frame-by-frame basis
Constant-length frames (46ms) Frames typically overlap
Any audio signal can be represented as a sum of sinusoids (deterministic components) and noise (stochastic components)
3
![Page 4: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/4.jpg)
Sinusoidal Modeling of Audio Signals
Given a signal y of length N, represent as K component sinusoids plus noise e:
y and e are N-dimensional vectors Each sinusoid has frequency (w), magnitude (a), and phase (f) parameters K is determined during the analysis process Higher-resolution frequencies than DFT bins, no harmonic relationship required Model, encode, and/or process these components independently Applications:
Effects processing (time-scale modification, pitch shifting) Audio compression Feature extraction for machine listening Auditory scene analysis
1
, cos 1( )K
n k k k nk
a n n N
y e
4
![Page 5: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/5.jpg)
Estimation Algorithm Using frequency domain analysis (e.g. FFT), iterate up to K
times, until residual signal is small and/or has a flat spectrum: Identify the highest-magnitude sinusoid in the signal Estimate its frequency w Given w, estimate its magnitude a and phase f Reconstruct the sinusoid Subtract the reconstructed sinusoid to produce a residual
signal After all sinusoids have been removed, the final residual
contains only noise
5
![Page 6: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/6.jpg)
Sinusoidal Analysis Example6
![Page 7: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/7.jpg)
Sinusoidal Analysis Example7
![Page 8: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/8.jpg)
Sinusoidal Analysis Example8
![Page 9: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/9.jpg)
Sinusoidal Analysis Example9
![Page 10: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/10.jpg)
Estimation Challenges Energy in any DFT bin can come from:
Multiple sinusoids with similar frequency Both sinusoids and noise
Interference from other sinusoids and/or noise results in inaccurate estimates
Incorrect estimation of a single sinusoid corrupts the residual signal and affects all subsequent estimates
10
![Page 11: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/11.jpg)
Possible Solution Optimize frequency, magnitude, and phase to
minimize the energy in the residual signal The original parameter estimates are initial
estimates for the optimization Sinusoidal approximation: Residual: Optimization problem:
1
ˆ cos( , 1)K
n k k kk
a n n N
y
ˆ e y y
, , 2min || subject to ˆ || 0, 1a ka k K yy
11
![Page 12: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/12.jpg)
Is it Convex?
Want convexity so the problem is practical to solve Not a convex optimization problem because each element of
ŷ is a sum of cosine functions of w and f Want convex function inside of the 2-norm instead With fixed frequencies, can reformulate optimization of
magnitudes and phases as convex problem Fix frequencies to initial estimates
, , 2min || subject to ˆ || 0, 1a ka k K yy
12
![Page 13: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/13.jpg)
Convex Optimization Problem
2 22min || || , , , N K K N
x R Rx y A x y RA
1 2 1 2
1 2 1 2
1 2 1
sin(0) sin(0) sin(0) cos(0) cos(0) cos(0)
sin( ) sin( ) sin( ) cos( ) cos( ) cos( )
sin(2 ) sin(2 ) sin(2 ) cos(2 ) cos(2 ) cos(2 )
sin(( 1) ) sin(( 1) ) sin(( 1) ) cos(( 1) ) cos(( 1)
K K
K K
KN N N N N
A
2 ) cos(( 1) )KN
2 2 1 and tan 2
k Kk k k K k
k
a
xx x
xMagnitude and phase recovered as:
Classic least-squares problem:
13
![Page 14: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/14.jpg)
Related Work Petre Stoica, Hongbin Li, and Jian Li. “Amplitude
estimation of sinusoidal signals: Survey, new results, and an application”, 2000. Mentions least-squares as one approach to estimate
amplitude of complex exponentials No discussion of phase estimation
Hing-Cheung So. “On linear least squares approach for phase estimation of real sinusoidal signals”, 2005. Focuses on phase estimation Theoretical analysis
Not applied specifically to audio signals
14
![Page 15: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/15.jpg)
Constraints Analytic least-squares solution frequently results in
unrealistic magnitude values This is possibly the result of errors in frequency estimates
Constraints on magnitudes were required Ideal constraint: Relaxed constraint: Result is a constrained least squares problem that can
be solved using a generic quadratic program (QP) solver
2 2max0 , 1 k k K a k K x x
max max , 1 ka a k K x
15
![Page 16: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/16.jpg)
Final Formulation16
Quadratic Program:
Magnitude and phase recovered from x as:
2 max maxmin || || subject to , 1 x ka a k K Ax y x
2 2 1 and tan 2
k Kk k k K k
k
a
xx x
x
![Page 17: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/17.jpg)
Test Signals17
Model test signals that reproduce challenging aspects of real-world signals
Reconstruct signal based on original model parameters and optimized parameters
Compare both reconstructions to original test signal and to each other
![Page 18: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/18.jpg)
Test Signal 1: Overlapping Sinusoids
Signal consists of two sinusoids close in frequency There is no additive noise, so the residual (the noise
component of the model) should be zero
18
![Page 19: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/19.jpg)
Results 1: Overlapping Sinusoids
Without optimization, there is significant energy left in the residual (very audible)
With optimization, the residual power at individual frequencies is reduced by as much as 50dB (now barely audible)
The improvement with optimization generally decreases as the frequency separation is increased
19
![Page 20: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/20.jpg)
Test Signal 2: Sudden Onset A single sinusoid starts half-way through
an analysis frame (the first half is silence)
20
![Page 21: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/21.jpg)
Results 2: Sudden Onset21
Original:MSE* = 2.76x10-5
Optimized:MSE* = 4.13x10-6
*MSE = Mean Squared Error
![Page 22: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/22.jpg)
Test Signal 3: Chirp A single sinusoid with constant
magnitude and continuously-increasing frequency
22
![Page 23: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/23.jpg)
Results 3: Chirp Non-optimized peak magnitudes are close to constant between
consecutive frames Optimized peak magnitudes vary significantly from frame to frame The optimization produces peak parameters that do not reflect the
underlying real-world phenomenon.
23
![Page 24: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/24.jpg)
Conclusions Problem can be formulated using convex programming For several classic challenging signals, optimization
produces a more accurate model Constraints are necessary to ensure parameter
estimates reflect possible real-world phenomena Final formulation is quadratic program Parameters obtained via optimization may still not
represent the underlying real-world phenomenon as well as the original analysis (i.e. chirp)
24
![Page 25: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/25.jpg)
Future Work Explore robust optimization techniques to
compensate for errors in frequency estimates Integrate optimization into original analysis
instead of a post-processing stage Experiment with more real-world signals Further investigate constraints The ultimate goal: three-way joint optimization
of frequency, magnitude, and phase
25
![Page 26: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/26.jpg)
References M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming,
version 1.21. http://cvxr.com/cvx, May 2010. R. McAulay and T. Quatieri. Speech analysis/synthesis based on a sinusoidal
representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4):744-754, Aug 1986.
Xavier Serra. A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition. PhD thesis, Stanford University, 1989.
Kevin M. Short and Ricardo A. Garcia. Accurate low-frequency magnitude and phase estimation in the presence of DC and near-DC aliasing. In Proceedings of the 121st Convention of the Audio Engineering Society, 2006.
Kevin M. Short and Ricardo A. Garcia. Signal analysis using the complex spectral phase evolution (CSPE) method. In Proceedings of the 120th Convention of the Audio Engineering Society, 2006.
Hing-Cheung So. On linear least squares approach for phase estimation of real sinusoidal signals. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E88-A(12):3654-3657, December 2005.
Petre Stoica, Hongbin Li, and Jian Li. Amplitude estimation of sinusoidal signals: Survey, new results, and an application. IEEE Transactions on Signal Processing, 48(2):338-352, 2000.
26
![Page 27: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/27.jpg)
Thanks for your attention!
For further information:
http://ccrma.stanford.edu/~danielsm/ifors2011.html
27
![Page 28: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/28.jpg)
THE END28
![Page 29: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/29.jpg)
Convex ReformulationDefine:
Change of variables:
Define:
29
![Page 30: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/30.jpg)
Test Signal: Sinusoid in noise A single sinusoid with stationary frequency
and corrupted by additive white Gaussian noise
Noise is present at all frequencies, including that of the sinusoid, corrupting magnitude and phase estimates
Test repeated using different variances for the noise (varying signal-to-noise ratios)
30
![Page 31: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/31.jpg)
Results: Sinusoid in noise• Without optimization, the sinusoid’s magnitude is over-
estimated and the noise’s energy is under-estimated• The optimization gives residual energy slightly closer to the
true noise energy.
31
![Page 32: Convex Optimization in Sinusoidal Modeling for Audio Signal Processing Michelle Daniels PhD Student, University of California, San Diego](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2d5503460f94a039f7/html5/thumbnails/32.jpg)
Results: Overlapping Sinusoids
The optimization is able to compensate for some of the errors in initial magnitude and phase estimation, resulting in a lower MSE.
32