the evaluation and optimisation of multiresolution fft parameters
DESCRIPTION
The evaluation and optimisation of multiresolution FFT Parameters. For use in automatic music transcription algorithms. Automatic music transcription (AMT). AMT Algorithms. Time & Frequency Resolution. Short Window. Time Resolution Increases Frequency Resolution Decreases. Long Window. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/1.jpg)
The evaluation and optimisation of multiresolution FFT ParametersFor use in automatic music transcription algorithms
![Page 2: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/2.jpg)
Automatic music transcription (AMT)
![Page 3: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/3.jpg)
AMT Algorithms
![Page 4: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/4.jpg)
Time & Frequency Resolution
Time Resolution IncreasesFrequency Resolution Decreases
Short Window
Time Resolution DecreasesFrequency Resolution Increases
Long Window
![Page 5: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/5.jpg)
Multiresolution FFT (MRFFT)
High FrequencyResolution
High Time Resolution
FcA FcB FcC FcD
FFT A FFT B FFT C FFT D
![Page 6: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/6.jpg)
Time Freq Plane - Dressler
![Page 7: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/7.jpg)
Window Length - Bin Alignment
Note-bin alignment – The position of a fundamental frequency relative to a FFT bin frequency.
![Page 8: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/8.jpg)
Note bin alignment
215.33236.87258.40279.93301.46323.00344.53366.06387.60409.13430.66452.20473.73495.260
50
100
150
200
250
A 2048 FFT Decomposition of a 376.83Hz Sine Wave
FFT Bin (Hz)
FFT
Bin
Mag
nitu
de
![Page 9: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/9.jpg)
Note bin alignment
215.33236.87258.40279.93301.46323.00344.53366.06387.60409.13430.66452.20473.73495.260
100
200
300
400
500
600
A 2048 FFT Decomposition of a 366.06Hz Sine Wave
FFT Bin (Hz)
FFT
Bin
Mag
nitu
de
![Page 10: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/10.jpg)
MRFFT Optimisation
Cut off frequencies Subband FFT Length Optimised based on 3 characteristics determined by
window length Time Resolution Frequency Resolution Note Bin Alignment
![Page 11: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/11.jpg)
Scoring
Calculate score for time, freq, and note-bin alignment in each subband
Weight score according to notes in subband Range correct score to be between 0 and 1 Sum all scores across all bands to generate MRFFT
Score
![Page 12: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/12.jpg)
Note Bin Scoring
If 2 note frequencies fall within same bin, FFT length is discounted as unsuitable
Weighted Sub-band FFT Bin Score = Sub-band FFT Bin Score * (notes in sub-band/total notes across all bands)
![Page 13: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/13.jpg)
Scoring Process The algorithm moves the cut off frequencies A, B and C
through all combinations of positions. For each position, all FFT lengths between 256 and 8192 samples in increments of 128 are evaluated on each sub-band. All combinations of FFT lengths on all combinations of subbands are evaluated and scored.
Subband A Subband B Subband C Subband D
FcA FcB FcC FcD80 Hz 5KHz
![Page 14: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/14.jpg)
Solutions
1. 4 band MRFFT 256-8192 range
2. 3 band MRFFT256-8192 range
3. Dressler 4 band MRFFT256-2048 range
4. Dressler fixed FFT Length variable bands 256-2048 range5. 4 band MRFFT
256-2048 range6. 1 band FFT
8192
![Page 15: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/15.jpg)
Resu
lts –
Subb
and
Divi
sions Band A
Band B
Band CBand D
![Page 16: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/16.jpg)
Results – MRFFT Score
![Page 17: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/17.jpg)
Transcription Test – Low F Bands
FcA FcB
Original
Solution 1
Solution 6
High F Resolution of solution 6 is reflected inLow frequency transcription accuracy
![Page 18: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/18.jpg)
Transcription Test – High F Bands
Solution 1
Solution 3
Solution 6
![Page 19: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/19.jpg)
F-Measure Results
1 2 3 4 5 60.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
1.000
RecallPrecisionFmeasure
Solution
Scor
e
Recall refers to the fraction of the relevant notes that were retrieved i.e. how many of the correct notes the system extracted.
Precision refers to the fraction of relevant notes retrieved, relative to the total number retrieved. I.e. how many of the extracted notes that were correct.
F-Measure is the weighted mean of precision and recall.
![Page 20: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/20.jpg)
Peak Picker
A threshold is dynamically set for each analysis window of the STFT as a percentage of the maximum magnitude within the window, with a minimum threshold heuristically decided. If a bin magnitude exceeds the threshold a note is transcribed at that point.
![Page 21: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/21.jpg)
Peak Picker Robustness
![Page 22: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/22.jpg)
Solution 1 Vs Solution 6 Picker
![Page 23: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/23.jpg)
MRFFT Implementation6016 FFT is performed on the entire frequency spectrum. The spectral information is then filtered to include only the frequencies required by that band.
note frequency (orange magnitude) not in the frequency band considered, generates cross channel interference (red magnitudes) that contributes to the magnitudes in the sub-band of interest.
![Page 24: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/24.jpg)
Cross talk indicators
![Page 25: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/25.jpg)
Adjacent bins Adjacent bins in optimised MRFFT
represent fundamental frequencies. Therefore any cross channel interference will contribute to energy contained in FFT bins representing note frequencies. This may contribute to false positives.
![Page 26: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/26.jpg)
F Measure conclusions
The results of the F-Measure are largely disappointing, and can be attributed to the inadequacies of the implemented peak picker to handle fluctuations in magnitude of local maxima. Characteristics of the MRFFT, like adjacent note representing bins, and interference generated by sub-band division methods contribute to this problem.
Large variations of spectral magnitudes also contribute
![Page 27: The evaluation and optimisation of multiresolution FFT Parameters](https://reader036.vdocuments.us/reader036/viewer/2022062301/56816156550346895dd0dda7/html5/thumbnails/27.jpg)
Conclusions
The theoretical scoring of MRFFT parameters resulted in favourable results for the optimised FFT.
The ‘real world’ sinusoidal extraction test demonstrated initially disappointing F-Measure results for the MRFFT solutions compared to the single band 8192 FFT. However, upon closer analysis of the transcribed files, positive aspects of the MRFFT analysis were found as performance improved in the higher frequencies.
Further investigation of the results revealed inadequacies of the peak picker implemented and also indicated issues with the construction of the MRFFT that require further investigation.