pitch recognition with wavelets 1.130 final presentation by stephen geiger
TRANSCRIPT
What is pitch recognition?
Well, what is pitch? . . .
How HIGH or LOW a sound is
Which note?
Perceived Frequency
For Example:
For Middle C:
Frequency = 262 Hz
MATLAB CODE:fs = 22050; % Sampling Frequency.f = 262; % Fundamental Freq of Middle C. t=0:1/(fs):1; % Time range of 0 to 1 seconds. sound(cos(2*pi*f*t)/2,fs); % Make some noise!
For an A Scale:
E = 220*2^(7/12) = 330 HzF = 220*2^(8/12) = 349 HzF#= 220*2^(9/12) = 370 HzG = 220*2^(10/12)= 392 HzG = 220*2^(11/12)= 415 HzA = 220*2^(12/12)= 440 Hz
A = 220*2^(0/12)= 220 HzA#= 220*2^(1/12)= 233 HzB = 220*2^(2/12)= 247 HzC = 220*2^(3/12)= 262 HzC#= 220*2^(4/12)= 277 HzD = 220*2^(5/12)= 294 HzD#= 220*2^(6/12)= 311 Hz
An Octave Up:
For C5:
Frequency = 524 Hz
MATLAB CODE:fs = 22050; % Sampling Frequency.f = 524; % Fundamental Freq of C5.t=0:1/(fs):1; % Time range of 0 to 1 seconds. sound(cos(2*pi*f*t)/2,fs); % Make some noise!
A Sum with 2 Frequencies:
MATLAB CODE:fs = 22050; % Sampling Frequency.f1 = 262; % Fundamental Freq of Middle C. f2 = 524; % Fundamental Freq of C5.t=0:1/(fs):1; % Time range of 0 to 1 seconds. sound((cos(2*pi*f1*t)+ . . . 0.25*cos(2*pi*f2*t))/2,fs);
Frequency = 262 Hz
and
Frequency = 524 Hz
Mono vs. Poly
Monophonic
one note at a time
(e.g. trumpet)
Polyphonic
multiple notes at a time
(e.g. piano, orchestra)
Creates a problem forpitch recognition.
(especially octaves!)
Some Existing Methods
Time Domain – Pitch Period estimation With wavelets. With auto-correlation function.
Freq. Domain – Find Fundamental
Auditory Scene Analysis Blackboard Systems Neural Networks Perceptual Models
What applications are there?
Transcription of Music
Modeling of Musical Instruments
Speech Analysis
Besides its an Interesting Problem
A Novel Wavelet Approach
For a piano playing these notes, a CWT
could be used to identify a ‘G’
with certain scale/wavelet combinations.
Even with some polyphony !
Based on an observation made by
Jeremy Todd, that:
The Continuous Wavelet Transform
Definition of a CWT:
dta
bt
atfC ba
1)(,
Where: a = scaling factor b = shift factor f(t) = function we start with (t) = Mother wavelet
What is Scale?
LOW SCALECompressed Wavelet
Lots of DetailHigh Frequency
(You are here) (And here)
HIGH SCALEStretched WaveletCoarse FeaturesLow Frequency
Initial Work
Took an empirical approach.
Ran a number of CWT’s at varying scale, and looked at the results.
Picked out a CWT scale for each note in the C scale.
Why does this work?
The scale parameter
in the CWT affects
frequency response.
However, our “scales” that
work don’t seem to follow
a clear pattern.
Training Algorithm
Again, took an empirical approach.
Ran CWT’s at varying scales, on sample files containing one note.
Picked out scales, where: maximum of the CWT forone note >> other notes(and collected results).
Training on a ‘Real’ Guitar
Only able to find 5 of 8 pitches for C Scale
training case. (With limited attempt).
Results on a test file were not completely
accurate.
Expected to be a more difficult case than a
piano.
Could merit a more thorough try.
Entire 88 K on a P
Work in progress.
It takes a long time to run many
CWT’s on 88 different sound files.
Initial results able to
identify notes 70-88.
Resulting Scales for 22 Piano Notes
0
500
1000
1500
2000
2500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
SCALE
NOTE NUMBER
Resulting Scales for 8 Sinusoidal Notes
0
2000
4000
6000
8000
10000
12000
14000
0 1 2 3 4 5 6 7 8
SCALE
NOTE NUMBER