amdfcalc

5
1 MATLAB Exercise – AMDF Calculation Program Directory: matlab_gui\amdf Program Name: amdf_GUI25.m GUI data file: amdf.mat Callbacks file: Callbacks_amdf_GUI25.m TADSP: Section 6.7, pp. 275-277, Problem 6.21 This MATLAB exercise computes and displays the short-time average magnitude difference function (AMDF) of a speech frame and, in cases that are determined to represent voiced speech frames, estimates the pitch period of the current frame as the constrained minimum (i.e., within the range of pitch period estimates) of the AMDF array. AMDF Calculation – Theory of Operation This MATLAB exercise calculates and displays the AMDF of a frame of speech from a designated speech file and implements a pitch detection algorithm based on using the AMDF on a frame-by-frame basis. The program can operate in a frame mode, displaying each analysis frame and its associated AMDF (with a marking for best estimate of pitch period in voiced regions), or in a pitch detection mode. In the AMDF pitch detection mode a non-voiced frame detector (when the frame log energy falls below a fixed threshold) is utilized. The code also uses a procedure for determining whether the location of the minimum of the AMDF, for a given analysis frame, represents a case of pitch period doubling or even pitch period tripling, and provides an appropriate correction measure for such cases. The exercise plots the pitch period contour along with a plot of a confidence measure based on how strong a minimum of the AMDF was obtained for each analysis frame. AMDF Calculation – GUI Design The GUI for this exercise consists of two panels, 2 graphics panels, 1 title box and 13 buttons. The functionality of the two panels is: 1. one panel for the graphics display, 2. one panel for parameters related to the signal processing parameters for AMDF calculation, and for running the program. The set of two graphics panels is used to display the following: 1. in frame mode one graphics panel shows the current frame speech waveform on a normalized amplitude scale; in pitch detector mode this same graphics panel shows the smoothed pitch period contour for the speech utterance, 2. in frame mode the second graphics panel shows the AMDF function for the current frame along with the es- timated location of a valid pitch peak for the current speech frame; in pitch detector mode this same graphics panes shows the smoothed confidence score for the pitch detector pitch periods. The title box displays the information about the selected file for analysis including frame size, frame shift, and maxi- mum and minimum pitch periods for searching for a pitch period estimate. The functionality of the 13 buttons is: 1. a pushbutton to select the directory with the speech file that is to be analyzed using short-time analysis methods; the default directory is ’speech files’, 2. a popupmenu button that allows the user to select the speech file for analysis, 3. a pushbutton to play the speech file being processed, 4. an editable button that specifies the frame duration, L m , (in msec) for short-time analysis; (the default value is L m = 40 msec),

Upload: niravsinghdabhi

Post on 17-Aug-2015

220 views

Category:

Documents


1 download

DESCRIPTION

Find AMDF

TRANSCRIPT

1MATLAB Exercise AMDF CalculationProgram Directory: matlab_gui\amdfProgram Name: amdf_GUI25.mGUI data le: amdf.matCallbacks le: Callbacks_amdf_GUI25.mTADSP: Section 6.7, pp. 275-277, Problem 6.21This MATLAB exercise computes and displays the short-time average magnitude difference function (AMDF) ofa speech frame and, in cases that are determined to represent voiced speech frames, estimates the pitch period of thecurrent frame as the constrained minimum (i.e., within the range of pitch period estimates) of the AMDF array.AMDF Calculation Theory of OperationThis MATLAB exercise calculates and displays the AMDF of a frame of speech from a designated speech le andimplementsapitchdetectionalgorithmbasedonusingtheAMDFonaframe-by-framebasis. Theprogramcanoperate in a frame mode, displaying each analysis frame and its associated AMDF (with a marking for best estimateof pitch period in voiced regions), or in a pitch detection mode. In the AMDF pitch detection mode a non-voicedframe detector (when the frame log energy falls below a xed threshold) is utilized. The code also uses a procedurefor determining whether the location of the minimum of the AMDF, for a given analysis frame, represents a case ofpitch period doubling or even pitch period tripling, and provides an appropriate correction measure for such cases.The exercise plots the pitch period contour along with a plot of a condence measure based on how strong a minimumof the AMDF was obtained for each analysis frame.AMDF Calculation GUI DesignThe GUI for this exercise consists of two panels, 2 graphics panels, 1 title box and 13 buttons. The functionality ofthe two panels is:1. one panel for the graphics display,2. one panel for parameters related to the signal processing parameters for AMDF calculation, and for running theprogram.The set of two graphics panels is used to display the following:1. in frame mode one graphics panel shows the current frame speech waveform on a normalized amplitude scale; inpitch detector mode this same graphics panel shows the smoothed pitch period contour for the speech utterance,2. in frame mode the second graphics panel shows the AMDF function for the current frame along with the es-timated location of a valid pitch peak for the current speech frame; in pitch detector mode this same graphicspanes shows the smoothed condence score for the pitch detector pitch periods.The title box displays the information about the selected le for analysis including frame size, frame shift, and maxi-mum and minimum pitch periods for searching for a pitch period estimate. The functionality of the 13 buttons is:1. a pushbutton to select the directory with the speech le that is to be analyzed using short-time analysis methods;the default directory is speech les,2. a popupmenu button that allows the user to select the speech le for analysis,3. a pushbutton to play the speech le being processed,4. an editable button that species the frame duration, Lm, (in msec) for short-time analysis; (the default value isLm= 40 msec),25. an editable button that species the frame shift, Rm,(in msec) for short-time analysis; (the default value isRm= 10 msec),6. a popupmenu button that lets the user choose a pitch range for search for the current pitch period estimatedepending on the gender of the talker; (the default is Male pitch range),7. an editable button that species the AMDF threshold, amdfthresh, for determining that a frame of signal isvoiced (minimum AMDF below threshold) or not voiced (minimum AMDF above threshold),8. a text button that displays the starting sample, ss, of the current frame for frame analysis; (the default value is1 for starting sample),9. a pushbutton to determine the single frame starting sample, ss, using the iterative method described below; thisstarting sample denes the current analysis frame,10. a pushbutton to run the analysis code and display the signal processing results using the current frame of thespeech signal; this button can be pressed and used as often as desired, changing one or more analysis parameterswhile keeping the frame starting sample the same,11. a pushbutton to run the analysis code and display the signal processing results using the next frame of signal;i.e., the frame with starting sample set to ss+R where R is the frame shift in samples; this button can be pushedrepeatedly to provide a frame-by-frame analysis,12. apushbuttontorunthepitchdetectorcodeandtodisplaythesmoothedpitchperiodandcondencescorecontours on the graphics panels,13. a pushbutton to close the GUI.Interactive Method of Dening the Speech Analysis Frame Starting SampleSeveral MATLAB Exercises rely on frame-based analysis methods where the user needs to specify both the speechle for analysis, and the starting sample of the speech analysis frame of interest.The method that we have chosen todene the frame starting sample is an interactive analysis which homes in on an appropriate analysis frame in a seriesof steps. The operations of this interactive method for determining the starting sample of the speech analysis framefor autocorrelation analysis proceed as follows:1. In a specied graphics frame (or gure sub-frame) a single line plot of the entire speech waveform is obtained,as illustrated at the top panel of Figure 1. A graphics curser then appears allowing the user to move the cursor tothe region of speech that is of interest for specifying the current analysis frame. A solid vertical cursor is shownat the place selected by the user. For the example of Figure 1 the cursor location is approximately sample 13000,as indicated by the solid red bar.2. In another specied graphics frame (or gure sub-frame) a plot of the speech signal over a region that is about1000 samples around the location of the cursor in the previous step; i.e., from sample 12000 to sample 14000.A second graphics cursor appears allowing the user to move the cursor to the exact starting sample of interest(to within the resolution of the display) for specifying the current analysis frame, as illustrated in the middlegraphics panel of Figure 1. Here the cursor is again shown in the area of sample 13000.3. The current analysis frame is then dened as the frame of speech from the starting sample of step 2 minus halfthe window length, to the starting sample of step 2 plus half the window length. The designated analysis frameis then weighted by the analysis window (Hamming in the case here) and plotted in the bottom graphics panel.It should be clear that the three steps of the above process for choosing an analysis frame can be implemented in eithera single graphics panel or frame (by simply overwriting the graphics panel with the new speech signal) or in a series ofgraphics panels or frames. The current exercise uses one of the 8 graphics panels and overwrites the speech waveformplot at each step of the analysis. This process is a very useful and efcient one for choosing a region of interest withinthe speech signal, and then homing into a particular analysis frame using the steps outlined above.3Figure 1: Sequence of waveform plots dening how the user can interactively choose a starting sample for the currentanalysis frame.AMDF Calculation Scripted RunA scripted run of the program amdf GUI25.m is as follows:1. run the program time domain features gui25.m from the directory matlab gui\amdf,2. hit the pushbutton Directory; this will initiate a system call to locate and display the lesystem for the directoryspeech les,3. using the popupmenu button, select the speech le for short-time feature analysis; choose the le we were awaya year ago suzanne.wav for this example,4. hit the pushbutton Play Speech File to play the speech le being processed,5. using the editable buttons, choose the default values for the frame length,Lm, (40 msec), for the frame shift,Rm (10 msec), and for the AMDF threshold, amdfthresh (0.6),6. using the popupmenu button, specify the gender of the talker as female to modify the search region for pitchperiod estimates,7. hit the Get Frame Starting Sample button to interactively choose the initial analysis frame starting sample, ss,using the iterative method described above; try to choose the starting sample as close to the value of 3010 so asto match the plotted results for this example exercise,48. hit the Run Current Frame button to initiate single frame analysis of the speech beginning at the current framestarting sample, ss; the results of AMDF analysis are shown in the various graphical plots; the Run CurrentFrame button can be hit repeatedly after making changes in the analysis frame parameters; a red vertical lineindicates the estimate of pitch period for the current frame of speech,9. hit the Run Next Frame button to initiate single frame analysis on the next frame of speech, i.e., where thestarting sample of the next frame is set to ss+R, where R is the frame shift in samples,10. hit the Run Pitch Detector to run in pitch detection mode; the resulting pitch period contour is displayed inthe upper graphics panel, and the amdf minimum value (which serves as a condence score) is displayed in thelower graphics panel,11. experiment with different choices of speech le, and with different values for Lm, Rm, gender and amdfthresh,12. hit the Close GUI button to terminate the run.Examples of the graphical output obtained from this exercise using the speech le:we were away a year ago suzanne.wavare shown in Figure 2 (for the frame mode graphics), and in Figure 3 (for the pitch detection mode graphics). Thedisplays for the frame mode graphics are the short-time analysis frame (upper graphics panel) and the short-time amdffunction (lower graphics panel). The displays for the pitch detection mode graphics are the smoothed pitch periodcontour (upper graphics panel) and the smoothed condence score contour (lower graphics panel).Figure 2: Plots of short-time AMDF analysis in frame mode: the upper graphics panel shows the current speech frame,and the lower graphics panel shows the short-time amdf with the best estimate of pitch period denoted by a red verticalline at the pitch period location.5Figure 3: Plots of amdf pitch detector output. The upper graphics panel shows the smoothed pitch period contour. Thelower graphics panel shows the smoothed condence scores.AMDF Calculation Issues for Experimentation1. run the scripted exercise above (remembering to set the male/female switch to female for this speech le), andanswer the following: at what non-zero lag does the AMDF estimate attain a minimum? what is the pitch frequency estimate for this frame of speech?2. run the exercise in the Pitch Detector mode by hitting the Run Pitch Detector button; a new plot is generatedshowing the pitch period contour along with the amdf score which is used as a measure of condence (the closerto zero for the amdf at the minimum value, the higher the condence in the pitch period estimate) how smooth is the resulting pitch period contour from the amdf estimates? what percentage of the frames have minimum amdf estimates whose level is below 0.6? how correlated are high amdf minimum value scores with unreliable pitch period estimates?