pattern recognition matlab manual

Pattern Recognition Matlab Manual

Aggelos Pikrakis, Sergios Theodoridis, Kostantinos Koutroumbas and Dionisis Cavouras

February 2009

Theodoridis, Pattern Recognition 4e,

Copyright 2009, Elsevier Inc

2

Contents

1 Preface 5

2 Clustering 6

2.1 BSAS.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 reassign.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 GMDAS.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 gustkess.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.5 k means.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.6 possibi.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.7 k medoids.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.8 spectral Ncut2.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.9 spectral Ncut gt2.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.10 spectral Ratiocut2.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.11 spectral Ratiocut gt2.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.12 competitive.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.13 valley seeking.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.14 distant init.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.15 rand data init.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.16 rand init.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.17 cost comput.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.18 dgk.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.19 dist.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Feature Selection 17

3.1 simpleOutlierRemoval.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 normalizeStd.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.3 normalizeMnmx.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.4 normalizeSoftmax.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.5 ROC.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.6 divergence.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.7 divergenceBhata.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.8 ScatterMatrices.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.9 ScalarFeatureSelectionRanking.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.10 SequentialBackwardSelection.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.11 SequentialForwardSelection.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.12 exhaustiveSearch.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Image Features 21

4.1 generateCoOccMat.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.2 CoOccMatFeatures.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.3 CoOccASM.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.4 CoOccContrast.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.5 CoOccCOR.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.6 CoOccVariance.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22



Theodoridis, Pattern Recognition 4e, Page 2




3

4.7 CoOccIDM.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.8 CoOccSUA.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.9 CoOccSUV.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.10 CoOccSUE.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.11 CoOccEntropy.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.12 CoOccDEN.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.13 CoOccDVA.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.14 CoOccCIMI.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.15 CoOccCIMII.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.16 CoOccPXandPY .m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.17 CoOccPXminusY.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.18 CoOccPxplusY.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.19 ImageHist.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.20 HistMoments.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.21 HistCentralMoments.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.22 LawMasks.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.23 RL 0 90.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.24 RL 45 135.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.25 SRE.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.26 LRE.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.27 GLNU.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.28 RLNU.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.29 RP.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5 Audio Features 30

5.1 sfSpectralCentroid.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2 sfSpectralRolloff.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.3 sfFundAMDF.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.4 sfFundAutoCorr.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.5 sfFundCepstrum.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.6 sfFundFreqHist.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.7 sfMFCCs.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.8 computeMelBank.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.9 stEnergy.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.10 stZeroCrossingRate.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.11 stSpectralCentroid.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.12 stSpectralRolloff.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.13 stSpectralFlux.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.14 stFundAMDF.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.15 stMelCepstrum.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.16 stFundFreqHist.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.17 stFundAutoCorr.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.18 stFundCepstrum.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375.19 stFourierTransform.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37







4

6 Dynamic Time Warping 38

6.1 editDistance.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.2 DTWSakoe.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.3 DTWSakoeEndp.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396.4 DTWItakura.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396.5 DTWItakuraEndp.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406.6 BackTracking.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41







5

1 Preface

The Matlab m-files that are provided here are intended to satisfy pedagogical needs. We have not covered allthe algorirhms that are described in the book. We have focused at the most popular methods as well as methodsthat can help the reader to get familiar with the basic concepts associated with the different methodologies.

All the m-files have been developed by ourselves and we have not included (with very few exceptions) codethat is already included in the available Matlab toolboxes, e.g., statistical toolbox and image processing toolbox.Examples are the routines related to Support Vector Machines, k-NN classifier, etc.

Currently, a companion book is being developed including short descriptions of theory as well as a numberof Matlab excercises. This book will be based on routines, which are given here as well as those available in thematlab toolboxes.

We would appreciate any comments from the readers and we are happy to try to accomodate them.

A. PikrakisS. TheodoridisK. KoutroumbasD. Cavouras







6

2 Clustering

2.1 BSAS.m

• Syntax : [bel, m]=BSAS(X,theta,q,order)

• Description: This function implements the BSAS (Basic Sequential Algorithmic Scheme) algorithm. Itperforms a single pass on the data. If the currently considered vector lies at a significant distance (greaterthan a given dissimilarity threshold) from the clusters formed so far, a new cluster is formed with thecurrent vector being its representative. Otherwise, the considered vector is assigned to its closest cluster.The results of the algorithm are influenced by the order of the presentation of the data.

• Input :

– X : an l × N dimensional matrix, each column of which corresponds to an l-dimensional data vector.

– theta: the dissimilarity threshold.

– q : the maximum allowable number of clusters.

– order : an N-dimensional vector containing a permutation of the integers 1, 2, . . . , N . The i-th elementof this vector specifies the order of presentation of the i-th vector to the algorithm.

• Output :

– bel : an N-dimensional vector whose i-th element indicates the cluster where the i-th data vector hasbeen assinged.

– m: a matrix, each column of which contains the l-dimensional (mean) representative of each cluster.

2.2 reassign.m

• Syntax : [bel]=reassign(X,m,order)

• Description: This function performs a single pass on the data set and re-assigns the data vectors to theirclosest clusters taking into account their distance from the cluster representatives. It may be applied onthe clustering produced by BSAS in order to obtain more compact clusters.

• Input :


– m: the matrix whose columns contain the l-dimensional (mean) representatives of the clusters.

– order : an N-dimensional vector containing a permutation of the integers 1, 2, . . . , N . The i-th elementof this vector specifies the order of presentation of the i-th vector to the algorithm.

• Output :

– bel : an N-dimensional vector whose i-th element indicates the cluster where the i-th data vector hasbeen assinged.







7

2.3 GMDAS.m

• Syntax : [ap,cp,mv,mc,iter,diffvec]=GMDAS(X,mv,mc,e,maxiter,sed)

• Description: This function implements the GMDAS (Generalized Mixture Decomposition AlgorithmicScheme) algorithm, where each cluster is characterized by a normal distribution. The aim is to esti-mate the means and the covariance matrices of the distributions characterizing each cluster, as well as thea priori probabilities of the clusters. This is carried out in an iterative manner, which terminates when nosignificant change in the values of the previous parameters is encountered between two successive iterations.Once more, the number of clusters m is assumed to be known.

• Input :


– mv : an l × m dimensional matrix, each column of which contains an initial estimate of the meancorresponding to the i-th cluster.

– mc: an l × l ×m dimensional matrix whose i-th l × l two-dimensional “slice” is an initial estimate ofthe covariance matrix corresponding to the i-th cluster.

– e: The threshold that controls the termination of the algorithm. Specifically, the algorithm terminateswhen the sum of the absolute differences of “mv”s “mc”s and a priori probabilities between twosuccessive iterations is smaller than “e”.

– maxiter : The maximum number of iterations the algorithm is allowed to run.

– sed : The seed used for the initialization of the random generator function “rand”.

• Output :

– ap: an m-dimensional vector whose i-th coordinate contains the a priori probability of the i-th cluster.

– cp: an N × m dimensional matrix whose (i, j) element contains the probability of the fact that thei-th vector belongs to the j-th cluster.

– mv : the l × m dimensional matrix each column of which contains the final estimate of the meancorresponding to the i-th cluster.

– mc: an l × l × m dimensional matrix whose i-th l × l two-dimensional “slice” is the final estimate ofthe covariance matrix corresponding to the i-th cluster.

– iter : the number of iterations performed by the algorithm.

– diffvec: a vector whose i-th cooordinate contains the difference between the sum of the absolutedifferences of “mv”s “mc”s and a priori probabilities between the i-th and the (i − 1)-th iteration.

2.4 gustkess.m

• Syntax : [u,c,S,iter]=gustkess(X,u,c,S,q,e)

• Description: This function implements the Gustafson-Kessel algorithm, an algorithm of fuzzy nature thatis able to unravel planar clusters. Once more, the number of clusters, m, is a prerequisite for the algorithm.The j-th cluster is represented by a center c(:,j) and a covariance matrix S(:,:,j). The distance of a pointX(:,i) from the j-th cluster is a weighted form of the Mahalanobis distance and is a function of X(:,i), c(:,j)and S(:,:,j). The algorithm aims at grouping points that lie around a line (hyperplane in general) to thesame cluster via iterative adjustment of the cluster parameters (centers and covariances).







8

• Input :

– X : an l × N dimensional matrix each column of which corresponds to a data vector.

– u: an N×m dimensional matrix whose (i, j) element is an initial estimate of the “grade of membership”of the i-th data vector to the j-th cluster (all elements of u are in the interval [0, 1] and the entries ofeach row sum up to unity).

– c: an l × m dimensional matrix whose j-th column is an initial estimate of the center for the j-thcluster.

– S : an l × l × m dimensional matrix whose j-th l × l two-dimensional “slice” is an initial estimate ofthe covariance for the j-th cluster.

– q : the fuzzifier parameter.

– e: the parameter used in the termination criterion of the algorithm. (the algorithm terminates whenthe summation of the absolute differences of u’s between two successive iterations is less than e).

• Output :

– u: an N ×m dimensional matrix with the final estimates of the grade of memberships of each vectorto each cluster.

– c: an l × m dimensional matrix with the final estimates of the centers of the clusters.

– S : an l × l × m dimensional matrix whose j-th l × l two-dimensional “slice” is a final estimate of thecovariance for the j-th cluster.

– NOTE : This function calls dgk.m function for the computation of the distance of a point from acluster.

2.5 k means.m

• Syntax : [w,bel]=k means(X,w)

• Description: This function implements the k-means algorithm, which requires as input the number of clus-ters underlying the data set. The algorithm starts with an initial estimation of the cluster representativesand iteratively tries to move them into regions that are “dense” in data vectors, so that a suitable costfunction is minimized. This is achieved by performing (usually) a few passes on the data set. The algo-rithm terminates when the values of the cluster representatives remain unaltered between two successiveiterations.

• Input :


– w : a matrix, whose columns contain the l-dimensional (mean) representatives of the clusters.

• Output :

– w : a matrix, whose columns contain the final estimates of the representatives of the clusters.

– bel : an N-dimensional vector, whose i-th element indicates the cluster where the i-th data vector hasbeen assinged.







9

2.6 possibi.m

• Syntax : [U,w]=possibi(X,m,eta,q,sed,init proc,e thres)

• Description: Implements the possibilistic algorithm, when the squared Euclidean distance is adopted. Thealgorithm moves iteratively the cluster representatives to regions that are “dense” in data, so that a suitablecost function is minimized. It terminates when no significant difference in the values of the representativesis encountered between two successive iterations. Once more, the number of clusters is a priori required.However, when it is overestimated, the algorithm has the ability to return a solution where more than onerepresentatives coincide.

• Input :

– X : an l × N dimensional matrix, each column of which corresponds to a data vector.

– m: the number of clusters.

– eta: an m-dimensional array of the eta parameters of the clusters.

– q : the q parameter of the algorithm. When this is not equal to 0 the original cost function isconsidered, while if it is 0 the alternative one is considered.

– sed : a scalar integer, which is used as the seed for the random generator function “rand”.

– init proc: an integer taking values “1”, “2” or “3” with- “1” corresponding to the rand init.m initialization procedure (this procedure chooses randomly mvectors from the smallest hyperrectangular that contains all the vectors of X and its sides are parallelto the axes).- “2” corresponding to rand data init.m (this procedure chooses randomly m vectors among the vectorsof X) and- “3” corresponding to distant init.m (this procedure chooses the m vectors of X that are “mostdistant” from each other. This is a more computationally demanding procedure).

– e thres: The threshold controlling the termination of the algorithm. Specifically, the algorithm termi-nates when the sum of the absolute differences of the representatives between two successive iterationsis smaller than e thres.

• Output :

– U : an N×m dimensional matrix, whose (i, j) element denotes the possibility that the i-th data vectorbelongs to the j-th cluster, after the convergence of the algorithm.

– w : an l × m dimensional matrix, each column of which corresponds to a cluster representative, afterthe convergence of the algorithm.

– NOTE : This function calls rand init.m, rand data init.m and distant init.m.

2.7 k medoids.m

• Syntax : [bel,cost,w,a]=k medoids(X,m,sed)

• Description: This function implements the k-medoids algorithm. The aim of this algorihm is the same aswith k-means, i.e., to move iteratively the cluster representatives to regions that are “dense” in data, sothat a suitable cost function is minimized. However, now, the represenatives are constrained to be vectors







10

of the data set. The algorithm terminates when no change in the representatives is encountered betweentwo successive iterations.

• Input :


– m: the number of clusters.

– sed : a scalar integer, which is used as the seed for the random generator function “rand”.

• Output :

– bel : an N-dimensional vector, whose i-th element contains the cluster in which the i-th data vector isassigned, after the convergence of the algorithm.

– cost : a scalar which is the summation of the distances of each data vector from each closest represen-tative, computed after the convergence of the algorithm.

– w : an l × m dimensional matrix, each column of which corresponds to a cluster representative, afterthe convergence of the algorithm.

– a: an m-dimensional vector containing the indices of the data vectors that are used as representatives.

– NOTE : This function calls cost comput.m to compute the cost accosiated with a specific partition.

2.8 spectral Ncut2.m

• Syntax : [bel]=spectral Ncut2(X,e,sigma2)

• Description: This function performs spectral clustering based on the N ×N dimensional normalized graphLaplacian L1, produced by an l × N dimensional matrix X, each column of which corresponds to a datavector. The number of clusters in this case is fixed to 2. The algorithm determines the N dimensionaleigenvector that corresponds to the 2nd smallest eigenvalue of L1. A new N dimensional vector y isproduced by multiplying the above eigenvector with a suitable matrix D. Finally, the elements of y aredivided into two groups according to whether they are greater or less than the median value of y. Thisdivision specifies the clustering of the vectors in the original data set X. The algorithm minimizes theso-called Ncut criterion.

• Input :

– X : an l × N dimensional matrix each row of which is a data vector.

– e: the parameter that defines the size of the neighborhood around each vector.

– sigma2 : the parameter that controls the width of the Gaussian kernel (here only the case where allthe kernels have the same sigma2 parameter is considered).

• Output :

– bel : an N-dimensional vector whose i-th element contains the index of the cluster to which the i-thdata vector is assigned.







11

2.9 spectral Ncut gt2.m

• Syntax : [bel]=spectral Ncut gt2(X,e,sigma2,k)

• Description: This function performs spectral clustering based on the N ×N dimensional normalized graphLaplacian L1, produced by an l × N dimensional matrix X, each column of which corresponds to a datavector. The number of clusters, k, is again assumed to be known (it may be greater than or equal to 2).The algorithm determines a k × N dimensional matrix U, the j-th column of which corresponds to thej-th smallest eigenvalue of L1. Matrix V is produced by multiplying U with a suitable matrix. Then,the i-th data vector X(:,i) is mapped to the i-th row vector U(i,:) of U. Finally, the data vectors X(:,i)’sare clustered based on the clustering produced by the k-means algorithm that is applied on U(i,:)’s. Thealgorithm minimizes the so-called Ncut criterion.

• Input :




– k : the number of clusters.

• Output :


2.10 spectral Ratiocut2.m

• Syntax : [bel]=spectral Ratiocut2(X,e,sigma2)

• Description: This function performs spectral clustering based on the N × N dimensional unnormalizedgraph Laplacian L, produced by an l × N dimensional matrix X, each column of which corresponds to adata vector. The number of clusters in this case is fixed to 2. The algorithm determines the N-dimensionaleigenvector y that corresponds to the 2nd smallest eigenvalue of L. The elements of y are divided into twogroups according to whether they are greater or less than the median value of y. This division specifiesthe clustering of the vectors in the original data set X. The algorithm minimizes the so-called Ratiocutcriterion.

• Input :




• Output :








12

2.11 spectral Ratiocut gt2.m

• Syntax : [bel]=spectral Ratiocut gt2(X,e,sigma2,k)

• Description: This function performs spectral clustering based on the N × N dimensional unnormalizedgraph Laplacian L, produced by an l × N dimensional matrix X, each column of which corresponds to adata vector. The number of clusters, k, is again assumed to be known (it may be greater or equal to 2).The algorithm determines a k × N dimensional matrix V, the j-th column of which corresponds to thej-th smallest eigenvalue of L. Then, i-th data vector X(:,i) is mapped to the i-th row vector V(i,:) of V.Finally, the data vectors X(:,i)’s are clustered based on the clustering produced by the k-means algorithmthat is applied on V(i,:)’s. The algorithm minimizes the so-called Ratiocut criterion.

• Input :



– sigma2 : This parameter controls the width of the Gaussian kernel (here only the case where all thekernels have the same sigma2 parameter is considered).

– k : the number of clusters.

• Output :


2.12 competitive.m

• Syntax : [w,bel,epoch]=competitive(X,w ini,m,eta vec,sed,max epoch,e thres,init proc)

• Description: This function implements the competitive leaky learning algorithm. It is an iterative algorithmwhere the representatives are updated after the presentation of each data vector X(:,i). Specifically, allrepresentatives move towards X(:,i). However, the learning rate for the one that lies closer to X(:,i) (winner)is much higher than the learning rate for the rest representatives (losers). As a consequence, the closestrepresentative moves much closer to X(:,i) than the rest representatives. In this way, the representativesare moved towards regions that are “dense” in data vectors. The number of representatives is assumed tobe known. The basic competitive learning scheme (where only the closer representative moves towards thecurrent vector) can be viewed as a special case of the leaky learning algortihm when the learning rate forthe losers is set equal to 0.IMPORTANT NOTE: In this implementation,(a) the vectors are presented in random order within each epoch and(b) the termination condition of the algorithm is “The clustering remains unaltered during two successiveepochs”.

• Input :

– X : an l × N dimensional matrix containing the data points.

– w ini : an l × m dimensional matrix containing the initial estimates of the representatives. If it isempty, the representatives are initialized by the algorithm.







13

– m: the number of representatives. This is utilized only when w ini is empty.

– eta vec: a 2-dimensional parameter vector whose first component is the learning rate for the winningrepresentative, while its second component is the learning rate for all the rest representatives.

– sed : a seed for the rand function of MATLAB.

– max epoch: The maximum number of epochs.

– e thres: The parameter used in the termination condition (in this version of the algorithm its valueis of no importance).

– init proc: an integer taking values “1”, “2” or “3” with- “1” corresponding to rand init.m initialization procedure (this procedure chooses randomly m vectorsfrom the smallest hyperrectangular that contains all the vectors of X and its sides are parallel to theaxes).- “2” corresponding to rand data init.m (this procedure chooses randomly m vectors among the vectorsof X) and- “3” corresponding to distant init.m (this procedure chooses the m vectors of X that are “mostdistant” from each other. This is a more computationally demanding procedure). This choice isactivated only if the user does not provide the initial conditions.

• Output :

– w : an l × m dimensional matrix containing the initial estimates of the representatives.

– bel : an N dimensional vector, whose i-th element contains the index of

– the closest to X( : ,i) representative.

– epoch: The number of epochs required for convergence.

– NOTE : This function calls rand init.m, rand data init.m and distant init.m

2.13 valley seeking.m

• Syntax : [bel,iter]=valley seeking(X,a,bel ini,max iter)

• Description: This function implements the valley seeking algorithm. This algorithm starts with an initialclustering of the data vectors and iteratively adjusts it in order to identify the regions that are “dense” indata (which correspond to the physically formed clusters). Specifically, at each iteration and for each datavector X(:,i), its closest neighbors are considered and X(:,i) is assigned to the cluster that has more vectorsamong the neighbors of X(:,i). The algorithm terminates when no point is reassigned to a different clusterbetween two successive iterations.

• Input :

– X : an l × N dimensional data matrix whose rows corresponds to the data vectors.

– a: a parameter that specifies the size of the neighborhood.

– bel ini : an N-dimensional vector whose i-th coordinate contains the index of the cluster where thei-th vector is initially assigned.

– max iter : a parameter that specifies the maximum allowable number of iterations.







14

• Output :

– bel : an N-dimensional vector, which has the same structure with bel ini described before.

– iter : the number of iterations performed until convergence is achieved.

– NOTES :- This function calls dist.m that computes the squared Euclidean distance between two vectors.- It is assumed that the cluster indices are in the set {1, 2, . . . ,m}.- The algorithm is extremely sensitive to parameter settings (a and bel ini). It can be used after,e.g., a sequential algorithm. Then, the valley seeking algorithm can take over, using this as initialcondition, in order to identify the true number of clusters.

2.14 distant init.m

• Syntax : [w]=distant init(X,m,sed)

• Description: This function chooses the m “most distant” from each other vectors among the N vectorscontained in a data set X. Specifically, the mean vector of the N data vectors of a set X is computed andthe vector of X that lies furthest from the mean is assigned first to the set w contaning the “most distant”points. The i-th element of w is selected via the follwoing steps:(a) the minimum distance of each of the N-i+1 points of X-W from the i-1 vectors in W is computed.(b) the point with the maximum of the above N-i+1 computed minimum distances joins w as its i-thelement.

• Input :

– X : an l × N dimensional matrix, whose columns are the data vectors.

– m: the number of vectors to be chosen.

– sed : the seed for the random number generator (in this case it does not affect the results of thealgorihm).

• Output :

– w : an l × m dimensional matrix, whose columns are the selected “most distant” vectors.

2.15 rand data init.m

• Syntax : [w]=rand data init(X,m,sed)

• Description: This function chooses randomly m among the N vectors contained in a data set X of vectors.

• Input :



– sed : the seed for the random number generator.

• Output :

– w : an l × m dimensional matrix, whose columns are the randomly selected vectors of X.







15

2.16 rand init.m

• Syntax : [w]=rand init(X,m,sed)

• Description: This function chooses randomly m vectors from the smallest hyperrectangular whose edgesare prallel to the axes and contains all the vectors of a given data set X.

• Input :



– sed : the seed for the random number generator.

• Output :

– w : an l × m dimensional matrix, whose columns are the randomly selected vectors.

2.17 cost comput.m

• Syntax : [bel,cost]=cost comput(X,w)

• Description: This is an auxiliary function and it is called by the k medoids function. Its aim is twofold:(a) it computes the value of the cost function employed by the k-medoids algorithm, i.e. the summationof the distances of each data vector from each closest representative and(b) it assigns each vector X(:,i) to the cluster whose repersentative lies closest to X(:,i).

• Input :


– w : an l × m dimensional matrix, each column of which corresponds to a cluster representative.

• Output :

– bel : an N dimensional vector, whose i-th element contains the cluster where the i-th data vector isassigned.

– cost : a scalar which is the summation of the distances of each data vector from each closest represen-tative.

– NOTE : This function calls the function dist that computes the squared Euclidean distance betweentwo vectors.

2.18 dgk.m

• Syntax : [z]=dgk(x,c,S)

• Description: This function determines the distance used in the G-K algorithm between a point x and acluster characterized by center c and covariance S.

• Input :







16

– x : a l-dimensional column vector.

– c: the center of the cluster at hand.

– S : the covariance of the cluster at hand.

• Output :

– z : the distance between the point and the cluster as it is defined in the framework of the Gustafson-Kessel algorithm.

2.19 dist.m

• Syntax : [z]=dist(x,y)

• Description: Computes the squared Euclidean distance between two column vectors of equal length.

• Input :

– x, y : two column vectors of equal length.

• Output :

– z : the squared Euclidean distance between the vectors x and y.

– NOTE : It is called from the cost comput.m function.







17

3 Feature Selection

3.1 simpleOutlierRemoval.m

• Syntax : [outls,Index,dat]=simpleOutlierRemoval(dat,ttimes)

• Description: Detects and removes outliers from a normally distributed dataset by means of a thresholdingtechnique. The threshold depends on the median and std values of the dataset.

• Input :

– dat : holds the normally distributed data.

– ttimes: sets the outlier threshold.

• Output :

– outls: outliers that have been detected.

– Index : indices of the outliers in the input dat matrix.

– datOut : reduced data set after outliers have been removed.

3.2 normalizeStd.m

• Syntax : [class1n,class2n]=normalizeStd(class1,class2)

• Description: Data normalization to zero mean and standard deviation 1.

• Input :

– class1 : row vector of data for class1.


• Output :

– class1n: row vector of normalized data for class1.

– class2n: row vector of normalized data for class2.

3.3 normalizeMnmx.m

• Syntax : [c1,c2]=normalizeMnmx(class1,class2,par1,par2)

• Description: A linear technique for data normalization that limits the feature values in the range [par1par2] (e.g., [-1 1]), by proper scaling.

• Input :



– par1 : desired minimum value.

– par2 : desired maximum value.







18

• Output :

– c1 : normalized data for class1.


3.4 normalizeSoftmax.m

• Syntax : [c1,c2]=normalizeSoftmax(class1,class2,r)

• Description: Softmax normalization. This is basically a squashing function limiting data in the range [01]. The range of values that corresponds to the linear section depends on the standard deviation and theinput parameter r. Values away from the mean are squashed exponentially.

• Input :



– r : factor r affects the range of values that corresponds to the linear section (e.g., r=0.5).

• Output :



3.5 ROC.m

• Syntax : [auc]=ROC(x,y)

• Description: Plots the ROC curve and computes the area under the curve.

• Input :

– x : row vector of data for both classes.

– y : row vector with data labels. Each element is -1 or 1.

• Output :

– auc: area under ROC curve.

3.6 divergence.m

• Syntax : [D]=divergence(c1,c2)

• Description: Computes the divergence between two classes.

• Input :

– c1 : data of the first class, one pattern per column.

– c2 : data of the second class, one pattern per column.

• Output :

– D : value of divergence.







19

3.7 divergenceBhata.m

• Syntax : [D]=divergenceBhata(c1,c2)

• Description: Computes the Bhattacharyya distance between two classes.

• Input :

– c1 : data of the first class, one pattern per column.

– c2 : data of the second class, one pattern per column.

• Output :

– D : Bhattacharyya distance.

3.8 ScatterMatrices.m

• Syntax : [J]=ScatterMatrices(class1,class2)

• Description: Computes the distance measure (J3) between two classes with scattered (non-Gaussian) fea-ture samples.

• Input :

– class1 : data of the first class, one pattern per column.

– class2 : data of the second class, one pattern per column.

• Output :

– J : distance measure (J3) which is computed from the within-class and mixture scatter matrices.

3.9 ScalarFeatureSelectionRanking.m

• Syntax : [T]=ScalarFeatureSelectionRanking(c1,c2,sepMeasure)

• Description: Features are treated individually and are ranked according to the adopted separability criterion(given in sepMeasure).

• Input :

– c1 : matrix of data for the first class, one pattern per column.

– c2 : matrix of data for the second class, one pattern per column.

– sepMeasure: class separability criterion. Possible parameter values are ’t-test’, ’divergence’, ’Bhata’,’ROC’, ’Fisher’.

• Output :

– T : Feature ranking matrix. The first column contains class separability costs and the second columnthe respective feature ids.







20

3.10 SequentialBackwardSelection.m

• Syntax : [cLbestOverall,JMaxOverall]=SequentialBackwardSelection(class1,class2)

• Description: Feature vector selection by means of the Sequential Backward Selection technique. Usesfunction ScatterMatrices to compute the class separability measure.

• Input :



• Output :

– cLbestOverall : Selected feature subset. Row vector of feature ids.

– JMaxOverall : Class separabilty cost derived from the scatter-matrices measure.

3.11 SequentialForwardSelection.m

• Syntax : [cLbestOverall,JMaxOverall]=SequentialForwardSelection(class1,class2)

• Description: Feature vector selection by means of the Sequential Forward Selection technique. Uses functionScatterMatrices for the class separability measure.

• Input :



• Output :

– cLbestOverall : Selected feature subset. Row vector of feature ids.

– JMaxOverall : Class separabilty cost derived from the scatter-matrices measure.

3.12 exhaustiveSearch.m

• Syntax : [cLbest,Jmax]=exhaustiveSearch(class1,class2,CostFunction)

• Description: Exhaustive search for the best feature combination depending on the adopted class separabilitymeasure (given in CostFunction).

• Input :

– class1 : data of the first class, one pattern per column.

– class2 : data of the second class, one pattern per column.

– CostFunction: Possible choices are ’divergence’, ’divergenceBhata’, ’ScatterMatrices’.

• Output :

– clbest : best feature combination. Row vector of feature ids.

– Jmax : value of the adopted cost function for the best feature combination.







21

4 Image Features

4.1 generateCoOccMat.m

• Syntax : [P 0,P 45,P 90,P 135]=generateCoOccMat(dat,Ng)

• Description: Generates four co-occurrence matrices corresponding to the directions, 0, 45, 50 and 135degrees.

• Input :

– dat : gray-level image (matrix).

– Ng : reduce the number of gray levels to Ng as a preprocessing step.

• Output :

– P 0, P 45,P 90, P 135 : the four co-occurrence matrices.

– NOTE : This function differs from Matlab’s graycomatrix.m because it implements a two-way scanfor each direction, i.e., for the horizontal direction data are scanned both from left-to-right and fromright-to-left.

4.2 CoOccMatFeatures.m

• Syntax : [features]=CoOccMatFeatures(CoMat)

• Description: Computes a total of 13 image features, given a co-occurrence matrix. Calls 13 functions, oneper feature. These 13 functions are described in the sequel.

• Input :

– coMat : the co-occurrence matrix.

• Output :

– features: a feature vector of 13 features.

4.3 CoOccASM.m

• Syntax : [ASM]=CoOccASM(M)

• Description: Computes the Angular Second Moment, given a co-occurrence matrix. This feature is ameasure of the smoothness of the image.

• Input :

– M : the co-occurrence matrix.

• Output :

– ASM : the value of the Angular Second Moment.







22

4.4 CoOccContrast.m

• Syntax : [CON]=CoOccContrast(M)

• Description: Computes the Contrast, given a co-occurrence matrix. This is a measure of the image contrast,i.e., a measure of local gray-level variations.

• Input :


• Output :

– CON : the value of contrast.

4.5 CoOccCOR.m

• Syntax : [COR]=CoOccCOR(M)

• Description: Computes the Correlation, given a co-occurrence matrix. This feature is a measure of thegray-level linear dependencies of an image.

• Input :


• Output :

– COR: the value of Correlation.

4.6 CoOccVariance.m

• Syntax : [variance]=CoOccVariance(M)

• Description: Computes the Variance of a co-occurrence matrix.

• Input :


• Output :

– variance: the value of Variance.

4.7 CoOccIDM.m

• Syntax : [INV]=CoOccIDM(M)

• Description: Computes the Inverse Difference Moment of a co-occurrence matrix.

• Input :


• Output :

– INV : the value of the Inverse Difference moment.







23

4.8 CoOccSUA.m

• Syntax : [SUA]=CoOccSUA(M)

• Description: Computes the Sum (Difference) Average of a co-occurrence matrix.

• Input :


• Output :

– SUA: the value of the Sum (Difference) Average.

4.9 CoOccSUV.m

• Syntax : [SUV]=CoOccSUV(M)

• Description: Computes the Sum Variance of a co-occurrence matrix.

• Input :


• Output :

– SUV : the value of Sum Variance.

4.10 CoOccSUE.m

• Syntax : [SUE]=CoOccSUE(M)

• Description: Computes the Sum Entropy of a co-occurrence matrix.

• Input :


• Output :

– SUE : the value of Sum Entropy.

4.11 CoOccEntropy.m

• Syntax : [entropy]=CoOccEntropy(M)

• Description: Computes the Entropy of a co-occurrence matrix. Entropy is a measure of randomness andtakes low values for smooth images.

• Input :


• Output :

– entropy : the value of Entropy.







24

4.12 CoOccDEN.m

• Syntax : [DEN]=CoOccDEN(M)

• Description: Computes the Difference Entropy, given a co-occurrence matrix.

• Input :


• Output :

– DEN : the value of Difference Entropy.

4.13 CoOccDVA.m

• Syntax : [DVA]=CoOccDVA(M)

• Description: Computes the Difference Variance, given a co-occurrence matrix.

• Input :


• Output :

– DVA: the value of Difference Variance.

4.14 CoOccCIMI.m

• Syntax : [CIMI]=CoOccCIMI(M)

• Description: Computes the Information Measure I, given a co-occurrence matrix.

• Input :


• Output :

– CIMI : the value of the Information Measure I.

4.15 CoOccCIMII.m

• Syntax : [CIMII]=CoOccCIMII(M)

• Description: Computes the Information Measure II, given a co-occurrence matrix.

• Input :


• Output :

– CIMII : the value of the Information Measure II.







25

4.16 CoOccPXandPY .m

• Syntax : [px,py]=CoOccPXandPY (M, px, py)

• Description: It is used by other feature generation functions. It computes the marginal probability matricesby summing the rows (px) and columns (py) of the co-occurrence matrix.

• Input :


• Output :

– px : vector formed by sums of rows of co-occurrence matrix.

– py : vector formed by sums of columns of co-occurrence matrix.

4.17 CoOccPXminusY.m

• Syntax : [Px minus y]=CoOccPXminusY(M)

• Description: It is used by other feature generation functions. It generates probability matrices by scanningthe diagonals of the co-occurrence matrix at 135 degrees.

• Input :


• Output :

– Px minus y : vector formed by the sums of the diagonals of M at the direction of 135 degrees.

4.18 CoOccPxplusY.m

• Syntax : [Px plus y]=CoOccPxplusY(M)

• Description: It is used by other feature generation functions. It generates probability matrices by scanningthe diagonals of the co-occurrence matrix at 45 degrees.

• Input :


• Output :

– Px plus y : vector formed by the sums of the diagonals of M at the direction of 45 degrees.







26

4.19 ImageHist.m

• Syntax : [h]=ImageHist(A,Ng)

• Description: Generates the histogram of a graylevel image for Ng levels of gray.

• Input :

– A: gray-level image (matrix).

– Ng : desired number of gray levels.

• Output :

– h: histogram of the image (vector Ngx1).

4.20 HistMoments.m

• Syntax : [feat]=HistMoments(dat,mom)

• Description: Computes the moment of order mom from the histogram of a gray-level image.

• Input :

– dat : graylevel image (matrix).

– mom: order of moment (integer ≥1).

• Output :

– feat : the value of moment of order mom.

4.21 HistCentralMoments.m

• Syntax : [feat]=HistCentralMoments(dat,c mom)

• Description: Computes the central moment of order c mom from the histogram of a gray-level image.

• Input :

– dat : graylevel image (matrix).

– c mom: order of moment (integer ≥1).

• Output :

– feat : the value of moment of order c mom.







27

4.22 LawMasks.m

• Syntax : [A]=LawMasks(kernelLength)

• Description: Generates the Law Masks [Laws 80] given the kernel length (3 or 5).

• Input :

– kernelLength: length of kernels (3 or 5).

• Output :

– A: Cell array with 9 or 16 masks (matrices), depending on the adopted kernel length.

4.23 RL 0 90.m

• Syntax : [Q]=RL 0 90(m,N runs,Ng,degrees)

• Description: Generates the Run Length matrix from a gray-level image given the number of runs, the levelsof gray and the direction (0 or 90 degrees).

• Input :

– m: gray-level image (matrix).

– N runs: number of runs.

– degrees: direction in degrees (0 or 90).

• Output :

– Q : the resulting run length matrix.

4.24 RL 45 135.m

• Syntax : [Q]=RL 45 135(m,N runs,Ng,degrees)

• Description: Generates the Run Length matrix from a gray-level image given the number of runs, the levelsof gray and the direction (45 or 135 degrees).

• Input :

– m: gray-level image (matrix)

– N runs: number of runs.

– degrees: direction in degrees (45 or 135).

• Output :

– Q : the resulting run length matrix.







28

4.25 SRE.m

• Syntax : [SRuEm]=SRE(M)

• Description: Computes the Short-Run Emphasis (SRE) from a Run Length matrix. This feature empha-sizes small run lengths and it is thus expected to be large for coarser images.

• Input :

– M : the Run Length matrix.

• Output :

– SRuEm: the value of Short-Run Emphasis.

4.26 LRE.m

• Syntax : [LRunEmph]=LRE(M)

• Description: Computes the Long-Run Emphasis (LRE) from a Run Length matrix. This feature is expectedto be large for smoother images.

• Input :


• Output :

– LRunEmph: the value of Long-Run Emphasis.

4.27 GLNU.m

• Syntax : [GLNonUn]=GLNU(M)

• Description: Computes the Gray-level Nonuniformity (GLNU) from a Run Length matrix. When runs areuniformly distributed among the gray levels, this feature takes small values.

• Input :


• Output :

– GLNonUn: the value of Gray-Level Nonuniformity.







29

4.28 RLNU.m

• Syntax : [RLNonUn]=RLNU(M)

• Description: Computes the Run Length Nonuniformity (RLN) from a Run Length matrix. In a similarway to gray-level nonuniformity, this feature is a measure of run length nonuniformity.

• Input :


• Output :

– RLNonUn: the value of Run Length Nonuniformity.

4.29 RP.m

• Syntax : [RuPe]= RP(M)

• Description: Computes the Run Percentage (RP) from a Run Length matrix. This feature takes low valuesfor smooth images.

• Input :


• Output :

– RuPe: the value of Run Percentage.







30

5 Audio Features

5.1 sfSpectralCentroid.m

• Syntax : [Sc]=sfSpectralCentroid(x,Fs)

• Description: Computes the Spectral Centroid of a single frame.

• Input :

– x : signal frame (sequence of samples).

– Fs: sampling frequency (Hz).

• Output :

– Sc: spectral centroid (Hz).

5.2 sfSpectralRolloff.m

• Syntax : [Sr]=sfSpectralRolloff(x,Fs,RolloffThresh)

• Description: Computes the Spectral Rolloff frequency of a single frame given a rolloff threshold.

• Input :



– RolloffThresh: rolloff threshold, 0≤RolloffThresh≤1.

• Output :

– Sr : rolloff frequency (Hz).

5.3 sfFundAMDF.m

• Syntax : [Fr]=sfFundAMDF(x,Fs,Tmin,Tmax)

• Description: Computes the fundamental frequency of a single frame using the Average Magnitude DifferenceFunction for periodicity detection [Rabi 78].

• Input :



– Tmin: minimum period length (in samples).

– Tmax : maximum period length (in samples).

• Output :

– Fr : fundamental frequency (Hz).







31

5.4 sfFundAutoCorr.m

• Syntax : [Fr]=sfFundAutoCorr(x,Fs,Tmin,Tmax)

• Description: Computes the fundamental frequency of a single frame using the autocorrelation function forperiodicity detection [Rabi 78].

• Input :





• Output :


5.5 sfFundCepstrum.m

• Syntax : [Fr]=sfFundCepstrum(x,Fs,Tmin,Tmax)

• Description: Computes the fundamental frequency of a single frame using the cepstrum coefficients [Rabi 78].Requires function rceps of Matlab’s Signal Processing Toolbox.

• Input :





• Output :


5.6 sfFundFreqHist.m

• Syntax : [Fund,FinalHist]=sfFundFreqHist(x,Fs,F1,F2,NumOfPeaks)

• Description: Computes the fundamental frequency of a single frame using Schroeder’s frequency histogram[Schr 68]. The frequency histogram is generated by taking into account only the spectral peaks (localmaxima).

• Input :



– F1 : minimum funamental frequency (Hz).







32

– F2 : maximum fundamental frequency (Hz).

– NumOfPeaks: number of spectral peaks to be taken into account.

• Output :

– Fund : fundamental frequency (Hz).

– FinalHist : frequency histogram.

5.7 sfMFCCs.m

• Syntax : [cMel,Y]=sfMFCCs(x,Fs,bins)

• Description: Computes the Mel-frequency cepstrum coefficients (MFCCs) from a single frame of data, as-suming a filter-bank of triangular non-overlapping filters. Prior to using this function, the center frequencyand frequency range for each filter in the filter-bank need to be computed. This can be achieved by firstcalling function computeMelBank and pass its output to the bins variable of this m-file.

• Input :



– bins: Mel filter-bank (see computeMelBank for details).

• Output :

– cMel : the MFCCs. The number of MFCCs is equal to the length of the frame.

– Y : output of the filter bank after insertion of zeros has taken place. The length of Y is equal to thelength of the frame.

5.8 computeMelBank.m

• Syntax : [bins]=computeMelBank(N,Fs,melStep)

• Description: Computes the frequency centers and frequency limits of the filters of a Mel filter-bank, as-suming triangular non-overlapping filters. This function computes the frequencies of the FFT of a N-pointframe and then shifts the frequency centers of the filter-bank to the closest FFT frequency.

• Input :

– N : number of FFT coefficients.


– melStep: distance between successive centers of the filter-bank (mel units).

• Output :

– bins: filter-bank matrix. The i-th row contains three values for the i-th filter of the filter-bank, i.e.,lowest frequency, center frequency and rightmost frequency.







33

5.9 stEnergy.m

• Syntax : [E,T]=stEnergy(x,Fs,winlength,winstep)

• Description: Computes the short-term energy envelop.

• Input :

– x : signal (sequence of samples).


– winlength: length of the moving window (number of samples).

– winstep: step of the moving window (number of samples).

• Output :

– E : sequence of short-term energy values.

– T : time equivalent of the first sample of each window.

5.10 stZeroCrossingRate.m

• Syntax : [Zcr,T]=stZeroCrossingRate(x,Fs,winlength,winstep)

• Description: Computes the short-term zero-crossing rate.

• Input :





• Output :

– Zcr : zero-crossing rate sequence of values.


5.11 stSpectralCentroid.m

• Syntax : [Sc,T]=stSpectralCentroid(x,Fs,winlength,winstep,windowMultiplier)

• Description: Computes the short-term spectral centroid by calling the MovingWindow function.

• Input :











34

– windowMultiplier : use ’hamming’, ’hanning’, etc., i.e., any valid Matlab window multiplier (or [] forrectangular window).

• Output :

– Sc: sequence of spectral centtroids.


5.12 stSpectralRolloff.m

• Syntax : [Sr,T]=stSpectralRolloff(x,Fs,winlength,winstep,windowMultiplier,RolloffThresh)

• Description: Computes the short-term spectral rolloff feature by calling the MovingWindow function.

• Input :






• Output :

– Sr : sequence of spectral rolloff values.


5.13 stSpectralFlux.m

• Syntax : [Sf,T]=stSpectralFlux(x,Fs,winlength,winstep,windowMultiplier)

• Description: Computes the short-term spectral flux.

• Input :






• Output :

– Sf : sequence of spectral flux values.








35

5.14 stFundAMDF.m

• Syntax : [Fr,T]=stFundAMDF(x,Fs,winlength,winstep,windowMultiplier,Tmin,Tmax)

• Description: Fundamental Frequency Tracking using the Average Magnitude Difference Function [Rabi 78].

• Input :








• Output :

– Fr : sequence of fundamental frequencies (Hz).


5.15 stMelCepstrum.m

• Syntax : [MelCepsMat,T]=stMelCepstrum(x,Fs,winlength,winstep,windowMultiplier,melStep)

• Description: This function computes the short-term Mel Cepstrum. It calls computeMelBank to computethe center frequencies and frequency range of each filter in the mel filter-bank.

• Input :






– melStep: distance between the center frequencies of successive filters in the Mel filter-bank (mel units).

• Output :

– MelCepsMat : matrix of Mel-cepstrum coefficients, one column per frame.








36

5.16 stFundFreqHist.m

• Syntax : [FundFreqs,T]=stFundFreqHist(x,Fs,winlength,winstep,windowMultiplier,F1,F2,NumOfPeaks)

• Description: Fundamental Frequency Tracking based on Schroeder’s Histogram method [Schr 68]. Thisfunction calls MovingWindow.

• Input :






– F1 : minimum funamental frequency (Hz).

– F2 : maximum fundamental frequency (Hz).

– NumOfPeaks: number of spectral peaks to take into account for the histogram generation.

• Output :

– FundFreqs: sequence of fundamental frequencies (Hz).


5.17 stFundAutoCorr.m

• Syntax : [Fr,T]=stFundAutoCorr(x,Fs,winlength,winstep,windowMultiplier,Tmin,Tmax)

• Description: Autocorrelation-based Fundamental Frequency Tracking [Rabi 78].

• Input :








• Output :









37

5.18 stFundCepstrum.m

• Syntax : [Fr,T]=stFundCepstrum(x,Fs,winlength,winstep,windowMultiplier,Tmin,Tmax)

• Description: Cepstrum-based Fundamental Frequency Tracking [Rabi 78].

• Input :








• Output :



5.19 stFourierTransform.m

• Syntax : [StftMat,Freqs,T]=stFourierTransform(x,Fs,winlength,winstep,windowMultiplier,GenPlot)

• Description: Short-time Fourier Transform of a signal.

• Input :






– GenPlot (optional): if set to 1, a plot spectrogram is generated.

• Output :

– StftMat : matrix of Short Time Fourier Transform coefficients (one column per frame).

– Freqs: multiples of Fs/winlength (vector of frequencies).








38

6 Dynamic Time Warping

6.1 editDistance.m

• Syntax : [editCost,Pred]=editDistance(refStr,testStr)

• Description: Computes the edit (Levenstein) distance between two strings, where the first argument is thereference string (prototype). The prototype is placed on the horizontal axis of the matching grid.

• Input :

– refStr : reference string.

– testStr : string to compare with prototype.

• Output :

– editCost : the matching cost.

– Pred : matrix of node predecessors. The real part of Pred(j, i) is the row index of the predecessor ofnode (j, i) and the imaginary part of Pred(j, i) is the column index of the predecessor of node (j, i).

6.2 DTWSakoe.m

• Syntax : [MatchingCost,BestPath,D,Pred]=DTWSakoe(ref,test)

• Description: Computes the Dynamic Time Warping cost between two feature sequences. The first argumentis the prototype, which is placed on the vertical axis of the matching grid. The function employs the Sakoe-Chiba local constraints on a cost grid, where the Euclidean distance has been used as the distance metric.No end-point contstraints have been adopted. This function calls BackTracking.m to extract the best path.

• Input :

– ref : reference sequence. Its size is m × I, where m is the number of features and I the number offeature vectors.

– test : test sequence. Its size is m× J , where m is the number of features and J the number of featurevectors.

– genPlot : if set to 1, a plot of the best path is generated.

• Output :

– MatchingCost : matching cost. The matching cost is normalized, i.e., it is divided with the length ofthe best path.

– BestPath: backtracking path. Each node of the best path is represented as a complex number, wherethe real part stands for the row index and the imaginary part stands for the column index of the node.

– D : cost grid. Its size is I × J .

– Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index of the predecessor ofnode (i, j) and the imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j).







39

6.3 DTWSakoeEndp.m

• Syntax : [MatchingCost,BestPath,D,Pred]=DTWSakoeEndp(ref,test,omitLeft,omitRight)

• Description: Computes the Dynamic Time Warping cost between two feature sequences. The first argumentis the prototype which is placed on the vertical axis of the matching grid. The function employs the Sakoe-Chiba local constraints on a type N cost grid, where the Euclidean distance has been used as the distancemetric. End-points contstraints are permitted for the test sequence. This function calls BackTracking.mto extract the best path.

• Input :

– ref : reference sequence Its size is m × I, where m is the number of features and I the number offeature vectors.

– test : test sequence. Its size is m × J , where l is the number of features and J the number of featurevectors.

– omitLeft : left endpoint constraint for the test sequence. This is the number of frames that can beomitted from the beginning of the test sequence.

– omitRight : right endpoint constraint for the test sequence. This is the number of frames that can beomitted from the end of the test sequence.


• Output :

– MatchingCost : matching cost. The matching cost is normalized, i.e., is divided with the length of thebest path.



– Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index and the imaginarypart of Pred(i, j) is the column index of the predecessor of node (i, j).

6.4 DTWItakura.m

• Syntax : [MatchingCost,BestPath,D,Pred]=DTWItakura(ref,test)

• Description: Computes the Dynamic Time Warping cost between two feature sequences. The first argumentis the prototype, which is placed on the vertical axis of the matching grid. The function employs thestandard Itakura local constraints on a cost grid, where the Euclidean distance has been used as thedistance metric. No end-points contstraints have been adopted. This function calls BackTracking.m toextract the best path.

• Input :








40

– test : test sequence. Its size is m× J , where m is the number of features and J the number of featurevectors.


• Output :

– MatchingCost : matching cost. The matching cost is normalized, i.e., is divided with the length of thebest path.




6.5 DTWItakuraEndp.m

• Syntax : [MatchingCost,BestPath,D,Pred]=DTWItakuraEndp(ref,test,omitLeft,omitRight)

• Description: Computes the Dynamic Time Warping cost between two feature sequences. The first argumentis the prototype, which is placed on the vertical axis of the matching grid. The function employs thestandard Itakura local constraints on a cost grid, where the Euclidean distance has been used as thedistance metric. End-points contstraints are permitted for the test sequence. This function calls functionBackTracking.m to extract the best path.

• Input :


– test : test sequence. The size is m×J , where m is the number of features and J the number of featurevectors.

– omitLeft : left endpoint constraint for the test sequence. This is the number of frames that can beomitted from the beginning of the test sequence.

– omitRight : right endpoint constraint for the test sequence. This is the number of frames that can beomitted from the end of the test sequence.


• Output :

– MatchingCost : matching cost. The matching cost is normalized, i.e., it is divided by the length of thebest path.










41

6.6 BackTracking.m

• Syntax : [BestPath]=BackTracking(Pred,startNodek,startNodel)

• Description: Performs backtracking on a matrix of node predecessors and returns the extracted best pathstarting from node (startNodek,startNodel). The best path can be optionally plotted.

• Input :

– Pred : matrix of node predecessors. The real part of Pred(i, j) is the row index of the predecessor ofnode (i, j). The imaginary part of Pred(i, j) is the column index of the predecessor of node (i, j).

– startNodek : row index of node from which backtracking starts.

– startNodel : column index of node from which backtracking starts.


• Output :

– BestPath: backtracking path, i.e., vector of nodes. Each node is represented as a complex numberwhere the real part stands for the row index of the node and the imaginary part stands for the columnindex of the node.







42

References

[Theo 09] S. Theodoridis, K. Koutroumbas, Pattern Recognition, 4th edition, Academic Press, 2009.

[Rabi 78] L.R. Rabiner, R.W Schafer, Digital Processing of Speech Signals, Prentice Hall, 1978.

[Schr 68] M.R. Schroeder, “Period histogram and product spectrum: New methods for fundamental frequencymeasurement,” Journal of Acoustical Society of America, Vol. 43(4), pp. 829-834, 1968.

[Laws 80] K.I. Laws, Texture image segmentation, Ph.D. Thesis, University of Southern California, 1980.







pattern recognition matlab manual

Documents