page 0 of 14 dynamical invariants of an attractor and potential applications for speech data saurabh...

15
Page 1 of 14 Dynamical Invariants of an Attractor and potential applications for speech data Saurabh Prasad Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Estimating Kolmogorov Entropy from Acoustic Attractors from a Recognition Perspective

Upload: toby-carson

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Saurabh Prasad Intelligent Electronic Systems

Human and Systems EngineeringDepartment of Electrical and Computer Engineering

Estimating Kolmogorov Entropy from Acoustic Attractors from a Recognition Perspective

Page 2 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Estimating the correlation integral from a time seriesCorrelation Integral of an attractor’s trajectory : Correlation sum of a system’s attractor is a measure quantifying the average number of neighbors in a neighborhood of radius along the trajectory.

where represents the i’th point on the trajectory, is a valid norm and is the Heaviside’s unit step function (serving as a count function here)

At a given embedding dimension (m > [2*D+1]), we have:

)()1(*

2)(

1 1ji

N

i

N

ij

xxNN

C

ix

ln

)(lnlimlim0

CD

N

DC )(

resolutionsAttractor'0)( C

radiussAttractor'1)( C

~ Fractal Dimension of the attractor

Page 3 of 14Dynamical Invariants of an Attractor and potential applications for speech data

order-q Renyi entropy and K2-Entropy

Divide the state space into disjoint boxes

If the evolution of the state space that generated the observable is sampled at

dt ....,,2,

Numerically, the Kolmogorov entropy can be estimated as the second order Renyi entropy (K2)

diii

dq

dq iiip

qdK

,...,,21

021

),...,,(ln1

11limlim

Represents the joint probability that lies in box i1 lies in box i2 and so on.

)( tx

)2( tx

)exp(lim~)( 20

KdC D

d

d

)(

)(lnlim

1~

102

d

d

dC

CK

diii ,...,, 21

SystemStochasticK

SystemChaoticK

SystemOrderedK

2

2

2

0

0

Page 4 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Second Order Kolmogorov Entropy Estimation of speech data

•Speech data, sampled at 22.5 KHz

– Sustained Phones (/aa/, /ae/, /eh/, /sh/, /z/, /f/, /m/, /n/)

•Output – Second order Kolmogorov Entropy

•We wish to analyze:

– The presence or absence of chaos in any time series.

– Their discrimination characteristics across attractors from different sound units (for classification)

Page 5 of 14Dynamical Invariants of an Attractor and potential applications for speech data

The analysis setup

• Currently, this analysis includes estimates of K2 for different embedding dimensions

• Variation in entropy estimates with the neighborhood radius, epsilon was studied

• Variation in entropy estimates with SNR of the signal was studied

• Currently, the analysis was performed on 3 vowels, 2 nasals and 2 fricatives

• Results show that vowels and nasals have a much smaller entropy, as compared to fricatives

• K2 consistently decreases with embedding dimension for vowels and nasals, while for fricatives, it consistently increases

Page 6 of 14Dynamical Invariants of an Attractor and potential applications for speech data

The analysis setup (in progress / coming soon)…

• Data size (length of the time series):

–This is crucial for our purpose, since we wish to extract information from short

time series (sample data from utterances).

• Speaker variation:

– We wish to study variations in the Kolmogorov entropy of phone or word level attractors

• across different speakers.

• across different phones/words

• across different broad phone classes

Page 7 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Correlation Entropy vs. Embedding Dimension

Various Epsilons

Page 8 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Correlation Entropy vs. Embedding Dimension

Various Epsilons

Page 9 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Correlation Entropy vs. Embedding Dimension

Various Epsilons

Page 10 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Correlation Entropy vs. Embedding Dimension

Various SNRs

Page 11 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Correlation Entropy vs. Embedding Dimension

Various Data Lengths

Page 12 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Measuring Discrimination Information in K2 based features

Kullback-Leibler (KL) divergence: Provides an information theoretic distance measure between two statistical models

xdxp

xpxpxd

xp

xpxpjiJ

i

j

xj

j

i

xi

)(

)(ln)(

)(

)(ln)(),(

Average Discriminating Information between class i and class j:

Likelihood: i vs. j Likelihood: j vs. i

For Normal Densities:

),(),(),(

]))(([2

1)]([

2

1ln2

1),( 111

ijIjiIjiJ

CtrCCCtrC

CjiI T

jijijijii

j

Page 13 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Measuring Discrimination Information in K2 based features

Statistics of entropy estimates over several frames, for various phones

Page 14 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Measuring Discrimination Information in K2 based features

1 2 30

50

100

150

200

250

300

350

400

450

500

/aa/ vs. /f//ae/ vs. / sh//eh/ vs. /z/

/aa/ vs. /m//aa/ vs. /n//ae/ vs. /n/

/m/ vs. /f//n/ vs. /z//m/ vs. /sh/

KL-Divergence Measure between K2 features from various phonemes for two speakers

1 2 30

500

1000

1500

2000

2500

3000

/aa/ vs. /f//ae/ vs. /sh//eh/ vs. /z/

/aa/ vs. /m//aa/ vs. /n//ae/ vs. /n/

/m/ vs. /f//n/ vs. /z//m/ vs. /sh/

Page 15 of 14Dynamical Invariants of an Attractor and potential applications for speech data

Plans

• Finish studying the use of K2 entropy as a feature characterizing phone-level attractors

– We will be performing a similar analysis on Lyapunov Exponents and Correlation Dimension estimates

• Measure speaker dependence in this invariant

• Use this setup on a meaningful recognition task

• Noise robustness, parameter tweaking, integrating these features to MFCCs

• Statistical Modeling…