1 robust hmm classification schemes for speaker recognition using integral decode marie roch florida...

33
1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

Upload: angel-dean

Post on 27-Dec-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

1

Robust HMM classification schemes for speaker recognition

using integral decode

Marie Roch

Florida International University

Page 2: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

2

Who am I?

Page 3: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

3

Speaker Recognition

Verification Identification

Text

Dependent

Text

Independent

• Types of speaker recognition

Page 4: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

4

Speaker Recognition

• Why is it hard?• Minimal training data

• Background noise

• Transducer mismatch

• Channel distortions

• People’s voices change over time and under stress

• Performance

Page 5: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

5

Feature Extraction

• Extract speech

• Spectral analysis

• Cepstrum:

• Cepstral means removal

))((log(1 Sdft

Page 6: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

6

Hidden Markov Models

• Statistical pattern recognition

• State dependent modeling– Distribution/state– Radial basis functions common

• State sequence unobservable

Page 7: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

7

HMM

• Efficient decoders:

• Training – EM algorithm– Convergence to local maxima guaranteed

)( 2TNO

Page 8: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

8

Recognition

• Model for each speaker

• Maximum a priori (MAP) decision rule

ArgMaxFeatures

Models

Scores

Page 9: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

9

The MAP decision rule

• Optimal decision rule provided we have accurate distribution parameters & observations.

• Problem:– Corruption of feature vectors.– Distribution known to be inaccurate.

Page 10: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

10

A case of mistaken identity

Page 11: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

11

Integral decode

• Goal: Include uncorrupted observation ôt.

• Problem: ôt unobservable.

• Determine a local neighborhood t about ot and use a priori information to weight the likelihood:

ooMoMotot

)|Pr()|Pr()|Pr(

Page 12: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

12

Integral decode issues

• Problems approximating the integral– High frame rate * number of models– Non-trivial dimensionality

• Selection of the neighborhood

Page 13: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

13

Approximating the integral

• Monte Carlo impractical

• Use simplified cubature technique:

1 2)|)(stepPr()()|)(step(...)|Pr( i i i

prior

t

area

pdf

tt CioiMiofMo

C

j jj

jj E

kiki 1 )

1

2)1(()(step

Page 14: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

14

Neighborhood choice

• Choosing an appropriate neighborhood:– Upper bound difference neighborhoods [Merhav and Lee 93]

– Error source modeling

Page 15: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

15

Upper bound difference neighborhoods

• Arbitrary signal pairs with a few general conditions.

• PSD

• Cepstra 1 1

1

ki iic

ki j

ij

i

ji

ji

ee

eeS 1 )1)(1(

)1)(1()(

Page 16: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

16

Taking the upper bound

• Asymptotic difference between cepstral parameters:

iiii

k

iiii

k

iiiicc

,,,max 4k,

1

1

1

1

)2()1(

Page 17: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

17

Error source modeling

• Multiple error sources

• Simplifying assumption of one normal distribution with zero mean

• Use time series analysis to estimate the noise

• Trend

ttt nO

to

tot

tt

t

t

1

1

1

1

Page 18: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

18

Error Source Modeling

• Estimate variance from detrended signal

Page 19: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

19

Error source modeling

• Problem: – is infinite

• Solution:– Most of the points are outliers– Set percentage of distribution beyond which

points are culled.

ooMoMotot

)|Pr()|Pr()|Pr(

t

Page 20: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

20

Complexity of integration

• Expensive

• Ways to reduce/cope– Implemented

• Top K processing• Principle Components Analysis

– Possible• Gaussian Selection• Sub-band Models• SIMD or MIMD parallelism

)( 2

pdf

nIntegratio

C

Mixtures

Decoder

Speakers

CEMTNSO

Page 21: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

21

Top K Processing

)( 2 CTMENSO CTopK

1 second 3 seconds

5 seconds

Page 22: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

22

Principal Component Analysis

• Choose P most important directions

Page 23: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

23

Principal Component Analysis

• Integrate using new basis set for step function

Page 24: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

24

Speech Corpus

• King-92– Used San Diego subset

• 26 male speakers

• Long distance telephone speech

• Quiet room environment

• 5 sessions recorded one week apart– 1-3 train

– Sessions 4-5 partitioned into test segments

Page 25: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

25

Baseline performance

Page 26: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

26

Integral decode performance

Test Baseline Upper Bound Difference Error Modeling Length Error Error % Error % 1 0.4420 0.4237 0.0183 4.14 0.4401 0.0019 0.43 3 0.1833 0.1554 0.0279 15.22 0.1753 0.0080 4.64 5 0.0872 0.0738 0.0134 15.37 0.0638 0.0234 26.83

1 second 3 seconds 5 seconds

Page 27: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

27

Integral decode with other conditions

• Performance on – high quality speech– transducer mismatch

Page 28: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

28

Future work

• Extensions to the integral decode– Automatic parameter selection– Gaussian selection– distributed computation

• Efficient multiple class preclassifiers

Page 29: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

29

Page 30: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

30

Optimal/utterance hyperparameters – 5 seconds

KingN

B2

6 KingW

B5

1

SpidreF

18XD

R SpidreM

27XD

R

Page 31: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

31

95% Confidence Intervals

• Caveat: – Per speaker

means– Large

granularity

Page 32: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

32

Pattern Recognition

• Long term statistics [Bricker et al 71, Markel et al 77]

• Vector Quantization [Soong et al 87]

• HMM [Rosenberg et al 90, Tishby 91, Matsui & Furui 92, Reynolds et al 95]

• Connectionist frameworks• Feed forward [Oglesby & Mason 90] • Learning vector quantization [He et al 99]

Page 33: 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

33

Pattern Recognition Contd.

• Hybrid/Modified HMMs• Min Classification Error discriminant [Liu et al 95] • Tree structured neural classifiers [Liou & Mammone 95]

• Trajectory modeling [Russell et al 85, Liu et al 95, Ostendorf et al 96, He et al 99]

• Sub-band recognition [Besacier & Bonastre 97]