a3: incremental specification in context cooperation: lss, ims grzegorz dogil, bin yang, wolgang...

A3: Incremental Specification in ContextCooperation: LSS, IMS

Grzegorz Dogil, Bin Yang, Wolgang Wokurek

Stefan Uhlich, Andreas Madsack

Content• Current research topics:

– Subglottal resonances

– Robust speech representation

– Landmark/Phoneme detection

• Future research direction

Subglottal Resonances

• Topic:– Measurement of subglottal resonances

– Relationship to vowel chart

• Results: (in cooperation with Steven Lulich, MIT)

– Recording of 20 speaker with sensors

– Analysis of Swabian Diphthongs

• Future (until 2010):– Measure 50+ speakers with new sensor

– Calculate for these speakers vowelspacewith first two sg resonances

VTSG

G

• How can we identify features that are important?

• So far: Measuring relevance with mean squared error– A feature is more important, if it allows a good

reconstruction in the spirit of the mean squared error

– Mathematically more tractable

• General question for the next project phase:– How do we measure perceptional relevance?

Robust Speech Representation

Random ErasureProcess

Features s

1,...,s

N Subset of Features y

1,...,y

MReconstructionReconstr.

Features ŝ1,...,ŝ

N

Landmark/Phoneme Detection• Topic:

– Find relevant Features for Landmarks/Phonemes using statistical evaluation methods

– Identify characteristic temporal contour of relevant features

• Used features (only subset selected):– En. envelopes (A2), Liu bands

– LPC (LSS), VQP (Wokurek), f0, MFCC, ...

• Results (for different tasks):– Segment wise detection

– Evaluation of performance difference for fixed and phoneme-based segmentation

Future Research Direction (I)

• Identify perceptual relevant regions of speech

• Identify perceptual relevant regions of speech

Future Research Direction (I)

Future Research Direction (II)• Example: /ae/ of

handbag• Exemplars: Different

versions of perceptual relevant regions for the same phoneme

Set of all /ae/'s in corpus and corresponding feature values

Feature Selectione.g. find best five features

Statistical Classifier

New Exemplar, i.e. coverage of 80 %

not coveredi.e. 20 %

Future Research Direction (III)

• Work packages:– Regions: How to identify perceptual relevant regions in the

(t,f)-plane?• Feature extraction: IMS, LSS (part already done), robustness• Feature selection: IMS (phonetically motivated), LSS

(statistically motivated) + Combination of both + memory decay

– Evaluation: Are the identified regions relevant?• ... for speech representation in context? (IMS)• ... for usage-induced context information? (LSS)

– Transition to higher levels (pitch-accents (A1), syllables (A2), words (A4))

References• Subglottal Resonces

– W. Wokurek and A. Madsack (2008), Messung subglottaler Resonanzen mit Beschleunigungssensoren, Fortschritte der Akustik--DAGA-2008 (Dresden) pp. 125-126

– A. Madsack, S. Lulich, W. Wokurek and G. Dogil (2008), Subglottal Resonances and Vowel Formant Variability: A Case Study of High German Monophthongs and Swabian Diphthongs, LabPhon11, Wellington

• Robust Speech Representation, Incremental Specification

– M. Lugger and B. Yang (2007), An incremental analysis of different feature groups in speaker independent emotion recognition, Proc. ICPhS 2007

– S. Uhlich and B. Yang (2008), A generalized optimal correlating transform for multiple description coding and its theoretical analysis, Proc. IEEE ICASSP 2008

– R. Blind, S. Uhlich, B. Yang and F. Allgöwer, Robustification and Optimization of a Kalman Filter with Measurement Loss using Linear Precoding, submitted to Proc. ACC 2009

– M. Lugger and B. Yang, “Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features“, to be published in: Speech Recognition, Publisher: I-Tech Education and Publishing, Vienna, Austria

• Landmark/Phoneme Detection

– A. Madsack, G. Dogil, S. Uhlich, Y. Zeng and B. Yang, On the Importance of Timing Information in Plosive Detection, submitted to Proc. ICASSP 2009

a3: incremental specification in context cooperation: lss, ims grzegorz dogil, bin yang, wolgang...

Documents