king’s speech foreign language: pronounce with...
TRANSCRIPT
KING’SSPEECHForeignlanguage:pronouncewithstyle!
“Towardsautomaticevaluationofpronunciation”
July5th 2017
KING’S SPEECH:SYSTEM OVERVIEW
Speech&videoacquisitionandvisualization
Pronunciationanalysis
Gamescheduler
Avatar-basedrendering
Empowermentanalysis
Speechsynthesis/database
STARTING POINT:TEXT AND SPEECH
Student
Teacher
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
IPA:Elva etydje
Howwellhasthestudentrepeatedtheteacher’sphrase?
Whattoevaluate?Vowels
Consonants
Intonation
Accentuation
Rhythmicgroups
Liaison
…andmore…
StudentAudiovalidation&pre-processing VAD
Teacher
IS THE INPUT SPEECH GOOD ENOUGH?
Checktheinputsignal!Correctsamplingfrequency?
Sufficientsignalenergy?
Reasonableduration?
Hasthesignalclipped?
Doesitcontainnoise?
…?
ApplyVoiceActivityDetection(VAD)
StudentAudiovalidation&pre-processing VAD
Teacher
SPEECH ALIGNMENT:WHAT’S BEHIND THE SCENES?
PhoneticalignmentUsesspeechrecognitiontechnology
TrainaHiddenMarkovModel(HMM)
Inputs:HMMmodel,speechfile&transcription
Output:timeintervalsofphonemesinspeechfile
Speechalignment
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
StudentAudiovalidation&pre-processing VAD
Teacher
SPEECH ALIGNMENT:HOW TO VALIDATE IT?
Remember:wedon’tperformrecognition!CalculatetheLowFrequencyModulatedEnergy
Peakscorrespondtovowels(well,mostlikely!)
Compare:peaklocationsvs.alignmentresults
Speechalignment
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
LFMEAutomaticalignmentvalidation
PRONUNCIATION ANALYSIS
Casestudy:Vowels
StudentAudiovalidation&pre-processing VAD
Teacher
VOWELS ANALYSIS:WHAT FEATURES TO LOOK AT?
Formants:whatarethey?
Howtocalculatethem?LinearPredictionCoding(LPC)
Vocaltractlengthnormalization(VTLN)
Speechalignment
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
LFMEAutomaticalignmentvalidation
Formantsestimation LPC
Optimalorder
Warpingfunction(VTLN)
StudentAudiovalidation&pre-processing VAD
Teacher
VOWELS ANALYSIS:WHAT ABOUT NASALS?
Nasals:whatarethey?
Howtodistinguish?Computesmoothedpowerspectrum
Findpeaksinregionsrelatedtoformants
Usepeakinformationasacousticparameters
Speechalignment
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
LFMEAutomaticalignmentvalidation
Formantsestimation LPC
Optimalorder
Warpingfunction(VTLN)
Nasalizationevaluation /ɛ/ /ɛ/̃
StudentAudiovalidation&pre-processing VAD
Teacher
VOWELS ANALYSIS:COMPARISON AND FEEDBACK
Speechalignment
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
LFMEAutomaticalignmentvalidation
Formantsestimation LPC
Optimalorder
Warpingfunction(VTLN)
Nasalizationevaluation
Statisticalanalysis
Howtocomparevowels?Distancebetweenformantvectors(student/teacher)
Contextisimportant!
Checkdistancefromallothervowels
Oralswithorals,nasalswithnasals
StudentAudiovalidation&pre-processing VAD
Teacher
VOWELS ANALYSIS:THE FULL PICTURE
Speechalignment
PhoneticTranscription(fənɛtɪk trænskrɪpʃən)
LFMEAutomaticalignmentvalidation
Formantsestimation LPC
Optimalorder
Warpingfunction(VTLN)
Nasalizationevaluation
Statisticalanalysis
KING’S SPEECH:COMING NEXT
Speech&videoacquisitionandvisualization
Pronunciationanalysis
Gamescheduler
Avatar-basedrendering
Empowermentanalysis
Speechsynthesis/database
19th ofJuly:“Modular architectureforimmersivelearning applications”
LanguageKaraokescenario
Deeplearningphonemeevaluation
Student-TeacherFacialandSpeechsynchronization
THANK YOU FOR YOUR ATTENTION!