uitspraakevaluatie & training met behulp van spraaktechnologie pronunciation assessment &...
TRANSCRIPT
![Page 1: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/1.jpg)
Uitspraakevaluatie & trainingmet behulp van spraaktechnologie
Pronunciation assessment & trainingby means of speech technology
Helmer Strik – and many othersCentre for Language and Speech Technology (CLST)Radboud University Nijmegen, the Netherlands
Radboud University Nijmegen
![Page 2: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/2.jpg)
Radboud University NijmegenLeuven, 28-04-2007 2
Context‘Deviant’ pronunciation (e.g., pathology, non-natives)
& speech technology (applications) :
AssessmentDiagnosis, monitoring
Training (therapy, learning)Speaking & listening; reading aloud
AAC (Augmentative & Alternative Communication)
Improve communication
![Page 3: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/3.jpg)
Radboud University NijmegenLeuven, 28-04-2007 3
Our research
Past: Fluency assessment - Temporal measures CAPT: Computer Assisted Pronunciation Training Pronunciation error detection Recognition of dysarthric speech
Current, future: OSTT: Ontwikkelcentrum voor Spraak- en
Taaltechnologie ten behoeve van Spraak- en Taalpathologie en Revalidatietechnologie
Training & error detection, not only pronunciation, but also other (e.g. morpho-syntactic) aspects
![Page 4: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/4.jpg)
Radboud University NijmegenLeuven, 28-04-2007 4
Our research
Past: Fluency assessment - Temporal measures CAPT: Computer Assisted Pronunciation Training Pronunciation error detection Recognition of dysarthric speech
Current, future: OSTT: Ontwikkelcentrum voor Spraak- en
Taaltechnologie ten behoeve van Spraak- en Taalpathologie en Revalidatietechnologie
Training & error detection, not only pronunciation, but also other (e.g. morpho-syntactic) aspects
![Page 5: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/5.jpg)
Radboud University NijmegenLeuven, 28-04-2007 5
CAPT:Computer Assisted Pronunciation Training
Pronunciation errors – detected automatically by means of Automatic Speech Recognition (ASR) → feedback
Question: ASR-based CAPT: Is it effective? Goal: To study the effectiveness and possible
advantages of ASR-based CAPT Target users :
Adult learners of Dutch with different L1's Pedagogical goal :
Improving segmental quality in pronunciation
![Page 6: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/6.jpg)
Radboud University NijmegenLeuven, 28-04-2007 6
Dutch CAPT: feedback
Content: focus on problematic phonemes
Criteria1. Common across speakers of various L1’s2. Perceptually salient 3. Frequent 4. Persistent5. Robust for automatic detection (ASR)
Result:11 ‘targeted phonemes’: 9 vowels and 2 consonants
![Page 7: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/7.jpg)
Radboud University NijmegenLeuven, 28-04-2007 7
11 ‘targeted phonemes’IPA symbol example
//// toch, Scheveningen
//// hand, Helmer
//// pat
//:/:/ naam
//// pit
//// put
//// vuur
//// voer
//:/:/ deur
//// fijn
//// huis
![Page 8: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/8.jpg)
Radboud University NijmegenLeuven, 28-04-2007 8
Video (from Nieuwe Buren)
![Page 9: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/9.jpg)
Radboud University NijmegenLeuven, 28-04-2007 9
![Page 10: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/10.jpg)
Radboud University NijmegenLeuven, 28-04-2007 10
Video: dialogue
![Page 11: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/11.jpg)
Radboud University NijmegenLeuven, 28-04-2007 11
Max. 3 times
![Page 12: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/12.jpg)
Radboud University NijmegenLeuven, 28-04-2007 12
![Page 13: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/13.jpg)
Radboud University NijmegenLeuven, 28-04-2007 13
![Page 14: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/14.jpg)
Radboud University NijmegenLeuven, 28-04-2007 14
Experiment: participants & training
Regular teacher-fronted lessons: 4-6 hrs per week
a) Experimental group (EXP): n=15 (10 F, 5 M)Dutch CAPT
b) Control group 1 (NiBu): n=10 (4 F, 6 M)reduced version of Nieuwe Buren
c) Control group 2 (noXT): n=5 (3 F, 2 M)no extra training
Extra training: 4 weeks x 1 session 30’ – 60’ 1 class – 1 type of training
![Page 15: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/15.jpg)
Radboud University NijmegenLeuven, 28-04-2007 15
Experiment: testing
3 analyses:1. Participants’ evaluations: questionnaires on system’s
usability, accessibility, usefulness etc.2. Global segmental quality: 6 experts rated stimuli on
10-point scale (pretest/posttest, phonetically balanced sentences)
3. In-depth analysis of segmental errors: expert annotations
![Page 16: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/16.jpg)
Radboud University NijmegenLeuven, 28-04-2007 16
Results: participants’ evaluations
Positive reactions
Enjoyed working with the system
Believed in the usefulness of the system
![Page 17: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/17.jpg)
Radboud University NijmegenLeuven, 28-04-2007 17
Results: Global segmental quality
3
3,5
4
4,5
5
5,5
6
6,5
pre post
EXP
NiBu
noXT
All 3 groups improve (mean improvement)EXP improved most
![Page 18: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/18.jpg)
Radboud University NijmegenLeuven, 28-04-2007 18
In-depth analysis of segmental errors
0%
5%
10%
15%
20%
25%
EXP NiBu EXP NiBu
targeted untargeted
pretest
posttest
![Page 19: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/19.jpg)
Radboud University NijmegenLeuven, 28-04-2007 19
Conclusions
Goal: To study the effectiveness and possible advantages of ASR-based CAPT
Question: ASR-based CAPT: Is it effective? Answer: Yes! It is effective in improving the pronunciation of targeted phonemes.
Advantages :ASR-based CAPT can provide automatic, instantaneous, individual feedback on pronunciation in a private environment.
![Page 20: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/20.jpg)
Radboud University NijmegenLeuven, 28-04-2007 20
Video: pronouncing words
![Page 21: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/21.jpg)
Radboud University NijmegenLeuven, 28-04-2007 21
Error detection
Detection of pronunciation errors Goodness Of Pronunciation (GOP)
o Silke Witt & Steve Young Acoustic-phonetic features (APF)
o Khiet Truong et al.
Goal: improve error detection
![Page 22: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/22.jpg)
Radboud University NijmegenLeuven, 28-04-2007 22
![Page 23: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/23.jpg)
Radboud University NijmegenLeuven, 28-04-2007 23
Goodness Of Pronunciation (GOP): Accuracy
15 participants2174 target phones
Accept Reject Total
Correct CA: 59.5% CR: 26.5% C: 86.0%
False FA: 9.2% FR: 4.8% F: 14.0%
![Page 24: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/24.jpg)
Radboud University NijmegenLeuven, 28-04-2007 24
Acoustic-phonetic features (APF)
Selection of segmental pronunciation errors:
/A/ mispronounced as /a:/ (man - maan)
/Y/ mispronounced as /u/ or /y/ (tut – toet or tuut)
/x/ mispronounced as /k/ or /g/ (gat – kat or /g/at)
![Page 25: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/25.jpg)
Radboud University NijmegenLeuven, 28-04-2007 25
Amplitude
Rate Of Rise
(ROR)
![Page 26: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/26.jpg)
Radboud University NijmegenLeuven, 28-04-2007 26
Height of the highest ROR peak (‘ROR’)
1 amplitude measurement before the ROR peak (‘i1’)
3 amplitude measurements after the ROR peak (‘i2’, ‘i3’, ‘i4’)
Duration (‘rawdur’ or ‘normdur’ or not used at all ‘nodur’)
![Page 27: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/27.jpg)
Radboud University NijmegenLeuven, 28-04-2007 27
male (A)male (B)
female (A)female (B)
GOP
APF76
78
80
82
84
86
88
90
92
94
Acc.
Accuracy (%), /x/ vs. /k/
GOP
APF
![Page 28: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/28.jpg)
Radboud University NijmegenLeuven, 28-04-2007 28
Error detection
Goodness Of Pronunciation (GOP): One general method for all sounds Error specific knowledge is not used
Acoustic-phonetic features (APF) Error specific knowledge is used Works well How to generalize? (artic. + other features)
Combination?Other approaches, e.g. post. prob’s (ANN)?
![Page 29: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/29.jpg)
Radboud University NijmegenLeuven, 28-04-2007 29
![Page 30: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/30.jpg)
Radboud University NijmegenLeuven, 28-04-2007 30
controlsignals <XML>
client server
sockets
sound samples
XML handler
samplehandler
UI
EPD
Audio I/O
content
XML handler
samplehandler
daemon
utilities
feature extraction
utter. verification
(error detection)
ASR (HTK)
database
classification
acoustic analysis
Pronunciation evaluation (Praat)
controlsignals <XML>
client server
sockets
sound samples
XML handler
samplehandler
UI
EPD
Audio I/O
content
XML handler
samplehandler
daemon
utilities
feature extraction
utter. verification
(error detection)
ASR (HTK)
database
classification
acoustic analysis
Pronunciation evaluation (Praat)
![Page 31: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/31.jpg)
Radboud University NijmegenLeuven, 28-04-2007 31
Dutch CAPT
Gender-specific, Dutch & English version.
4 units, each containing: 1 video (from Nieuwe Buren) with real-life + amusing
situations + ca. 30 exercises based on video: dialogues, question-
answer, minimal pairs, word repetition
Sequential, constrained navigation: min. one attempt needed to proceed to next exercise, maximum 3
![Page 32: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/32.jpg)
Radboud University NijmegenLeuven, 28-04-2007 32
Results: reliability global ratings
Cronbach’s α:
Intrarater: 0.94 – 1.00
Interrater: 0.83 - 0.96
![Page 33: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/33.jpg)
Radboud University NijmegenLeuven, 28-04-2007 33
L1 Training group Total EXP NiBu NoXt Arabic 6 6
Bengali 1 1 Catalan 2 2 English 1 1 1 3 German 1 1 Greek 2 2
Hebrew 1 1 Italian 1 1 2
Lithuanian 1 1 Polish 2 1 2 5
Russian 1 1 Spanish 1 1 Swedish 1 1 Turkish 2 2
Ukrainian 1 1
Total 15 10 5 30
![Page 34: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/34.jpg)
Radboud University NijmegenLeuven, 28-04-2007 34
Results: Global ratings
3,7
4,7
3,3
4,04,0
5,15,0
5,65,66,0
3,0
3,5
4,0
4,5
5,0
5,5
6,0
pre post
Exp
Exp_Ar
Exp_IE
NiBu
noXT
![Page 35: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/35.jpg)
Radboud University NijmegenLeuven, 28-04-2007 35
Possible improvements
Increase sample size (more participants) Increase training intensity (more training) Match training groups: L1’s, proficiency, etc.
Give feedback on more phonemesMore targeted systems for fixed L1-L2 pairs.
Give feedback on suprasegmentals
Improve error detection?
![Page 36: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/36.jpg)
Radboud University NijmegenLeuven, 28-04-2007 36
Error detection
Pronunciation errors 11 ‘problematic sounds’: 9 V + 2 C Goal: give feedback on more sounds
Morpho-syntactic errors maak / maakt / maken
o Ik maako Hij/zij maakto Wij maken
Goal: also give feedback on morpho-syntactic aspects
![Page 37: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/37.jpg)
Radboud University NijmegenLeuven, 28-04-2007 37
Goodness Of Pronunciation (GOP)
GOP has been applied in the exp. system.The exp. system was effective.
Evaluate GOP Correct vs. errors Patterns Pros & cons Improve
![Page 38: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/38.jpg)
Radboud University NijmegenLeuven, 28-04-2007 38
Accuracy (%), /x/ vs. /k/
76
78
80
82
84
86
88
90
92
94
male (A) male (B) female (A) female (B)
Acc
. GOP
APF
![Page 39: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/39.jpg)
Radboud University NijmegenLeuven, 28-04-2007 39
Test condition A Test condition B
Results /x/ vs /k/, male speakersS
corin
g ac
cura
cy (
in %
)
83.58
74.15
85.50 86.77 88.96
76.97
93.0689.91
50
60
70
80
90
10
0
GOPWeigeltLDA-APFLDA-MFCC
![Page 40: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/40.jpg)
Radboud University NijmegenLeuven, 28-04-2007 40
Test condition A Test condition B
Results /x/ vs /k/, female speakersS
corin
g ac
cura
cy (
in %
)
85.4080.00
88.93 89.49
81.84 83.09
92.90 91.65
50
60
70
80
90
10
0
GOPWeigeltLDA-APFLDA-MFCC
![Page 41: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/41.jpg)
Radboud University NijmegenLeuven, 28-04-2007 41
Exp. A.1 Exp. A.2
Male Female Male Female
S1 = [i1 i3] S2 = [ROR i1 i2 i3 i4]
S2S1S2S1S2S1S2S1
Cor
rect
cla
ssifi
catio
n %
100
95
90
85
80
nodur
normdur
9796
89
969494
9193
9696
88
96
93
91
86
90
Training = DL2N1-Nat
Test = DL2N1-Nat
Training = DL2N1-NN
Test = DL2N1-NN
Results method II (LDA)/x/ vs /k/
![Page 42: Uitspraakevaluatie & training met behulp van spraaktechnologie Pronunciation assessment & training by means of speech technology Helmer Strik – and many](https://reader036.vdocuments.us/reader036/viewer/2022062307/551b1d6e5503465e7d8b6693/html5/thumbnails/42.jpg)
Radboud University NijmegenLeuven, 28-04-2007 42
11 ‘targeted phonemes’
///, //, //, //, //, //, //, //, //, //, /:/, /:/, //, //, //, //, //, //, /:/, /:/, ///