letter to phoneme alignment
DESCRIPTION
Letter to Phoneme Alignment. Reihaneh Rabbany Shahin Jabbari. Outline. Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result. Text to Speech Problem. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/1.jpg)
LETTER TO PHONEME ALIGNMENT
Reihaneh Rabbany
Shahin Jabbari
![Page 2: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/2.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 2
![Page 3: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/3.jpg)
TEXT TO SPEECH TEXT TO SPEECH PROBLEM
Conversion of Text to Speech: TTS
Automated Telecom ServicesE-mail by PhoneBanking SystemsHandicapped People
3
![Page 4: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/4.jpg)
PRONUNCIATIONPRONUNCIATION
Pronunciation of the words Dictionary Words Non-Dictionary Words
Phonetic Analysis
Dictionary Look-up Language is alive, new words add Proper Nouns
4
Phonetic AnalysisWord
Pronunciation
![Page 5: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/5.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 5
![Page 6: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/6.jpg)
PROBLEM
Letter to Phoneme Alignment◦ Letter: c a k e
◦ Phoneme: k ei k
6
L2P
![Page 7: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/7.jpg)
CHALLENGES
No Consistency◦ City / s /◦ Cake / k /◦ Kid / k /
No Transparency◦ K i d (3) / k i d / (3) ◦ S i x (3) / s i k s / (4)◦ Q u e u e (5) / k j u: / (3)◦ A x e (3) / a k s / (3)
7
![Page 8: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/8.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 8
![Page 9: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/9.jpg)
ONE-TO-ONE EMDAELEMANS ET.AL., 1996 Length of word = pronunciation Produce all possible alignments
Inserting null letter/phoneme
Alignment probability
9
i
ii lpPAP )|()(
![Page 10: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/10.jpg)
DECISION TREEBLACK ET.AL., 1996
Train a CART Using Aligned Dictionary Why CART? A Single Tree for Each Letter
10
![Page 11: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/11.jpg)
KONDRAK
Alignments are not always one-to-one A x e / a k s / B oo k /b ú k /
Only Null Phoneme Similar to one-to-one EM
Produce All Possible Alignments Compute the Probabilities
11
![Page 12: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/12.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 12
![Page 13: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/13.jpg)
FORMAL MODEL
Word: sequence of letters
Pronunciation: sequence of phonemes
Alignment: sequence of subalignments
Problem: Finding the most probable alignment
13
mpppP ...21
iiik PLaaaaA ,...21
nlllL ...21
),|(maxarg PLAPA Abest
2|||,| ii PL
![Page 14: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/14.jpg)
MANY-TO-MANY EM
1. Initialize prob(SubAlignmnets)// Expectation Step2. For each word in training_set
2.1. Produce all possible alignments 2.2. Choose the most probable
alignment// Maximization Step3. For all subalignments
3.1. Compute new_p(SubAlignmnets)
14][
],[)(
i
iii lM
plMaP
![Page 15: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/15.jpg)
DYNAMIC BAYESIAN NETWORK
15
Model
Subaligments are considered as hidden variables
Learn DBN by EM
lili PiPi
ai
k
iiii PLaPAP
1
),|()(
],[
][)(
ii
ii lpM
aMaP
![Page 16: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/16.jpg)
CONTEXT DEPENDENT DBN
Context independency assumption Makes the model simpler It is not always a correct assumption Example: Chat and Hat
Model
16
lili PiPi
aiai-1
k
iiiii PLaaPAP
11 ),,|()(
],,[
][)(
1 iii
ii lpaM
aMaP
![Page 17: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/17.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 17
![Page 18: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/18.jpg)
EVALUATION DIFFICULTIES
Unsupervised Evaluation No Aligned Dictionary
Solutions How much it boost a supervised module
Letter to Phoneme Generator Comparing the result with a gold alignment
AER
18
![Page 19: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/19.jpg)
Letter to Phoneme Generator
Percentage of correctly generated phonemes and words
How it works? Finding Chunks
Binary Classification Using Instance-Based-Learning
Phoneme Prediction Phoneme is predicted independently for each letter Phoneme is predicted for each chunk
Hidden Markov Model 19
![Page 20: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/20.jpg)
ALIGNMENT ERROR RATIO
AER Evaluating by Alignment Error Ratio
Counting common pairs between Our aligned output Gold alignment
Calculating AER
20
|| A
GAAER
![Page 21: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/21.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 21
![Page 22: Letter to Phoneme Alignment](https://reader035.vdocuments.us/reader035/viewer/2022062809/568157e7550346895dc56141/html5/thumbnails/22.jpg)
RESULTS
22
10 fold cross validation
Model Word Accuracy
Phoneme Accuracy
Best previous results 66.82 92.45
One_To_One EM 53.87% 85.66%
Many_To_Many EM 76% 94.5%
DBN ContextIndependent
79.12% 95.23%
ContextDependent
81.54% 96. 70%