iowa state university developmental robotics laboratory unsupervised segmentation of audio speech...
TRANSCRIPT
![Page 1: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/1.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm
Matthew Miller, Alexander StoytchevDevelopmental Robotics Lab
Department of Electrical and Computer Engineering Iowa State University
[email protected], [email protected]/~mamille/
![Page 2: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/2.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Language: A Grand Challenge• A working example• Automatically acquires
language• Well studied
![Page 3: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/3.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Statistical Learning Experiments
• Saffran et. al. (1996): 8-month-olds can segment speech.
Artificial Language:tupiro golabu bedaku padoti
Language: tu pi ro go la bu be da kuTransition Prob: 1.0 1.0 .25 1.0 1.0 .25 1.0 1.0 ...
Acclimate
Novel Word
• Hypothesis: Infants use local minima in single syllable transition probabilities to segment speech streams.
![Page 4: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/4.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Voting Experts
• An algorithm for unsupervised segmentation• Key Idea: Natural “chunks” have:
– Low Internal Information– High Boundary Entropy
itwasabrightcolddayinaprilandtheclockswere
))"log(Pr(")"(" brightbrightI
)"(")"(" rightcIbrightI
![Page 5: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/5.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Voting Experts
• An algorithm for unsupervised segmentation• Key Idea: Natural “chunks” have:
– Low Internal Information– High Boundary Entropy
itwasabrightcolddayinaprilandtheclockswere
)"(")"|"Pr()"("
brightIbrightbrightE
)"(")"(" brighEbrightE
![Page 6: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/6.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
VE Implementation (Cohen 2006)
1. Build an n-gram trie from text.2. Slide a window along the text sequence3. Two experts vote how to break the window
1. One minimizes internal info2. Other maximizes boundary entropy
i t w a s a b r i g h t c o l d d a y i n a p r i lWindow
1
windowts
II
..
)]()([min ,)"(")"(" abrigIasI
![Page 7: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/7.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
VE Implementation (Cohen 2006)
1. Build an n-gram trie from text.2. Slide a window along the text sequence3. Two experts vote how to break the window
1. One minimizes internal info2. Other maximizes boundary entropy
i t w a s a b r i g h t c o l d d a y i n a p r i lWindow
2
windowts
E
..
)]([max )"("asaE
![Page 8: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/8.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
VE Implementation (Cohen 2006)
1. Build an n-gram trie from text.2. Slide a window along the text sequence3. Two experts vote how to break the window
1. One minimizes internal info2. Other maximizes boundary entropy
4. Break at vote peaks
i t w a s a b r i g h t c o l d d a y i n a p r i l
i | t | w | a | s | a | b | r | i | g | h | t | c | o | l | d0
3
1
0
3
2
0
1
1
0
0
6
1
0
0
![Page 9: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/9.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
VE Results• Results are surprisingly good on text
– Especially giving its simplicity– Accuracy and Hit rate about 75%
• Seems to capture something about the nature of “chunks”
• Can we use this algorithm to segment real audio?
It was a br igh t
![Page 10: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/10.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Acoustic Model
![Page 11: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/11.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Acoustic Model
• Cluster spectral features using a GGSOM
![Page 12: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/12.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Acoustic Model
• Cluster spectral features using a GGSOM• Collapse state sequence
![Page 13: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/13.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Acoustic Model
• Cluster spectral features using a GGSOM• Collapse state sequence• Run VE to get breaks
![Page 14: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/14.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Experiments and Results• Used the model to segment “1984”
– CD 1 of audio book (40 mins)– Chosen for length, consistency– Evaluation: Human graders
![Page 15: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/15.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
New Experiments• Trained on infant datasets
• Tested on manually generated keys
Stream A:tupiro golabu bedaku padoti
Stream B:dapiku tilado pagotu burobi
Train Train
Train Train
Test Test
Test Test
Acoustic Model A
Acoustic Model B
VE Model A
VE Model B
Key A
Key B
![Page 16: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/16.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
New Experiments• Trained on infant datasets
• Tested on manually generated keys
Stream A:tupiro golabu bedaku padoti
Stream B:dapiku tilado pagotu burobi
Test TestTes
t Test
Acoustic Model A
Acoustic Model B
VE Model A
VE Model B
Key B
Key A
![Page 17: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/17.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Results• Experiment 1
– Accuracy: 50% on all induced breaks– Hit Rate: 75% of word breaks– Significantly better than chance
• Experiment 2– Accuracy: 16% on all induced breaks– Hit Rate: 1% of word breaks– Worse than chance– 18 breaks, 3 correct
![Page 18: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/18.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Conclusions and Future Work• VE Model can be used to segment audio
• Can reproduce the results of Infant studies
• May model part of the human chunking mechanism
• Have built more sophisticated acoustic models– Better results (nearly perfect)
![Page 19: Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander](https://reader035.vdocuments.us/reader035/viewer/2022062518/56649e8f5503460f94b92c99/html5/thumbnails/19.jpg)
Iowa State UniversityDevelopmental Robotics Laboratory
Thank You• www.cs.iastate.edu/~mamille/