gpu-accelerated hmm for speech recognition leiming yu, yash ukidave and david kaeli ece,...
TRANSCRIPT
![Page 1: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/1.jpg)
GPU-ACCELERATED HMM FOR SPEECH RECOGNITION
Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University
HUCAA 2014
![Page 2: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/2.jpg)
Outline
Background & Motivation
HMM
GPGPU
Results
Future Work
![Page 3: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/3.jpg)
Background
• Translate Speech to Text
• Speaker DependentSpeaker Independent
• Applications* Natural Language Processing* Home Automation* In-car Voice Control* Speaker Verifications* Automated Banking* Personal Intelligent Assistants
Apple SiriSamsung S Voice
* etc.
[http://www.kecl.ntt.co.jp]
![Page 4: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/4.jpg)
DTWDynamic Time Warping
A template-based approach to measure similarity between two temporal sequences which may vary in time or speed.
[opticalengineering.spiedigitallibrary.org]
![Page 5: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/5.jpg)
DTWDynamic Time Warping
DTW Pros:1) Handle timing variation2) Recognize Speech at reasonable cost
DTW Cons:1) Template Choosing2) Ending point detection (VAD, acoustic noise) 3) Words with weak fricatives, close to acoustic background
For i := 1 to n For j := 1 to m cost:= D(s[i], t[j]) DTW[i, j] := cost + minimum(DTW[i-1, j ], DTW[i , j-1], DTW[i-1, j-1])
![Page 6: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/6.jpg)
Neural NetworksAlgorithms mimics the brain.
Simplified Interpretation:* takes a set of input features* goes through a set of hidden layers* produces the posterior probabilities as the output
![Page 7: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/7.jpg)
Neural Networks
“activation” of unit in layer
matrix of weights controlling function mapping from layer to layer
Bike Pedestrian Car Parking Meter
If Pedestrian
[Machine Learning, Coursera]
![Page 8: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/8.jpg)
Neural Networks
Equation Example
![Page 9: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/9.jpg)
Neural Networks Example
Hint: * effective in recognizing individual phones isolated words as short-time units
* not ideal for continuous recognition tasks largely due to the poor ability to model temporal dependencies.
![Page 10: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/10.jpg)
Hidden Markov ModelIn a Hidden Markov Model,
* the states are hidden* output that depend on the states are visible
x — statesy — possible observationsa — state transition probabilitiesb — output probabilities
[wikipedia]
![Page 11: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/11.jpg)
Hidden Markov ModelThe temporal transition of the hidden states fits well with the nature of phoneme transition.
Hint: * Handle temporal variability of speech well * Gaussian mixture models(GMMs), controlled by the hidden variables determine how well a HMM can represent the acoustic input. * Hybrid with NN to leverage each modeling technique
![Page 12: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/12.jpg)
Motivation• Parallel Architecture
multi-core CPU to many-core GPU ( graphics + general purpose)
• Massive Parallelism in Speech Recognition SystemNeural Networks, HMMs, etc. , are both Computation and Memory Intensive
• GPGPU Evolvement* Dynamic Parallelism
* Concurrent Kernel Execution* Hyper-Q* Device Partitioning* Virtual Memory Addressing* GPU-GPU Data Transfer, etc.
• Previous works
• Our goal is to use new modern GPU features to accelerate Speech Recognition
![Page 13: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/13.jpg)
Outline
Background & Motivation
HMM
GPGPU
Results
Future Work
![Page 14: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/14.jpg)
Hidden Markov ModelMarkov chains and processes are named after Andrey Andreyevich Markov(1856-1922), a Russian mathematician, whose Doctoral Advisor is Pafnuty Chebyshev.
1966, Leonard Baum described the underlying mathematical theory.
1989, Lawrence Rabiner wrote a paper with the most comprehensive description on it.
![Page 15: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/15.jpg)
Hidden Markov ModelHMM Stages
* causal transitional probabilities between states
* observation depends on current state, not predecessor
![Page 16: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/16.jpg)
Hidden Markov Model
Forward
Backward
Expectation-Maximization
![Page 17: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/17.jpg)
HMM-Forward
![Page 18: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/18.jpg)
Hidden Markov Model
Forward
Backward
Expectation-Maximization
![Page 19: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/19.jpg)
HMM Backward
I J
t - 1 t t + 1 t + 2
𝛼 𝑖(𝑡) 𝛽 𝑗 (𝑡+1)
𝛼 𝑖𝑗
𝛽 𝑗 (𝑥𝑡+1)
![Page 20: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/20.jpg)
HMM-EM
Variable Definitions:* Initial Probability
* Transition Prob. Observation Prob.
* Forward Variable Backward Variable
Other Variables During Estimation:* the estimated state transition probability matrix, epsilon
* the estimated probability in a particular state at time t, gamma
* Multivariate Normal Probability Density FunctionUpdate Obs. Prob. From Gaussian Mixture Models
𝜀
𝛾
![Page 21: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/21.jpg)
HMM-EM
![Page 22: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/22.jpg)
Outline
Background & Motivation
HMM
GPGPU
Results
Future Work
![Page 23: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/23.jpg)
GPGPU
Programming Model
![Page 24: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/24.jpg)
GPGPUGPU Hierarchical Memory System
[http://www.biomedcentral.com]
• Visibility
• Performance Penalty
![Page 25: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/25.jpg)
GPGPU
[www.math-cs.gordon.edu]
• Visibility
• Performance Penalty
![Page 26: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/26.jpg)
GPGPUGPU-powered Eco System
1) Programming Model* CUDA* OpenCL* OpenACC, etc.
2) High Performance Libraries* cuBLAS* Thrust* MAGMA (CUDA/OpenCL/Intel Xeon Phi)* Armadilo (C++ Linear Algebra Library), drop-in libraries etc.
3) Tuning/Profiling Tools* Nvidia: nvprof / nvvp* AMD: CodeXL
4) Consortium StandardsHeterogeneous System Architecture (HSA) Foundation
![Page 27: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/27.jpg)
Outline
Background & Motivation
HMM
GPGPU
Results
Future Work
![Page 28: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/28.jpg)
ResultsPlatform Specs
![Page 29: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/29.jpg)
ResultsMitigate Data Transfer Latency
Pinned Memory Sizecurrent process limit: ulimit -l ( in KB )hardware limit: ulimit –H –lincrease the limit: ulimit –S –l 16384
![Page 30: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/30.jpg)
Results
![Page 31: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/31.jpg)
ResultsA Practice to Efficiently Utilize Memory System
![Page 32: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/32.jpg)
Results
![Page 33: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/33.jpg)
Results
Hyper-Q Feature
![Page 34: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/34.jpg)
Results
Running Multiple Word Recognition Tasks
![Page 35: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/35.jpg)
Results
![Page 36: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/36.jpg)
Outline
Background & Motivation
HMM
GPGPU
Results
Future Work
![Page 37: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/37.jpg)
Future Work
• Integrate with Parallel Feature Extraction
• Power Efficiency Implementation and Analysis
• Embedded System Development, Jetson TK1 etc.
• Improve generosity, LMs
• Improve robustness, Front-end noise cancelation
• Go with the trend!
![Page 38: GPU-ACCELERATED HMM FOR SPEECH RECOGNITION Leiming Yu, Yash Ukidave and David Kaeli ECE, Northeastern University HUCAA 2014](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1e5503460f949f2123/html5/thumbnails/38.jpg)
QUESTIONS ?