hidden markov model in automatic speech recognition

Upload: vipkute0901

Post on 02-Jun-2018

262 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    1/29

    Hidden Markov Model in

    Automatic Speech Recognition

    Z. Fodroczi

    Pazmany Peter Catholic. Univ.

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    2/29

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    3/29

    Discrete Markov Processes

    S1 S2

    S3

    a11a22

    a33

    a12

    a21

    a23a32

    States={S1,S2,S3} P(qt=Si|qt-1=Sj)=aijSum(j=1..N)(aij)=1

    a31

    a13

    Consider the following model of weather:

    S1: rain or snow

    S2: cloudy

    S3: sunny

    8.01.01.0

    2.06.02.0

    3.03.04.0

    ijaA

    What is the probability that the weather for the next 3 days will be sun,sun,rain.

    O={S3,S3,S1}observation sequence

    P(O|Model)=P(S3,S3,S1|Model)=P(S3)*P(S3|S3)*P(S3|S1)=1*0.8*0.3=0.24

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    4/29

    Hidden Markov Model

    The observation is a probabilistic function of the state.

    Teacher-mood-modelSituation:

    Your school teacher gave three dierent types of daily homework assignments:

    A: took about 5 minutes to complete B: took about 1 hour to complete

    C: took about 3 hours to complete

    Your teacher did not reveal openly his mood to you daily, but you knew thatyour teacher was either in a bad, neutral, or a good mood for a whole day.

    Mood changes occurred only overnight.

    Question: How were his moods related to the homework type assigned thatday?

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    5/29

    Hidden Markov Model

    Model parameters:

    S - states {good, neutral, bad}akl,- probability that state l is

    followed by k

    alphabet {A,B,C}

    ek(x)probability that state k emit

    symbol x

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    6/29

    Hidden Markov Model

    One week, your teacher gave the following homework assignments: Monday: A

    Tuesday: C

    Wednesday: B

    Thursday: A

    Friday: C

    Questions:

    What did his mood curve look like most likely that week?(Searching for the most probable pathViterbi algorithm)

    What is the probability that he would assign this order of homework

    assignments? (Probability of a sequence - Forward algorithm)

    How do we adjust the model parameters (S,aij,ei(x)) to maximize P(O| )(create a HMM for a given sequence set)

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    7/29

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    8/29

    HMMViterbi algorithm

    Matrix vk(i), where kSand 1

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    9/29

    HMMViterbi algorithm

    Empty table

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    10/29

    HMMViterbi algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    11/29

    HMMViterbi algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    12/29

    HMMViterbi algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    13/29

    HMMViterbi algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    14/29

    HMMViterbi algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    15/29

    HMMViterbi algorithm

    Question:

    What did his mood curve look like mostlikely that week?

    Answer:

    Most probable mood curve:Day: Mon Tue Wed Thu FriAssignment: A C B A C

    Mood: good bad neutral good bad

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    16/29

    HMMForward algorithm

    Used to test the probability of a sequence.

    Given:

    Hidden Markov model: S, akl, , el(x).

    Observed symbol sequence E =x1; : : : ; xn.

    What is the probability of E.

    Let fk(i) be the probability of the symbol sequencex1,x2, . xi

    ending in state k. Then:

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    17/29

    HMMForward algorithm

    Matrix fk(i), where kSand 1

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    18/29

    HMMForward algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    19/29

    HMMForward algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    20/29

    HMMForward algorithm

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    21/29

    HMMParameter estimation

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    22/29

    HMMParameter estimation

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    23/29

    HMMParameter estimation

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    24/29

    Type of HMMs

    Till now I was speaking about HMMs with

    discrete output (finite ||).

    Extensions:continuous observation probability density function

    mixture of Gaussian pdfs

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    25/29

    HMMs in ASR

    Separation FeatureextractionHMM

    decoder

    UnderstandingApplication

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    26/29

    HMMs in ASR

    How HMM can used to classify feature sequences to known classes.

    Make a HMM to each class.

    By determineing the probability of a sequence to the HMMs, we can decide whichHMM could most probable generate the sequence.

    There are several idea what to model:

    Isolated word recognition ( HMM for each known word)Usable just on small dictionaries. (digit recognition etc.)Number of states usually >=4. Left-to-rigth HMM

    Monophone acoustic model ( HMM for each phone)~50 HMM

    Triphone acoustic model (HMM for each three phone sequence)50^3 = 125000 triphoneseach triphone has 3 state

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    27/29

    HMMs in ASR

    Hierarchical system of HMMs

    HMM of a triphone HMM of a triphone HMM of a triphone

    Higher level HMM of a word

    Language model

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    28/29

    ASR state of the art

    Typical state-of-the-art large-vocabulary

    ASR system:- speaker independent

    - 64k word vocabulary- trigram (2-word context) language model

    - multiple pronunciations for each word

    - triphone or quinphone HMM-based acoustic model

    - 100-300X real-time recognition

    - WER 10%-50%

  • 8/10/2019 Hidden Markov Model in Automatic Speech Recognition

    29/29

    HMM Limitations

    Data intensive

    Computationally intensive

    50 phones = 125000 possible triphones

    3 states per triphone3 Gaussian mixture for each state

    64k word vocabulary

    262 trillion trigrams

    2-20 phonemes per word in 64k vocabulary39 dimensional feature vector sampled every 10ms

    100 frame per second