hidden markov models (hmm) rabiner’s paper markoviana reading group computer eng. & science...
TRANSCRIPT
![Page 1: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/1.jpg)
Hidden Markov Models (HMM)
Rabiner’s Paper
Markoviana Reading Group Computer Eng. & Science Dept.Arizona State University
![Page 2: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/2.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 2
Stationary and Non-stationary
Stationary Process:Its statistical properties do not vary with time
Non-stationary Process:The signal properties vary over time
![Page 3: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/3.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 3
HMM Example - Casino Coin
Fair Unfair
0.9 0.2
0.8
0.1
0.30.5 0.5 0.7
H HT T
State transition Pbbties.
Symbol emission Pbbties.
HTHHTTHHHTHTHTHHTHHHHHHTHTHHObservation Sequence
FFFFFFUUUFFFFFFUUUUUUUFFFFFF State Sequence
Motivation: Given a sequence of H & Ts, can you tell at what times the casino cheated?
Observation Symbols
States
Two CDF tablesTwo CDF tables
![Page 4: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/4.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 4
Properties of an HMM
First-order Markov process qt only depends on qt-1
Time is discrete
![Page 5: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/5.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 5
Elements of an HMM
.....
........
.....
.....
...
2
1
321
N
M
S
S
S
OOOOb
....
.......
....
....
...
2
1
21
N
N
S
S
S
SSSa
....
...21 NSSS
N, the number of States M, the number of Symbols States S1, S2, … SN Observation Symbols O1, O2, … OM
, the Probability Distributions a, b,
![Page 6: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/6.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 6
HMM Basic Problems
1. Given an observation sequence O=O1O2O3…OT and , find P(O|)
Forward Algorithm / Backward Algorithm
2. Given O=O1O2O3…OT and find most likely state sequence Q=q1q2…qT
Viterbi Algorithm
3. Given O=O1O2O3…OT and re-estimate so that P(O|) is higher than it is now
Baum-Welch Re-estimation
![Page 7: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/7.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 7
Forward Algorithm Illustration
t(i) is the probability of observing a partial sequence O1O2O3…Ot such that the state Si.
![Page 8: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/8.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 8
Forward Algorithm Illustration (cont’d)
StateSj
SN bN(O1) 1(i) aiN) bN(O2)
… … …
S6 b6(O1) 1(i) ai6) b6(O2)
S5 b5(O1) 1(i) ai5) b5(O2)
S4 b4(O1) 1(i) ai4) b4(O2)
S3 b3(O1) 1(i) ai3) b3(O2)
S2 b2(O1) 1(i) ai2) b2(O2)
S1 b1(O1) 1(i) ai1) b1(O2)
t(j) O1 O2 O3 O4 … OT
Observations Ot
Tot
al o
f th
is c
olum
n gi
ves
solu
tion
t(i) is the probability of observing a partial sequence O1O2O3…Ot such that the state Si.
![Page 9: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/9.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 9
Forward Algorithm
)|,...()( 21 ittt SqOOOPi
NiObi ii 1)()( 11
NjTtObaij tj
N
iijtt
1,11)()()( 1
11
Definition:
Initialization:
Induction:
Problem 1 Answer:
t(i) is the probability of observing a partial sequence O1O2O3…Ot such that the state Si.
N
iT iOP
1
)()|( Complexity: O(N2T)
![Page 10: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/10.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 10
Backward Algorithm Illustration
t(i) is the probability of observing a partial sequence Ot+1Ot+2Ot+3…OT such that the state Si.
![Page 11: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/11.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 11
Backward AlgorithmDefinition:
Initialization:
Induction:
t(i) is the probability of observing a partial sequence Ot+1Ot+2Ot+3…OT such that the state Si.
![Page 12: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/12.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 12
Q2: Optimality Criterion 1* Maximize the expected number of correct individual states
Definition:
Initialization:
Problem 2 Answer:
t(i) is the probability of being in state Si at time t given the observation sequence O and the model .
Problem: If some aij=0, the optimal state sequence may not even be a valid state sequence.
![Page 13: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/13.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 13
Q2: Optimality Criterion 2
* Find the single best state sequence (path), i.e. maximize P(Q|O,).
Definition:
t(i) is the highest probability of a state path for the partial observation sequence O1O2O3…Ot such that the state Si.
![Page 14: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/14.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 14
Viterbi Algorithm
The major difference from the forward algorithm:
Maximization instead of sum
![Page 15: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/15.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 15
Viterbi Algorithm Illustration
StateSj
SN bN(O1) max1(i) aiN] bN(O2)
… … …
S6 b6(O1) max1(i) ai6] b6(O2)
S5 b5(O1) max1(i) ai5] b5(O2)
S4 b4(O1) max1(i) ai4] b4(O2)
S3 b3(O1) max1(i) ai3] b3(O2)
S2 b2(O1) max1(i) ai2] b2(O2)
S1 b1(O1) max1(i) ai1] b1(O2)
t(j) O1 O2 O3 O4 … OT
Observations Ot
t(i) is the highest probability of a state path for the partial observation sequence O1O2O3…Ot such that the state Si.
Max
of
this
col
indi
cate
s tr
aceb
ack
star
t
![Page 16: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/16.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 16
Relations with DBNForward Function:
Backward Function:
Viterbi Algorithm:
bj(Ot+1) aij t(i)
bj(Ot+1) aijt+1(j)
t+1(j)
t(i)
T(i)=1
bj(Ot+1) aij t(i)
t+1(j)
![Page 17: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/17.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 17
Some more definitions t(i) is the probability of being in state Si at time t
t(i,j) is the probability of being in state Si at time t, and Sj at time t+1
![Page 18: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/18.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 18
Baum-Welch Re-estimation
Expectation-Maximization Algorithm
Expectation:
![Page 19: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/19.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 19
Baum-Welch Re-estimation (cont’d)
Maximization:
![Page 20: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/20.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 20
Notes on the Re-estimation If the model does not change, it means that it has
reached a local maxima. Depending on the model, many local maxima can
exist Re-estimated probabilities will sum to 1
![Page 21: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/21.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 21
Implementation issues
Scaling Multiple observation sequences Initial parameter estimation Missing data Choice of model size and type
![Page 22: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/22.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 22
Scaling
calculation:
Recursion to calculate:
![Page 23: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/23.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 23
Scaling (cont’d)
calculation:
Desired condition:
* Note that is not true!
![Page 24: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/24.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 24
Scaling (cont’d)
![Page 25: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/25.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 25
Maximum log-likelihood
Initialization:
Recursion:
Termination:
![Page 26: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/26.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 26
Multiple observations sequences
Problem with re-estimation
![Page 27: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/27.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 27
Initial estimates of parameters
For and A, Random or uniform is sufficient
For B (discrete symbol prb.), Good initial estimate is needed
![Page 28: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/28.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 28
Insufficient training data
Solutions:
Increase the size of training data
Reduce the size of the model
Interpolate parameters using another model
![Page 29: Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c755503460f94929b8e/html5/thumbnails/29.jpg)
Markoviana Reading Group Fatih Gelgi – Feb, 2005 29
References L Rabiner. ‘
A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.’ Proceedings of the IEEE 1989.
S Russell, P Norvig. ‘Probabilistic Reasoning Over Time’. AI: A Modern Approach, Ch.15, 2002 (draft).
V Borkar, K Deshmukh, S Sarawagi. ‘Automatic segmentation of text into structured records.’ ACM SIGMOD 2001.
T Scheffer, C Decomain, S Wrobel. ‘Active Hidden Markov Models for Information Extraction.’ Proceedings of the International Symposium on Intelligent Data Analysis 2001.
S Ray, M Craven. ‘Representing Sentence Structure in Hidden Markov Models for Information Extraction.’ Proceedings of the 17th International Joint Conference on Artificial Intelligence 2001.