slide 1 ee3j2 data mining ee3j2 data mining lecture 14: introduction to hidden markov models martin...
Post on 19-Dec-2015
212 views
TRANSCRIPT
![Page 1: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/1.jpg)
EE3J2 Data MiningSlide 1
EE3J2 Data Mining
Lecture 14: Introduction to Hidden Markov Models
Martin Russell
![Page 2: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/2.jpg)
EE3J2 Data MiningSlide 2
Objectives
Limitations of sequence matching Introduction to hidden Markov models (HMMs)
![Page 3: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/3.jpg)
EE3J2 Data MiningSlide 3
Sequence retrieval using DP
……
AAGDTDTDTDD
AABBCBDAAAAAAA
BABABABBCCDF
GGGGDDGDGDGDGDTDTD
DGDGDGDGD
AABCDTAABCDTAABCDTAAB
CDCDCDTGGG
GGAACDTGGGGGAAA
…….
…….
Corpus of sequential data
‘query’ sequence Q
…BBCCDDDGDGDGDCDTCDTTDCCC…
Dynamic Programming
Distance Calculation Calculate ad(S,Q)
for each sequence S in corpus
QSadSS
,minargˆ
![Page 4: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/4.jpg)
EE3J2 Data MiningSlide 4
Limitations of ‘template matching’
This type of analysis is sometimes referred to as template matching
The ‘templates’ are the sequences in the corpus Can think of each template as representing a ‘class’ Problem is to determine which class best fits the
query Performance will depend on precisely which
template is used to represent the class
![Page 5: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/5.jpg)
EE3J2 Data MiningSlide 5
Alternative path shapes
The basic units of path considered so far are:
substitution insertion deletion
Others are possible and may have advantages, e.g:
substitution insertion deletion
![Page 6: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/6.jpg)
EE3J2 Data MiningSlide 6
Example
![Page 7: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/7.jpg)
EE3J2 Data MiningSlide 7
Hidden Markov Models (HMMs)
One solution is to replace the individual template sequence with an ‘average’ sequence
But what is an ‘average sequence’? One solution is to use a type of statistical model
called a Hidden Markov Model
![Page 8: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/8.jpg)
EE3J2 Data MiningSlide 8
HMMs
Suppose the following sequences are in same class:– ABC, YBBC, ABXC, AZ
Compute alignments:
Y B B C
A
B
C
A B X C
A
B
C
A Z
A
B
C
![Page 9: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/9.jpg)
EE3J2 Data MiningSlide 9
Finite State Network Representation The sequence consists of 3 ‘states’
– First state is ‘realised’ as A (twice) or Y (once)
– Second state ‘realised’ as B (three times) or X (once)
– Second state can be repeated or deleted
– Third state can be ‘realised’ as C (twice) or Z (once)
![Page 10: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/10.jpg)
EE3J2 Data MiningSlide 10
Network representation
Directed graph representation Each state associated with a set of probabilities
– Called the ‘state emission’ probabilities
0
,3
1 ,
3
2
ZpXpCpBp
YpAp
![Page 11: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/11.jpg)
EE3J2 Data MiningSlide 11
Transition probabilities
Transition probabilities control insertions and deletions of symbols
1 10.67
0.33
0.5
0.5
00000
10000
05.05.000
033.067.000
00010
A
ajk=Prob(state k follows state j)
Basic rule for drawing transition networks: Connect state j to state k if ajk > 0
![Page 12: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/12.jpg)
EE3J2 Data MiningSlide 12
Formal Definition
A Hidden Markov Model (HMM) for the symbols 1, 2, …, K consists of:
– A number of states N
– An N N state transition probability matrix A
– For each state k a set of probabilities pk(1), … , p(K) - p(k) is the probability that k occurs for state k
![Page 13: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/13.jpg)
EE3J2 Data MiningSlide 13
Alignment paths for HMMs
For HMMs, alignment paths are called state sequences
Y A B B B X B C
A
B
C
CpaBpaApaYpYABBBXBCp 4343232222
![Page 14: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/14.jpg)
EE3J2 Data MiningSlide 14
State-symbol trellis
Y A B B B X B C
A
B
C
Rule: connect state j at symbol m with state k at symbol m+1 if ajk > 0
![Page 15: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/15.jpg)
EE3J2 Data MiningSlide 15
More examples
![Page 16: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/16.jpg)
EE3J2 Data MiningSlide 16
Dynamic Programming
Y A B B B X B C
A
B
C
Bpa
Bpa
k
kk
4341
4241
3
2max4
![Page 17: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/17.jpg)
EE3J2 Data MiningSlide 17
Formal Definition
A Hidden Markov Model (HMM) for the symbols 1, 2, …, K consists of:
– A number of states N
– An N N state transition probability matrix A
– For each state k a set of probabilities pk(1), … , p(K) - p(k) is the probability that k occurs for state k
![Page 18: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/18.jpg)
EE3J2 Data MiningSlide 18
Alignment paths for HMMs
For HMMs, alignment paths are called state sequences
Y A B B B X B C
A
B
C
CpaBpaApaYpYABBBXBCp 4343232222
State sequence
![Page 19: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/19.jpg)
EE3J2 Data MiningSlide 19
The optimal state sequence
Let M be a HMM and s a sequence Probability on previous slide depends on the state
sequence and the model, so we write:
By analogy with dynamic programming, the optimal state sequence is the sequence such that:
Msp |,
Msp
MspMsp
|,maxargˆ
or, ,|,max|ˆ,
![Page 20: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/20.jpg)
EE3J2 Data MiningSlide 20
Computing the optimal state sequence:The ‘state-symbol’ trellis
Y A B B B X B C
A
B
C
Rule: connect state j at symbol m with state k at symbol m+1 if ajk > 0
![Page 21: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/21.jpg)
EE3J2 Data MiningSlide 21
More examples
![Page 22: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/22.jpg)
EE3J2 Data MiningSlide 22
Dynamic Programminga.k.a Viterbi Decoding
Y A B B B X B C
A
B
C
4|ˆ,ˆ ,3
2max4
4341
4241K
k
kk Mspsp
Bpa
Bpa
k
4
K
![Page 23: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/23.jpg)
EE3J2 Data MiningSlide 23
Sequence retrieval using HMMs
Corpus of pre-build HMMs
‘query’ sequence Q
…BBCCDDDGDGDGDCDTCDTTDCCC…Viterbi
Decoding
Calculate p(Q|M) for each HMM M
in corpus MQpMM
|maxargˆ
![Page 24: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/24.jpg)
EE3J2 Data MiningSlide 24
![Page 25: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/25.jpg)
EE3J2 Data MiningSlide 25
![Page 26: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/26.jpg)
EE3J2 Data MiningSlide 26
![Page 27: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/27.jpg)
EE3J2 Data MiningSlide 27
![Page 28: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/28.jpg)
EE3J2 Data MiningSlide 28
HMM Construction
Suppose we have a set of HMMs, each representing a different class (e.g. protein sequence)
Given an unknown sequence s:– Use Viterbi decoding to compare s with each HMM
– Compute
But how do we obtain the HMM in the first place?
MxspMsp |ˆ,|ˆ
![Page 29: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/29.jpg)
EE3J2 Data MiningSlide 29
HMM training
Given a set of example sequences S a HMM M can be built such that p(S|M) is locally maximised
Procedure is as follows:– Obtain an initial estimate of a suitable model M0
– Apply an algorithm – the ‘Baum-Welch’ algorithm – to obtain a new model M1 such that p(S|M1) ≥ p(S|M0)
– Repeat to produce a sequence of HMMs M0, M1,…,Mn with:
p(S|M0) ≤ p(S|M1) ≤ p(S|M2) ≤… ≤ p(S|Mn)
![Page 30: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/30.jpg)
EE3J2 Data MiningSlide 30
Local optimality
M0 M1…Mn
P(S|M)
Local maximum
Global maximum
![Page 31: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/31.jpg)
EE3J2 Data MiningSlide 31
Summary
Hidden Markov Models Importance of HMMs for sequence matching Viterbi decoding HMM training
![Page 32: Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d395503460f94a13389/html5/thumbnails/32.jpg)
EE3J2 Data MiningSlide 32
Summary
Review of template matching Hidden Markov Models Dynamic programming for HMMs