1 hidden markov model xiaole shirley liu stat115, stat215, bio298, bist520

1

Hidden Markov Model

Xiaole Shirley Liu

STAT115, STAT215, BIO298, BIST520

2

Outline

• Markov Chain

• Hidden Markov Model– Observations, hidden states, initial, transition

and emission probabilities

• Three problems– Pb(observations): forward, backward procedure– Infer hidden states: forward-backward, Viterbi– Estimate parameters: Baum-Welch

3

iid process

• iid: independently and identically distributed– Events are not correlated to each other– Current event has no predictive power of future

event– E.g.

Pb(girl | boy) = Pb(girl),

Pb(coin H | H) = Pb(H)

Pb(dice 1 | 5) = pb(1)

4

Discrete Markov Chain

• Discrete Markov process– Distinct states: S1, S2, …Sn

– Regularly spaced discrete times: t = 1, 2,…– Markov chain: future state only depends on

present state, but not the path to get here

– aij transition probability

]|[...],|[ 121 itjtktitjt SqSqPSqSqSqP

N

jijijij aaa

1

1,0,

5

Markov Chain Example 1

• States: exam grade 1 – Pass, 2 – Fail

• Discrete times: exam #1, #2, # 3, …

• State transition probability

• Given PPPFFF, pb of pass in the next exam

5.05.0

2.08.0}{ ijaA

5.0

),|(

}),,,,,{,|(

2617

22211117

SqASqP

SSSSSSOASqP

6


• States: 1 – rain; 2 – cloudy; 3 – sunny

• Discrete times: day 1, 2, 3, …

• State transition probability

• Given 3 at t=1

8.01.01.0

2.06.02.0

3.03.04.0

}{ ijaA

4

23321311313333

3132311333

10536.12.01.03.04.01.08.08.0

),|},,,,,,,{(

aaaaaaa

SqASSSSSSSSOP

7


• States: fair coin F, unfair (biased) coin B

• Discrete times: flip 1, 2, 3, …

• Initial probability: F = 0.6, B = 0.4

• Transition probability

• Prob(FFBBFFFB)

F B0.1

0.30.9 0.7

8

Hidden Markov Model

• Coin toss example

• Coin transition is a Markov chain

• Probability of H/T depends on the coin used

• Observation of H/T is a hidden Markov chain (coin state is hidden)

9

Hidden Markov Model

• Elements of an HMM (coin toss)– N, the number of states (F / B)– M, the number of distinct observation (H / T)

– A = {aij} state transition probability

– B = {bj(k)} emission probability

={i} initial state distributionF = 0.4, B = 0.6

7.03.0

1.09.0A

2.0)(,8.0)(

5.0)(,5.0)(

TbHb

TbHb

BB

FF

10

HMM Applications• Stock market: bull/bear market hidden Markov

chain, stock daily up/down observed, depends on big market trend

• Speech recognition: sentences & words hidden Markov chain, spoken sound observed (heard), depends on the words

• Digital signal processing: source signal (0/1) hidden Markov chain, arrival signal fluctuation observed, depends on source

• Bioinformatics: sequence motif finding, gene prediction, genome copy number change, protein structure prediction, protein-DNA interaction prediction

11

Basic Problems for HMM

1. Given , how to compute P(O|) observing sequence O = O1O2…OT

• Probability of observing HTTHHHT …• Forward procedure, backward procedure

• Given observation sequence O = O1O2…OT and , how to choose state sequence Q = q1q2…qt

1. What is the hidden coin behind each flip2. Forward-backward, Viterbi

1. How to estimate =(A,B,) so as to maximize P(O| )

1. How to estimate coin parameters 2. Baum-Welch (Expectation maximization)

12

Problem 1: P(O|)

• Suppose we know the state sequence Q

– O = HT T H H H T– Q = F F B F F B B

– Q = B F B F B B B

• Each given path Q has a probability for O

2.08.05.05.02.05.05.0

)()()()()()()(),|(

TbHbHbHbTbTbHbQOP BBFFBFF

2.08.08.05.02.05.08.0

)()()()()()()(),|(

TbHbHbHbTbTbHbQOP BBBFBFB

)()...()(),|( 2211 TqTqq ObObObQOP

13

Problem 1: P(O|)

• What is the prob of this path Q?

– Q = F F B F F B B

– Q = B F B F B B B

• Each given path Q has its own probability

7.01.09.03.01.09.06.0

)|(

BBFBFFBFFBFFF aaaaaaQP

)|( QP

7.07.01.03.01.03.04.0

)|(

BBBBFBBFFBBFB aaaaaaQP

14

Problem 1: P(O|)

• Therefore, total pb of O = HTTHHHT

• Sum over all possible paths Q: each Q with its own pb multiplied by the pb of O given Q

• For path of N long and T hidden states, there are TN paths, unfeasible calculation

QallQall

QOPQPQOPOP ),|()|()|,()|(

15

Solution to Prob1: Forward Procedure

• Use dynamic programming

• Summing at every time point

• Keep previous subproblem solution to speed up current calculation

16

Forward Procedure

• Coin toss, O = HTTHHHT

• Initialization

– Pb of seeing H1 from F1 or B1

H T T H …

)()( 11 Obi ii

B

F

17

Forward Procedure


• Initialization

• Induction

– Pb of seeing T2 from F2 or B2

F2 could come from F1 or B1

Each has its pb, add them up

H T T H …

)()( 11 Obi ii

)(])([)( 11

1

tj

N

iijtt Obaij

+BB

F+

F

18

Forward Procedure


• Initialization

• Induction

H T T H …

0712.02.0)7.048.01.02.0()())()(()(

162.05.0)3.048.09.02.0()())()(()(

112

112

TbaBaFB

TbaBaFF

BBBFB

FBFFF

)()( 11 Obi ii

)(])([)( 11

1

tj

N

iijtt Obaij

+BB

F+

F

19

Forward Procedure


• Initialization

• Induction

H T T H …

013208.02.0]7.00712.01.0162.0[)(

08358.05.0]3.00712.09.0162.0[)(

3

3

B

F

)()( 11 Obi ii

)(])([)( 11

1

tj

N

iijtt Obaij

+B

+B

+

+B B

F F+

F+

F

20

Forward Procedure


• Initialization

• Induction

• Termination

H T T H …

)()( 11 Obi ii

)(])([)( 11

1

tj

N

iijtt Obaij

+B

+B

+

+B B

F F+

F+

F

)()()()|( 441

BFiOPN

it

21

Solution to Prob1: Backward Procedure


• Initialization

• Pb of coin to see certain flip after it

...H H H T

1)()(

1)(

**

*

BF

i

TT

T

B

F

22

Backward Procedure


• Initialization

• Induction

• Pb of coin to see certain flip after it

1)(* iT

)()()(1

11 jObaiN

jttjijt

23

Backward Procedure


• Initialization

• Induction

...H H H T?

1)(* iT

)()()(1

11 jObaiN

jttjijt

29.02.07.05.03.01)(1)()(

47.02.01.05.09.01)(1)()(

1

1

TbaTbaB

TbaTbaF

BBBFBFT

BFBFFFT

+

+

B

F

24

Backward Procedure


• Initialization

• Induction

...H H H T

1)(* iT

)()()(1

11 jObaiN

jttjijt

2329.029.08.07.047.05.03.0)()()()()(

2347.029.08.01.047.05.09.0)()()()()(

112

112

BHbaFHbaB

BHbaFHbaF

TBBBTFBFT

TBFBTFFFT

+ +

++

B B

+

+

B

F F F

25

Backward Procedure


• Initialization

• Induction

• Termination

• Both forward and backward could be used to solve problem 1, which should give identical results

1)(* iT

)()()(1

11 jObaiN

jttjijt

)()()()((*) 110 BHbFHb BBFF

26

Solution to Problem 2 Forward-Backward Procedure

• First run forward and backward separately• Keep track of the scores at every point• Coin toss

– α: pb of this coin for seeing all the flips now and before

– β: pb of this coin for seeing all the flips after

H T T H H H T

α1(F) α2(F) α3(F) α4(F) α5(F) α6(F) α7(F)

α1(B) α2(B) α3(B) α4(B) α5(B) α6(B) α7(B)

β1(F) β2(F) β3(F) β4(F) β5(F) β6(F) β7(F)

β1(B) β2(B) β3(B) β4(B) β5(B) β6(B) β7(B)

27

Solution to Problem 2 Forward-Backward Procedure

• Coin toss

• Gives probabilistic prediction at every time point• Forward-backward maximizes the expected

number of correctly predicted states (coins)

N

jtt

ttt

jj

iii

1

)()(

)()()(

659.00038.00016.00047.00025.0

0047.00025.0..

)()()()(

)()()(

3333

333

ge

BBFF

FFF

28

Solution to Problem 2Viterbi Algorithm

• Report the path that is most likely to give the observations

• Initiation

• Recursion

• Termination

• Path (state sequence) backtracking

0)(

)()(

1

11

i

Obi ii

])([maxarg)(

)(])([max)(

11

11

ijtNi

t

tjijtNi

t

aij

Obaij

)]([maxarg

)]([max*

1

*

1

iq

iP

TNi

T

TNi

1,...,2,1),( *11

* TTtqq ttt

29

Viterbi Algorithm

• Observe: HTTHHHT

• Initiation0)(

)()(

1

11

i

Obi ii

48.08.06.0)(

2.05.04.0)(

1

1

B

F

30

Viterbi Algorithm

•

H T T H

F

B

31

Viterbi Algorithm


• Initiation

• Recursion

0)(

)()(

1

11

i

Obi ii

])([maxarg)(

)(])([max)(

11

11

ijtNi

t

tjijtNi

t

aij

Obaij

BBFF

TbaBaFB

TbaBaFF

BBBFBi

FBFFFi

)(,)(

0672.02.0)336.0,02.0max()()])(,)(([max)(

09.05.0)144.0,18.0max()()])(,)(([max)(

22

112

112

Max instead of +, keep track path

32

Viterbi Algorithm

• Max instead of +, keep track of path

• Best path (instead of all path) up to here

H T T H

F F

B B

33

Viterbi Algorithm


• Initiation

• Recursion

0)(

)()(

1

11

i

Obi ii

])([maxarg)(

)(])([max)(

11

11

ijtNi

t

tjijtNi

t

aij

Obaij

BBFF

TbaBaFB

TbaBaFF

BBBFBi

FBFFFi

)(,)(

0094.02.0)7.00672.0,1.009.0max(

)()])(,)(([max)(

0405.05.0)3.00672.0,9.009.0max(

)()])(,)(([max)(

33

223

223

Max instead of +, keep track path

34

Viterbi Algorithm

• Max instead of +, keep track of path

• Best path (instead of all path) up to here

H T T H

F F

B B

F

B

F

B

35

Viterbi Algorithm

• Terminate, pick state that gives final best δ score, and backtrack to get path

H T T H

• BFBB most likely to give HTTH

F F

B B

F

B

F

BBB

F

B

36

Solution to Problem 3

• No optimal way to do this, so find local maximum

• Baum-Welch algorithm (equivalent to expectation-maximization)– Random initialize =(A,B,) – Run Viterbi based on and O– Update =(A,B,)

: % of F vs B on Viterbi path

• A: frequency of F/B transition on Viterbi path

• B: frequency of H/T emitted by F/B

37

GenScan

• HMM model for gene structure– Hexamer coding statistics– Matrix profile for gene

structure

• Need training sequences– Known coding/noncoding

• Could miss or mispredict whole gene/exon

38

Summary

• Markov Chain• Hidden Markov Model

– Observations, hidden states, initial, transition and emission probabilities

• Three problems– Pb(observations): forward, backward procedure (give

same results)

– Infer hidden states: forward-backward (pb prediction at each state), Viterbi (best path)

– Estimate parameters: Baum-Welch

1 hidden markov model xiaole shirley liu stat115, stat215, bio298, bist520

Documents

hidden markov chain

hidden coin

hidden markov modelcoin

hidden markov modelelements

markov chain example

t hidden states

markov chainprobability

path q