manish shrivastava - cse, iit bombaypb/cs344-2014/hmm-manish-march-2014.… · manish shrivastava....

36
Hidden Markov Models By Manish Shrivastava

Upload: others

Post on 07-Oct-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Hidden Markov Models

By

Manish Shrivastava

Page 2: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

A Simple Process

a

b

a bS1S2

A simple automata

Page 3: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

A Slightly Complicated Process

Urn 1# of Red = 100# of Green = 0 # of Blue = 0

Urn 3# of Red = 0

# of Green = 0 # of Blue = 100

Urn 2# of Red = 0

# of Green = 100 # of Blue = 0

A colored ball choosing example :

U1 U2 U3U1 0.1 0.4 0.5U2 0.6 0.2 0.2U3 0.3 0.4 0.3

Probability of transition to another Urn after picking a ball:

Page 4: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

A Slightly Complicated Process contd.U1 U2 U3

U1 0.1 0.4 0.5U2 0.6 0.2 0.2U3 0.3 0.4 0.3

Given :

Observation : RRGGBRGR

State Sequence : ??

Easily Computable.

Page 5: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Markov Processes

Properties Limited Horizon :Given previous n states, a state i,

is independent of preceding 0…i-n+1 states. P(Xt=i|Xt-1, Xt-2 ,… X0) = P(Xt=i|Xt-1, Xt-2… Xt-n)

Time invariance : P(Xt=i|Xt-1=j) = P(X1=i|X0=j) = P(Xn=i|X0-1=j)

Page 6: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

A (Slightly Complicated) Markov Process

Urn 1# of Red = 100# of Green = 0 # of Blue = 0

Urn 3# of Red = 0

# of Green = 0 # of Blue = 100

Urn 2# of Red = 0

# of Green = 100 # of Blue = 0

A colored ball choosing example :

U1 U2 U3U1 0.1 0.4 0.5U2 0.6 0.2 0.2U3 0.3 0.4 0.3

Probability of transition to another Urn after picking a ball:

Page 7: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Markov Process

Visible Markov Model Given the observation, one can easily follow the

state sequence traversed

1 32

P(3|1)

P(1|3)

Page 8: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

POS Tagging Basics

Assign “Part of Speech” to each word in a given sentence

Example:

This/DT is/VBZ a/DT sentence/NN ./. This/DT is/VBZ a/DT tagged/JJ sentence/NN

./.

Page 9: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Simple Method

Assign the most common tag Example : I/PRP bank/NN at/IN SBI/NNP ./SYM

But, the correct tag sequence in context is: I/PRP bank/VBP at/IN SBI/NNP ./SYM

Assign “Part of Speech” to each word according to its context in a given sentence

Page 10: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Mathematics of POS Tagging

Formally, POS Tagging is a sequence labeling task

For a given observation sequence W W: {w1,w2 … wn}

Produce a label/tag sequence T T: {t1,t2 … tn}

Such that they “belong” together Maximize P(W,T) or P(T|W) argmax P(T|W)

Page 11: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Computing Label sequence given Observation Sequence

Page 12: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Hidden Markov Model

Urn 1# of Red = 30

# of Green = 50 # of Blue = 20

Urn 3# of Red =60

# of Green =10 # of Blue = 30

Urn 2# of Red = 10

# of Green = 40 # of Blue = 50

A colored ball choosing example :

U1 U2 U3U1 0.1 0.4 0.5U2 0.6 0.2 0.2U3 0.3 0.4 0.3

Probability of transition to another Urn after picking a ball:

Page 13: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Hidden Markov Model

U1 U2 U3U1 0.1 0.4 0.5U2 0.6 0.2 0.2U3 0.3 0.4 0.3

Given :

Observation : RRGGBRGR

State Sequence : ??

Not So Easily Computable.

and

R G BU1 0.3 0.5 0.2U2 0.1 0.4 0.5U3 0.6 0.1 0.3

Page 14: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Hidden Markov Model

Set of states : S Output Alphabet : V Transition Probabilities : A = {aij} Emission Probabilities : B = {bj(ok)} Initial State Probabilities : π

),,( BA

Page 15: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Hidden Markov Model

Here : S = {U1, U2, U3} V = { R,G,B}

For observation: O ={o1… on}

And State sequence Q ={q1… qn}

π is

U1 U2 U3

U1 0.1 0.4 0.5

U2 0.6 0.2 0.2

U3 0.3 0.4 0.3

R G B

U1 0.3 0.5 0.2

U2 0.1 0.4 0.5

U3 0.6 0.1 0.3

A =

B=

)( 1 ii UqP

Page 16: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Three Basic Problems of HMM

1. Given Observation Sequence O ={o1… on} Efficiently estimate

2. Given Observation Sequence O ={o1… on} Get best Q ={q1… qn}

3. How to adjust to best maximize

)|( OP

),,( BA )|( OP

Page 17: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Solutions

Problem 1: Likelihood of a sequence Forward Procedure Backward Procedure

Problem 2: Best state sequence Viterbi Algorithm

Problem 3: Re-estimation Baum-Welch ( Forward-Backward Algorithm )

Page 18: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Problem 1

T

T

qqqqq

Tqq

T

ttt

QPQOPOP

QOPOP

aaQP

obob

qoPQOP

TT

T

)|(),|()|(Then,

)|,()|(know, We

....)|( And

)( ... ).(

),|(),|(

,Then }...q{q Q And

}...o{oO:Consider

1211

1 1

1

T1

T1

Page 19: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Problem 1

T

TTTqq

Tqqqqqqq obobaaOP...

11

11211)( ... ).(....)|(

Order 2TNT

Definitely not efficient!! Is there a method to tackle this problem? Yes.

Forward or Backward Procedure

Page 20: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Forward Procedure

model given the ,...n observatio partial theof and , is position at state y that theprobabilit The

)|,...()(as variableForward Define

1

1

t

i

ittt

ooSt

SqooPi

Forward Step:

Page 21: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Forward Procedure

Page 22: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Backward Procedure

Page 23: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Backward Procedure

Page 24: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Forward Backward Procedure

Benefit Order

N2T as compared to 2TNT for simple computation

Only Forward or Backward procedure needed for Problem 1

Page 25: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Problem 2

Given Observation Sequence O ={o1… oT} Get “best” Q ={q1… qT} i.e.

Solution :1. Best state individually likely at a position i2. Best state given all the previously observed

states and observations Viterbi Algorithm

Page 26: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Viterbi Algorithm• Define such that,

i.e. the sequence which has the best joint probability so far.

• By induction, we have,

Page 27: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Viterbi Algorithm

Page 28: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Viterbi Algorithm

Page 29: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Problem 3

How to adjust to best maximize Re-estimate λ

Solutions : To re-estimate (iteratively update and improve)

HMM parameters A,B, π Use Baum-Welch algorithm

),,( BA )|( OP

Page 30: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Baum-Welch Algorithm

Define

Putting forward and backward variables

Page 31: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Baum-Welch algorithm

Page 32: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Define

Then, expected number of transitions from Si

And, expected number of transitions from Sj to Si

Page 33: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process
Page 34: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Baum-Welch Algorithm

Baum et al have proved that the above equations lead to a model as good or better than the previous

Page 35: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

Questions?

Page 36: Manish Shrivastava - CSE, IIT Bombaypb/cs344-2014/hmm-manish-march-2014.… · Manish Shrivastava. A Simple Process a b a b S1 S2 A simple automata. A Slightly Complicated Process

References

Lawrence R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77 (2), p. 257–286, February 1989.

Chris Manning and Hinrich Schütze, Chapter 9: Markov Models,Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999