cognitive computer vision 3r400 kingsley sage room 5c16, pevensey iii [email protected]

Cognitive Computer Vision 3R400

Kingsley Sage

Room 5C16, Pevensey III

[email protected]

Markov Models - Seminar

Practical issues in computing the Baum Welch re-estimation formulae

Choosing the number of hidden nodes Generative modelling and stochastic sampling Coursework

Computing the BW parameters (1)

Choose =(,A,B) at random (subject to probability constraints)

A Sunny Rain Wet

Sunny 0.6 0.3 0.1

Rain 0.1 0.6 0.3

Wet 0.2 0.2 0.6

B Red Green Blue

Sunny 0.8 0.1 0.1

Rain 0.1 0.8 0.1

Wet 0.2 0.2 0.6

N h

idd

en s

tate

s

M observable states

N h

idd

en s

tate

s

N hidden states

=1

=1

=1

=1

=1

=1

Sunny Rain Wet

0.6 0.3 0.1

N hidden states

=1


We want to be able to calculate:

)|(

)()()(),(

11

OP

jObaiji

ttjijtt

t(i) comes from forwards evaluation

t+1(j) comes from backwards evaluation Given O Have initial values for aij and bj(Ot+1) Can calculate P(O|) but do we need to?


Can calculate P(O|) but do we need to?

)|(

)()()(),( 11

OP

jObaiji ttjijt

t

P(O|) is a normalising constant and is the same value for all t(i,j) for any individual iteration

Can ignore P(O|) if we re-normalise =(,A,B) at the end of the re-estimation


Recall the Scaling Factor SFt from the previous seminar …

Intended to prevent arithmetic underflow when calculating and trellises Calculate SFt using trellis and apply the same factors to the trellis. SFt

for t = SFt for t+1 (think why …)

Ni

itt iSF

1

)(


Everything else should now be straightforward …

Except … how to choose the number of hidden nodes N

Choosing N (1)

What is the actual complexity of the underlying task?

Too many nodes – over learning and lack of generalisation capability (model learns precisely only those patterns that occur in O)

Too few nodes – over generalisation (model has now adequately captured the dynamics of the underlying task)

Same problem as deciding how many hidden nodes there should be for a neural network

Choosing N (2)L

og

Lik

elih

oo

d /

sym

bo

l

N

0

-

Little additional performance withincreasing N

Optimal p

oint

Generative modelling (1)

OK, so now we know what a (Hidden) Markov Model is, and how to learn its parameters, but how is this all relevant to Cognitive/Computer Vision?– (H)MMs are generative models– Perception guided by expectation– Visual control– An example visual task …

Generative modelling (2) Simple case study: Visual task

15 3

4

2

Training sequence:{3,3,2,2,2,2,5,5,4,4,3,3,1,1,1}

Generative modelling (3) Example sequence 1 generated by HMM5 observed states & 14 hidden states

Generative modelling (4)Example sequence 2 generated by HMM5 observed states & 14 hidden states

Stochastic sampling

To generate a sequence from =(,A,B): Select starting state according to distribution FOR t=1: T

– Generate ht(N) (a 1*N distribution) using A (part of the trellis computation

– Select a state q according to t(N) distribution

– Generate ot(N) (a 1*N distribution) using q and B

– Select an output symbol ot according to ot(N) distribution

END_FOR

cognitive computer vision 3r400 kingsley sage room 5c16, pevensey iii [email protected]

Documents