cognitive computer vision 3r400 kingsley sage room 5c16, pevensey iii [email protected]
TRANSCRIPT
![Page 2: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/2.jpg)
Markov Models - Seminar
Practical issues in computing the Baum Welch re-estimation formulae
Choosing the number of hidden nodes Generative modelling and stochastic sampling Coursework
![Page 3: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/3.jpg)
Computing the BW parameters (1)
Choose =(,A,B) at random (subject to probability constraints)
A Sunny Rain Wet
Sunny 0.6 0.3 0.1
Rain 0.1 0.6 0.3
Wet 0.2 0.2 0.6
B Red Green Blue
Sunny 0.8 0.1 0.1
Rain 0.1 0.8 0.1
Wet 0.2 0.2 0.6
N h
idd
en s
tate
s
M observable states
N h
idd
en s
tate
s
N hidden states
=1
=1
=1
=1
=1
=1
Sunny Rain Wet
0.6 0.3 0.1
N hidden states
=1
![Page 4: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/4.jpg)
Computing the BW parameters (2)
We want to be able to calculate:
)|(
)()()(),(
11
OP
jObaiji
ttjijtt
t(i) comes from forwards evaluation
t+1(j) comes from backwards evaluation Given O Have initial values for aij and bj(Ot+1) Can calculate P(O|) but do we need to?
![Page 5: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/5.jpg)
Computing the BW parameters (2)
Can calculate P(O|) but do we need to?
)|(
)()()(),( 11
OP
jObaiji ttjijt
t
P(O|) is a normalising constant and is the same value for all t(i,j) for any individual iteration
Can ignore P(O|) if we re-normalise =(,A,B) at the end of the re-estimation
![Page 6: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/6.jpg)
Computing the BW parameters (3)
Recall the Scaling Factor SFt from the previous seminar …
Intended to prevent arithmetic underflow when calculating and trellises Calculate SFt using trellis and apply the same factors to the trellis. SFt
for t = SFt for t+1 (think why …)
Ni
itt iSF
1
)(
![Page 7: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/7.jpg)
Computing the BW parameters (4)
Everything else should now be straightforward …
Except … how to choose the number of hidden nodes N
![Page 8: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/8.jpg)
Choosing N (1)
What is the actual complexity of the underlying task?
Too many nodes – over learning and lack of generalisation capability (model learns precisely only those patterns that occur in O)
Too few nodes – over generalisation (model has now adequately captured the dynamics of the underlying task)
Same problem as deciding how many hidden nodes there should be for a neural network
![Page 9: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/9.jpg)
Choosing N (2)L
og
Lik
elih
oo
d /
sym
bo
l
N
0
-
Little additional performance withincreasing N
Optimal p
oint
![Page 10: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/10.jpg)
Generative modelling (1)
OK, so now we know what a (Hidden) Markov Model is, and how to learn its parameters, but how is this all relevant to Cognitive/Computer Vision?– (H)MMs are generative models– Perception guided by expectation– Visual control– An example visual task …
![Page 11: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/11.jpg)
Generative modelling (2) Simple case study: Visual task
15 3
4
2
Training sequence:{3,3,2,2,2,2,5,5,4,4,3,3,1,1,1}
![Page 12: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/12.jpg)
Generative modelling (3) Example sequence 1 generated by HMM5 observed states & 14 hidden states
![Page 13: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/13.jpg)
Generative modelling (4)Example sequence 2 generated by HMM5 observed states & 14 hidden states
![Page 14: Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III khs20@sussex.ac.uk](https://reader036.vdocuments.us/reader036/viewer/2022082517/56649d9c5503460f94a84b21/html5/thumbnails/14.jpg)
Stochastic sampling
To generate a sequence from =(,A,B): Select starting state according to distribution FOR t=1: T
– Generate ht(N) (a 1*N distribution) using A (part of the trellis computation
– Select a state q according to t(N) distribution
– Generate ot(N) (a 1*N distribution) using q and B
– Select an output symbol ot according to ot(N) distribution
END_FOR