![Page 1: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/1.jpg)
Juli 2010 eNTERFACE
Application : Surveillance in trains
Video, Audio processingSound localization, pattern rec.
![Page 2: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/2.jpg)
Juli 2010 eNTERFACE
Lip reading
Facial expression recognition
Automatic recognition of facial expressions and lipreading using vector flow
Model based approach
![Page 3: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/3.jpg)
Juli 2010 eNTERFACE
What makes visual speech recognition so hard?
Visemes Smaller word separability Speech info in audio > Speech info in
video
![Page 4: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/4.jpg)
Juli 2010 eNTERFACE
Lip-reading by Humans
People recognize speech better when the signal is both auditory and visual
The difference inrecognition ratesgrows with thelevel of noise inthe environment
0102030405060708090
100
noisy clear
S/N (dB)% c
orr
ect
resp
on
ses
A A+V
![Page 5: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/5.jpg)
Juli 2010 eNTERFACE
Inspiration
In the 1968 Stanley Kubrick film 2001: A space odyssey the computer reads from the lip-movements the conversation of two astronauts.
Thirty years later automated lip-reading becomes a significant part of research in speech recognition systems.
![Page 6: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/6.jpg)
Juli 2010 eNTERFACE
![Page 7: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/7.jpg)
Juli 2010 eNTERFACE
New speech corpusAV speech
corpus
![Page 8: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/8.jpg)
Juli 2010 eNTERFACE
![Page 9: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/9.jpg)
Juli 2010 eNTERFACE
Databases of different quality and resolution
![Page 10: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/10.jpg)
Juli 2010 eNTERFACE
Recording a new speech corpusAV speech
corpus
Visemes|Corpus|Tracking|Features
Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations
![Page 11: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/11.jpg)
Juli 2010 eNTERFACE
Recording a new speech corpusAV speech
corpus
Visemes|Corpus|Tracking|Features
Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations
![Page 12: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/12.jpg)
Juli 2010 eNTERFACE
New speech corpus Dutch Recorded at high-speed: 100 fps Front and profile views included 70 people
49 male, 21 female Students, professors,
secretaries, friends Utterances:
Sentences, digits, spelling, conversation starters/endings, open questions
Normal, fast, whispering
AV speechcorpus
Visemes|Corpus|Tracking|Features
Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations
![Page 13: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/13.jpg)
Juli 2010 eNTERFACE
New speech corpusAV speech
corpus
Visemes|Corpus|Tracking|Features
Applications|Problem|ASR|VSR|Training|Analysis|Conclusion|Recommendations
![Page 14: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/14.jpg)
Juli 2010 eNTERFACE
Lip-reading by Humans
People recognize speech better when the signal is both auditory and visual
The difference inrecognition ratesgrows with thelevel of noise inthe environment
0102030405060708090
100
noisy clear
S/N (dB)% c
orr
ect
resp
on
ses
A A+V
![Page 15: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/15.jpg)
Juli 2010 eNTERFACE
ISFER WorkbenchExamples (continued)
![Page 16: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/16.jpg)
Juli 2010 eNTERFACE
Active Contours Internal and external
energies Internal energy forces
contour to shrink Locally defined
external energy forces the contour to stop at the edge of the mouth
Computationally cheap Sensitivity to initial
setting of the contour7
9
810
12
13 13 1113
13
11
10 9 7
10
8
7
6
8
6
5
![Page 17: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/17.jpg)
Juli 2010 eNTERFACE
Template Matching Internal and external
energies Internal energy forces template
to maintain geometry Globally defined external
energy forces appropriate placement on the picture
Better results than with snakes Integration of energy functions at each step
can be very time consuming
![Page 18: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/18.jpg)
Juli 2010 eNTERFACE
Model
Goal: lip-reading Needed:
accurate description of visible parts of articulatory system
Accurate description of the shape of the mouth: measurements of the distance of the lip to a
center of the mouth measurements of thickness of visible part of
the lips
![Page 19: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/19.jpg)
Juli 2010 eNTERFACE
Data processingFiltered image
- intensity distribution- center of mouth
Image in polar coordinates
Conditional distribution
Mean and variance functions
(continued)
yxI , EYEX ,
cos,sin,ˆ rEYrEXIrI
rGaussrI m ,,ˆ
mM V
![Page 20: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/20.jpg)
Juli 2010 eNTERFACE
Data visualization
Single frame
data vector:
181181 , mm
![Page 21: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/21.jpg)
Juli 2010 eNTERFACE
Results of Experiments
Feed Forward BP
Vanmiddag komt de pianostemmer langs om mijn vleugel te stemmen
![Page 22: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/22.jpg)
Juli 2010 eNTERFACE
![Page 23: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/23.jpg)
Juli 2010 eNTERFACE
![Page 24: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/24.jpg)
Juli 2010 eNTERFACE
Tracking the face – Optical flow
Capturing apparent motion of subsequent images in a grid of motion vectors
Advantages No lip model required Good at capturing motion
Disadvantage Slow
Face tracking
![Page 25: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/25.jpg)
Juli 2010 eNTERFACE
Tracking the face – Lip Geometry Estimation
Applying some color filters and capturing the lip contours in polar coordinates
Advantages No lip model required More or less person-independent
Disadvantage Not robust to external factors
Face tracking
![Page 26: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/26.jpg)
Juli 2010 eNTERFACE
Tracking the face – Active Appearance Models Point tracking according to a statistical lip
model
Disadvantage Requires annotated training images
Advantages Robust against external factors Fast!
Face tracking
![Page 27: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/27.jpg)
Juli 2010 eNTERFACE
Active Appearance Models – Design of the lip model
Face tracking
![Page 28: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/28.jpg)
Juli 2010 eNTERFACE
AAM model point coordinatesFace
tracking
![Page 29: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/29.jpg)
Juli 2010 eNTERFACE
0 50 100 150 200 250-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 50 100 150 200 250-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 50 100 150 200 250-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Features plotted for“F”
Feature extraction
time (frames)
![Page 30: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/30.jpg)
Juli 2010 eNTERFACE
5-states HMM
![Page 31: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/31.jpg)
Juli 2010 eNTERFACE
Automatic bi-modal human emotion recognition
Automatic recognition of facial expressions using active Appearance model
Model based approach
![Page 32: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/32.jpg)
Juli 2010 eNTERFACE
Face localization
![Page 33: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/33.jpg)
Juli 2010 eNTERFACE
User-interface prototype iCat tohelp users in daily tasks.
![Page 34: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/34.jpg)
Juli 2010 eNTERFACE
M.A.E.L.I.A. Our digital cat
H.C.I. Group
![Page 35: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/35.jpg)
Juli 2010 eNTERFACEH.C.I. Group
![Page 36: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/36.jpg)
Juli 2010 eNTERFACE
H.C.I. Group
![Page 37: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/37.jpg)
Juli 2010 eNTERFACE
Requirements in other words…Are you out of your mind? I am sleeping!!!
Get a life! I am still sleeping!
I am so bored! I
wish I had a companion!
7:00 AM 8:00 AM
11:00 AM 14:00 AM
I feel so lonely!!! I am very sad and depressed.
16:00 AM
Finally I have a friend! I am so happy and I even managed to pick up the bone! Wow!!!
AIBO! Bring me my
newspaper!!!
AIBO! Let’s play!!! Follow
me
AIBO! Let’s play!!! Follow
me
![Page 38: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/38.jpg)
Juli 2010 eNTERFACE
Multimodal Communication
Uh, ….
I have no time to do anything with you
Hello,
do you like to chat with me ?
Uh, what a nerd
I want a date
She looks nice
![Page 39: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/39.jpg)
Juli 2010 eNTERFACE
Multi-modal interaction
![Page 40: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/40.jpg)
Juli 2010 eNTERFACE
![Page 41: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/41.jpg)
Juli 2010 eNTERFACE
Would you like to join mefor a dinner ?
![Page 42: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/42.jpg)
Juli 2010 eNTERFACE
![Page 43: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/43.jpg)
Juli 2010 eNTERFACE
![Page 44: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/44.jpg)
Juli 2010 eNTERFACE
![Page 45: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/45.jpg)
Juli 2010 eNTERFACE
![Page 46: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/46.jpg)
Juli 2010 eNTERFACE
![Page 47: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/47.jpg)
Juli 2010 eNTERFACE
Chat-session A cup of tea? Mmh, njeh, I don’t like tea. What’s wrong with tea? Tea makes me sick. That’s nonsense!! And my sister doesn’t like you too! She is very disappointed!! Hihi, I was joking!!! Oh, that’s funny!!!
![Page 48: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/48.jpg)
Juli 2010 eNTERFACE
Chat-session (f) A cup of tea? : - ) (m) Mmh, njeh, I don’t like tea. (: - (
(f) What’s wrong with tea? : - o (m) Tea makes me sick. % - \
(f) That’s nonsense!! : - l l (f) My sister doesn’t like you too! : - l l (f) She is very disappointed!! : - ( (m) Hihi, I was joking!!! ; - ) (f) Oh, that’s funny!!! : - ]
![Page 49: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/49.jpg)
Juli 2010 eNTERFACE
A cup of tea?
: - )
![Page 50: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/50.jpg)
Juli 2010 eNTERFACE
Mmh, njeh, I don’t like tea.
(: - (
![Page 51: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/51.jpg)
Juli 2010 eNTERFACE
What’s wrong with tea?
: - o
![Page 52: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/52.jpg)
Juli 2010 eNTERFACE
Tea makes me sick.
% - \
![Page 53: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/53.jpg)
Juli 2010 eNTERFACE
That’s nonsense!!
: - l l
![Page 54: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/54.jpg)
Juli 2010 eNTERFACE
My sister doesn’t like you too!
: - l l
![Page 55: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/55.jpg)
Juli 2010 eNTERFACE
She is very disappointed!!
: - (
![Page 56: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/56.jpg)
Juli 2010 eNTERFACE
Hihi, I was joking!!!
; - )
![Page 57: Juli 2010 eNTERFACE Application : Surveillance in trains Video, Audio processing Sound localization, pattern rec](https://reader035.vdocuments.us/reader035/viewer/2022062404/551a739b5503463e778b60e9/html5/thumbnails/57.jpg)
Juli 2010 eNTERFACE
Oh, that’s funny!!!
: - ]