m4 – video processing, brno university of technology1 m4 – video processing igor potůček,...
TRANSCRIPT
M4 – Video Processing, Brno University of Technology 1
M4 – Video Processing
Igor Potůček, Michal Španěl, Ibrahim Abu Kteish,Olivier Lai Kan Thon, Pavel Zemčík
Faculty of Information Technolgoy,
Brno University of Technology, Czech Republic
M4 Meeting, September 2003, Delft, The Netherlands
M4 – Video Processing, Brno University of Technology 2
Outline:
• Video processing block diagram• Mouth parametrization• Face detection and feature extraction• Demo• Other video work in Brno• Conclusions
Gesture interpretation
Face (rough)positioning, recognition
M4 – Video Processing, Brno University of Technology 3
Video Processing Block Diagram
Finished
In progress
Planned
Raw (hyperbolicmirror image)
„Corrected“ geometry
Pre-identifiedbody parts
Image acquisition Colour-based methods
Identification of face,eyes, mouth (hands)
Gabor wavelet networks
Detected mouth in large
video (reference)
Geometry/graphics
Mouth parametrization
Mouth area (motionand shape) in low res.
Statistical methods
Features for audio
Pattern matching
Annotation/control functionsAnnotation functions
M4 – Video Processing, Brno University of Technology 4
Mouth Parametrization
• Mouth parametrization may be useful for speech recognition anhancement
• Low-resolution videos do not allow „proper“ lips tracking(in the case of meting rooms the heads are always small)
• Statistical mouth area parametrization needed; however, we also need a reference method with „visibly correct“ mouth parametrization to be able to compare the results
• 2 methods - „full size“ lips tracking and mouth parametrization and „statistical“ approach for small-size face images while the „full size“ version serves just as a reference (currently no intention for further research in it)
M4 – Video Processing, Brno University of Technology 5
Mouth Parametrization
Three main parameters: width w, height h, and curvature c
A hope exists that these parameters are enough as they can be quite well extracted from the low resolution video
If the parametrization model is not sufficient, it can befurther extended (probably through subcontracting)
c
w
h
M4 – Video Processing, Brno University of Technology 6
Face Detection and Feature Extraction
The process of face detection if based on pre-processing of the colours in the video
• It is necessary to positively identify the faces in the video•Gabor wavelet networks seem to be suitable instrument for that purpose (fixed-size networks with 56 or 112 wavelets, see paper of M. Španěl)
•Training for general face detection and positioning (spatial orientation, 5 pan, 5 tilt positions, 15° step)•Can be re-applied for face features, such as eyes or mouth to fine-tune the results and specifically to adjust the scale
M4 – Video Processing, Brno University of Technology 7
Face Detection and Feature Extraction
The Gabor wavelet networks can olso be used for (limited capability) face recognition
•Well suitable for occlusion problems solution – does not loose track of people•Can be used to distinguish „registered“ people and „newcomers“•Requires totally different training – not „zero price“ but still cheap
M4 – Video Processing, Brno University of Technology 8
Demo
• Face tracking with feature detection and identification
M4 – Video Processing, Brno University of Technology 9
Demo
• Face tracking with feature detection and identification
M4 – Video Processing, Brno University of Technology 10
Demo
• Face tracking with feature detection and identification
M4 – Video Processing, Brno University of Technology 11
Demo
• Face tracking with feature detection and identification
M4 – Video Processing, Brno University of Technology 12
Demo
• Face tracking with feature detection and identification
M4 – Video Processing, Brno University of Technology 14
Other Video Work in Brno
• Automatic video editing – PROLOG-based, prolog clauses generated from scenarios (general intentions and rules) and processed audiovisual sequences – generates „cutting“ instructions that are used by an off-line video editing software
• Video annotation tool – shown before, recently extended to multi-channel capabilitiy and extended XML usage
• The more detailed description of the tools/approaches can be seen at:
http://www.fit.vutbr.cz/research/projects/m4
M4 – Video Processing, Brno University of Technology 15
Conclusions
• Audio/video multimodal processing not yet very successfull• Video features detection fairly successfull• Face detection very reliable• In the future work (possibly in AMI), we would like to perform
some fo the tasks in real-time and in embedded video camera systems
• Thanks for the attention