seminar multimodale räume ss 2003 – einführung 7. mai - rainer stiefelhagen multimodale räume...
TRANSCRIPT
![Page 1: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/1.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Multimodale Räume
“Smart Rooms”“Intelligent
Environments”
Seminar SS 03
![Page 2: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/2.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
User Interfaces
• In the beginning: Wimpy Computing– Windows, Icons, Menus, Pointing
![Page 3: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/3.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
2nd Generation:Human-Machine Interaction
“Please show me… hm… all Hotels in THIS area.. er..partof the city"
• Speaking• Pointing, • Gesturing• Hand-Writing• Drawing• Presence/Focus of Attention• Combination
– Sp+HndWrtg+Gestr.
– Repair
• Multimodal NLP & Dialog
![Page 4: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/4.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
“Perceptual” User Interfaces
• Perceptive– human-like perceptual capabilities (what is the user
saying, who is the user, where is the user, what is he doing?)
• Multimodal – People use multiple modalities to communicate (speech,
gestures, facial expressions, …)
• Multimedia– Text, graphics, audio and video
(Matthew Turk (Ed.), Proceedings of the 1998 Workshop on Perceptual User Interfaces)
![Page 5: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/5.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Next: Pervasive Computing
Human-Computer Interaction not the Only Exchange
Humans Want to Interact with Other Humans
– Computers in the Human Interaction Loop (CHIL)
– The Transparent, Invisible Computer
– Computers Needs to be Context Aware
– Should Require little or no Learning or Attention
– Should be proactive rather than command driven
– Produce Little or No Distraction
– Permit a HCI and CHIL Mix
![Page 6: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/6.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Smart/Intelligent Rooms
• Use of computation to enhance everyday activity• Integrate computers seamlessly into the real world
(e.g. offices, homes)• Use “natural” interfaces for communication
(voice, gesture, etc. ) • Computer should adapt to the human, not vice-
versa!
![Page 7: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/7.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Perception
• In order to respond appropriately, objects/room need(s) to pay attention to – People and – Context
• Machines have to be aware of their environment:– Who, What, When, Where and Why?
• Interfaces must be adaptive to – Overall situation – Individual User
![Page 8: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/8.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Intelligent Environments
• Classroom 2000 (Georgia Tech)• Mozer’s Adaptive House• Enhanced Meeting Rooms• Kids Room (MIT)• …• Enhanced Objects such as Whiteboards, Desks, Chairs, …• See also the Intelligent Environments Resource Page
(http://www.research.microsoft.com/ierp/)
![Page 9: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/9.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Intelligent Rooms, Univ. California, San Diego
![Page 10: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/10.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Classroom 2000
• Capturing activity in a classroom– Speaker’s voice– Video– Slides– Handwritten Notes
![Page 11: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/11.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Classroom 2000
Presenting (recorded) lectures through a web-based interface
•Integration of Slides, Notes, Audio, Video
• Searching
•Adding additional material
![Page 12: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/12.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Microsoft Easy Living Project
•XML-based distributed agent system
•Computer vision for person-tracking and visual user interaction.
•Multiple sensor modalities combined.
•Use of a geometric model of the world to provide context.
•Automatic or semi-automatic sensor calibration and model building.
•Fine-grained events and adaptation of the user interface. Device-independent communication and data protocols. Ability to extend the system in many ways.
![Page 13: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/13.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Mozer’s Adaptive House
• Operated as an ordinary home– Usual light-switches, thermostats, doors etc.
• Adjustments are measured and used to train the house to– automatically adjust temperature– adjust lighting– choose music or TV channel
• The house infers the users desires from their actions and behaviours
![Page 14: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/14.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Adaptive House (Mozer)
Sensors:
• Light Level
• Sound Level
• Temperature
• Motion
• Door status
• Window status
• Light settings
• Fan
• Heaters
• …(M. Mozer, Univ. of Colorado, Boulder)
![Page 15: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/15.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Issues in Perception
• Visual– Face-detection / Tracking– Body-Tracking– Face Recognition– Gesture Recognition– Action Recognition– Gaze Tracking / Tracking Focus of Attention
• Auditory– Speech Recognition– Speaker Tracking– Auditory Scene Analysis– Speaker Identification
• Other: Haptic, Olfactoric, … ?
![Page 16: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/16.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Enhanced Meeting Rooms
Capturing of Meetings
• Transcription• Summarization• Dialog Processing
• Who was there ?• Who talked to whom ?
![Page 17: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/17.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Work at ISL
• Face Tracking• Facial Feature Tracking (Eyes, Nose, Mouth)• Head Pose Estimation / Gaze Tracking• Lip-Reading (Audio-Visual Speech Reco.)• 3D Person Tracking• Pointing Gesture Tracking
• Other Modalities: Speech (!!!, see John), Dialogue, Translation, Handwriting, ...
![Page 18: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/18.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Tracking of Human Faces
• A face provides different functions:• identification• perception of emotional expressions
• Human Computer Interaction requires tracking of faces:• lip-reading• eye/gaze tracking• facial action analysis / synthesis
• Video Conferencing / video telephony application:• tracking the speaker• achieving low bit rate transmission
![Page 19: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/19.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Demo: FaceTracker
![Page 20: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/20.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Color Based Face Tracking
Human skin-colors:• cluster in a small area of a color space• skin-colors of different people mainly differ in intensity!• variance can be reduced by color normalization• distribution can be characterized by a Gaussian model
BGR
Rr
BGR
Gg
Chromatic colors:
![Page 21: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/21.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Color Model
Advantages:• very fast• orientation invariant• stable object representation• not person-dependent• model parameters can be quickly adapted
Disadvantages:• environment dependent • (light-sources heavily affect color distribution)
![Page 22: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/22.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Tracking Gaze and Focus of Attention
• In meetings:– to determine the addressee of a speech act – to track the participants attention– to analyse, who was in the center of focus– for meeting indexing / retrieval
• Interactive rooms – to guide the environments focus to the right application– to suppress unwanted responses
• Virtual collaborative workspaces (CSCW)• Human-Robot Cooperation • Cars (Driver monitoring)
![Page 23: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/23.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Tracking a User’s Focus of Attention
• Focus of Attention tracking:– To detect a person’s interest
– To know what a user is interacting with
– To understand his actions/intentions
– To know whether a user is aware of something
• In meetings:– to determine the addressee of a speech act
– to understand the dynamics of interaction
– for meeting indexing / retrieval
• Other areas– Smart environments
– Video-conferencing
– Human-Robot Interaction
![Page 24: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/24.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Head Pose Estimation
• Model-based approaches:– Locate and track a number of facial features– Compute head pose from 2D to 3D correspondences (Gee
& Cipolla '94, Stiefelhagen et.al '96, Jebara & Pentland '97,Toyama '98)
• Example-based approaches:– estimate new pose with function approximator (such as
ANN) (Beymer et.al.'94, Schiele & Waibel '95, Rae & Ritter '98)
– use face database to encode images (Pentland et.al. '94)
![Page 25: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/25.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Model-based Head Pose estimation
Image 3D Model Real World
Y
Z
X
Feature Tracking Pose Estimation
•Find correspondences between points in a 3D model and points in the image
• Iteratively solve linear equation system to find pose parameters (rx, ry, rz, tx, ty, tz)
![Page 26: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/26.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Demo: Facial Feature Tracking
![Page 27: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/27.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Demo: Model-based Head Pose
![Page 28: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/28.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Model-based Head Pose
• Pose estimation accuracy depends on correct feature localization!
• Problems:– Choice of good features– Occlusion due to strong head rotation– Fast head movement– Detection of tracking failure / re-initialization– Requires good image resolution
• Video
![Page 29: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/29.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Estimating Head Pose with ANNs
• Train neural network to estimate head orientation
• Preprocessed image of the face used as input
![Page 30: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/30.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Network Architecture
Hidden Layer:40 to 150 units
Pan (Tilt)
Input Retina: up to 3 x 20x30 pixel 1.800 units
![Page 31: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/31.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Tracking People in a Panoramic View
Camera View Panoramic View
PerspectiveView
![Page 32: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/32.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Training
• Separate nets for pan and tilt• Trained with Std.-Backprop with Momentum Term
• Datasets:– Training on 6100 images from 12 users– Crossevaluation on 750 images from same users– Tested on 750 images from same users
• Additional User Independent Testset:– 1500 images from two new users
![Page 33: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/33.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Results
training set test set new usershisto 6.6 / 5.0 9.4 / 6.9 11.3 / 9.1edges 6.0 / 2.6 10.8 / 7.1 13.3 / 10.8both 1.4 / 1.5 7.8 / 5.4 9.9 / 10.3
Average Error in degrees for pan / tilt
histo: Histogram-normalized image used as input
edges: Horizontal- and Vertical Edge Image used as input
both: Both, Histogram-image plus Edge Images used
![Page 34: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/34.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Demo
![Page 35: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/35.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Spatial-Awareness in Smart Rooms
Tracking people indoors
• To focus sensors on people
• To resolve spatial lrelationships
• To avoid bumping into humans• To analyze activity
![Page 36: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/36.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Person Tracking
Vision based localization of people/objects:
• Single Perspective:• Pfinder - W3S - Hydra - etc.
•Multiple Perspective:• AVIARY - Easy Living
![Page 37: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/37.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Person Tracking in the ISL Smart Room
Cam2
Cam1 Cam0
Features
FeaturesTrackingagent
Featureextractor
People
Cam3
![Page 38: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/38.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Personen-Tracking mit mehreren Kameras
Ziel: 3D Tracking von Personen in Räumen
• Segmentierung von Vordergrundobjekten in jedem Bild
• „3D Schnitt“ der Strahlen durch die Objektmitten
• Kalman-Filter
![Page 39: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/39.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Adaptive Silhouette Extraction
Background subtraction:
• Adaptive Multi-Gaussian background model [Stauffer et al., CVPR 1998]
• Morphological operators smooth foreground output
• Connected components form silhouettes
![Page 40: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/40.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Locating people
LocationHypotheses:
i) (X,Y)ii) (X,Y)
12
3
a
b
a
ab
b
• Extract reference point: Centroid
• Use calibrated sensors to calculate absolute position
• Create list of location hypotheses 1
![Page 41: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/41.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Tracking people
Best Hypothesis Tracking:
• Match location hypotheses ato tracks
• Smooth tracks with Kalman afilter
Hypotheses
i) (X,Y)ii) (X,Y)
Track 1
Track 2
i)
ii)
Track 1
Track 2
![Page 42: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/42.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Tracking Problems
Imperfect andMerged silhouettes:
Counterstrategies
Better Vision algorithm
Probabilistic Multi-Hypothesis aTracking
Reference point: Head
![Page 43: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/43.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
• Use head as reference point instead of centroid
• Head tracker has significantly lower tracking error and false alarm rate
Reference point: Head
HeadCentroid
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0,08
0,09
0,1
Tracking error False alarm rate
![Page 44: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/44.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Demo
![Page 45: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/45.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Erkennung von Zeigegesten
• Ziele: – Menschliche Zeigegesten erkennen – Zeigerichtung in 3D extrahieren
• Einsatzgebiete:– Mensch-Roboter-Interaktion – smart rooms
• Anforderungen:– Personenunabhängig– Echtzeitbetrieb– Kamerabewegung möglich
![Page 46: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/46.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Erkennung von Zeigegesten
Stereokamera Linkes/rechtes Bild
![Page 47: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/47.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
3D-Tracker: Verarbeitungsschritte
Kamera Hautfarbe Disparität
3D-Clustering von Hautfarbpixeln liefert Hinweise auf Position von Kopf und Hände.
![Page 48: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/48.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Gestenerkennung: Bewegungsphasen
• Zeigegesten bestehen aus drei intuitiv unterscheidbaren Bewegungsphasen:
– Beginn – Halten– Ende
• Genaue Lokalisierung der Haltephase wichtig zur Bestimmung der Zeigerichtung
Mittlere Dauer der Bewegungsphasen
μ [sec] σ [sec]
Komplette Geste 1.75 0.48
Beginn 0.52 0.17
Halten 0.76 0.40
Ende 0.47 0.12
![Page 49: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/49.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Gestenerkennung: Modelle
• Modellierung der 3 Phasen mit separaten Modellen
• Kontinuierliche HMMs mit 2 Gaussians pro Zustand
• Null-Modell als Schwellwert für die Phasen-Modelle
• Training auf handgelabelten Daten
![Page 50: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/50.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Gestenerkennung: Detektion
• Eine Zeigegeste wird erkannt, wenn 3 Zeitpunkte tB < tH < tE gefunden werden, so dass– PE(tE) > PB(tE) und PE(tE) > 0
– PB(tB) > PE(tB) und PB(tB) > 0
– PH(tH) > 0
![Page 51: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/51.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Gestenerkennung: Merkmale
• Merkmalsvektor: (r, Δθ, Δy )
• Experimente: zylindrische Koordinaten besser als sphärische und kartesische
• Hand relativ zum Kopf unabhängig von Position im Raum
• Δθ, Δy keine Anpassung an Zeigeziele aus dem Training
• Spline-Interpolation der Merkmals-sequenzen auf konstant 40Hz.
![Page 52: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/52.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Zeigerichtung
• Kopf-Hand-Linie– Sehstrahl Auge-Hand
– Einfach zu messen
• Unterarmlinie– Potenziell überlegen bei
abgewinkeltem Arm
– Schwieriger zu messen
![Page 53: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/53.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Audio-Visual Speech Recognition
![Page 54: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/54.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Lip Tracking Module
• Feature based
• detects localization failures and automatic recover from failures
• tracks facial features (pupils, nostrils, lips)
![Page 55: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/55.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Audio-Visual Recognition
hypc = a hypa v hypv
1 = a v
Kombinations Methoden
• SNR Gewichte
• Entropie Gewichte
• trainierte Gewichte
![Page 56: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/56.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Fusion Levels
• Word Level (Vote, Decide based on A and V score)• Phoneme Level (Combine by Diff. Weighting
Schemes)• Feature Level (Combine Features)
![Page 57: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/57.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Audio-Visual Speech
0102030405060708090
100
clean 16 dBSNR
8 dBSNR
acoustic
visual
combined
![Page 58: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/58.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
Mögliche Themen
• Personentracking• Gestenerkennung• Attentive Interfaces• Face Detection• Lippenlesen (Audio-Visual
Speech Reco.)• Audio-Visual Tracking• Emotion Recognition• Person Identification• Microphone-Arrays• Sensor Fusion
• Smart Room Infrastructure • Intelligent Camera Control• Self-Calibration • Other Smart Room Projects
(MIT, Georgia Tech, IM2)• Other Sensors: Pressure, IR,
etc• Speech Recognition
– in Meetings– Far-Field – Efficient
• Microphone-Arrays
![Page 59: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/59.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen
![Page 60: Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen Multimodale Räume Smart Rooms Intelligent Environments Seminar SS 03](https://reader037.vdocuments.us/reader037/viewer/2022110305/55204d7649795902118cb0b5/html5/thumbnails/60.jpg)
Seminar Multimodale Räume SS 2003 – Einführung 7. Mai - Rainer Stiefelhagen