machines that recognize human emotion yuan qi mit media laboratory
TRANSCRIPT
Machines that Recognize Human Emotion
Yuan Qi
MIT Media Laboratory
A man barges into your office when you’re busy. He doesn’t apologize, doesn’t introduce himself, and
doesn’t notice you are annoyed.
He offers you useless advice.You express more annoyance. He ignores it.
He continues to be unhelpful. The clarity of your emotional expression escalates. He ignores it.
(this goes on) Finally you have to tell him explicitly “go away”
He winks, and does a little dancebefore exiting.
Recognition of three “basic” states:
• Expressions, behaviors“Flared nostrils, tightened lips, a quick sharp
gesture, skin conductivity=high; probably she is angry ”
• Situation, reasoningThat was an important goal to her and Bob just
thwarted it, so she probably feels angry toward Bob
“ Emotion recognition”
Emotions give rise to changes that can be sensed
Face
Distance Voice
Sensing: Posture
Gestures, movement, behavior
Skin conductivity
Pupillary dilation
Up-close Respiration, heart rate, pulse
Sensing: Temperature
Blood pressure
Internal Hormones
Sensing: Neurotransmitters
…
Emotions give rise to changes that can be sensed
Distance Voice
Sensing: Posture
Gestures, movement, behavior
Skin conductivity
Pupillary dilation
Up-close Respiration, heart rate, pulse
Sensing: Temperature
Blood pressure
Internal Hormones
Sensing: Neurotransmitters
…
Emotions give rise to changes that can be sensed
Sensing: Posture
Gestures, movement, behavior
Skin conductivity
Pupillary dilation
Up-close Respiration, heart rate, pulse
Sensing: Temperature
Blood pressure
Internal Hormones
Sensing: Neurotransmitters
…
Emotions give rise to changes that can be sensed
Gestures, movement, behavior
Skin conductivity
Pupillary dilation
Up-close Respiration, heart rate, pulse
Sensing: Temperature
Blood pressure
Internal Hormones
Sensing: Neurotransmitters
…
Emotions give rise to changes that can be sensed
Skin conductivity Pupillary dilation
Up-close Respiration, heart rate, pulse
Sensing: Temperature
Blood pressure
Internal Hormones
Sensing: Neurotransmitters
...
Can a machine tell if a person is bored or interested?
Attentive? Fidgeting?
Application: Computer Learning Companion, Tutor, Mentor
Can we teach a chair to recognize behaviors indicative of interest and boredom? (Mota and Picard)
Sit upright Lean Forward Slump Back Side Lean
What can the sensor chair contribute toward inferring the user’s state: Bored vs. interested?
9-state Posture Recognition: 89-97% accurateHigh/Low interest, Taking a Break: 69-83% accurate(Results on kids not in training data, 2002)
Detecting, tracking, and recognizing facial expressions from video (Kapoor & Picard)
Computer recognition of natural head nods and shakes
Kapoor and Picard, PUI ‘01
Fully automatic computer recognition of six natural facial “action units”
(Kapoor and Picard)
Accuracy:“Expert” human: 75%Our first system: 67%
Can the computer sense mild frustration or distress?
(e.g., for usability testing in the field?)
Things to communicate frustration
(Reynolds & Picard)
Example: data from pressure mouse
Forthcoming paper w/Jack Dennerlein, Harvard School of Public Health, and Carson Reynolds/Rosalind Picard at MIT, International Ergonomics Association, linking frustration and physical risk factors
Can the computer sense other emotions? Stress? Pleasure?…
Sensing Processing Expression
Wearable skin conductivity communicator
Making the light glow:
• Significant thoughts
• Exciting events
• Exercise
• Motion artifacts
• Lying
• Pain
Audience’s “Glow” conveys excitement(Approximate Skin Conductivity Level)Audience’s “Glow” conveys excitement(Approximate Skin Conductivity Level)
Communicate emotion in new waysPicard and Scheirer, HCI 2001
Cybernetic wearable camera(Healey & Picard, ISWC 98)
StartleCam Filter
Video: StartleCam(Healey & Picard, ISWC 98)
Subject intentionally expressing 8 emotions:
1. Neutral 5. Platonic Love2. Anger 6. Romantic Love3. Hate 7. Joy4. Grief 8. Reverence
Each emotion collected daily, for > 4 weeks4 physiological signals:EMG on jaw, skin conductivity, BVP, respiration
Classification Accuracy:81% on 8 emotions (person dependent)Picard et al., IEEE Trans. Pattern Analysis Machine Intell.,Oct 2001.
1. Neutral 5. Platonic Love2. Anger 6. Romantic Love3. Hate 7. Joy4. Grief 8. Reverence
Autonomic Balance = LF/HF
Bayesian Spectrum Estimation of Unevenly Sampled Nonstationary Data
(Y. Qi, T.P. Minka, and R.W. Picard 01) Problem
Estimating spectrum with data that is
• Nonstationary• Unevenly
Sampled• Noisy
Bayesian Approach
Dynamic modeling of the time series
Then the spectrum at time ti can be summarized by the posterior mean of p(si|x1:i ).
)2(
)1(
]2cos,...,2cos
,2sin,...,2sin,1[
],...,,,,...,,[
1
1
1
1010
iiii
iii
iMi
iMii
TiMiiiMiii
vscx
wss
tftf
tftfc
bbbaaas
w i : the process noise at time t i
vi : the observation noise at time ti.The filtering distribution p(si|x1:i ) can be sequentially estimated as
)|(
)|()|()|(
)|()|()|(
1:1
1:1:1
11:1111:1
ii
iiiiii
iiiiiii
xxp
xspsxpxsp
dsxspsspxsp
Comparison with Classical Spectrum Estimation Algorithms
Welch
Burg
Music
Multitaper
New
The signal is the sum of 19, 20, and 21 Hz real sinusoid waves with amplitudes 0.5, 1, and 1 respectively. The variance of the additive white noise is 0.1. The signal is evenly sampled 128 times at 50 Hz.
Lomb-Scargle periodogram with a
window size of 200 points
Lomb-Scargle periodogram with a window size of 200 points
Spectral analysis for an unevenly sampled signal
The signal frequency jumps from 20 Hz to 40 Hz at the sampling time -0.833 second, and then jumps from 40 Hz to 60 Hz at 0.833 second.
Spectrogram by the new method Spectrogram by the new method
coupled with sparsification
The signal frequency jumps from 20 Hz to 40 Hz at the sampling time -0.833 second, and then jumps from 40 Hz to 60 Hz at 0.833 second.
Spectral analysis for an unevenly sampled signal
Simultaneously examine physiology and behavior for recognizing level of stress: up to 96% accurate, across 12 drivers.(Healey and Picard, ICPR 2000)
Driver Stress Demo(work w/Jen Healey, Yuan Qi,incorporating new spectral estimationtechnique for assessing heart rate variability)
Stress is evident for this person when:driving through cityturning around at toll boothhearing siren
New algorithm: analysis of heart-rate variability via real-time spectrum estimation with missing and irregularly sampled data (Qi and Picard, 2001)
Goal: recognize stress in speech of driver,over cell phone headset.
Recognizing Affect in Speech: Stress
Data: Four drivers talking over cell phone (headset)
Problem: Associate stress with cognitive load of driving/verbal task: 2 speeds of driving (~60 kph, ~120 kph)2 speeds of questioning (every 9 sec, every 4 sec)
Models: Daubechies-4 filterbank: 21 bands, Teager Energy Operator features, Models: HMM, Auto-regressive HMM, Factorial HMM, Hidden-Markov Decision Tree, Support Vector Machine, Neural Network, Mixture of HMM’s
Results: 96% training/62% testing on 4 categories stress with Mixture HMM’s; highly speaker dependent, e.g. 89-100% training, 36-96% test
Fernandez & Picard, ISCA Workshop on Speech and Emotions, Belfast 2000
Extralinguistic Markers
BreathsPauses
F0
Intonation
Syllables
Tempo
Rhythmicality, …
Understanding the Structure of Spoken Language for Affect Modeling
Emotions give rise to changes that can be sensed
FaceDistance VoiceSensing: Posture Gestures, movement, behavior
Skin conductivity Pupillary dilationUp-close Respiration, heart rate, pulseSensing: Temperature Blood pressure
Internal HormonesSensing: Neurotransmitters …
Conclusions & Challenges
• Steady progress w/sensors, pattern rec• Put the desires of the user first:
– more visible vs. less visible signals– non-tethered, wearable, portable, – psychological comfort– cognitive load/interruptions
• Still to combine w/additional context sensing & cognitive reasoning
Papers and projects/details:
http://www.media.mit.edu/affect
http://www.media.mit.edu/~yuanqi
•Machines that “have emotion”
•Emotion and consciousness
•Concerns
•Applications
•How to sense, recognize, build
•Modeling emotion
•Affective wearables