Download - Feedback Elisabetta Bevacqua, Dirk Heylen,, Catherine Pelachaud, Isabella Poggi, Marc Schröder
Feedback
Elisabetta Bevacqua, Dirk Heylen, , Catherine Pelachaud, Isabella Poggi, Marc Schröder
Roddy + Ruth
Modeling
Listener system
• Listener’s modules: generator of listener’s behaviours• Input: video and audio data from real world• Player: 3D agent Greta• Backchannel library: lexicon of backchannel signals• Whiteboard Psyclone: communication protocol system
Reactive Listener module
• The reactive module generates listener’s responses according to speaker’s head movement
• To detect head movements we integrated Watson (Gratch et al.) – At the moment Greta reacts with a head nod every time
the speaker performs a nod or a shake
– In the future Greta will be able to react with different backchannel signals and/or copy the speaker’s head movement
Analysis
• Head movements– tracking– classification
Television
Data Annotation (SAL)
Cognitive Listener module
• We use the SAL Wizard of Oz to trigger deliberative listener behaviours for Greta
• Pre-calculated FAP files are selected according to the wizard’s decision
• The Player displays the selected FAP files
Backchannel lexicon
• We aim at building a listener ECA able to display backchannel signals according to its style of behaviour: assertive/not assertive, believing/not believing, interested/not interested and so on
• We need to define a set of backchannel signals that users are able to interpret and understand
• To define such a library of recognizable signals we performed a perceptive test
Perception/Feedback
• Samples* for subjects to judge– questionnaires
(semantic scales)– ask to label things
• Facial Expressions• Affect Bursts
Perceptive Test
• Perceptive test: find a mapping between signals and meanings
• Questions:– it is possible to identify a signal (or a combination of
signals) for each meaning,– a combination of signals can alter the meaning
attached to each backchannel single signal.
Subjects and material
• Sixty French subjects (age mean 20.1)• Tasks: select meanings for facial expressions
and head signals displayed by Greta• 21 different signals • 12 meanings:
• agree, disagree, accept, refuse, interested, not interested, believe, disbelieve, understand, don't understand, like, dislike
• As the list of meanings was too long, subjects were divided into two groups: group1 and group2
Signals
• Signals can be simple (containing just a single action) or complex (containing several actions):
nod tilt
smile tilt and frown
raise eyebrows tilt and sad eyebrows
nod and smile tilt and raise eyebrows
nod and raise eyebrows tilt and gaze right down
shake gaze right down
frown eyes roll up
tension1 raise left eyebrow
shake and frown sad eyebrows
frown and tension1 eyes wide open
shake, frown and tension1
1tension of the lips
Result1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total
accept 2 3 2 5 1 13agree 1 10 11 3 25angry 1 1astounded 1 1attentive 2 2believe 1 3 2 1 7bored 10 1 1 10 6 1 29compassionate 1 2 3considering 1 1disagree 1 2 1 1 1 3 1 11 8 29disappointed 1 1 2disbelieve 5 1 9 4 1 3 1 2 8 1 1 5 2 43disdain 1 1disgust 1 1dislike 2 2 1 2 2 1 5 3 2 1 21distrust 1 1 2doubt 2 1 1 4encourage 1 2 3helpless 1 1interested 3 3 2 5 13like 2 1 2 5 1 11meaningless 1 1not interested 1 1 2oh no not again 1 1pity 1 1pondering 1 1refuse 2 1 1 4 1 4 7 20sad 1 1sorrow 1 1surprised 1 5 1 3 10thinking 1 1 2thoughtfull 1 1uncertain 1 1understand 1 1 2 3 7unhappy 1 1worried 2 2not understand 6 1 5 2 2 7 3 2619.4 21 20 22 20 18 19 19 24 18 15 21 18 22 17 17 7,864865(average number of labels for each movie) (average time label employed per movie)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total
3
disbelieve 9surprise 5interested 3like 2astounded 1bored 1disagree 1
11
disbelieve 8not understand 5dislike 3dislike 1dislike 1dislike 1dislike 1dislike 1
First question
• Q1 it is possible to identify a signal (or a combination of signals) for each meaning
• agree and accept: the signal nod proved to be very significant. All signals containing nods were interpreted as signals of agreement and acceptance
• like: the signal smile conveys this meaning • understand: this meaning can be conveyed
through the combination of smile and raise eyevrows
Second question
• Hyp2: a combination of signals can alter the meaning of backchannel single signals.
• Tension alone and frown alone do not mean dislike, but their combination does
• To convey the meaning disbelieve tilt and frown must be displayed together
• The signal frown means don’t understand but when a shake is added their combination loses this meaning
• Tilt alone and gaze right down alone do not mean not interested, but their combination does
Experiment: Affect bursts as listener feedback
Research questions
1. Affect bursts used as listener feedback => same emotion?
2. How acceptable is such feedback?
(3. Difference between German and Dutch listeners?)
Method• Stimuli
– select German affect bursts from Schröder (2003)– embed into neutral German / Dutch speaker utterance“Yeah, then I told myself, why don’t you try it <pause> and then I did it!”
– 10 emotions, 2 affect bursts each=> 20 stimuli per language
• e.g. Dutch + admiration-wow• e.g. Dutch + anger-growl• …• e.g. German + worry-ohoh• e.g. German + startle-ah• …
ResultsEmotion recognition
0
20
40
60
80
100
adm
iratio
n-w
ow
adm
iratio
n-bo
ah
thre
at-h
ey1
thre
at-h
ey2
disg
ust-
buäh
disg
ust-
ih
elat
ion-
ja1
elat
ion-
ja2
bore
dom
-yaw
n
bore
dom
-hm
m
relie
f-si
gh
relie
f-uf
f
star
tle-in
t. b
reat
h
star
tle-a
h
wor
ry-o
je
wor
ry-o
hoh
cont
empt
-pha
cont
empt
-tse
ange
r-gr
owl1
ange
r-gr
owl2
% c
orr
ect de isol.
nl isol.
de context
nl context
Results
• maybe: social appropriateness
+ admiration, elation, relief, worry
- threat, startle, anger
Acceptability as feedback
0
20
40
60
80
100
adm
iratio
n-w
ow
adm
iratio
n-bo
ah
thre
at-h
ey1
thre
at-h
ey2
disg
ust-
buäh
disg
ust-
ih
elat
ion-
ja1
elat
ion-
ja2
bore
dom
-yaw
n
bore
dom
-hm
m
relie
f-si
gh
relie
f-uf
f
star
tle-in
t. b
reat
h
star
tle-a
h
wor
ry-o
je
wor
ry-o
hoh
cont
empt
-pha
cont
empt
-tse
ange
r-gr
owl1
ange
r-gr
owl2
% c
orr
ect
de
nl
Discussion
• “Acceptability” is very ambiguous– general appropriateness in the context (intended)
– strange as reaction to speaker utterance
– technical aspects• mismatch between sound quality
• timing of feedback
– social appropriateness: display rules• social norms prescribed by one’s culture as to “who
can show what emotion to whom, when” (Ekman, 1977)
Discussion (2)
• Tentative set of display rules for affect bursts– display gratifying emotions (admiration);– display empathy emotions (elation, worry,
relief);– do not display negative evaluation (disgust,
contempt)– do not display aggression (anger, threat)
• Can explain most observations– but not high acceptability of boredom
Summary and Questions• For some emotions, highly recognisable affect
bursts were judged to fit well with the context• Perception of emotional feedback may depend on:
– social acceptability (display rules);– semantic/pragmatic interaction between speaker
utterance and affect burst;– timing of feedback;– relation between speaker/listener;– formality of the situation;– …
Future work
• In the future we aim at:– integrating perception of audio data– defining a set of rules to decide when a
reactive backchannel signal must be triggered and which signal Greta should display
– defining different styles to create variety of agents (assertive/not assertive, interested/not interested, believing/not believing, and so on) and evaluating their impression on users
CONTEXT
first phase has ended succesfully
move to the next…
and put the findings in context