understanding affective interaction: emotion, engagement, and internet videos
DESCRIPTION
As interest in experience and affect in HCI continues to grow, particularly with regard to social media and Web 2.0 technologies, research on techniques for evaluating user engagement is needed. This paper presents a study of popular Internet videos involving a mixed method approach to user engagement. Instruments included physiological measures, emotional self-report measures, and personally expressive techniques, such as open-ended prose reviews. Using triangulation to interpret the results, we describe relationships among perceived emotion, experienced emotion, video preference, and contextual factors.TRANSCRIPT
1
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199
Abstract
As interest in experience and affect in HCI continues
to grow, particularly with regard to social media and
Web 2.0 technologies, research on techniques for
evaluating user engagement is needed. This paper
presents a study of popular Internet videos involving a
mixed method approach to user engagement. Instruments
included physiological measures, emotional self-report
measures, and personally expressive techniques, such as
open-ended prose reviews. Using triangulation to
interpret the results, we describe relationships among
perceived emotion, experienced emotion, video
preference, and contextual factors.
1. Introduction
Situated in human-computer interaction’s (HCI)
decade-long preoccupation with the personal, the
embodied, and the subjective, as seen in the forms of
experience design [28], emotional design [30], affective
computing has risen to prominence, both in its own right,
and also in research areas related to it, including user
experience and engagement [29, 18, 24, 5, 13, 15, 10].
The movement of affective computing began with
Picard’s seminal work on the topic [32], envisioning the
next generation of computers to “have the ability to
recognize, express, and in some cases, ‘have emotions.’”
[32, p. 1]. This line of work was highly influenced by
artificial intelligence (AI) and neurology. Since then, the
affective computing research agenda has branched to
include emotional theory [11], the examination of
significance of emotion in HCI [20, 31], application
interfaces [4, 42, 36, 16, 17, 38], modeling and sensing
emotions [1, 25, 40] measuring and evaluating affect [12,
41, 2, 7, 23, 21, 22] and ethical issues [33, 34, 19].
To research these issues, we conducted a study on the
relationships between emotion and navigation among
Internet videos. Recent reports [27, 35] from the Pew
Internet & American Life Project state that 57% of
Internet users visit Internet video portals and 20% visit
those portals daily. Additionally, 57% of Internet video
viewers share videos with friends/colleagues and 75% of
Internet users receive and watch videos sent from
friends/colleagues. However, as popular Internet video
may be, little is known about how viewers engage with
and respond to the emotion-laden content present in
Internet videos. Moreover, Internet videos are short and
require active participation from users to navigate among
the millions of options available to them.
The goal of this study was to develop a mixed method
approach [9] that would enable us to collect a range of
data that may shed light on the relationships between
emotion and user preferences and critical assessments of
viral videos. Our research approach, described in more
detail below, includes a range of physiological measures,
emotional self-report measures, and personally
expressive techniques such as open-ended prose reviews.
These, when combined with behavioral data regarding
the navigation choices and sequences during viewing,
provide a robust set of data about the role of emotion in
the navigation of Internet video.
2. Study Design
The study uses a four-tier approach that combines
physiological, behavioral, subjective, and self-reporting
data sources, as described below.
2.1. Participants
A total of twenty-one participants took part in this
phase of the experiment. They were recruited using a
number of electronic distribution lists. Based on exit
survey data collected during the study, 75% of
participants were between the ages of 18-34 and 71%
were male. Participants were familiar with Internet video
(80% of participants view over 30 videos per month),
primarily fulfilling the role of a spectator/viewer (67% do
not upload or create videos). For the study, 63% of
participants said they were “mostly” or “very”
comfortable talking about their emotions.
2.2. The Videos
A collection of 60 videos was selected from three video
websites of amateur social multimedia content based on
their popularity rankings. These video portals are
www.youtube.com, www.newgrounds.com (a Flash
Understanding Affective Interaction:
Emotion, Engagement, and Internet Videos
Shaowen Bardzell
Indiana University
901 E 10th ST
Bloomington, IN 47408 USA
Jeffrey Bardzell
Indiana University
901 E 10th ST
Bloomington, IN 47408 USA
Tyler Pace
Indiana University
901 E 10th ST
Bloomington, IN 47408 USA
2
200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249
250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299
animation portal), and www.albinoblacksheep.com
(another Flash portal). Videos with community awards
and top positions on “most popular/viewed/rated” lists
were selected as a means to ensure a similar level of
quality among the videos used in the study. All videos in
the study were vetted by their primary user community
and perceived to be among the best possible examples of
their respective media and genre.
Videos were further categorized into eight genres:
Action, Comedy, Documentary, Drama, Family, Horror,
Mashup and Romance. The idea to use genre categories
and this particular categorization scheme was derived
from the video portals themselves (they are especially
prominent on Newgrounds and Albino Black Sheep).
Where possible, videos in the study used the same genre
classification given to them on their respective video
portals. For videos without a prior genre classification
from their source website, genre was assigned by
consensus of the research team. Each video was only
permitted classification in a single genre.
2.3. Instrument
We designed and built a Flash application as our study
instrument (Figure 1), which contains the video
collection, a video navigation interface, and a video
player. It also features several emotional feedback
interfaces (described below), and the exit survey, all in
one stand-alone application.
A separate review application for the researchers was
also built so that we could observe and take notes of
participants’ activities in real-time.
2.4. Study Procedure
All participants wore a Zephyr BioHarness™ during
the entire duration of the study, which took
approximately 90 minutes. The BioHarness™ uses smart
fabric sensoring and wireless technology to capture and
transmit user’s physiological signals in an unobtrusive
manner. At the beginning of the study, we benchmarked
participants’ physiological states by recording their
biophysical measures for 5 minutes while they sat
inactive in a controlled environment without any stimulus
from the study. The purpose of the benchmark is two-
fold. First, it gives participants’ physiological signals
time to return to a baseline state after the activity of
traveling to the study, filling out consent paperwork and
equipping the BioHarness™. Second, the physiological
benchmark data is used to standardize the data collected
while the participant is exposed to study stimuli.
Figure 1. Participant views of the instrument include a video
navigation interface (left) and player (right)
In addition to the physiological signals benchmark, we
asked the participants to identify their present emotional
state by selecting up to three emotional descriptors from
a collection of 36, based on the affect categories
developed by Scherer [37] to obtain self-report
information on a wide-range of felt emotions elicited by
a particular event (in the case of this study, viewing
Internet videos). Emotional descriptors developed by
Scherer can be further broken down into categories of
positive and negative valence. An equal number of
descriptors for negatively and positively valenced
emotions were used in the instrument. Emotions were
presented in alphabetical order as to not inherently bias
one valence category over the other in the instrument
interface.
Participants were then asked to watch six videos of
their choosing from any combination of the 60 total
spread across the eight available genres. After watching
each video, participants were asked to complete three
different tasks (all built in and self-contained in the Flash-
based study instrument) with the objective of providing
different means for them to identify, express, and
interpret their emotions. The tasks, described in more
detail below, involve selecting up to three emotional
descriptors and the arousal level for each emotion,
reviewing each video with a star rating and a semi-
structured prose review.
The same procedures were repeated for each of the 6
videos. At the end of the study, we gave the participants
an exit survey, which provided researchers with
demographic information and helped researchers
understand more about participants’ familiarity with
Internet video, their video selection criteria, and their
evaluation of the effectiveness of the different methods of
emotional expression used in the study.
2.5. Data Collection
The study instrument, combined with the
BioHarness™, enabled us to collect different traces of
participants’ emotions:
Physiological: The BioHarness™ provides
participants’ second-by-second heart rate (beats
per minute) data and breath rate data (breaths per
minute). These traces offer second-by-second
evidence of ongoing emotional response, as it
affects the body through the nervous system.
The use of heart rate and breath rate as measures
of emotion is increasingly common in HCI
research [1, 26].
Emotion Tagging: The study instrument
provides two mechanisms enabling participants
to express their responses to the videos. First,
users rate each video using a 1-5 scale, with 5
representing “most likely to share with someone
else.” Second, participants were asked to select
3
300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399
up to 3 out of 36 emotional descriptors (again,
from Scherer [36]) to describe the emotional
dimensions of the video they watch. They were
also asked to state the intensity level of their
emotional responses on a 1-5 scale, ranging
from “Hardly (1)” to “Very Much (5)” with the
middle points (2-4) left undefined to provide
more room for participant interpretation of their
emotional intensity. These two mechanisms
provide participants quick and easy ways to
communicate their preferences and emotional
states, and they also lend themselves to
quantitative analysis.
Emotion Prose: The study instrument also a
space to write semi-structured prose reviews of
each video as a way to support participant
interpretive expression of their emotional state
and reactions. Primarily, the prose reviews were
intended to offer a space in which participants
could reveal their sense-making of the videos
and their own emotions. The expressive and
indeterminate nature of the prose reviews does
not present a one-to-one mapping of a set of
discrete emotions of our participants, but rather
an active exploration of their emotional
experiences.
Behavioral: The study instrument also captures
participants’ emotional engagement data during
the study by documenting their interactions with
the videos (e.g., browsing, navigating). We
recorded participants’ video genres selections
and the sequences in which these occurred.
Combined, these four categories of affective data reflect
different aspects of emotion and offer a broad research
basis to analyze and interpret the relationship between
emotion and critical assessment of viral videos.
2.6. Limitations of the Study
As with all studies, there are a number of limitations to
consider before presenting our findings. First, the study
was conducted in a controlled lab environment.
Emotional reactions in a more natural environment will
likely differ from those in the lab environment, however,
the use of a lab environment permits us to achieve the
primary goal of this study which is to inform interactional
notions of affect with informational methods.
Informational methods for measuring affect (e.g., heart
rate) are largely dependent on lab settings to ensure some
level of reliability in the collected data. Second, we
recognize the potential conflict in using a lab
environment for the study of affect. A lab environment
assumes that participants enter with a “blank” emotional
state to be intentionally modified and studied during their
time in the study. The interactional model of affect
counters the lab assumption noting that participants enter
a study with rich and varied emotional states. In terms of
our study, we balanced these assumptions by proceeding
with the necessary benchmarking of physiological
(“blank” slate) states while also including room for
participants to express their beginning emotional state
(via the emotional tags) before being exposed to study
stimuli and after (via the prose reviews).
3. Findings
As we have described, this study has involved the
collection of a sizable quantity of data, from
physiological traces and emotional tagging to 126 prose
reviews composed by participants. Because these data
come from such diverse sources, it is not surprising that
they often seemed to suggest different interpretations.
3.1. Triangulating Conflicting Data Sources
A central problem in any study of emotional
interaction with cultural artifacts is that any given method
commits to (or at least privileges) a particular position on
the relationships among artifacts, emotions, the body,
hermeneutics, and culture. For example, measuring
physiological traces seems to privilege a stimulus-
response model—i.e., that the artifact is a stimulus that
causes a physiological response. This model seems to
entail a conception of the cultural artifact as fixed and
stable, and it treats its reception by the viewer as a
straightforward response. In contrast, the greater part of
cultural theory in the twentieth century theorized the
viewer or reader as constructing or even performing the
artifact, going so far as to suggest that an artifact only
exists when it is being viewed, with the radical
implication that an artifact never exists the same way
twice (e.g., [3], [39], [14]).
Such rival accounts have important consequences. The
stimulus-response model underlies research that seeks to
understand the physiology of media response, which is
important not only in HCI, but also in, for example,
advertising. The cultural theory model offers a rich
vocabulary for an analysis of reception, including the
phenomenology of individual reading/viewing, the
hermeneutics of intersubjective reading/viewing habits
among social groups, and more broadly the role of
culture. Among them, the models offer different
perspectives into the same frustratingly elusive problem
space: the relationship between a given cultural artifact
and the intersubjective responses to it by its audience.
Our strategy for dealing with all of this data—and the
many conflicts within it—is triangulation. We looked for
instances where multiple sources seemed to be pointing
in the same direction (e.g., a user’s heart rate was low, her
rating for the video was low, and she wrote a review that
characterized the video as “boring”).
3.2. Recognizing Emotion in Individual Videos
In this section, we summarize a series of related
findings about participants’ responses to individual
videos. Much of the data was analyzed in relation to the
rating users gave the videos (1-5 stars). We extracted all
the prose reviews (n=126) and organized them by star
4
400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499
(e.g., all the 1-star reviews, all the 2-star reviews, etc.). In
the next section, we analyze these reviews side-by-side
with the quantitative measures, but first we offer some
insights into how participants talked about emotion in
their prose reviews.
3.2.1 Participants are Skilled at Identifying
Individual Emotions
The prose reviews provided abundant evidence that
viewers are able to identify discrete emotions. One
reviewer writes,
Shii [the protagonist] was feeling sadness, longing,
needing, lonliness…. The strange cat … caused me to
feel confusion…. I was sad that Shii lived on the streets
in a box, but the flowers were pretty and made the box
and street feel homey and warm, which made me feel
warm. I fully empathized with Shii….
This viewer, and this is typical of many of our
participants, expresses a range of emotions and is able to
distinguish between those of the protagonist in the video
and those experienced by her/himself.
3.2.2 Videos are Emotionally Complex
Related to the first finding is the fact that the videos
were experienced as emotionally complex. Participants
saw wide ranges of emotions in the videos and in
themselves, which undercuts the common perception that
Internet videos are simplistic and merely silly. Our
participants saw rage, frustration, pain, touching
empathy, hilarity, shock, and wonder, often in diverse
combinations within a single video.
In addition to identifying complexity within a video,
participants also identified a different kind of emotional
complexity, one that occurs in the juxtaposition of two
different sets of emotions: one in the video and the other
in themselves:
It was cute and endearing how the character was
doing his best to put into words how much he loved his
girlfriend, and of course all he could think of are
cheesy lines…. His frustration made me feel amused
because he was trying so hard I couldn’t help but
chuckle. When he was excited and sang … that was
just flat out hysterical. Overall it made me feel warm
and fuzzy inside.
Here, the participant lays out both the emotions of the
protagonist and her or his own emotions, and also
establishes their relationship: one of ironic amusement.
The more distressed the protagonist becomes, the more
the reviewer finds the video funny—and yet the viewer
simultaneously was touched by the sincerity of the
protagonist’s emotion.
3.3. Characteristics of High- Versus Low-Rated
Videos
Sorting the reviews by star rating threw into relief the
different ways people write about videos they like versus
ones they do not like. Different critical criteria emerge at
the different ends of the scale, and these shed light on two
issues of interest to the study: (a) whether emotional
characteristics relate to the overall rating and (b) the
extent to which factors external to the content of a video
are used as criteria in their judgment. Evidence from
across our data sources sheds light on both these
questions.
3.3.1 Emotional Characteristics are Key Critical
Criteria in Evaluating Internet Videos
The prose reviews alone make it clear that viewers
have no trouble talking about videos in relation to the
emotions depicted in and caused by them. A separate
matter, however, is the extent to which emotions entered
into the critical aspects of viewership. Several sources of
data suggest that viewer perception of emotions is central
to their critical judgments, that is, whether they liked a
given video. To determine this, we tested for significance
and correlations between our quantitative measures and
user ratings (i.e., the 1-5 star ratings users assigned to
individual videos) as well as each other. On the
qualitative measures—primarily the prose reviews—as
described above, we divided them by rating, and analyzed
them all to identify patterns that emerged based on rating.
For our physiological traces, heart and breath rates
were collected second-by-second during the study, and
then each set was averaged, giving us an average heart
rate and an average breath rate for each viewing (21
subjects x 6 viewings each = 126 average heart rate scores
and average breath rate scores). Each average
physiological score was then standardized via a z-score
with each participants benchmark data used to compute
the population mean and standard deviation. The
resulting numbers, what we call HR Score (heart rate) and
R Score (breath rate), represent single dimensionless
values of how far above or below a participant’s
benchmark average their physiological measures were
while exposed to a particular video. Standardization is
critical in order to compare data across many subjects
whose individual physiological traces vary dramatically
(e.g., average benchmark heart rates in this study ranged
from 71 to 102 beats per minute).
A single factor analysis of variance (ANOVA)
revealed that there is a significant effect between HR
score and rating [f(4, 121) = 3.537, p = .009] (Figure 1).
Post hoc Bonferroni tests reveal that heart rates were
significantly higher for 5 star reviews than for 1 star
reviews [p = .003]. Additionally, a series of Pearson’s
correlations revealed that HR scores were weakly but
positively associated with user ratings [r(126) = .275, p <
.001]. R scores (breath rate) were not significantly
correlated with user ratings (or, for that matter, any of the
other measures). Research in both HCI and the field of
5
500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549
550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599
psychophysiology has identified that heightened heart
rate corresponds to emotional arousal [1, 8, 26] and our
study further supports that claim in addition to our
hypothesis that emotional response is tied to video
preference.
In the emotional tagging activity, subjects provided
emotional descriptors and stated the arousal level for each
video. To analyze them quantitatively, we used the 36
emotional categories identified by Scherer [37] from
which they were derived to identify the valence of each
emotion (i.e., whether it is generally positive or negative).
Positive emotions were assigned a score of 1, while
negative emotions were assigned a score of -1.
Additionally, the intensity level of each emotion was
calculated by assigning a score from 1-5 based on the
intensity level reported by the participant. Finally, an
overall emotional score was calculated by multiplying the
valence score by the intensity score. Participants were
able to use up to three emotion descriptors/arousal levels
per video, so three emotional descriptor scores were
calculated per participant per video. We studied these
scores both individually and as an average of the up to
three scores assigned by a given reviewer to a given
video.
Figure 2: ANOVA graph of HR Score by video rating.
A single factor ANOVA revealed a significant effect
between descriptor scores (cumulative score derived
from a participant’s descriptor selection) and video rating
[f(4, 121) = 40.139, p < .001] (Figure 2). Post hoc
Bonferroni tests reveal that emotion tag scores were
significantly higher across all video ratings (p < .05).
Further tests of the constituent components of the
descriptor scores, intensity and valence, reveal additional
insights into the relationship between descriptor scores
and user ratings. Additionally, we observed positive
correlations between descriptor scores and user ratings
[r(126) = .751, p <.001] as well as descriptor scores and
HR scores [r(126) = .263, p <.001].
The emotional tagging activity offers strong evidence
of a relationship between users’ perceived emotional
responses and overall preferences for videos.
It could be objected that the emotional tagging activity
is an overly formalized measurement technique for a
phenomenon as subtle as emotion. Indeed, though [6] did
not take specific aim at Scherer’s affect categories upon
which this aspect of our study was based, their critique of
quantitative approaches would appear to apply here. Yet
the more culturally situated aspects of our study—in
particular, the open-ended prose reviews—offer evidence
to support the hypothesis as well.
Figure 3: ANOVA graph of descriptor score by video rating.
While descriptions of emotion appear in prose reviews
of high-, medium-, and low-rated videos, the
characterization of emotion nonetheless varies by rating.
In perhaps the simplest example, videos that are
experienced in an emotionally negative way also ended
up with low reviews, as in the following example:
I saw the emotions of anger, excitement, happiness,
greed, and hate. The emotion of “anger” upset me
because it was acted out in a violent way…. [T]he two
characters physically faught over the snorkel—biting
ears, riping out body organs, etc. It was quite revolting
and inappropriate.
While the content of the video showed a range of
emotions, the viewer experienced a much simpler one:
disgust. She or he also gave this video a 1-star review (on
a 1-5 scale).
Another common characteristic of low-rated videos is
that they left the viewer feeling emotionally unresolved,
as the following three examples show, all excerpted from
the final lines of low-rated reviews:
That made me laugh, but it’s true, and probably
made me feel a bit defensive and aggressive myself.
Their were feeling of sadness. This sadness was
caused by [the antagonist]. Also the [antagonist] made
the character enraged. This cartoon made me smile and
left me a little confused.
I couldn’t tell if this [satiric video on Super Mario
and communism] was supposed to be glorifying
communism or condemning it…. I felt curious about
what a lot of the symbolism was supposed to represent.
Each of these review excerpts reveal a viewer who fails
to achieve emotional resolution and the closure that
comes with it.
Regardless of star rating, reviewers articulated a range
6
600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649
650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699
of emotions found in the videos. The presence or absence
of certain emotions, or a certain number of emotions,
does not seem to coincide with particularly high or low
ratings. Instead, it appears to be the arrangement of
emotions that viewers respond to. Videos that have a
range of emotions, and some sort of satisfying resolution,
were associated with much higher reviews, as in the
following examples.
I felt discust at first because [certain imagery] was
really … too much for me. And then I felt tension…. I
then felt sad when the girl was caught by the moster.
However, the girl’s bear was a secret bomb! (which
really surprised me) All in all, I felt interested in this
video because it’s a good one, which is enthusiastic to
me and at the end … I laughed somehow. I will want
more!
In the video the characters displayed anxiety,
tension, fear, and anger. I held some anxiety for the
child character – cant help it I am a mom – and was
surprised and amused by the ending.
I was … touched by this video…. I felt … unsure
and anxious while I heard/read the lyrics…. Sadness
feeling comes from the unsureness and anxiety. At last,
I felt great and happy because there is still hope there!
Each of these high-rated videos contains an emotional
arc, which includes negative emotions that are resolved
into something positive by the end. It is the arc, not the
presence or absence of certain emotions, that participants
appear to appreciate.
Much of our data—physiological, affective tagging,
prose reviews, and rating—lent itself to the triangulation
approach. Emotions—both as depicted in the videos and
as experienced by the viewers—were an important
consideration in the enjoyment and rating of the videos.
Quantitative data, typically associated with the
information processing model, and qualitative data,
usually associated with the culturally embedded model of
emotion, all seemed to confirm the hypothesis that
emotional engagement is a major critical criterion when
viewing Internet videos.
3.3.2 Characteristics External to the Content of the
Videos are Important Critical Criteria
One of the critiques made in [6] of Picard’s and others’
information processing model of emotion is that it relies
on a conduit metaphor, in which emotion is understood to
be transmitted in discrete bits from one place (e.g., an
information system) to another (e.g., the user). They
argue instead that emotion arises through interaction, in
which understanding, interpretation, and emotional
response are mutually constructed. Whereas one
implication of the conduit metaphor of the information
processing model is that the content of the stimulus (in
this case, video) should drive the response, an alternative,
more culturally embedded model would instead
emphasize the sources of the experience of the video that
users bring to it.
The quantitative measures we collected—especially
the physiological measures and the emotional tagging—
may give insight into emotional phenomena attendant to
the viewing of videos, but they do not tell us about the
origins of those emotional experiences. That is, they do
not help us determine the extent to which the felt
emotions are “in” the videos themselves, as opposed to
arising as a part of the culturally mediated experience and
interpretation of the videos.
The user reviews, however, do offer evidence
supporting the idea that the experience of a video is
substantially shaped by factors external to the content of
a given video. This evidence does not support the
converse, which is that the content of the video does not
matter. Instead, it simply adds nuance to the relationship
between the content and experience of the video.
Expressions of empathy—here defined as statements
where people relate something seen in videos to
something in their own lives, and deriving significance
from that analogy—were tied to the highest rated videos
in the study, as in the following quote:
The graphics do nothing for me. I do not really like
animated music videos. However, not only do I like the
Beatles, but the song reminded me about how I feel
about my fiance. I feel fine with him being that I am
happy but I also feel safe…. I am likely to share this
song with my partner.
This review, which accompanied a 5-star rating, clearly
shows that the viewer responded favorably to this video
in spite of neither liking its visuals nor its genre. Not only
did the viewer commit to liking the video, but she even
expressed a desire to follow up with a behavior of key
interest to user engagement researchers: she wants to
share it with someone else.
Another problem with the conduit model of emotion is
that it offers little insight into the significance of
microcommunities, inside jokes, and private visual
languages. When viewing a popular animutation video, a
genre of Internet video with a distinctive animation style
and repertoire of standard jokes, one reviewer writes,
I was quite disappointed and disgusted by the
video—poor animation, nonsense sing-a-long words at
the bottom of the screen, they couldn’t even spell porn
right, and overall I thought it was a waste of my time.
Every one of the specific features this reviewer
criticizes—the animation, the silly sing-a-long, the “leet
speak” spelling of porn as “pr0n”—is a hallmark feature
of the genre; it is hard to imagine a more comprehensive
misunderstanding. Thus, where members of that video’s
community delight in its wit, it left this reviewer
“disgusted.”
Implied in the previous example’s critique is a
7
700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749
750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799
normative notion that animation should meet certain
production, comprehensibility, and typographical
standards. Other normative notions turned up in many of
the reviews. One way that they appeared to help a video
was in situations where videos appear to have high
cultural capital—because they are conspicuously
educational or artistic. Thus, an expressionist art video
created by a film student generally received high ratings,
though the reviews, emotional tagging scores, and
physiological measures would otherwise have predicted
a lower rating. Similar anomalies occurred with a video
about racism and another about new media textual
practices.
Normative notions are not merely the stuff of abstract
aesthetic theorizing; they also shape expectations, e.g., of
genre, title, even the video’s poster frame. The most
commonly stated reason videos were given 1-star ratings
was that they violated the viewer’s expectations:
From the description, I expected the movie to be
more positive.
I had expected to see a nice video with Beetles
music.
I didn’t find the video funny as I was supposed to.
The ending was violent and surprised me.
The name of the video clip “Secrests of MySpace”
is I think used to just lure the viewwers into thinking
something else. There was just one mention of the
social netwrking website and thats it.
In all these cases, viewers are not complaining about the
content per se, but rather inasmuch as it is other than what
they expected.
Finally, a few readers expressed selective viewing
habits within certain videos; by that we mean that they
perceived that they were supposed to watch a video in a
certain way, but rejecting that, they imposed their own
personal tastes on the video, as in the following example:
I was bored with the film so I concentrated on how
cute the puppy was no matter how pointless the film
was.
Similarly, several others who expressed dislike for
animated music videos said they simply ignored the
visuals and just enjoyed the music. Interestingly, these
selectively enjoyed videos got middle-of-the-road ratings
(typically 3 stars).
Each of these examples reveals ways that factors
external to the content of the videos, which are brought to
bear by the viewers themselves, substantially and
sometimes even dominantly shape the emotional
experience of the video.
That said, we do not believe that these examples
displace or upend the significance of information
processing approaches to emotion; rather, they help us
interpret what is going on. As noted earlier, ratings, heart
rate, and emotional tagging scores were all significantly
positively correlated. The reviews help shed light onto
what is going on behind those measures.
4. Conclusion
In this study, a first attempt at triangulating the various
data points collected yielded promising results. For both
of the major findings in this study (i.e., that emotion is
implicated in video preferences, and that factors external
to videos, including emotional arcs, affect the
experience), we found that all of the data we collected, if
it told any story (some of our sources, such as breath rate,
failed to yield significant findings), it told the same story.
Users were responsive to the emotional complexity of
videos and it weighed into their critical evaluations. Heart
rate, video ratings, and emotional descriptor scores all
increased with videos that matched the desired emotional
complexity sought by participants (as evidenced by a
close reading of participants’ prose reviews).
Ultimately, this study is suggestive of ways that user
experience researchers and designers can think about
designing their research and data analysis methodologies.
It is both possible and fruitful to pursue mixed method
approaches when dealing with phenomena as complex
and subjective as emotion and experience.
5. REFERENCES
1. Anttonen, J. and Surakka, V. Emotions and heart rate while
sitting on a chair. Proc. of CHI’05. ACM Press (2005),
491-499.
2. Axelrod, L, and Hone, K. Affectemes and allaffects: A
novel approach to coding user emotional expression during
interactive experiences. Behavior & Information
Technology 25, 3, Taylor & Francis (2006), 159-173.
3. Barthes, R. The death of the author. In Heath, S. (trans).
Image-Music-Text. Hill and Wang, New York (1977), 142-
148.
4. Bates, J. The role of emotion in believable agents.
Communications of the ACM 37, 7 (1994), 122-125.
5. Bernhaupt, R., Ijsselsteijn, W., Mueller, F., Tscheligi, M.,
Wixon, D. Evaluating user experiences in games. CHI’08
Extended Abstract, ACM Press (2008), 3905-3908.
6. Boehner, K., DePaula, R., Dourish, P., and Sengers, P.
Affect: From information to interaction. AARHUS’05,
ACM Press (2005), 59-67.
7. Boehner, K., DePaula, R., Dourish, P., and Sengers, P.
How emotion is made and measured. International Journal
of Human-Computer Studies 65, Elsevier Ltd (2007), 275-
291.
8. Cacioppo, J., Tassinary, L., and Bernston, G. Handbook of
Psychophysiology. Cambridge University Press. (2000).
9. Creswell, J. and Clark, V. Designing and Conducting
Mixed Methods Research. Sage Publications. (2007).
10. Csikszentmihali, M. Flow: The Psychology of Optimal
Experience. Harper Perennial. (1990).
8
800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849
850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899
11. DePaula, R., and Dourish, P. Cognitive and cultural views
of emotions. Proc. of the Human Computer Interaction
Consortium Winter Meeting. (2005).
12. Desmet, P. Measuring emotion: Development and
application of instrument to measure emotional responses
to product. In Blythe, M., Overbeeke, K., Monk, A., and
Wright, P. (eds.). Funology: From Usability to Enjoyment.
Kluwer Academic
13. Douglas, Y. and Hargadon, A. The pleasure principle:
Immersion, engagement, flow. Proc. Of the Eleventh ACM
on Hypertext and Hypermedia. HYPERTEXT’00. (2000),
153-160.
14. Eco, U. The Role of the Reader: Explorations in the
Semiotics of Texts. Indiana University Press, Bloomington
Indian, USA (1979/1984).
15. Gileade, K. and Dix, A. Using frustration in the design of
adaptive videogames. Proc. Of the 2004 ACM SIFCHI
International Conference on Advances in Computer
Entertainment Technology. (2004), ACE’04.
16. Fagerberg, P., Ståhl, A., Höök, K. Designing gestures for
affective input: Analysis of shape, effort, and valence.
Proc. of MUM’03 (2003), 57-65.
17. Fagerberg, P., Ståhl, A., Höök, K. eMoto – Emotionally
engaging interaction. Journal of Personal and Ubiquitous
Computing 8, 5 (2004), 377-381.
18. Hassenzahl, M. and Tractinsky, N. User experience – A
research agenda. Behavior & Information Technology, 25,
2, Taylor & Francis (2006), 91-97.
19. Höök, K., Ståhl, A. Sundström, P., and Laaksolahti, J.
Interactional empowerment. Proc. of CHI’08. ACM Press
(2008), 647-656.
20. Hudlicka, E. To feel or not to feel: The role of affect in
human-computer interaction. International Journal of
Human-Computer Studies 59 (2003), 1-32.
21. Isbister, K. and Höök, K. Evaluating affective interfaces:
Innovative approaches. CHI’05 Extended Abstract, ACM
Press (2005). 2119.
22. Isbister, K., Höök, K., Sharp, M., and Laaksolahti, J. The
sensual evaluation instrument: Developing an affective
evaluation tool. Proc. of CHI’06, ACM Press (2006),
1163-1172.
23. Isomursu, M., Tähti, M., Väinämö, S., and Kuutti, K.
Experimental evaluation of five methods for collecting
emotions in field settings with mobile applications.
International Journal of Human-Computer Studies 65,
Elsevier Ltd (2007), 404-418.
24. Law, E., Roto, V., Vermeeren, A., Kort, J., Hassenzahl, M.
Towards a shared definition of user experience. CHI’08
Extended Abstract, ACM Press (2008), 2395-2398.
25. Liao, W., Zhang, W., Zhu, Z., Ji, Q., and Gray, W. Toward
a decision-theoretic framework for affect recognition and
user assistance. International Journal of Human-Computer
Studies 65 (2006), 847-873.
26. Mahlke, S., Minge, M., and Thüring, M. Measuring
multiple components of emotions in interactive contexts.
Extended Abstracts of CHI’06, ACM Press (2006), 1061-
1066.
27. Madden, M. Online Video. Pew Internet and American
Life Project. (2007).
http://www.pewinternet.org/PPF/r/219/report_display.asp
28. McCarthy, J. and Wright, P. Technology as Experience.
The MIT Press (2004).
29. McNamara, N., and Kirakowski, J. Functionality, usability,
and user experience: Three areas of concern. Interactions
13, 6, ACM Press (2006), 26-28.
30. Norman, D. Emotional Design: Why We Love (or Hate)
Everyday Things. Basic Books, New York, NY, USA,
2004.
31. Peter, C., Beal, R, Crane, E., Axelrod, L., and Blyth, G.
Emotion in HCI. Joint Proc. of British HCI Group Annual
Conference (2005, 2006, 2007).
32. Picard, R. Affective Computing. MIT Press, Cambridge,
MA, USA, 1997.
33. Picard, R. and Klein, J. Computers that recognize and
respond to user emotion: Theoretical and practical
implications. Interacting with Computers, 14, 2 (2002),
141-169.
34. Picard, R. Affective computing: Challenges. International
Journal of Human-Computer Studies 59, 1-2 (2003), 55-
64.
35. Raine 2008 = Raine, L. Increased Use of Video-sharing
Sites. Pew Internet and American Life Project. (2008).
http://www.pewinternet.org/PPF/r/232/report_display.asp
36. Sengers, P., Liesendahl, R.., Magar, W., Seibert, C.,
Müller, Joachims, T., Geng, W, Mårtensson, P., and Höök.
The enigmatics of affect. Proc. of DIS’02. ACM Press
(2002), 87-98.
37. Scherer, K. What are emotions? And how can they be
measured? Social Science Information 44, 4 (2005), 695-
729.
38. Ståhl, A., Sundström, P., and Höök, K. A foundation for
emotional expressivity. Proc. of DUX’05. AIGA(2005).
39. Tompkins, J. (ed.). Reader-Response Criticism: From
Formalism to Post-Structuralism. The Johns Hopins
University Press: Baltimore USA (1980).
40. Zeng, Z., Pantic, M., Roisman, G., & Huang, T. A survey
of affect recognition methods: Audio, visual, and
spontaneous expressions. IEEE Transactions on pattern
analysis and machine intelligence. (2007), 1-20.
41. Ward, R. An analysis of facial movement tracking in
ordinary human-computer interaction. Interacting with
Computers 16, 5 (2004), 879-896.
42. Wensveen, S.A.G., Overbeeke, C.J., & Djajadiningrat, J.P.
Touchme, hit me, and I know how you feel. A design
approach to emotionally rich interaction. Proc. of DIS’00,
ACM Press (2000), 48-5