understanding affective interaction: emotion, engagement, and internet videos

1

100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149

150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199

Abstract

As interest in experience and affect in HCI continues

to grow, particularly with regard to social media and

Web 2.0 technologies, research on techniques for

evaluating user engagement is needed. This paper

presents a study of popular Internet videos involving a

mixed method approach to user engagement. Instruments

included physiological measures, emotional self-report

measures, and personally expressive techniques, such as

open-ended prose reviews. Using triangulation to

interpret the results, we describe relationships among

perceived emotion, experienced emotion, video

preference, and contextual factors.

1. Introduction

Situated in human-computer interaction’s (HCI)

decade-long preoccupation with the personal, the

embodied, and the subjective, as seen in the forms of

experience design [28], emotional design [30], affective

computing has risen to prominence, both in its own right,

and also in research areas related to it, including user

experience and engagement [29, 18, 24, 5, 13, 15, 10].

The movement of affective computing began with

Picard’s seminal work on the topic [32], envisioning the

next generation of computers to “have the ability to

recognize, express, and in some cases, ‘have emotions.’”

[32, p. 1]. This line of work was highly influenced by

artificial intelligence (AI) and neurology. Since then, the

affective computing research agenda has branched to

include emotional theory [11], the examination of

significance of emotion in HCI [20, 31], application

interfaces [4, 42, 36, 16, 17, 38], modeling and sensing

emotions [1, 25, 40] measuring and evaluating affect [12,

41, 2, 7, 23, 21, 22] and ethical issues [33, 34, 19].

To research these issues, we conducted a study on the

relationships between emotion and navigation among

Internet videos. Recent reports [27, 35] from the Pew

Internet & American Life Project state that 57% of

Internet users visit Internet video portals and 20% visit

those portals daily. Additionally, 57% of Internet video

viewers share videos with friends/colleagues and 75% of

Internet users receive and watch videos sent from

friends/colleagues. However, as popular Internet video

may be, little is known about how viewers engage with

and respond to the emotion-laden content present in

Internet videos. Moreover, Internet videos are short and

require active participation from users to navigate among

the millions of options available to them.

The goal of this study was to develop a mixed method

approach [9] that would enable us to collect a range of

data that may shed light on the relationships between

emotion and user preferences and critical assessments of

viral videos. Our research approach, described in more

detail below, includes a range of physiological measures,

emotional self-report measures, and personally

expressive techniques such as open-ended prose reviews.

These, when combined with behavioral data regarding

the navigation choices and sequences during viewing,

provide a robust set of data about the role of emotion in

the navigation of Internet video.

2. Study Design

The study uses a four-tier approach that combines

physiological, behavioral, subjective, and self-reporting

data sources, as described below.

2.1. Participants

A total of twenty-one participants took part in this

phase of the experiment. They were recruited using a

number of electronic distribution lists. Based on exit

survey data collected during the study, 75% of

participants were between the ages of 18-34 and 71%

were male. Participants were familiar with Internet video

(80% of participants view over 30 videos per month),

primarily fulfilling the role of a spectator/viewer (67% do

not upload or create videos). For the study, 63% of

participants said they were “mostly” or “very”

comfortable talking about their emotions.

2.2. The Videos

A collection of 60 videos was selected from three video

websites of amateur social multimedia content based on

their popularity rankings. These video portals are

www.youtube.com, www.newgrounds.com (a Flash

Understanding Affective Interaction:

Emotion, Engagement, and Internet Videos

Shaowen Bardzell

[email protected]

Indiana University

901 E 10th ST

Bloomington, IN 47408 USA

Jeffrey Bardzell

[email protected]

Indiana University

901 E 10th ST


Tyler Pace

[email protected]

Indiana University

901 E 10th ST


2

200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249

250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299

animation portal), and www.albinoblacksheep.com

(another Flash portal). Videos with community awards

and top positions on “most popular/viewed/rated” lists

were selected as a means to ensure a similar level of

quality among the videos used in the study. All videos in

the study were vetted by their primary user community

and perceived to be among the best possible examples of

their respective media and genre.

Videos were further categorized into eight genres:

Action, Comedy, Documentary, Drama, Family, Horror,

Mashup and Romance. The idea to use genre categories

and this particular categorization scheme was derived

from the video portals themselves (they are especially

prominent on Newgrounds and Albino Black Sheep).

Where possible, videos in the study used the same genre

classification given to them on their respective video

portals. For videos without a prior genre classification

from their source website, genre was assigned by

consensus of the research team. Each video was only

permitted classification in a single genre.

2.3. Instrument

We designed and built a Flash application as our study

instrument (Figure 1), which contains the video

collection, a video navigation interface, and a video

player. It also features several emotional feedback

interfaces (described below), and the exit survey, all in

one stand-alone application.

A separate review application for the researchers was

also built so that we could observe and take notes of

participants’ activities in real-time.

2.4. Study Procedure

All participants wore a Zephyr BioHarness™ during

the entire duration of the study, which took

approximately 90 minutes. The BioHarness™ uses smart

fabric sensoring and wireless technology to capture and

transmit user’s physiological signals in an unobtrusive

manner. At the beginning of the study, we benchmarked

participants’ physiological states by recording their

biophysical measures for 5 minutes while they sat

inactive in a controlled environment without any stimulus

from the study. The purpose of the benchmark is two-

fold. First, it gives participants’ physiological signals

time to return to a baseline state after the activity of

traveling to the study, filling out consent paperwork and

equipping the BioHarness™. Second, the physiological

benchmark data is used to standardize the data collected

while the participant is exposed to study stimuli.

Figure 1. Participant views of the instrument include a video

navigation interface (left) and player (right)

In addition to the physiological signals benchmark, we

asked the participants to identify their present emotional

state by selecting up to three emotional descriptors from

a collection of 36, based on the affect categories

developed by Scherer [37] to obtain self-report

information on a wide-range of felt emotions elicited by

a particular event (in the case of this study, viewing

Internet videos). Emotional descriptors developed by

Scherer can be further broken down into categories of

positive and negative valence. An equal number of

descriptors for negatively and positively valenced

emotions were used in the instrument. Emotions were

presented in alphabetical order as to not inherently bias

one valence category over the other in the instrument

interface.

Participants were then asked to watch six videos of

their choosing from any combination of the 60 total

spread across the eight available genres. After watching

each video, participants were asked to complete three

different tasks (all built in and self-contained in the Flash-

based study instrument) with the objective of providing

different means for them to identify, express, and

interpret their emotions. The tasks, described in more

detail below, involve selecting up to three emotional

descriptors and the arousal level for each emotion,

reviewing each video with a star rating and a semi-

structured prose review.

The same procedures were repeated for each of the 6

videos. At the end of the study, we gave the participants

an exit survey, which provided researchers with

demographic information and helped researchers

understand more about participants’ familiarity with

Internet video, their video selection criteria, and their

evaluation of the effectiveness of the different methods of

emotional expression used in the study.

2.5. Data Collection

The study instrument, combined with the

BioHarness™, enabled us to collect different traces of

participants’ emotions:

Physiological: The BioHarness™ provides

participants’ second-by-second heart rate (beats

per minute) data and breath rate data (breaths per

minute). These traces offer second-by-second

evidence of ongoing emotional response, as it

affects the body through the nervous system.

The use of heart rate and breath rate as measures

of emotion is increasingly common in HCI

research [1, 26].

Emotion Tagging: The study instrument

provides two mechanisms enabling participants

to express their responses to the videos. First,

users rate each video using a 1-5 scale, with 5

representing “most likely to share with someone

else.” Second, participants were asked to select

3

300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349

350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399

up to 3 out of 36 emotional descriptors (again,

from Scherer [36]) to describe the emotional

dimensions of the video they watch. They were

also asked to state the intensity level of their

emotional responses on a 1-5 scale, ranging

from “Hardly (1)” to “Very Much (5)” with the

middle points (2-4) left undefined to provide

more room for participant interpretation of their

emotional intensity. These two mechanisms

provide participants quick and easy ways to

communicate their preferences and emotional

states, and they also lend themselves to

quantitative analysis.

Emotion Prose: The study instrument also a

space to write semi-structured prose reviews of

each video as a way to support participant

interpretive expression of their emotional state

and reactions. Primarily, the prose reviews were

intended to offer a space in which participants

could reveal their sense-making of the videos

and their own emotions. The expressive and

indeterminate nature of the prose reviews does

not present a one-to-one mapping of a set of

discrete emotions of our participants, but rather

an active exploration of their emotional

experiences.

Behavioral: The study instrument also captures

participants’ emotional engagement data during

the study by documenting their interactions with

the videos (e.g., browsing, navigating). We

recorded participants’ video genres selections

and the sequences in which these occurred.

Combined, these four categories of affective data reflect

different aspects of emotion and offer a broad research

basis to analyze and interpret the relationship between

emotion and critical assessment of viral videos.

2.6. Limitations of the Study

As with all studies, there are a number of limitations to

consider before presenting our findings. First, the study

was conducted in a controlled lab environment.

Emotional reactions in a more natural environment will

likely differ from those in the lab environment, however,

the use of a lab environment permits us to achieve the

primary goal of this study which is to inform interactional

notions of affect with informational methods.

Informational methods for measuring affect (e.g., heart

rate) are largely dependent on lab settings to ensure some

level of reliability in the collected data. Second, we

recognize the potential conflict in using a lab

environment for the study of affect. A lab environment

assumes that participants enter with a “blank” emotional

state to be intentionally modified and studied during their

time in the study. The interactional model of affect

counters the lab assumption noting that participants enter

a study with rich and varied emotional states. In terms of

our study, we balanced these assumptions by proceeding

with the necessary benchmarking of physiological

(“blank” slate) states while also including room for

participants to express their beginning emotional state

(via the emotional tags) before being exposed to study

stimuli and after (via the prose reviews).

3. Findings

As we have described, this study has involved the

collection of a sizable quantity of data, from

physiological traces and emotional tagging to 126 prose

reviews composed by participants. Because these data

come from such diverse sources, it is not surprising that

they often seemed to suggest different interpretations.

3.1. Triangulating Conflicting Data Sources

A central problem in any study of emotional

interaction with cultural artifacts is that any given method

commits to (or at least privileges) a particular position on

the relationships among artifacts, emotions, the body,

hermeneutics, and culture. For example, measuring

physiological traces seems to privilege a stimulus-

response model—i.e., that the artifact is a stimulus that

causes a physiological response. This model seems to

entail a conception of the cultural artifact as fixed and

stable, and it treats its reception by the viewer as a

straightforward response. In contrast, the greater part of

cultural theory in the twentieth century theorized the

viewer or reader as constructing or even performing the

artifact, going so far as to suggest that an artifact only

exists when it is being viewed, with the radical

implication that an artifact never exists the same way

twice (e.g., [3], [39], [14]).

Such rival accounts have important consequences. The

stimulus-response model underlies research that seeks to

understand the physiology of media response, which is

important not only in HCI, but also in, for example,

advertising. The cultural theory model offers a rich

vocabulary for an analysis of reception, including the

phenomenology of individual reading/viewing, the

hermeneutics of intersubjective reading/viewing habits

among social groups, and more broadly the role of

culture. Among them, the models offer different

perspectives into the same frustratingly elusive problem

space: the relationship between a given cultural artifact

and the intersubjective responses to it by its audience.

Our strategy for dealing with all of this data—and the

many conflicts within it—is triangulation. We looked for

instances where multiple sources seemed to be pointing

in the same direction (e.g., a user’s heart rate was low, her

rating for the video was low, and she wrote a review that

characterized the video as “boring”).

3.2. Recognizing Emotion in Individual Videos

In this section, we summarize a series of related

findings about participants’ responses to individual

videos. Much of the data was analyzed in relation to the

rating users gave the videos (1-5 stars). We extracted all

the prose reviews (n=126) and organized them by star

4

400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449

450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499

(e.g., all the 1-star reviews, all the 2-star reviews, etc.). In

the next section, we analyze these reviews side-by-side

with the quantitative measures, but first we offer some

insights into how participants talked about emotion in

their prose reviews.

3.2.1 Participants are Skilled at Identifying

Individual Emotions

The prose reviews provided abundant evidence that

viewers are able to identify discrete emotions. One

reviewer writes,

Shii [the protagonist] was feeling sadness, longing,

needing, lonliness…. The strange cat … caused me to

feel confusion…. I was sad that Shii lived on the streets

in a box, but the flowers were pretty and made the box

and street feel homey and warm, which made me feel

warm. I fully empathized with Shii….

This viewer, and this is typical of many of our

participants, expresses a range of emotions and is able to

distinguish between those of the protagonist in the video

and those experienced by her/himself.

3.2.2 Videos are Emotionally Complex

Related to the first finding is the fact that the videos

were experienced as emotionally complex. Participants

saw wide ranges of emotions in the videos and in

themselves, which undercuts the common perception that

Internet videos are simplistic and merely silly. Our

participants saw rage, frustration, pain, touching

empathy, hilarity, shock, and wonder, often in diverse

combinations within a single video.

In addition to identifying complexity within a video,

participants also identified a different kind of emotional

complexity, one that occurs in the juxtaposition of two

different sets of emotions: one in the video and the other

in themselves:

It was cute and endearing how the character was

doing his best to put into words how much he loved his

girlfriend, and of course all he could think of are

cheesy lines…. His frustration made me feel amused

because he was trying so hard I couldn’t help but

chuckle. When he was excited and sang … that was

just flat out hysterical. Overall it made me feel warm

and fuzzy inside.

Here, the participant lays out both the emotions of the

protagonist and her or his own emotions, and also

establishes their relationship: one of ironic amusement.

The more distressed the protagonist becomes, the more

the reviewer finds the video funny—and yet the viewer

simultaneously was touched by the sincerity of the

protagonist’s emotion.

3.3. Characteristics of High- Versus Low-Rated

Videos

Sorting the reviews by star rating threw into relief the

different ways people write about videos they like versus

ones they do not like. Different critical criteria emerge at

the different ends of the scale, and these shed light on two

issues of interest to the study: (a) whether emotional

characteristics relate to the overall rating and (b) the

extent to which factors external to the content of a video

are used as criteria in their judgment. Evidence from

across our data sources sheds light on both these

questions.

3.3.1 Emotional Characteristics are Key Critical

Criteria in Evaluating Internet Videos

The prose reviews alone make it clear that viewers

have no trouble talking about videos in relation to the

emotions depicted in and caused by them. A separate

matter, however, is the extent to which emotions entered

into the critical aspects of viewership. Several sources of

data suggest that viewer perception of emotions is central

to their critical judgments, that is, whether they liked a

given video. To determine this, we tested for significance

and correlations between our quantitative measures and

user ratings (i.e., the 1-5 star ratings users assigned to

individual videos) as well as each other. On the

qualitative measures—primarily the prose reviews—as

described above, we divided them by rating, and analyzed

them all to identify patterns that emerged based on rating.

For our physiological traces, heart and breath rates

were collected second-by-second during the study, and

then each set was averaged, giving us an average heart

rate and an average breath rate for each viewing (21

subjects x 6 viewings each = 126 average heart rate scores

and average breath rate scores). Each average

physiological score was then standardized via a z-score

with each participants benchmark data used to compute

the population mean and standard deviation. The

resulting numbers, what we call HR Score (heart rate) and

R Score (breath rate), represent single dimensionless

values of how far above or below a participant’s

benchmark average their physiological measures were

while exposed to a particular video. Standardization is

critical in order to compare data across many subjects

whose individual physiological traces vary dramatically

(e.g., average benchmark heart rates in this study ranged

from 71 to 102 beats per minute).

A single factor analysis of variance (ANOVA)

revealed that there is a significant effect between HR

score and rating [f(4, 121) = 3.537, p = .009] (Figure 1).

Post hoc Bonferroni tests reveal that heart rates were

significantly higher for 5 star reviews than for 1 star

reviews [p = .003]. Additionally, a series of Pearson’s

correlations revealed that HR scores were weakly but

positively associated with user ratings [r(126) = .275, p <

.001]. R scores (breath rate) were not significantly

correlated with user ratings (or, for that matter, any of the

other measures). Research in both HCI and the field of

5

500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549

550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599

psychophysiology has identified that heightened heart

rate corresponds to emotional arousal [1, 8, 26] and our

study further supports that claim in addition to our

hypothesis that emotional response is tied to video

preference.

In the emotional tagging activity, subjects provided

emotional descriptors and stated the arousal level for each

video. To analyze them quantitatively, we used the 36

emotional categories identified by Scherer [37] from

which they were derived to identify the valence of each

emotion (i.e., whether it is generally positive or negative).

Positive emotions were assigned a score of 1, while

negative emotions were assigned a score of -1.

Additionally, the intensity level of each emotion was

calculated by assigning a score from 1-5 based on the

intensity level reported by the participant. Finally, an

overall emotional score was calculated by multiplying the

valence score by the intensity score. Participants were

able to use up to three emotion descriptors/arousal levels

per video, so three emotional descriptor scores were

calculated per participant per video. We studied these

scores both individually and as an average of the up to

three scores assigned by a given reviewer to a given

video.

Figure 2: ANOVA graph of HR Score by video rating.

A single factor ANOVA revealed a significant effect

between descriptor scores (cumulative score derived

from a participant’s descriptor selection) and video rating

[f(4, 121) = 40.139, p < .001] (Figure 2). Post hoc

Bonferroni tests reveal that emotion tag scores were

significantly higher across all video ratings (p < .05).

Further tests of the constituent components of the

descriptor scores, intensity and valence, reveal additional

insights into the relationship between descriptor scores

and user ratings. Additionally, we observed positive

correlations between descriptor scores and user ratings

[r(126) = .751, p <.001] as well as descriptor scores and

HR scores [r(126) = .263, p <.001].

The emotional tagging activity offers strong evidence

of a relationship between users’ perceived emotional

responses and overall preferences for videos.

It could be objected that the emotional tagging activity

is an overly formalized measurement technique for a

phenomenon as subtle as emotion. Indeed, though [6] did

not take specific aim at Scherer’s affect categories upon

which this aspect of our study was based, their critique of

quantitative approaches would appear to apply here. Yet

the more culturally situated aspects of our study—in

particular, the open-ended prose reviews—offer evidence

to support the hypothesis as well.

Figure 3: ANOVA graph of descriptor score by video rating.

While descriptions of emotion appear in prose reviews

of high-, medium-, and low-rated videos, the

characterization of emotion nonetheless varies by rating.

In perhaps the simplest example, videos that are

experienced in an emotionally negative way also ended

up with low reviews, as in the following example:

I saw the emotions of anger, excitement, happiness,

greed, and hate. The emotion of “anger” upset me

because it was acted out in a violent way…. [T]he two

characters physically faught over the snorkel—biting

ears, riping out body organs, etc. It was quite revolting

and inappropriate.

While the content of the video showed a range of

emotions, the viewer experienced a much simpler one:

disgust. She or he also gave this video a 1-star review (on

a 1-5 scale).

Another common characteristic of low-rated videos is

that they left the viewer feeling emotionally unresolved,

as the following three examples show, all excerpted from

the final lines of low-rated reviews:

That made me laugh, but it’s true, and probably

made me feel a bit defensive and aggressive myself.

Their were feeling of sadness. This sadness was

caused by [the antagonist]. Also the [antagonist] made

the character enraged. This cartoon made me smile and

left me a little confused.

I couldn’t tell if this [satiric video on Super Mario

and communism] was supposed to be glorifying

communism or condemning it…. I felt curious about

what a lot of the symbolism was supposed to represent.

Each of these review excerpts reveal a viewer who fails

to achieve emotional resolution and the closure that

comes with it.

Regardless of star rating, reviewers articulated a range

6

600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649

650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699

of emotions found in the videos. The presence or absence

of certain emotions, or a certain number of emotions,

does not seem to coincide with particularly high or low

ratings. Instead, it appears to be the arrangement of

emotions that viewers respond to. Videos that have a

range of emotions, and some sort of satisfying resolution,

were associated with much higher reviews, as in the

following examples.

I felt discust at first because [certain imagery] was

really … too much for me. And then I felt tension…. I

then felt sad when the girl was caught by the moster.

However, the girl’s bear was a secret bomb! (which

really surprised me) All in all, I felt interested in this

video because it’s a good one, which is enthusiastic to

me and at the end … I laughed somehow. I will want

more!

In the video the characters displayed anxiety,

tension, fear, and anger. I held some anxiety for the

child character – cant help it I am a mom – and was

surprised and amused by the ending.

I was … touched by this video…. I felt … unsure

and anxious while I heard/read the lyrics…. Sadness

feeling comes from the unsureness and anxiety. At last,

I felt great and happy because there is still hope there!

Each of these high-rated videos contains an emotional

arc, which includes negative emotions that are resolved

into something positive by the end. It is the arc, not the

presence or absence of certain emotions, that participants

appear to appreciate.

Much of our data—physiological, affective tagging,

prose reviews, and rating—lent itself to the triangulation

approach. Emotions—both as depicted in the videos and

as experienced by the viewers—were an important

consideration in the enjoyment and rating of the videos.

Quantitative data, typically associated with the

information processing model, and qualitative data,

usually associated with the culturally embedded model of

emotion, all seemed to confirm the hypothesis that

emotional engagement is a major critical criterion when

viewing Internet videos.

3.3.2 Characteristics External to the Content of the

Videos are Important Critical Criteria

One of the critiques made in [6] of Picard’s and others’

information processing model of emotion is that it relies

on a conduit metaphor, in which emotion is understood to

be transmitted in discrete bits from one place (e.g., an

information system) to another (e.g., the user). They

argue instead that emotion arises through interaction, in

which understanding, interpretation, and emotional

response are mutually constructed. Whereas one

implication of the conduit metaphor of the information

processing model is that the content of the stimulus (in

this case, video) should drive the response, an alternative,

more culturally embedded model would instead

emphasize the sources of the experience of the video that

users bring to it.

The quantitative measures we collected—especially

the physiological measures and the emotional tagging—

may give insight into emotional phenomena attendant to

the viewing of videos, but they do not tell us about the

origins of those emotional experiences. That is, they do

not help us determine the extent to which the felt

emotions are “in” the videos themselves, as opposed to

arising as a part of the culturally mediated experience and

interpretation of the videos.

The user reviews, however, do offer evidence

supporting the idea that the experience of a video is

substantially shaped by factors external to the content of

a given video. This evidence does not support the

converse, which is that the content of the video does not

matter. Instead, it simply adds nuance to the relationship

between the content and experience of the video.

Expressions of empathy—here defined as statements

where people relate something seen in videos to

something in their own lives, and deriving significance

from that analogy—were tied to the highest rated videos

in the study, as in the following quote:

The graphics do nothing for me. I do not really like

animated music videos. However, not only do I like the

Beatles, but the song reminded me about how I feel

about my fiance. I feel fine with him being that I am

happy but I also feel safe…. I am likely to share this

song with my partner.

This review, which accompanied a 5-star rating, clearly

shows that the viewer responded favorably to this video

in spite of neither liking its visuals nor its genre. Not only

did the viewer commit to liking the video, but she even

expressed a desire to follow up with a behavior of key

interest to user engagement researchers: she wants to

share it with someone else.

Another problem with the conduit model of emotion is

that it offers little insight into the significance of

microcommunities, inside jokes, and private visual

languages. When viewing a popular animutation video, a

genre of Internet video with a distinctive animation style

and repertoire of standard jokes, one reviewer writes,

I was quite disappointed and disgusted by the

video—poor animation, nonsense sing-a-long words at

the bottom of the screen, they couldn’t even spell porn

right, and overall I thought it was a waste of my time.

Every one of the specific features this reviewer

criticizes—the animation, the silly sing-a-long, the “leet

speak” spelling of porn as “pr0n”—is a hallmark feature

of the genre; it is hard to imagine a more comprehensive

misunderstanding. Thus, where members of that video’s

community delight in its wit, it left this reviewer

“disgusted.”

Implied in the previous example’s critique is a

7

700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749

750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799

normative notion that animation should meet certain

production, comprehensibility, and typographical

standards. Other normative notions turned up in many of

the reviews. One way that they appeared to help a video

was in situations where videos appear to have high

cultural capital—because they are conspicuously

educational or artistic. Thus, an expressionist art video

created by a film student generally received high ratings,

though the reviews, emotional tagging scores, and

physiological measures would otherwise have predicted

a lower rating. Similar anomalies occurred with a video

about racism and another about new media textual

practices.

Normative notions are not merely the stuff of abstract

aesthetic theorizing; they also shape expectations, e.g., of

genre, title, even the video’s poster frame. The most

commonly stated reason videos were given 1-star ratings

was that they violated the viewer’s expectations:

From the description, I expected the movie to be

more positive.

I had expected to see a nice video with Beetles

music.

I didn’t find the video funny as I was supposed to.

The ending was violent and surprised me.

The name of the video clip “Secrests of MySpace”

is I think used to just lure the viewwers into thinking

something else. There was just one mention of the

social netwrking website and thats it.

In all these cases, viewers are not complaining about the

content per se, but rather inasmuch as it is other than what

they expected.

Finally, a few readers expressed selective viewing

habits within certain videos; by that we mean that they

perceived that they were supposed to watch a video in a

certain way, but rejecting that, they imposed their own

personal tastes on the video, as in the following example:

I was bored with the film so I concentrated on how

cute the puppy was no matter how pointless the film

was.

Similarly, several others who expressed dislike for

animated music videos said they simply ignored the

visuals and just enjoyed the music. Interestingly, these

selectively enjoyed videos got middle-of-the-road ratings

(typically 3 stars).

Each of these examples reveals ways that factors

external to the content of the videos, which are brought to

bear by the viewers themselves, substantially and

sometimes even dominantly shape the emotional

experience of the video.

That said, we do not believe that these examples

displace or upend the significance of information

processing approaches to emotion; rather, they help us

interpret what is going on. As noted earlier, ratings, heart

rate, and emotional tagging scores were all significantly

positively correlated. The reviews help shed light onto

what is going on behind those measures.

4. Conclusion

In this study, a first attempt at triangulating the various

data points collected yielded promising results. For both

of the major findings in this study (i.e., that emotion is

implicated in video preferences, and that factors external

to videos, including emotional arcs, affect the

experience), we found that all of the data we collected, if

it told any story (some of our sources, such as breath rate,

failed to yield significant findings), it told the same story.

Users were responsive to the emotional complexity of

videos and it weighed into their critical evaluations. Heart

rate, video ratings, and emotional descriptor scores all

increased with videos that matched the desired emotional

complexity sought by participants (as evidenced by a

close reading of participants’ prose reviews).

Ultimately, this study is suggestive of ways that user

experience researchers and designers can think about

designing their research and data analysis methodologies.

It is both possible and fruitful to pursue mixed method

approaches when dealing with phenomena as complex

and subjective as emotion and experience.

5. REFERENCES

1. Anttonen, J. and Surakka, V. Emotions and heart rate while

sitting on a chair. Proc. of CHI’05. ACM Press (2005),

491-499.

2. Axelrod, L, and Hone, K. Affectemes and allaffects: A

novel approach to coding user emotional expression during

interactive experiences. Behavior & Information

Technology 25, 3, Taylor & Francis (2006), 159-173.

3. Barthes, R. The death of the author. In Heath, S. (trans).

Image-Music-Text. Hill and Wang, New York (1977), 142-

148.

4. Bates, J. The role of emotion in believable agents.

Communications of the ACM 37, 7 (1994), 122-125.

5. Bernhaupt, R., Ijsselsteijn, W., Mueller, F., Tscheligi, M.,

Wixon, D. Evaluating user experiences in games. CHI’08

Extended Abstract, ACM Press (2008), 3905-3908.

6. Boehner, K., DePaula, R., Dourish, P., and Sengers, P.

Affect: From information to interaction. AARHUS’05,

ACM Press (2005), 59-67.

7. Boehner, K., DePaula, R., Dourish, P., and Sengers, P.

How emotion is made and measured. International Journal

of Human-Computer Studies 65, Elsevier Ltd (2007), 275-

291.

8. Cacioppo, J., Tassinary, L., and Bernston, G. Handbook of

Psychophysiology. Cambridge University Press. (2000).

9. Creswell, J. and Clark, V. Designing and Conducting

Mixed Methods Research. Sage Publications. (2007).

10. Csikszentmihali, M. Flow: The Psychology of Optimal

Experience. Harper Perennial. (1990).

8

800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849

850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899

11. DePaula, R., and Dourish, P. Cognitive and cultural views

of emotions. Proc. of the Human Computer Interaction

Consortium Winter Meeting. (2005).

12. Desmet, P. Measuring emotion: Development and

application of instrument to measure emotional responses

to product. In Blythe, M., Overbeeke, K., Monk, A., and

Wright, P. (eds.). Funology: From Usability to Enjoyment.

Kluwer Academic

13. Douglas, Y. and Hargadon, A. The pleasure principle:

Immersion, engagement, flow. Proc. Of the Eleventh ACM

on Hypertext and Hypermedia. HYPERTEXT’00. (2000),

153-160.

14. Eco, U. The Role of the Reader: Explorations in the

Semiotics of Texts. Indiana University Press, Bloomington

Indian, USA (1979/1984).

15. Gileade, K. and Dix, A. Using frustration in the design of

adaptive videogames. Proc. Of the 2004 ACM SIFCHI

International Conference on Advances in Computer

Entertainment Technology. (2004), ACE’04.

16. Fagerberg, P., Ståhl, A., Höök, K. Designing gestures for

affective input: Analysis of shape, effort, and valence.

Proc. of MUM’03 (2003), 57-65.

17. Fagerberg, P., Ståhl, A., Höök, K. eMoto – Emotionally

engaging interaction. Journal of Personal and Ubiquitous

Computing 8, 5 (2004), 377-381.

18. Hassenzahl, M. and Tractinsky, N. User experience – A

research agenda. Behavior & Information Technology, 25,

2, Taylor & Francis (2006), 91-97.

19. Höök, K., Ståhl, A. Sundström, P., and Laaksolahti, J.

Interactional empowerment. Proc. of CHI’08. ACM Press

(2008), 647-656.

20. Hudlicka, E. To feel or not to feel: The role of affect in

human-computer interaction. International Journal of

Human-Computer Studies 59 (2003), 1-32.

21. Isbister, K. and Höök, K. Evaluating affective interfaces:

Innovative approaches. CHI’05 Extended Abstract, ACM

Press (2005). 2119.

22. Isbister, K., Höök, K., Sharp, M., and Laaksolahti, J. The

sensual evaluation instrument: Developing an affective

evaluation tool. Proc. of CHI’06, ACM Press (2006),

1163-1172.

23. Isomursu, M., Tähti, M., Väinämö, S., and Kuutti, K.

Experimental evaluation of five methods for collecting

emotions in field settings with mobile applications.

International Journal of Human-Computer Studies 65,

Elsevier Ltd (2007), 404-418.

24. Law, E., Roto, V., Vermeeren, A., Kort, J., Hassenzahl, M.

Towards a shared definition of user experience. CHI’08

Extended Abstract, ACM Press (2008), 2395-2398.

25. Liao, W., Zhang, W., Zhu, Z., Ji, Q., and Gray, W. Toward

a decision-theoretic framework for affect recognition and

user assistance. International Journal of Human-Computer

Studies 65 (2006), 847-873.

26. Mahlke, S., Minge, M., and Thüring, M. Measuring

multiple components of emotions in interactive contexts.

Extended Abstracts of CHI’06, ACM Press (2006), 1061-

1066.

27. Madden, M. Online Video. Pew Internet and American

Life Project. (2007).

http://www.pewinternet.org/PPF/r/219/report_display.asp

28. McCarthy, J. and Wright, P. Technology as Experience.

The MIT Press (2004).

29. McNamara, N., and Kirakowski, J. Functionality, usability,

and user experience: Three areas of concern. Interactions

13, 6, ACM Press (2006), 26-28.

30. Norman, D. Emotional Design: Why We Love (or Hate)

Everyday Things. Basic Books, New York, NY, USA,

2004.

31. Peter, C., Beal, R, Crane, E., Axelrod, L., and Blyth, G.

Emotion in HCI. Joint Proc. of British HCI Group Annual

Conference (2005, 2006, 2007).

32. Picard, R. Affective Computing. MIT Press, Cambridge,

MA, USA, 1997.

33. Picard, R. and Klein, J. Computers that recognize and

respond to user emotion: Theoretical and practical

implications. Interacting with Computers, 14, 2 (2002),

141-169.

34. Picard, R. Affective computing: Challenges. International

Journal of Human-Computer Studies 59, 1-2 (2003), 55-

64.

35. Raine 2008 = Raine, L. Increased Use of Video-sharing

Sites. Pew Internet and American Life Project. (2008).


36. Sengers, P., Liesendahl, R.., Magar, W., Seibert, C.,

Müller, Joachims, T., Geng, W, Mårtensson, P., and Höök.

The enigmatics of affect. Proc. of DIS’02. ACM Press

(2002), 87-98.

37. Scherer, K. What are emotions? And how can they be

measured? Social Science Information 44, 4 (2005), 695-

729.

38. Ståhl, A., Sundström, P., and Höök, K. A foundation for

emotional expressivity. Proc. of DUX’05. AIGA(2005).

39. Tompkins, J. (ed.). Reader-Response Criticism: From

Formalism to Post-Structuralism. The Johns Hopins

University Press: Baltimore USA (1980).

40. Zeng, Z., Pantic, M., Roisman, G., & Huang, T. A survey

of affect recognition methods: Audio, visual, and

spontaneous expressions. IEEE Transactions on pattern

analysis and machine intelligence. (2007), 1-20.

41. Ward, R. An analysis of facial movement tracking in

ordinary human-computer interaction. Interacting with

Computers 16, 5 (2004), 879-896.

42. Wensveen, S.A.G., Overbeeke, C.J., & Djajadiningrat, J.P.

Touchme, hit me, and I know how you feel. A design

approach to emotionally rich interaction. Proc. of DIS’00,

ACM Press (2000), 48-5