the effects of image speed and overlap on image recognition · product browsing and selection, in...

9
The Effects of Image Speed and Overlap on Image Recognition Timothy Brinded, James Mardell, Mark Witkowski and Robert Spence Department of Electrical & Electronic Engineering Imperial College London London SW7 2BT {timothy.brinded04, james.mardell, m.witkowski, r.spence}@imperial.ac.uk Abstract—Rapid Serial Visual Presentations (RSVPs) mimic the riffling of a book’s pages and are widely employed as a way of gaining familiarity with a collection of images or selecting images of interest from a collection. In a typical RSVP a number of images within a collection enter a display in (say) the lower left-hand corner and move towards the opposite corner where they disappear (‘diagonal RSVP’). If the rate at which images appear on the display is high, the speed with which images traverse the display may be such that the images overlap to some degree. As a consequence there is a need to know how much overlap can be tolerated without seriously affecting image recognition. Diagonal RSVP was implemented for four different levels of overlap and four different image speeds in such a way as to separate the effects of these two parameters. Participants were required to identify theme images (such as ‘ships’ or ‘cars’) in a series of image sequences at various combinations of speed and overlap. In addition to recording their performance, par- ticipants also completed a questionnaire to gauge their opinion on perceived presentation speed and ease of recognition. Results showed a significant effect of both overlap and speed on the percentage of correctly identified images. On the basis of the experimental results suggestions are made concerning interaction design for RSVP. Keywords-Inspection; Obscurement; Overlap; RSVP I. I NTRODUCTION Riffling the pages of a book in order to gain some familiar- ity with its contents or to conduct a search is a simple form of Rapid Serial Visual Presentation (RSVP) [1], [2]. The computational embodiment of RSVP leads to considerable flexibility in the mode of presentation, four of which are shown in Figure 1. In each mode images from a collection are introduced in turn at a given location, move along a trajectory and then disappear from view. The experiments in this paper use ‘diagonal’ mode (Figure 1a). In this diagonal mode, images appear in the bottom left and move at constant speed to the top right. The technique of RSVP finds a wide range of applications. In e-commerce, for example, Wittenburg et. al. [3] explore various RSVP like designs, specifically for inclusion into consumer electronic products, notably for TV channel surf- ing and visualisation of VCR style fast-forward and editing tasks (Figure 1d). Tse et al. [4] investigated augmented ‘gist’ acquisition for film browsing. de Bruijn & Tong [5] investigated the selection of news reports on a mobile, (a) Diagonal (b) Carousel (c) Volcano (d) Video Browsing [3] Figure 1. Typical RSVP Modes notably in relation to inherently smaller screen dimensions. Most RSVP designs present images at a predetermined rate; Witkowski et al. [6] embed an RSVP design into a kiosk for product browsing and selection, in which the user has direct control of both image presentation speed and direction. Cooper et al. [7] evaluated the comparative performance of users of six different RSVP designs, while monitoring their eye-gaze activities. de Bruijn and Spence [8] have investigated eye-gaze behaviour for a number of different RSVP visualisations, including a ‘carousel’ mode (Fig- ure 1b). In ‘carousel’ mode, images appear small at the top of the screen, move clockwise increasing in size to the bottom, and disappear again at the top. Corsato et al. [9] investigated both comparative perfor- mance and user preference for a number of more complex RSVP forms (including ‘volcano’ mode, Figure 1c). In ‘volcano’ mode images appear at the centre of the screen and successively move to the edges, reducing in size before disappearing. Corsato et al. also included a study of eye- gaze movement characteristics across the different modes tested, indicating that design has a substantial effect on gaze behaviour and that this is correlated with image recognition rates. Wittenburg et al. [10] have considered the effects of simulated depth using perspective and occlusion methods. In 2011 15th International Conference on Information Visualisation 1550-6037/11 $26.00 © 2011 IEEE DOI 10.1109/IV.2011.10 3

Upload: others

Post on 07-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

The Effects of Image Speed and Overlap on Image Recognition

Timothy Brinded, James Mardell, Mark Witkowski and Robert Spence

Department of Electrical & Electronic Engineering

Imperial College London

London SW7 2BT

{timothy.brinded04, james.mardell, m.witkowski, r.spence}@imperial.ac.uk

Abstract—Rapid Serial Visual Presentations (RSVPs) mimicthe riffling of a book’s pages and are widely employed as a wayof gaining familiarity with a collection of images or selectingimages of interest from a collection.

In a typical RSVP a number of images within a collectionenter a display in (say) the lower left-hand corner and movetowards the opposite corner where they disappear (‘diagonalRSVP’). If the rate at which images appear on the display ishigh, the speed with which images traverse the display may besuch that the images overlap to some degree. As a consequencethere is a need to know how much overlap can be toleratedwithout seriously affecting image recognition.

Diagonal RSVP was implemented for four different levels ofoverlap and four different image speeds in such a way as toseparate the effects of these two parameters. Participants wererequired to identify theme images (such as ‘ships’ or ‘cars’) ina series of image sequences at various combinations of speedand overlap. In addition to recording their performance, par-ticipants also completed a questionnaire to gauge their opinionon perceived presentation speed and ease of recognition.

Results showed a significant effect of both overlap and speedon the percentage of correctly identified images. On the basisof the experimental results suggestions are made concerninginteraction design for RSVP.

Keywords-Inspection; Obscurement; Overlap; RSVP

I. INTRODUCTION

Riffling the pages of a book in order to gain some familiar-

ity with its contents or to conduct a search is a simple form

of Rapid Serial Visual Presentation (RSVP) [1], [2]. The

computational embodiment of RSVP leads to considerable

flexibility in the mode of presentation, four of which are

shown in Figure 1. In each mode images from a collection

are introduced in turn at a given location, move along a

trajectory and then disappear from view. The experiments in

this paper use ‘diagonal’ mode (Figure 1a). In this diagonal

mode, images appear in the bottom left and move at constant

speed to the top right.

The technique of RSVP finds a wide range of applications.

In e-commerce, for example, Wittenburg et. al. [3] explore

various RSVP like designs, specifically for inclusion into

consumer electronic products, notably for TV channel surf-

ing and visualisation of VCR style fast-forward and editing

tasks (Figure 1d). Tse et al. [4] investigated augmented

‘gist’ acquisition for film browsing. de Bruijn & Tong [5]

investigated the selection of news reports on a mobile,

(a) Diagonal (b) Carousel

(c) Volcano (d) Video Browsing [3]

Figure 1. Typical RSVP Modes

notably in relation to inherently smaller screen dimensions.

Most RSVP designs present images at a predetermined rate;

Witkowski et al. [6] embed an RSVP design into a kiosk for

product browsing and selection, in which the user has direct

control of both image presentation speed and direction.

Cooper et al. [7] evaluated the comparative performance

of users of six different RSVP designs, while monitoring

their eye-gaze activities. de Bruijn and Spence [8] have

investigated eye-gaze behaviour for a number of different

RSVP visualisations, including a ‘carousel’ mode (Fig-

ure 1b). In ‘carousel’ mode, images appear small at the

top of the screen, move clockwise increasing in size to the

bottom, and disappear again at the top.

Corsato et al. [9] investigated both comparative perfor-

mance and user preference for a number of more complex

RSVP forms (including ‘volcano’ mode, Figure 1c). In

‘volcano’ mode images appear at the centre of the screen

and successively move to the edges, reducing in size before

disappearing. Corsato et al. also included a study of eye-

gaze movement characteristics across the different modes

tested, indicating that design has a substantial effect on gaze

behaviour and that this is correlated with image recognition

rates.

Wittenburg et al. [10] have considered the effects of

simulated depth using perspective and occlusion methods. In

2011 15th International Conference on Information Visualisation

1550-6037/11 $26.00 © 2011 IEEE

DOI 10.1109/IV.2011.10

3

Page 2: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

a related development Sun and Guimbretiere [11] describe

‘Flipper’, a novel method for document navigation.

In RSVPs containing moving images it may often be

preferable, for whatever reason, to cause the images to

overlap each other to some extent, a condition illustrated for

all the examples of RSVP shown in Figure 1. For any given

image size, the number of images on the screen at any one

time is determined by the interlinked factors of pace (rate of

new images appearing onscreen), image speed and overlap.

The question then arises as to how occlusion of one

image by the next affects a user’s ability to recognise some

feature of that obscured image. Since, for a given pace of

image appearance on the display, overlap can be diminished

or avoided by speeding up the movement of images, it is

therefore essential to investigate the separate effects of image

speed and overlap. The experiment described in this paper

was devised to identify these independent effects on the

ability of a user to recognise images satisfying a given theme

(e.g. ‘car’, ‘ship’ or ‘flower’).

The approach adopted here to separate out the independent

effects of overlap and image speed is presented in Section II

by a specific but artificial experimental design. This design

ensures that overlap and speed are independent.

The diagonal RSVP mode (see Figure 1a) was chosen

for investigation because this mode or a minor variant of

it is incorporated in most designs. The task assigned to

the user was that of identifying, by space-bar press, those

target images within a collection (of 200) characterised by a

particular theme. Although unknown to the participant, there

were always 10 target images in each presentation. After the

presentation of each image collection, subjects were asked

to answer three simple questions concerning their perception

of image speed and identification success.

The experimental design is described in Section III.

Section IV describes the experimental procedure used. Sec-

tion V presents an analysis and discussion of the results

obtained. Section VI provides a summary and further dis-

cussion of the experiment. In Section VII we derive, from

the experimental results, some notes of guidance to the

interaction designer faced with the task of designing an

RSVP for a particular application. The final section presents

some brief conclusions from the study.

II. INVESTIGATING OVERLAP

The purpose of these experiments was to test the com-

bined effects of image speed and obscurement due to overlap

as might be encountered in various RSVP designs. Our

approach to a study of the effect of image overlap was to

focus on two important features of an RSVP such as the

exemplar shown in Figure 2. The two features are the degree

of overlap and the speed at which the images move across

the screen.

To isolate the independent effects of overlap and image

speed the diagonal RSVP shown in Figure 2 was modified

Figure 2. A typical RSVP display as used in practice, showing imageocclusion due to overlap

Figure 3. A typical RSVP design modified for experimental purposes toaddress obscurement

as shown in Figure 3 purely for experimental purposes. This

experimental RSVP was so arranged that throughout the

investigation, the same number of images (or part images)

were visible. The consequence of overlap was reflected in the

varying obscurement of each image at one of four possible

values illustrated in Figure 4. Independently of the degree of

obscurement, four image speeds were investigated, but the

rate at which images became visible (pace) was adjusted to

keep the number of images on screen constant.

III. EXPERIMENTAL DESIGN

To investigate these parameters the number of images

visible on the screen was kept constant, while the degree of

obscurement was increased to emulate the effects of overlap.

Four levels of obscurement were investigated: 0%, 25%,

50% and 75% (Figure 4). Figure 3 shows how this appears

to the participant at 25% obscurement. At 0% obscurement

the images just abut each other in this design.

The final experimental design adopted four linear image

speeds across the screen, equating to 1005, 1612, 2246 and

2856 pixels/s. Table I indicates the consequences of this

choice in terms of total travel time for any image across

the whole screen, the rate at which images appear on the

4

Page 3: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

Figure 4. Levels of obscurement, as percentage area

Table IIMAGE SPEED SETTINGS

Travel Presentation Test Image Image

Time Rate Duration Speed Speed

(s) (images/s) (s) (pixels/s) (cm/s)

2.00 3.00 68.67 1005 24.15

1.24 4.84 42.57 1612 38.95

0.89 6.74 30.56 2246 54.27

0.70 8.57 24.03 2856 69.00

screen, the duration of each presentation of 200 images, and

the equivalent speed in cm/s. Presentations were made on a

19” (48.3 cm diagonal) LCD monitor with a native display

resolution of 1280x1024. Participants were seated a nominal

60 cm from the screen.

These final choices of obscurement and image speed were

selected following pilot studies of eight participants. The

trials were conducted both to test and refine the experi-

mental software and procedures, and to ensure that the test

parameters bounded the range of interest. Initial trials used

obscurement levels of 0, 15, 30 and 45% and speeds of 666,

999, 1665 and 2856 pixels/s. It was clear that higher levels of

obscurement were required to fully test the range of effects,

and that the lower speed range was too slow to challenge

the participants.

Each test represented a different combination resulting

from varying the speed and obscruement of each image set.

With four speeds and four levels of obscurement this resulted

in 16 separate tests. To ensure a fair test this required 16

separate image sets each consisting of 200 unique images,

so that participants only saw each image-set once.

Within each set, 10 target images drawn from a target

theme were pseudo-randomly placed in the sequence. Par-

ticipants used each target theme only once. Each target

theme was drawn from one of 16 easily recognised cate-

gories: birds, cars, chairs, dogs, fish, flowers, aeroplanes,

ships, horse, hot air balloons, hats, oranges, (military) tanks,

75

2856

50

1005

25

1005

0

2856

0

1612

25

2246

50

2264

75

1612

0

2246

25

1612

50

1612

75

2246

75

1005

50

2856

25

2856

0

10050 75(%)

1005 2856(pixels/s)

Speed

Occlusion

Figure 5. The Graeco-Latin square design

(British) phoneboxes, watches, and submarines. For each test

the associated image set was also randomised to minimise

the possibility of systematic error. The 3200 required images

were sourced on-line, and represented a broad range of

professional and amateur contributions.

The sequences were carefully chosen to equalise the

relative difficulty of targets, the locations of identifying

image features, and the target distribution within each set. In

particular, care was taken to ensure that no target followed

another by less than the time on screen, plus one second.

This design also eliminates the possibility of attentional

blink effects [12].

Prior to each presentation the participant was shown a

screen indicating the target theme they were to identify.

Participants were required to press the keyboard spacebar

once (Figure 6) each time they identified a target image. The

time of each spacebar activation was recorded automatically

and subsequently compared to the actual time of target

presentations to determine the accuracy data.

A Graeco-Latin square design [13] was adopted to dis-

tribute the combinations of speed and obscurment fairly. The

square adopted is shown in Figure 5. The various schedules

of combinations used for every participant were then derived

from the square following normal practice.

During testing the experimenter was seated behind the

participant, out of sight and otherwise non-interfering. This

was done in an attempt to reduce potential impact of

observation on the results (Figure 6), while allowing the

experimenter full control of the procedure.

A. Subjective Questionnaire

As an integral part of the experimental design, each par-

ticipant was required to answer three questions about their

perceptions as to the speed of presentation, their confidence

in seeing the targets and the overall difficulty of recognition

during the test. The questions, as presented, are shown in

Table II. Answers were recorded on an appropriate five-point

scale, in the manner of a Likert design.

Since the participant filled out the relevant questions after

each test (as opposed to at the end of the entire session) the

5

Page 4: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

Figure 6. View of the experimental set-up

Table IIINDIVIDUAL TEST QUESTIONS AND POSSIBLE ANSWERS

How did you findthe speed of thepresentation?

VerySlow

SlowJustRight

FastVeryFast

How confident areyou that you sawall the targets?

NotAt All

Some-what Fairly Quite

VeryCon-fident

Did you find it dif-ficult to recognisethe targets?

VeryDif-ficult

Diffi-cult OK Easy

VeryEasy

responses were recorded with the most recent memory fresh

in their minds.

IV. EXPERIMENTAL PROCEDURE

A total of 20 volunteers drawn from the general student

and research population participated in the main experiment,

15 male, five female. Although none were directly familiar

with the RSVP methodology, all were highly technically

literate and experienced computer users.

After welcome, participants were seated comfortably, the

general purpose of the experiment explained and consent

obtained. Each participant was then shown an example of

the diagonal RSVP presentation style to be used, at moderate

speed and without obscurement. There was no interaction

during the example presentation. Participants were carefully

instructed to press the space bar on the keyboard once

whenever they saw a picture with the target type in it,

for example a ‘dog’. Participants were not informed of the

number of target images in each presentation sequence. Each

participant was then asked if this was clear and any required

clarifications were made.

Participants then each completed 16 presentation trials,

at different speed and obscurement levels, in a schedule

described previously (Figure 5). Prior to each of the sixteen

presentations, an on-screen prompt informed the participants

of the image theme they were required to identify, for

example “In the next presentation, the theme you are looking

for is dogs.” The participant pressed the spacebar to start

the presentation, when they were ready. Each of the sixteen

presentations lasted between 24 and 69 seconds (Table I). At

the end of each presentation they were then asked to answer

each of the three questions from the questionnaire sheet, by

highlighting one of the five options.

Following, and during, all sixteen presentations, any com-

ments made by the participants were noted. The total time

for each participant was between 15 and 20 minutes. No

reward was offered for taking part.

V. ANALYSIS OF RESULTS

This section presents a discussion of the results obtained

from the 320 individual presentations made during the main

experiment, and summarises the results of the subjective

questionnaire completed after each of the presentations.

This discussion is broken down into different sections: a

general overview of the results; a comparison and analysis

of the effects of speed and obscurement; an examination

of any learning effects; and an analysis of the partici-

pant questionnaire results. In the analysis that follows, the

Accuracy measure reflects the number of instances where

the participant recorded a space bar press while a target

image was present on the screen (plus one second following

to allow for reaction time, reflecting the design criteria,

section III).

Accuracy is defined as the number of targets correctly

identified within a sequence as a fraction of the total (10);

these are the True Positives (TP). Accuracy may also be

expressed as a percentage. False Positives (FP), where the

participant indicated a target where none was present, were

also recorded. The Accuracy measure is independent of the

False Positive value.

A. General Overview of Results

Figure 7 provides an overview of the major findings of

the experiment. The bubble chart shows the combined effect

of obscurement and speed across the range of experimental

parameters. The width of each bubble reflects the Accuracy

of identification at each setting.

As expected, there is a pronounced effect on Accuracy

of both speed and obscurement. However, it may be seen

that the difference between 0 and 25% obscurement is very

slight, in contrast to the more substantive effects between

25% and 50% and 50% to 75% obscurement. In contrast,

Figure 7 indicates a progressive effect of speed on Accuracy.

This is considered in more detail in section V-B, and is

presented as an interpolated contour surface in Figure 16.

Figure 8 shows the average number of False Positives

across the combinations of speed and occlusion. A False

Positive indicates that a button press was recorded but did

not correspond to a target.

6

Page 5: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

0

25

50

75

500 1000 1500 2000 2500 3000 3500

Obs

cure

men

t (%

are

a)

Speed (pixels/s)

93

90

70

31

86

83

58

24

73

66

44

13

48

47

29

14

Figure 7. Accuracy (%) when varying speed and obscurement

0

25

50

75

500 1000 1500 2000 2500 3000 3500

Obs

cure

men

t (%

are

a)

Speed (pixels/s)

1.55

1.42

1.65

1.65

1.89

1.35

0.55

0.65

1.15

1.10

0.55

0.35

1.05

0.95

0.70

0.30

Figure 8. False Positives when varying speed and obscurement

Whilst the actual numbers of False Positives are overall

relatively small (all less than two per presentation) it is

interesting to note the level of correlation between ‘high

difficulty’ combinations (upper right, Figure 8) and smaller

average instances of false positives. We postulate that this

is due to participants ‘giving up’ on tests they deem too

hard and appear to be hesitant to press the space bar if

at all. Furthermore, we observed that participants would

raise their fingers above the spacebar during these difficult

presentations, while they would tend to rest their fingers on

the bar during more tractable combinations.

B. Speeds and Obscurements

In order to allow a more detailed assessment of the

difference between the different speeds and obscurements,

Figure 9 and Figure 10 show the averaged results for speed

and obscurement respectively. For example Figure 9 shows

that at a speed of 1005 pixels/s there are 7.1 (SD = 2.9)

0

1

2

3

4

5

6

7

8

9

10

1005 1612 2246 2856

Ave

rage

Num

ber

of T

arge

ts

Speed (pixels/s)

Found Targets (TP)Missed Targets

Standard Deviation

0

1

2

Mistaken Targets (FP)Standard Deviation

Figure 9. Results across speeds

Table IIITABLE OF p-VALUES FROM THE TUKEY MULTIPLE COMPARISONS OF

MEANS ACROSS SPEEDS

Speed (pixels/s) 1005 1612 2246 2856

1005

1612 0.0495

2246 0.0000 0.0003

2856 0.0000 0.0000 0.0001

True Positives, for 1612 pixels/s, 6.3 (SD = 3.1), for

2246 pixels/s, 4.9 (SD = 3.2) and for 2856 pixels/s, 3.4

(SD = 2.8). Each is the result of averaging over the

speed measurements across all obscurements (0%, 25%,

50%, 75%). We note the associated standard deviations (as

denoted by the error bars shown on the figure) are fairly

large, due to the range of obscurements used, and that the

Standard Deviations (SD) are quite similar. Figure 9 also

indicates a very strong negative linear correlation between

speed and true positives (R2 = 0.986).

Table III shows the Tukey multiple comparison p-values

for all combinations of speed (N ≈ 80), indicating that each

of the four accuracy values are distinct at p < 0.05.

The False Positives shown in Figure 9 (lower) also ex-

hibit a negative trend with increased speed, consistent with

Figure 8. The relatively high SD across the False Positive

values is indicative of the variability of the candidates’

behaviours.

Figure 10 shows a non-linear relationship between ob-

scurement and Accuracy. Note the similarity between 0%

obscurement (M = 7.5, SD = 2.6) and 25% obscurement

(M = 7.2, SD = 2.6). Table IV shows that the p-value

(p = 0.7580) for a Tukey comparison between the 0 and

25% obscurements is clearly non-significant at the p < 0.05

level. This is in stark contrast to other Tukey comparisons

of the remaining obscurements (25 to 50 and 50 to 75) as

7

Page 6: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

0

1

2

3

4

5

6

7

8

9

10

0 25 50 75

Ave

rage

Num

ber

of T

arge

ts

Obscurement (% area)

Found Targets (TP)Missed Targets

Standard Deviation

0

1

2

Mistaken Targets (FP)Standard Deviation

Figure 10. Results across obscurements

Table IVTABLE OF p-VALUES FROM THE TUKEY MULTIPLE COMPARISONS OF

MEANS ACROSS OBSCUREMENTS

Obscurement (%) 0 25 50 75

0

25 0.7580

50 0.0000 0.0000

75 0.0000 0.0000 0.0000

shown in Table IV, which are clearly distinct. Again we note

a high variability in False Positives across the individual

participants (Figure 10, lower).

Figure 11 emphasises the similarities between the effect

on the spread of number of targets correctly identified at

0% and 25% obscurement (Figure 11, top left and top

right). These are both heavily skewed toward the correctly

identified part of the histogram. At 50% obscurement (Fig-

ure 11, bottom left), the spread of correctly identified targets

is largely even, whereas 75% (Figure 11, bottom right)

obscurement clearly indicates that few targets are identified

correctly for the majority of presentations. There is a similar

effect for speeds, but it is not as pronounced.

C. Learning Effect

Figure 12 shows the average score across all participants

in the order of test presentation. To the extent possible

the experimental design distributed easy and difficult com-

binations evenly across the presentations. The absence of

any substantive improvement across the successive tests

indicates that there is no significant learning effect for this

type of RSVP design, implying that users found the RSVP

style intuitive. Error bars indicate the Standard Deviation

(N = 20). The error bars are large (and therefore only

indicative) as each test position contains presentations at

every speed and obscurement combination.

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10

Ave

rage

Fre

quen

cy

True Positives

0% Obscurement

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10

Ave

rage

Fre

quen

cy

True Positives

25% Obscurement

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10

Ave

rage

Fre

quen

cy

True Positives

50% Obscurement

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10

Ave

rage

Fre

quen

cy

True Positives

75% Obscurement

Figure 11. Effect of obscurement on accuracy

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Ave

rage

Num

ber

of T

rue

Pos

itive

s

Presentation

Figure 12. Potential learning effect

It should be noted that a potential ‘learning effect’ did

play a part in the questionnaire, as participants would rate

the slowest speed much slower than initially, once they had

experienced the fastest possible speed (question one).

D. Participant Questionnaire Findings

This section presents the findings from the user opinion

survey (Section III-A). Figure 13 summarises the results for

question 1 “How did you find the speed of this presentation”,

asked after every presentation. The response ‘Just Right’ is

assigned a value of zero, ‘Fast’ and ‘Very Fast’ positive

values one and two respectively; the responses ‘Slow’ and

‘Very Slow’ values of minus one and minus two respectively;

the permissible range is therefore −2 to +2. Fast values are

shown on Figure 13 as filled upward pointing triangles, slow

values as downward pointing triangles. Smaller triangles are

therefore closer to the ideal.

It may be seen that the slowest speed (1005 pixels/s)

was found to be very close to ‘Just Right’. On average

the second speed (1612 pixels/s) was considered just below

8

Page 7: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

0

25

50

75

500 1000 1500 2000 2500 3000 3500

Obs

cure

men

t (%

are

a)

Speed (pixels/s)

Fast Slow

-0.11

-0.12

0.17

0.61

0.75

0.61

0.5

0.89

1.28

1.33

1.17

1.44

1.89

1.56

1.72

1.89

Figure 13. Question 1: Speed (range −2 to +2)

‘Fast’. The two highest speeds (2246 and 2856 pixels/s)

were both between ‘Fast’ and ‘Very Fast’, with the highest

speed almost uniformly considered ‘Very Fast’ (average

1.76 of maximum value 2). Recall that the range of speeds

was partially determined by a sample of eight pilot study

participants, where the trial’s’ slowest speed was evaluated

as ‘Very Slow’, and which was also reported as unreasonably

slow. Note that obscurement has little apparent effect on the

perception of speed in these trials.

Figure 14 shows the results from the question: “How

Confident were you that you saw all the targets?” with 1

being the lowest confidence answer (‘Not At All’) and 5

being the highest (‘Very Confident’). It is interesting to note

that participants were on aggregate only between ‘Fairly’

(3, central value) and ‘Quite Confident’ (4) even at the

low speed, low obscurement presentations. Comparison with

the measured Accuracy results shown in Figure 7 indicates

that, on balance, participants were generally pessimistic

about their performance under the most benign conditions

compared to their actual performance. Confidence falls with

increases in both speed and obscurement, largely mirroring

the measured Accuracy values.

Figure 15 shows the results in response to the question

“Did you find it difficult to recognise the targets?” Results

are in the range 5 (‘Very Easy’) to 1 (‘Very Difficult’).

As with question two, for the benign settings for speed

and obscurement, participants generally underestimate their

performance regarding Accuracy, but become more realistic

at the challenging settings.

VI. SUMMARY AND DISCUSSION

Figure 16 presents the results of fitting a cubic spline

surface to the data shown in Figure 7 and indicates, for

convenience, contours of constant Accuracy. The interaction

designer may then choose a minimum desired accuracy

0

25

50

75

500 1000 1500 2000 2500 3000 3500

Obs

cure

men

t (%

are

a)

Speed (pixels/s)

3.5

3.47

2.83

1.78

2.94

2.61

2.28

1.5

2.5

2.5

2.22

1.22

2.22

2.44

1.83

1.11

Figure 14. Question 2: Confidence (range 1 to 5)

0

25

50

75

500 1000 1500 2000 2500 3000 3500

Obs

cure

men

t (%

are

a)

Speed (pixels/s)

3.72

3.47

2.94

1.72

3.00

3.33

2.61

1.5

2.72

2.61

2.56

1.17

2.56

2.61

2.11

1.06

Figure 15. Question 3: Difficulty (range 1 to 5)

level or range and simply read off a combination of speed

and obscurement that would meet their requirements. Note

that this Figure extrapolates the accuracy level to 100%

obscurement (where it may reasonably be assumed that there

can only be 0% Accuracy).

Note again that the effects of speed and obscurement are

not generally equivalent, the effect of speed being largely

linear over the range tested; whereas the effect of obscure-

ment is minimal at low values, but increasingly pronounced

at higher values. This is reflected in the shape of the contour

lines presented in Figure 16.

Results from the subjective questionnaires indicate that

the participants considered the speeds of presentation em-

ployed to range from ‘Just Right’ for the slowest speed to

‘Very Fast’ for the two faster speeds. Opinion about speed

was apparently independent of the degree of obscurement

used in the presentations.

Figure 17 shows the strong proportional correlation be-

9

Page 8: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

1000 1500 2000 2500

Speed (pixels/s)

0

25

50

75

100

Obs

cure

men

t (%

are

a)

0

10

20

30

40

50

60

70

80

90

100

Acc

urac

y (%

)

90

80

70

60

50

40

30

20

10

Recorded Accuracy Values

Figure 16. Accuracy, Speed and Obscurement

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

0 1 2 3 4 5 6 7 8 9 10

Ave

rage

Opi

nion

Rat

ing

Average Number of True Positives

Question 2 - confidence Question 3 - difficulty

Figure 17. Correlation of Opinion to Performance

tween participants’ opinion of their performance compared

to the actual performance as measured, across all combina-

tions of speed and occlusion. The linear regression fit, R2,

for question 2 is 0.94; R2 for question 3 is 0.91.

It would seem to be an attractive property of a presentation

design that users’ intuitive perception of their performance

matches that which they actually achieve. However, it re-

mains unclear as to why participants are generally more

pessimistic than they need be at slower speeds and lower

obscurement levels, even though they are, by and large,

properly pessimistic about their performance at the more

challenging settings.

Given the ubiquity of the problems associated with speed

and overlap in RSVP design, this paper has been concerned

solely with design performance issues. While we have used

a highly controlled experimental design, we consider our

results may well prove useful for interaction designers devel-

oping a wide-range of RSVP designs. It should also be noted

that the effects of image presentation rate on individuals’

cognitive ability to both recognise and remember images

has been extensively studied e.g. [14], [15], however such

0 10 20 30 40 50 60

Pace (images/s)

0

25

50

75

100

Cal

cula

ted

Effe

ctiv

e O

verla

p (%

are

a)

0

10

20

30

40

50

60

70

80

90

100

Acc

urac

y (%

)

Recorded Accuracy Values

90

80

70

60

50

40 30 20

Figure 18. Accuracy, Pace and Overlap (calculated)

considerations are beyond the scope of this paper.

VII. DESIGN CONSIDERATIONS

The experiment described in this paper keeps separate

the effects of image speed and the degree to which images

are obscured. In normal practice, this obscuration would

be as a consequence of images overlapping, as shown in

Figure 2. This artificial separation results in the obscurement

of images shown in Figure 3. This ensures the same number

of images (and part-images) are on the screen at the same

time regardless of which speed or obscurement levels are

selected.

The interaction designer may be more concerned with

the effect on accuracy of pace, rather than image speed.

Since image speed, overlap and pace are simply related,

it is possible to transform the experimental data gathered

in these experiments to generate Figure 18, under a single

assumption. The assumption is that the accuracy of target

identification is a function only of image speed and oc-

clusion and is independent of pace. This conjecture is as

yet unproven, though various previous studies give us some

confidence in its validity. The assumption needs to be tested

by future experiment.

Accepting the assumption, Figure 18 shows the freedom

of design choice of pace and overlap available for a specified

minimum Accuracy level of theme target. The area shown in

Figure 18 represents the range of the experimental evidence,

the other areas represent image presentation speeds that fall

outside our previously identified useful boundaries (too fast

or too slow).

VIII. CONCLUSIONS

This paper identifies and investigates the independent

effect of image obscurement and image speed in a diagonal

RSVP. We observed a strongly proportional effect of speed

on accuracy, but that small amounts of obscurement had

10

Page 9: The Effects of Image Speed and Overlap on Image Recognition · product browsing and selection, in which the user has direct control of both image presentation speed and direction

relatively little effect, although this increased markedly

with further obscurement. Under the stated assumptions,

we present our findings in a form (Figure 18) that allows

a visual designer to choose an efficacious combination of

image speed, overlap and pace to achieve a desired accuracy

of target image recognition.

IX. ACKNOWLEDGEMENTS

The authors gratefully acknowledge the valuable dis-

cussions with Dr. Kent Wittenburg of Mitsubishi Electric

Research Laboratories.

This work was financially supported by the Old Cen-

tralians’ Trust of Imperial College London.

REFERENCES

[1] R. Spence, “Rapid, Serial and Visual: A Presentation Tech-nique with Potential,” Information Visualization, vol. 1, no. 1,pp. 13–19, Mar. 2002.

[2] O. de Bruijn and R. Spence, “Rapid Serial Visual Presenta-tion: A Space-Time Trade-Off in Information Presentation,” inProceedings of the Working Conference on Advanced VisualInterfaces (AVI 2000). Palermo, Italy: ACM Press, 2000, pp.189–192.

[3] K. Wittenburg, C. Forlines, T. Lanning, A. Esenther,S. Harada, and T. Miyachi, “Rapid Serial Visual PresentationTechniques for Consumer Digital Video Devices,” in Proceed-ings of the 16th Annual ACM Symposium on User InterfaceSoftware and Technology (UIST ’03). ACM Press, 2003, pp.115–124.

[4] T. Tse, G. Marchionini, W. Drag, L. Slaughter, and A. Kom-lodi, “Dynamic Key Frame Presentation Techniques for Aug-menting Video Browsing,” in Proceedings of the Work-ing Conference on Advanced Visual Interfacesb (AVI ’98),L’Aquila, Italy, 1998, pp. 185–194.

[5] O. de Bruijn and C. H. Tong, “M-RSVP: Mobile Web Brows-ing on a PDA,” in People and Computers XVII, Proceedingsof HCI 2003, 2003, pp. 297–312.

[6] M. Witkowski, B. Neville, and J. Pitt, “Agent mediatedretailing in the connected local community,” Interacting withComputers, vol. 15, no. 1, pp. 5–32, Jan. 2003.

[7] K. Cooper, O. de Bruijn, R. Spence, and M. Witkowski, “AComparison of Static and Moving Presentation Modes for Im-age Collections,” in Proceedings of the Working Conferenceon Advanced Visual Interfaces (AVI ’06). New York, NewYork, USA: ACM Press, 2006, pp. 381–388.

[8] O. de Bruijn and R. Spence, “Patterns of Eye Gaze duringRapid Serial Visual Presentation,” in Proceedings of theWorking Conference on Advanced Visual Interfaces (AVI ’02).New York, New York, USA: ACM Press, 2002, pp. 209–217.

[9] S. Corsato, M. Mosconi, and M. Porta, “An Eye TrackingApproach to Image Search Activities Using RSVP DisplayTechniques,” in Proceedings of the Working Conference onAdvanced Visual Interfaces (AVI 2008). ACM, 2008, pp.416–420.

[10] K. Wittenburg, T. Lanning, C. Forlines, and A. Esenther,“Rapid Serial Visual Presentation Techniques for Visualizinga 3rd Data Dimension,” in HCI International, Crete, Greece,2003, pp. 810–814.

[11] L. Sun and F. Guimbretiere, “Flipper: a New Method ofDigital Document Navigation,” in Extended Abstracts onHuman Factors in Computing Systems (CHI ’05). Portland,OR, USA: ACM Press, 2005, pp. 2001–2004.

[12] K. L. Shapiro, K. M. Arnell, and J. E. Raymond, “TheAttentional Blink,” Trends in Cognitive Sciences, vol. 1, no. 8,pp. 291–296, 1997.

[13] R. A. Bailey, “Orthogonal Partitions in Designed Experi-ments,” Designs, Codes and Cryptography, vol. 77, no. 1-2,pp. 45–77, 1996.

[14] V. Coltheart, Ed., Fleeting Memories: Cognition of BriefVisual Stimuli, 1st ed. Cambridge, Massachusetts, UnitedStates: MIT Press, 1999.

[15] M. C. Potter, B. Wyble, R. Pandav, and J. Olejarczyk, “Picturedetection in rapid serial visual presentation: Features or iden-tity?” Journal of experimental psychology. Human perceptionand performance, Aug. 2010.

11